PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 21st of May, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:00:07] <Antiarc> (despite the fact that an array of seeds - even a single one - is a clear RS signal!)
[00:00:16] <Boomtime> no, there is a flag for that
[00:00:53] <Boomtime> what you are trying to do is a special case, and you're complaining that your special case isn't optimized for - you can do what you want but you want it to be magic
[00:01:16] <Boomtime> what exatly is the problem you are experiencing?
[00:01:19] <Antiarc> I....really don't think so. I've contributed to the 1.x driver, I'm the maintainer of MongoMapper, I really do know my way around this stuff.
[00:01:43] <Antiarc> The problem is that I connect to a replica set and the autodiscovery says it's a single ("direct") topology
[00:01:52] <cheeser> i'm not sure i understand what the problem is either.
[00:02:01] <Boomtime> no, what you've done is connect using direct mode semantics
[00:02:09] <joannac> His problem is that there's a way to explicitly say "this is a single node"
[00:02:20] <joannac> and a way to explicitly say "this is a replica set"
[00:02:37] <cheeser> what does that mean, exactly?
[00:02:38] <Antiarc> ...why even have topology autodiscovery then?
[00:02:46] <joannac> in the absence of anything explicit, he wants it to be autodiscovery
[00:02:52] <joannac> when in fact it's "Single node"
[00:03:18] <cheeser> if you pass a seed list, the driver will autodiscovery the topology. if you pass it one host, it'll only talk to that one host.
[00:04:37] <joannac> cheeser: but he thinks that since there's a way to say "this is a single node" (connect=direct), that the single node, no other options case should also default to autodiscovery
[00:04:50] <Antiarc> Sure, in the 1.x driver that was similar - a list - even a list of 1 - was treated as a seed list, rather than a single node. A URI or host:port string was single-node semantics.
[00:05:30] <cheeser> the new drivers all (should) implement the Server Discovery and Monitoring spec which determines that behavior.
[00:06:36] <joannac> I don't know about Ruby, but I don't think a single node list ever went to discovery in the drivers I'd worked with (python, java)
[00:07:09] <Antiarc> It did in 1.x :)
[00:07:29] <Boomtime> well that was a bug then
[00:07:33] <Boomtime> http://docs.mongodb.org/manual/reference/connection-string/#uri.replicaSet
[00:07:53] <Boomtime> the spec says that if only one host is provided and no replica-set option then it defaults to direct connect
[00:08:01] <Boomtime> perhaps you want to raise a feature request?
[00:08:31] <Antiarc> Okay, that's fair enough then. If it's spec, then I can live with it. Was just confusing, coming from the 1.x stuff.
[00:08:35] <Boomtime> maybe request an option to added "autoDiscover" or something to force discovery
[00:08:50] <Antiarc> The bigger concern is the global logger (that's going to play *hell* with multithreaded stuff) and one-connection-per-DB
[00:09:32] <joannac> Antiarc: raise some tickets in the RUBY project then
[00:09:36] <Antiarc> kk, will do
[00:09:43] <Boomtime> yeah, that sounds ruby specific
[00:10:23] <Antiarc> This is just me being curmudgeony, but I have apps which run several hundred threads, touch multiple DBs, and so I meantina large connection pools, and the idea of multiplying the size of those pools to be able to have available resources to query multiple DBs sounds like a fast track to pain, heh
[00:10:36] <Antiarc> err, maintain. Stupid fingers.
[00:45:50] <cht> I've been using mongostat to compute inserts/sec over a period of time. Noticed that mongostat does not always report after every second, however the document suggests the insert column is inserts/sec
[00:46:20] <cht> anyone knows if it does an average over multiple seconds ?
[00:49:25] <Boomtime> cht: mongostat pretty much just runs the serverStatus command every 1 second
[00:50:03] <Boomtime> this means that if your mongod server is performing badly, it might not service the request in time, thus, "every 1 second" is dependent on the server responding in a timely fashion
[00:50:46] <cht> so the total number of inserts mongo did and what mongostat performs will not match ?
[00:51:05] <cht> say during a span of couple of hour of test
[00:51:06] <Boomtime> huh?
[00:51:13] <Boomtime> it will exactly match
[00:51:25] <Boomtime> the metrics come from the server, it isn't mongostat inventing them
[00:51:50] <Boomtime> if you want to know what inserts over a period of an hour though you're better off runing serverStatus yourself
[00:52:05] <Boomtime> why use a period counter when you actually want a total counter?
[00:52:21] <cht> I want the trend and total both
[00:53:03] <Boomtime> mongostat is good for the trend, but it's probably bad at totals over a long period because the outputs will be rounded - i dont know if the rounding errors are accounted for
[00:53:21] <Boomtime> serverStatus on the other hand, is a counter only
[00:53:44] <Boomtime> it contains the global totals, mongostat works out a period by doing two serverStatus commands 1 second apart and comparing them
[00:54:29] <Boomtime> thus, if you want totals, you're better off taking serverStatus at the start, and again at the end and getting the absolute totals that way
[00:54:34] <cht> yep, mongostat is way off over long running tests for averages
[00:55:09] <Boomtime> right, mongostat is intended for trends, so that isn't really surprising that it drifts over time
[00:55:45] <cht> however if the sum of inserts over the run is exactly the same that should do it too
[00:56:13] <cht> just timeboxing the run and using the time measured rather than relying on mongostat time
[01:00:53] <Boomtime> cht: i would still recommend using the serverStatus output, you obviously want more fidelity so go to the source
[01:06:03] <cht> Boomtime, the metrics comes from mongod, does mongod report inserts/sec ?
[01:06:38] <Boomtime> no, it reports inserts as a counter
[01:06:46] <Boomtime> when there is an insert, the counter increments
[01:07:05] <cht> so I don't get what mongostat is doing then
[01:07:20] <Boomtime> it asks for the counter once, waits 1 second, then asks for it again
[01:07:36] <Boomtime> tada! stats over 1 second
[01:07:48] <Boomtime> and using that exact same method you can produce an average over any time frame you want
[01:07:49] <cht> however when the wait time if more than 1 second then ?
[01:08:00] <cht> so it does do an average ?
[01:08:11] <Boomtime> what?
[01:08:22] <Boomtime> no, it prints how many inserts occurred in the preceding second
[01:08:27] <Boomtime> that's all
[01:08:49] <cht> how ? if mongod only returns a counter how can it do that without computing the average
[01:10:19] <cht> say you have 5 second test time, mongod is doing 2000/sec. For the first second mongostat reports 2000 inserts and then only reports in the 3rd second, however it reports 2000 in the 3rd second, why won't it report 4000 over 2 seconds ?
[01:10:32] <cht> it seems it is doing 4000/2 = 2000 seconds
[01:10:32] <cht> ?
[01:10:41] <cht> s/2000 seconds/2000 inserts ?
[01:10:46] <Boomtime> because the counter didn't change in the second inteval
[01:11:22] <Boomtime> if you ask the server for the counter value, then wait 1 second and ask for the counter value, and the two vaues are the same, then no inserts occurred for that 1 second
[01:11:35] <Boomtime> this is what mongostat does
[01:11:47] <cht> if no inserts occurred, then mongostat should report 0
[01:11:55] <Boomtime> and it does
[01:11:59] <cht> not always
[01:12:04] <Boomtime> i assure you does
[01:12:12] <Boomtime> if it doesn't then an insert occurred
[01:12:31] <cht> we are talking about cases where mongostat requests the server and probably waits for 7 secs during which server is still writing and mongostat probably is doing total writes/7
[01:12:55] <Boomtime> what do you mean "waits for 7 seconds"
[01:13:03] <Boomtime> mongostat queries every second
[01:13:12] <cht> and if the query does not return every second ?
[01:13:28] <Boomtime> then your stats will be wrong because the server is apparently buried in work
[01:13:46] <cht> how wrong, will mongostat do average in that case ?
[01:14:07] <cht> say mongostat queries, and returns after 5 seconds.
[01:14:14] <Boomtime> how can it know?
[01:14:19] <cht> :P
[01:14:31] <cht> so what's the expected behavior ?
[01:14:45] <cht> it definitely does not report the inserts that happened during those 7 seconds
[01:14:51] <Boomtime> it reports the precise difference between the values provided in the two snapshots it was given
[01:15:48] <Boomtime> if the server is buried in work and those snapshots are not being provided in the time you need, then you cannot expect the stats to reflect the wall clock time
[01:16:05] <Boomtime> however, the stats will be absolutely accurate over time
[01:16:23] <Boomtime> serverStatus is a counter, it is unrelated in time, it has no concept of time
[01:16:34] <cheeser> like my kids...
[01:16:38] <Boomtime> if an insert occurs, the insert counter increments
[01:18:18] <cht> yep serverStatus is fine, but mongostat is not displaying the mere difference of that counter
[01:28:58] <Boomtime> cht: if you want to know precisely what mongostat does, then the source is the definitive authority
[01:29:05] <Boomtime> https://github.com/mongodb/mongo-tools/blob/master/mongostat/mongostat.go#L55
[01:29:29] <Boomtime> note that mongostat keeps only a single snapshot from the last poll
[01:30:03] <Boomtime> it may calculate drift inbetween using the time of the response, but it must do so using only 2 counter values
[01:31:34] <cht> From documentation: mongostat returns values that reflect the operations over a 1 second period. When mongostat <sleeptime> has a value greater than 1, mongostat averages the statistics to reflect average operations per second.
[01:31:58] <cht> so when mongostat is not reporting every second, can we assuming it is averaging
[01:32:19] <cht> also documentation says, inserts: The number of objects inserted into the database per second.
[01:33:14] <cht> so if it is doing what the documentation says, it have to average otherwise how can it get per second from 7 seconds of wait time..
[01:34:09] <cht> and if it is reporting the average value, then the sum of inserts reported by mongostat will not match the actual number of inserts that has happened
[01:34:46] <Boomtime> indeed
[01:35:42] <Boomtime> i can't say how it could attain that behavior - but assuming that description is correct, then yes, the average over time will not relfect precisely because "average" by definition is a loss of fidelity
[01:36:23] <cht> also I'm more interested in if it averages even when sleeptime is 1 second...going to be lots of digging now
[01:36:54] <Boomtime> right, assuming it has that ability it seems sensible that some logic like that would be in play
[01:37:39] <Boomtime> you have a simple option to avoid this entire issue - use serverStatus at the start of your test, and again at the end, and diff the counter values, this will tell you exactly what was obtained
[01:38:02] <cht> yep I will do that
[01:51:29] <cht> https://github.com/mongodb/mongo-tools/blob/76b5a5d048e1bc1c1307db14feaffa144e997a44/mongostat/stat_types.go#L707
[01:51:31] <cht> meh
[02:36:46] <Razerglass> how do i look in my DB to see if there has already been a collection saved with the same title
[02:38:25] <cheeser> show collections;
[03:00:11] <pwlarry> say I was to create an article that has multiple module types assigned to it, would the best pattern be " modules: [<module_id>, <module_id>, ...] "
[03:00:25] <pwlarry> is there any downside to that approach?
[06:01:46] <V10l4t3d> Hi all, can you help me with a problems with php driver ?
[06:13:19] <hahuang65> anyone know why db.runCommand({setParameter : 1, verboseQueryLogging : "false"}) is not working? It keeps returning { "was" : true, "ok" : 1 } no matter how many times I do it
[06:16:39] <joannac> hahuang65: don't you have a subscription?
[06:18:41] <hahuang65> joannac: yeah, felt like this was a quick answer, anyways, it was cuz "false" instead of false
[06:18:46] <hahuang65> joannac: thanks :)
[06:19:05] <hahuang65> I didn't want to wake anyone just for this answer, just trying to see if anyone could spot something quick on IRC :)
[07:17:06] <mitereiter> I got a tag range like this
[07:17:06] <mitereiter> tag: shard0003 { "_id.countryCode" : "USA", "_id.firNR" : { "$minKey" : 1 } } -->> { "_id.countryCode" : "USA", "_id.firNR" : { "$maxKey" : 1 } }
[07:17:07] <mitereiter> but I got such chunks (not on the right shard)
[07:17:07] <mitereiter> { "_id.countryCode" : "USA", "_id.firNR" : NumberLong(24124079) } -->> { "_id.countryCode" : "USA", "_id.firNR" : NumberLong(24231503) } on : shard0001 Timestamp(1, 1603)
[07:17:07] <mitereiter> so isn't it supposed to be moved to the shard specified by the tag range?
[07:18:07] <joannac> mitereiter: balancer is on?
[08:45:30] <jdo_dk> May be a noob question, but then i use pymongo and insert_one, i get a insertResult return. How do i get the inserted_id from insertResult? I have tried: res = mongo...insert_one(my_doc) print res.inserted_id() with out any luck
[08:45:38] <svm_invictvs> /join ##Java
[08:56:54] <joannac> mitereiter: try checking the mongos log for move chunks
[11:41:05] <razieliyo> hi
[11:41:23] <razieliyo> so I have: db.runCommand({distinct:"some_collection", key: "some_key"})
[11:41:27] <razieliyo> how can I count the result?
[11:44:39] <razieliyo> ok, I got it, capture the return, then returnvar["values"].length
[12:08:56] <eagles0513875__> hey guys i have mongod running manually at the moment how can i get mongo damon to run on start up as well as run in the background
[12:16:36] <razieliyo> can I remove duplicates ignoring _id?
[12:27:01] <KekSi> is there a way to programmatically check whether a mongodb instance supports ssl?
[12:27:10] <KekSi> beside trying to connect and see if it fails?
[12:28:39] <cheeser> that's pretty much it unless you want to read the config file
[12:41:23] <KekSi> is there a way to do that without waiting for a com.mongodb.MongoTimeoutException?
[12:55:28] <CaptTofu_> hi all
[12:56:07] <CaptTofu_> quick dumb question -- I'm coming from a mysql background -- how does one create a user that allows one to back up the database? The docs seem to imply there is a backup user, but I don't see it.
[13:01:40] <StephenLynx> a read only user, maybe?
[13:04:27] <newbsduser> how can i calculate query's dataset size compressed or uncompressed? which method should i use to do that?
[13:13:33] <Constg> Hello, I have a problem I can't identify. All is going weel on my replicaset, when suddenly, the queue rise a lot and quick. With with thousands of entry, I don't know how to find which operation is blocking the queue. Can somebody help me?
[13:14:21] <Constg> To fix it, I have to stepdown the current master to let another node of the replicat set becoming master. Then everything is ok for two or three days.
[13:17:41] <deathanchor> db.currentOp()
[13:18:12] <deathanchor> you can query things like db.currentOp({ secs_running : { $gt : 10 } })
[13:22:54] <Constg> thx deathanchor, I did, but I have thousands od results. Hard to find which one is causing the queue.
[13:25:03] <deathanchor> Constg: http://docs.mongodb.org/manual/reference/method/db.currentOp/#active-operations-with-no-yields
[13:31:41] <Constg> Ha that would help
[13:31:46] <Constg> thank you deathanchor
[14:19:27] <pamp> hi
[14:19:29] <pamp> why the mongostat does not present the page faults
[14:19:36] <pamp> in on ubuntu server
[14:21:05] <Lujeni> pamp, version ? engine ?
[14:22:15] <pamp> wired version 3.0.3
[14:23:28] <Lujeni> pamp, Only for MMAPv1 Storage Engine. The number of page faults per second.
[14:23:42] <pamp> hm ok´
[14:24:39] <pamp> there is no way to see PageFaults per second with engine wiredtiger?
[14:51:54] <mrfloyd> Hello
[14:53:49] <mrfloyd> I am about to build a social network where users can post videos (with comments and Like) also users can follow each others, any tips?
[14:54:14] <GothAlice> mrfloyd: Use a real graph database to store the connections between users.
[14:54:16] <GothAlice> Please. ;)
[14:54:39] <mrfloyd> I have around 2k active users / second and this number is expected to increase as soon as we release the new version
[14:55:00] <GothAlice> Yeah. Not using a real graph database killed my last project after we got hackernews'd and lifehacker'd the same week.
[14:55:23] <cheeser> agreed. there ways to do certain bits of social networking on mongo (http://blog.mongodb.org/post/65612078649/schema-design-for-social-inboxes-in-mongodb) but it's not ideal
[14:55:40] <cheeser> graph dbs are built for the kind of network traversals you'd want to to do.
[14:55:52] <GothAlice> As soon as you need to know "friends of friends", etc., MongoDB will fall short.
[14:56:07] <cheeser> pretty much anyting but a graph db would.
[14:56:16] <mrfloyd> i ve been doing front end objective-c stuff for the last 4 years a bit outdated with the server side technologies sorry if i ask anything stupid in advance
[14:56:25] <GothAlice> Oh, no worries.
[14:56:29] <GothAlice> There are no bad questions, only bad answers. :)
[14:57:01] <cheeser> i 98% agree with that. i've seen some real stunners over the years. ;)
[14:57:08] <mrfloyd> i used mongodb for www.completure.com 4 years ago but i only got 100k users so i couldn't really test it. it was hosted on mongolab.com
[14:57:10] <GothAlice> Yeah, should have said "few bad questions". ;)
[14:57:50] <GothAlice> mrfloyd: Interesting app. I use "Breaking News" for the same purpose.
[14:58:18] <mrfloyd> thank you
[14:58:36] <mrfloyd> we got some hot media as you can but we quit the project
[14:58:47] <mrfloyd> is mysql good for those relations between the users ?
[14:58:52] <GothAlice> No.
[14:59:02] <GothAlice> MySQL is not a graph database. Neo4j is an example of a real graph DB.
[14:59:10] <mrfloyd> ya i checked the list
[14:59:10] <GothAlice> (We use Neo4j at work, these days.)
[14:59:53] <mrfloyd> Alright
[15:00:05] <mrfloyd> what do they use @instagram
[15:00:22] <mrfloyd> i am building an exact clone but for videos
[15:00:34] <cheeser> probably a combination of databases.
[15:00:39] <GothAlice> mrfloyd: http://instagram-engineering.tumblr.com/post/13649370142/what-powers-instagram-hundreds-of-instances
[15:00:41] <GothAlice> :)
[15:01:05] <mrfloyd> jesus
[15:01:32] <cheeser> pgsql++
[15:01:41] <GothAlice> On our last project at work we went a step further than Instagram did, we completely removed permanent storage from the equation on our DB servers.
[15:02:02] <GothAlice> (All were ephemeral EC2 instances, all were self-configuring, with WAL-e oplog streaming into S3.)
[15:02:29] <mrfloyd> "Data storage
[15:02:30] <mrfloyd> Most of our data (users, photo metadata, tags, etc) lives in PostgreSQL; we’ve http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram about how we shard across our different Postgres instances. Our main shard cluster involves 12 Quadruple Extra-Large memory instances (and twelve replicas in a different zone.)"
[15:02:43] <GothAlice> Yuuuup.
[15:02:50] <GothAlice> It's all about the RAM, baby! :D
[15:02:58] <mrfloyd> i am using rackspace
[15:03:10] <GothAlice> As are we at work, these days.
[15:03:11] <mrfloyd> i am already paying 2k a month bandwith
[15:03:25] <mrfloyd> https://play.google.com/store/apps/details?id=com.feed.dubsmasher
[15:03:37] <mrfloyd> it is a dubsmash competitor have a look
[15:03:37] <GothAlice> Do you accidentally have your database connections going over the public interface instead of the servicenet one?
[15:03:40] <GothAlice> (ServiceNet is free.)
[15:03:42] <mrfloyd> now i am building the clean version ...
[15:04:09] <mrfloyd> ServiceNet free bandwith for video streaming??
[15:04:31] <GothAlice> ServiceNet is free server-to-server bandwidth.
[15:04:36] <GothAlice> Not so much the client side.
[15:04:47] <GothAlice> Also, lolandroid. http://www.huffingtonpost.ca/2015/05/21/spy-agencies-target-mobile-phones-app-stores-to-implant-spyware_n_7349520.html
[15:05:24] <mrfloyd> most of the bandwidth is server to client
[15:05:33] <GothAlice> Indeed.
[15:05:33] <mrfloyd> actually from the CDN to the client
[15:05:58] <GothAlice> Yeah: that's the bigger point. You should be serving content out of a cheaper-cost CDN. App server bandwidth is expensive by comparison.
[15:06:01] <StephenLynx> lol instagram uses ubuntu for servers. didn't expected that.
[15:06:11] <mrfloyd> yes
[15:06:22] <mrfloyd> Natty Narwhal
[15:07:23] <GothAlice> StephenLynx: I'm progressively moving away from caring about underlying architecture. I push to a remote git repo, the app deploys, and I really don't want to have to care about how it's doing that… it should just work. ;)
[15:07:50] <GothAlice> (OFC I still use Gentoo…)
[15:08:04] <cheeser> gentoo++
[15:08:16] <GothAlice> Kernel compiles from depclean in ~45 seconds, yeah! :D
[15:08:20] <StephenLynx> When I get to the point where I don't have to do sysadmining, maybe I'll stop caring about that kind of stuff.
[15:10:09] <mrfloyd> So which database should i use :D
[15:10:22] <GothAlice> mrfloyd: I'd recommend Neo4j for storing the graph data, and MongoDB for the rest of the data.
[15:10:33] <GothAlice> StephenLynx: As I'm getting further on in my career I'm better at recognizing things that are simply "make-work projects". Anything worth doing twice is worth automating, and servers are… highly automatable. Anything less is make-work, and there are generally better uses for time than applying patches to boxen. ;)
[15:11:06] <mrfloyd> ya but i don't want any learning curve
[15:11:35] <StephenLynx> mrfloyd from all problems that might arise in software development, lack of knowledge is the easiest one to be solved.
[15:12:36] <mrfloyd> I moved from software development to business unfortunately / fortunately
[15:12:49] <mrfloyd> so i always look to the fastest and easiest way around
[15:12:54] <mrfloyd> i shoot now and i aim later
[15:12:56] <GothAlice> mrfloyd: Doing without understanding is a recipe for disaster and a short career.
[15:13:17] <StephenLynx> yeah.
[15:13:24] <mrfloyd> this is why i am here asking for helo
[15:13:27] <StephenLynx> wrong decisions can kill projects.
[15:13:27] <mrfloyd> help*
[15:13:36] <StephenLynx> and GothAlice already recommended something.
[15:13:47] <GothAlice> One can not truly solve a problem without understanding it first. (And in my consulting work, it's often more about giving the client what they _need_ and not what they _want_, which requires even deeper understanding.)
[15:13:48] <mrfloyd> where can i host my Neo4j db?
[15:14:07] <mrfloyd> i use mongolab for the mongodb any similar solution for Neo4j
[15:14:08] <mrfloyd> ?
[15:14:21] <GothAlice> No, you're going to need to spin up some VMs and run that service yourself.
[15:14:35] <StephenLynx> yeah, clients have this issue understanding that if they knew what they are paying you to council them with, they wouldn't be paying you in the first place.
[15:14:39] <mrfloyd> Alright
[15:14:53] <GothAlice> mrfloyd: Also, as a note, "cloud hosted databases" are less cost effective than hiring a DBA and buying the physical servers yourself, for any amount of data exceeding around 1GB.
[15:15:30] <GothAlice> mrfloyd: Ref: https://twitter.com/GothAlice/status/582920470715965440
[15:15:39] <mrfloyd> Yes, it will come..
[15:15:48] <mrfloyd> we are currently based in lebanon and few resources in here
[15:16:02] <GothAlice> Er, TB, not GB, sorry.
[15:16:05] <mrfloyd> I cant find 1 objective-c developer it has been 4 month i am trying
[15:16:11] <GothAlice> Ouch.
[15:16:57] <StephenLynx> hehe
[15:17:03] <StephenLynx> there is a reason for that, mrfloyd
[15:17:40] <StephenLynx> everyone who ever worked with obj-c either hated everything apple so much they sworn they would never touch anything from them again or love apple so much they migrated to swift when apple released it.
[15:18:04] <StephenLynx> The former happened to me.
[15:18:15] <mrfloyd> in here they are all employed with very high salaries or doing their own apps
[15:18:30] <mrfloyd> swift is unstable yet
[15:18:37] <mrfloyd> not stable yet*
[15:19:10] <GothAlice> StephenLynx: Please, your over-broad statements fail to take into account that reality is a complex affair, and that there is likely a confluence of factors including geography contributing to the issue. (It seems clear you haven't actually seriously used Objective-C… it's like Python on top of C with all the introspection and OOP goodness, and is actually quite nice to work in once you're used to it.)
[15:19:35] <StephenLynx> I didn't say obj-c was bad.
[15:19:48] <StephenLynx> I said that working with apple platforms is.
[15:19:59] <GothAlice> "everyone who ever worked with" — ludicrously over-broad. ;)
[15:20:16] <StephenLynx> yeah, I go for hyperboles often :v
[15:20:34] <cheeser> they're not that useful in technical discussions
[15:20:36] <StephenLynx> tbh, obj-c was the most "ok" aspect for everything I had to interact from apple.
[15:20:37] <GothAlice> Not at all.
[15:20:39] <mrfloyd> i love objective-c
[15:20:45] <mrfloyd> it made me rich :)
[15:20:48] <GothAlice> :)
[15:21:14] <GothAlice> Well, ObjC + all of the effort Apple has spent developing the frameworks and toolchain.
[15:21:16] <mrfloyd> guys i gotta discuss the options with my partners and do some more readings
[15:21:30] <mrfloyd> you guys are so nice thank you for the support
[15:21:35] <GothAlice> It never hurts to help. :)
[15:21:45] <mrfloyd> it depends :P
[15:24:58] <eagles0513875> hey guys where can i find mongodb version 2.6.9 for ubuntu 14.04
[15:26:08] <GothAlice> eagles0513875: http://docs.mongodb.org/v2.6/tutorial/install-mongodb-on-ubuntu/
[15:48:03] <mitereiter> i got this message in my router processe's log
[15:48:05] <mitereiter> SHARDING [Balancer] no where to put it :(
[15:48:06] <mitereiter> I SHARDING [Balancer] chunk { _id: "CompanyDB.Companies-_id.countryCode_"USA"_id.firNR_2262711", ns: "CompanyDB.Companies", min: { _id.countryCode: "USA", _id.firNR: 2262711 }, max: { _id.countryCode: "USA", _id.firNR: 2295949 }, version: Timestamp 1219000|0, versionEpoch: ObjectId('5552186b33c8a80418f1d028'), lastmod: Timestamp 1219000|0, lastmodE
[15:48:06] <mitereiter> poch: ObjectId('5552186b33c8a80418f1d028'), shard: "shard0003" } is not on a shard with the right tag: shard0003
[15:48:28] <mitereiter> i dont have maxsize set for my shards
[15:49:40] <mitereiter> in documentation I found this
[15:49:41] <mitereiter> By default, maxSize is not specified, allowing shards to consume the total amount of available space on their machines if necessary.
[15:52:50] <eagles0513875> ty GothAlice :) that is what i needed
[15:56:04] <Asenar> Hi, when using mongoimport from csv file, i get this error: read error on entry: line 5, extraneous " in field
[15:56:23] <Asenar> do you know how I can fix this ?
[15:56:27] <GothAlice> Asenar: Then your CSV is invalidly formatted, either not escaping a quote in that field, or truly with extra quotes.
[15:56:44] <Asenar> the csv is generated from mysql into outfile
[15:56:47] <GothAlice> Correct the source data, or write a little script to process it and gracefully handle errors.
[15:57:05] <Asenar> there is a lot \" in the datas
[15:57:40] <Asenar> so if the source is incorrect, how can I insert double quotes in strings ?
[15:58:03] <GothAlice> Asenar: Could you pastebin/gist line 5?
[15:58:16] <GothAlice> https://docs.python.org/2/library/csv.html#examples < Python's CSV reader + a little PyMongo would get you started with rolling your own mongoimport that can handle whatever formatting you want.
[16:01:43] <Asenar> in fact it's line 5, column 460 and it correspond to \"
[16:02:00] <GothAlice> Clearly, mongoimport doesn't seem to handle escaped quoting. :/
[16:02:10] <Asenar> thanks you GothAlice
[16:03:46] <Asenar> maybe it will be simplier for me to import from json
[16:03:53] <GothAlice> Very likely, yes.
[16:04:10] <GothAlice> CSV isn't so much a standard as people doing things in roughly the same way. JSON is at least a standard. ;)
[16:05:56] <Asenar> Yeah, I wanted to do it quick, I have a mysql table with 523905 items :/
[16:18:16] <Asenar> GothAlice, I solved my problem by using tsv format (and hopefully no tab where in values)
[16:37:45] <qbar> I'm getting an Invariant failure when trying to start mongod on Ubuntu. It was working until I rebooted.
[16:38:14] <qbar> Anyone familiar with this? I'm new to Mongodb
[16:45:52] <StephenLynx> what error are you getting?
[16:46:06] <StephenLynx> and if have a choice, ubuntu is not the best thing to use on a server.
[16:46:12] <StephenLynx> if you have*
[16:52:03] <GothAlice> Migrations for deployment on the 25th are up to 35 minutes. T_T
[16:52:15] <GothAlice> (Running 8 tasks in parallel.)
[16:53:51] <GothAlice> Yay for longest maintenance downtime to date.
[16:54:27] <StephenLynx> a dude just told me today "there are no downtimes in real world" when bitching about mongo on another channel :v
[16:54:40] <GothAlice> Well, yeah, there is. You schedule it.
[16:54:46] <StephenLynx> i asked him "that is why world of warcraft goes offline every week for maintenance?"
[16:55:01] <GothAlice> Or EVE Online going offline for an hour every single day for maintenance? ;^)
[16:55:09] <StephenLynx> dude was completely biased.
[16:55:20] <StephenLynx> said document based dbs are useless lol
[16:55:36] <StephenLynx> and "why would one want to have sub-documents?" but apparently joins were a MUST.
[16:59:22] <qbar> When I try to start using systemd, I get "Job for mongod.service failed. See "systemctl status mongod.service" and "journalctl -xe" for details."
[16:59:31] <pamp> why ubuntu is not the best thing to use on a server? witch linux distribution is the best for MongoDB ?
[16:59:53] <pamp> which*
[16:59:58] <qbar> Failed to start LSB: An object/document-oriented database.
[17:03:02] <qbar> 2015-05-21T12:01:00.879-0500 I CONTROL ***** SERVER RESTARTED ***** 2015-05-21T12:01:00.901-0500 W - [initandlisten] Detected unclean shutdown - /var/lib/mongodb/mongod.lock is not empty. 2015-05-21T12:01:00.901-0500 I STORAGE [initandlisten] exception in initAndListen: 98 Unable to create/open lock file: /var/lib/mongodb/mongod.lock errno:13 Permission denied Is a mongod instance already running?, terminating 2015-05-21T12:0
[17:09:09] <StephenLynx> qbar because is not a distro focused on server working and apt-get is kind of bad.
[17:09:19] <StephenLynx> as so is ubuntu in general.
[17:09:27] <StephenLynx> for one I like to use centOS.
[17:10:07] <StephenLynx> and try running systemctl status mongod
[17:10:16] <StephenLynx> to see if its already running.
[17:10:51] <StephenLynx> pamp
[17:15:33] <pamp> Im using ubuntu
[17:15:47] <pamp> should I move to centOS?
[17:16:17] <pamp> I was in doubt about which distribution to use
[17:16:57] <pamp> I do not have much experience in linux
[17:17:35] <GothAlice> I would in general advise people to avoid the demands of sysadmin unless you happen to have an expert on hand. (The lesson I learned from years of mail hosting: don't do mail hosting. ;)
[17:18:28] <GothAlice> With the bonus of never needing to poke a server again.
[17:23:25] <StephenLynx> wouldn't a service like that limit you on what you can or not run on the server?
[17:23:40] <StephenLynx> what if I want something like, using some RAM to mount a filesystem for temporary files?
[17:23:55] <Pinkamena_D> I have a question about joining two collections in one query, no this is not a sql 'join' or anything. I have basically two collections/databases whatever which are created from two different sources but have similar data and data format. In this case it is for audit entries. Simpley like documents {'date':somedate,'message':somemsg}, for example.
[17:24:34] <Pinkamena_D> I am looking for a good way to make one query which will get for example 50 documents sorted by date, but from BOTH collections.
[17:25:04] <StephenLynx> Pinkamena_D afaik is not possible to work at the same time with two collections at all. The closest I ever heard was about using an out aggregation stage to save the output data by the aggregation on a new collection that would be cleaned before the insertion of the new data.
[17:25:44] <StephenLynx> I don't think its possible to query two different collections at the same time in any shape or form.
[17:26:51] <Pinkamena_D> so from your experice so far if you were presented with something like this, you would take from both collections with double the item limit, and do a merge later after downloading from each?
[17:41:26] <svm_invictvs> So, in Mongo is tehre a way to specify a write concern that will guarantee that all participants ofa replica set will see the write before the write returns?
[17:41:45] <svm_invictvs> I see things like "SAFE" write concerns, and majority, but nothing seems to indicate that it can give any sort of guarantee like that.
[17:41:48] <svm_invictvs> Is that correct?
[17:42:53] <deathanchor> svm_invictvs: what is wrong with majority?
[17:43:07] <svm_invictvs> (Sorry, all my other assumptions still stand that this can only pertain to a single document at a time)
[17:43:45] <svm_invictvs> deathanchor: Nothing but it may not satisfy what I'm trying to do.
[17:44:00] <GothAlice> svm_invictvs: You can specify an exact number of replicas to require responses from. Unfortunately such a decision will negatively impact reliability. If one node goes away, that write will fail.
[17:44:08] <svm_invictvs> yeah
[17:44:33] <GothAlice> For any query that absolutely must have current data, you must simply direct that query at the primary.
[17:44:43] <svm_invictvs> I see.
[17:45:01] <svm_invictvs> So let's say I want to write document with _id = foo, right?
[17:46:38] <svm_invictvs> If one writer writes it, and another attempts to write it at the same time will I encounter any issues?
[17:46:50] <GothAlice> Yes. Only one will win.
[17:47:04] <svm_invictvs> Or will one be be guarnteed to see that exists (and therefore fail the write), and one will succeed
[17:47:07] <svm_invictvs> Okay
[17:47:13] <GothAlice> Writes must be directed at a primary, thus it naturally serializes the writes. Even if both queries are made in the same millisecond, one will win.
[17:47:23] <svm_invictvs> ah
[17:47:24] <svm_invictvs> I see
[17:47:32] <svm_invictvs> Yeah, taht's all I wanted.
[17:47:37] <svm_invictvs> because, I basically need a lock...
[17:47:47] <GothAlice> (This effect is what allows you to implement atomic locks using MongoDB update-if-not-different operations.)
[17:47:55] <svm_invictvs> Yep
[17:48:22] <GothAlice> svm_invictvs: https://gist.github.com/amcgregor/4207375 may be an interesting read. Includes code which demonstrates the locks.
[17:49:34] <svm_invictvs> I hate the idea of locks, but sometimes you need to use them.
[17:49:44] <GothAlice> Sometimes you do. :)
[17:50:10] <svm_invictvs> I'mw riting a lucene/gridFS storage driver
[17:50:29] <GothAlice> An interesting approach.
[17:50:48] <svm_invictvs> I'm not the first to do this.
[17:51:05] <svm_invictvs> But the one out there is based on Lucene 4.x and doesn't work with Lucene 5
[17:51:26] <svm_invictvs> GothAlice: Is "interesting" a euphemisim for something?
[17:51:27] <svm_invictvs> haha
[17:52:23] <GothAlice> I tend to favour reduced architectural complexity at the expense of additional development time. I.e. the fewer moving parts the better, for me. Thus writing my own search engine in Python (with the rest of the app in Python, too) on top of MongoDB, rather than outsourcing that problem to additional infrastructure.
[17:53:40] <svm_invictvs> Heh
[17:53:52] <svm_invictvs> GothAlice: That's fair.
[17:54:18] <svm_invictvs> GothAlice: I had used Lucene in the past to do some fairly clever document matching/processing and I just wanted to tinker with it some more.
[17:54:39] <svm_invictvs> GothAlice: But my big beef with Lucene is that their core API changes so frequently
[17:54:45] <GothAlice> On the project those presentation slides were for, I joined about two weeks in. My first day saw us dropping redis, memcache, zeromq, rabbitmq, and postgres in favour of… just MongoDB. ;)
[17:55:15] <svm_invictvs> GothAlice: Yeah, a project I'm on right now I'd like to do that
[17:55:27] <svm_invictvs> GothAlice: Rip apart this mess of mysql, zeromq etc.
[17:55:31] <GothAlice> ^_^
[17:56:20] <greyTEO> I wonder how projects adopt so many technologies
[17:57:23] <greyTEO> usually my first process is how can I use an existing technology to do what I want to do. if it cant be done, I try to find another way.
[17:57:48] <greyTEO> but I definitely understand how use cases can be appealing
[17:58:27] <GothAlice> "We need a task queue!" "Celery = Redis + Zeromq." "We need caches!" "Memcache." "We need persistent event queues!" "Rabbitmq." "And data storage!" "Postgres." After introducing MongoDB that turns into: "Task queue?" "Capped capped collection." "Caches!" "TTL indexes." "Persistent queues!" "Capped collection backed by a real collection." "Data storage." "Documents."
[18:00:12] <greyTEO> I used rabbitmq but soley I needed to send messages across servers and between languages
[18:00:24] <greyTEO> lol later replaced by mongo.
[18:00:40] <greyTEO> and mongo-connecter is a thing of absolute beauty
[18:00:50] <mike_edmr> could have used thrift or pbuffers
[18:01:01] <mike_edmr> they are meant for cross language rpcs
[18:01:21] <greyTEO> I have almost replaced my entire stack mongo
[18:02:00] <mike_edmr> im not a fan of using mongo as a hammer
[18:03:02] <greyTEO> I liked the idea of rabbitmq because it was fire and forget. I am not sure about Apache Thrift
[18:04:20] <greyTEO> I will agree there are limits to mongo but it fits well in many different situations
[18:07:04] <GothAlice> mike_edmr: That task queue worker, three years ago, ran 1.9 million distributed RPC requests per second with two producers and five consumers. ;)
[18:07:08] <GothAlice> MongoDB is a glorious hammer.
[18:08:12] <GothAlice> (Considering that capped collections, the "queue", are how MongoDB itself handles multi-server replication, I expected it to be performant going in.)
[18:20:43] <mike_edmr> it doesnt work well for ad hoc querying/reporting, its not disk efficient, it doesnt work well for highly relational data
[18:20:52] <mike_edmr> and theres this: https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-reads
[18:23:33] <mike_edmr> disk *space* efficient i should probably qualify it
[18:23:33] <greyTEO> TO THE ##argument CHANNEL! lol
[18:24:25] <mike_edmr> i rly dont want to start an argument i just dont want anyone to suffer because they got swept up by hype
[18:24:50] <mike_edmr> which happens all the time unfortunately
[18:24:53] <greyTEO> I would go as far to say that any of those items (except relational data) can be overcome very easily
[18:25:27] <greyTEO> mike_edmr, that is true. Its not a silver bullet, but that is with any technology. try before you buy really
[18:28:44] <StephenLynx> " I tend to favour reduced architectural complexity at the expense of additional development time" this.
[18:29:50] <StephenLynx> My projects mostly revolve around the runtime environment, the database and its driver.
[18:30:12] <StephenLynx> only people who never tried to implement the foundation of a system complain that it is too hard.
[18:48:20] <LyndsySimon> Does anyone here have a strong background in discrete mathematics/graph theory? I'd like an opinion on this course: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-042j-mathematics-for-computer-science-fall-2010/Syllabus/
[18:49:18] <LyndsySimon> I'm wanting to quickly establish a foundation to draw from, to build a fairly complex social data application. I've always taken to math, and have taken up to intermediate calculus in a formal academic setting.
[18:54:14] <StephenLynx> Have you studied game design?
[18:55:15] <StephenLynx> for one, I have an aversion to academia. they always seem disconnected from the real world to me.
[18:55:25] <StephenLynx> even people that I respect their works, like RMS.
[20:04:22] <Pinkamena_D> ~Trying to do a mongodb dump/restore with single database
[20:05:06] <Pinkamena_D> The dump produces the database and the database.metadata , but restore wants "root directory must be the dump of a single collection"
[20:05:27] <Pinkamena_D> if I delete the metadata file it 'works', but this is this required?
[20:05:38] <Pinkamena_D> why is this*
[20:06:23] <GothAlice> Pinkamena_D: What are the exact mongodump and mongorestore lines you are using? (Less any users/passwords, of course.)
[20:07:37] <Pinkamena_D> mongodump --db=audit --collection=audit ; mongorestore --db=sonic --collection=siteAudit dump/audit
[20:08:12] <GothAlice> Indeed. It's your use of --collection that's biting you in the foot, here.
[20:08:27] <GothAlice> If you restore without the --collection option, and there's only one real collection to restore in the directory, it'll happily just restore that one.
[20:08:55] <Pinkamena_D> I do want only one collection, but I want it to restore to that collection, which as you can see is renamed
[20:10:28] <Pinkamena_D> of course if this was not supported for some reason I could understand, but it just seems weird that the metadata deletion 'solves' the issue, but it seems hacky.
[20:11:24] <Pinkamena_D> also Im guessing that messes with the indexes
[20:11:44] <GothAlice> The re-naming is a confusing factor, for mongorestore. It knows that you want to restore to siteAudit, but it must only be given a directory with a single collection dump in it otherwise it has no way of knowing which to restore to the alternate name.
[20:12:01] <GothAlice> Without the metadata, AFIK, you have no indexes in the resulting collection.
[20:12:46] <Pinkamena_D> In my case this must be done manually once in a while to update a collection which already has indexes, so theoretically it wont be an issue.
[20:13:20] <GothAlice> If these are simply different DBs (and collection names) on the same server, could you not make use of http://docs.mongodb.org/manual/reference/command/cloneCollection/ ?
[20:13:30] <Pinkamena_D> but, I think maybe it should be addressed in a bug report, or do you think it is more my fault? Renaming is slightly out of scope but I could see it being a fairly common use case.
[20:14:06] <GothAlice> (cloneCollection copies with the same name, but you'd then follow it up with a renameCollection.)
[20:14:24] <GothAlice> Renaming is generally better handled using the proper tools. ;)
[20:14:39] <GothAlice> (I.e. renameCollection, cloneCollection)
[20:16:48] <Pinkamena_D> alright. Well, I appreciate your help.
[22:57:35] <DragonPunch> why does Who: {
[22:57:35] <DragonPunch> type: Schema.ObjectId,
[22:57:35] <DragonPunch> ref: 'userSchema'
[22:57:35] <DragonPunch> },
[22:57:40] <DragonPunch> only store 1 ID
[22:57:53] <DragonPunch> and not the user object
[23:03:00] <louie_louiie> please explain a little bit more
[23:05:52] <DragonPunch> louie_louiie: well, basically, in my database it's only saving the ObjectID it's not saving all the attributes that userSchema has for example, the FirstName LastName etc etc.
[23:08:21] <louie_louiie> are you sending it into the db as an object?
[23:08:52] <louie_louiie> var object = {first: xxxxxx, last: xxxxxxx}
[23:09:06] <louie_louiie> collection.insert(object, { w: 0 });
[23:09:11] <DragonPunch> Well, I'm using MongooseJS so it happens via Schemas.
[23:18:57] <joannac> DragonPunch: maybe the fact that the type is a Schema.ObjectId ?
[23:19:29] <pjammer> So i have the same db setup in 2 spots. As i was migrating, i guess we left a NEVER used uploader on the site that was used. there are 1800 docs on old server that i need to somehow migrate to the corresponding new server.
[23:20:18] <pjammer> i can't make this up.... but how do I begin to learn what to do and is it going to be the worst, or the absolute worst thing i've done?
[23:21:33] <joannac> pjammer: what's the problem? db.oldcollection.find().forEach(connection-to-new-db.newcollection.insert())
[23:21:38] <joannac> (in pseudocode)
[23:21:50] <pjammer> gtfo really?
[23:22:15] <pjammer> i'll take it if it works. thanks @joannac
[23:44:45] <donCams> hi! say I have a capped collection. Can I create two queries where one has a tailable cursor while the other is just a normal find query?
[23:44:54] <donCams> at the same time? for that collection?