PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 4th of November, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:04:02] <izolate> when I apt-get install mongodb, it automatically starts the daemon. is this expected behavior?
[00:04:14] <izolate> just trying to understand it fully
[00:05:07] <joannac> I think so, on debian / ubuntu / etc
[00:05:57] <izolate> yup
[00:16:02] <darkblue_b> hi all - I will find out more tomorrow, but I think the Java PhD colleague of mine, might be re-allocated to be the search algorithms guy, with Mongo retained as the search platform for cluster analysis
[00:16:13] <darkblue_b> I kind of like that - for no good reasons ;-)
[00:37:16] <hahuang65> what's the difference between doing a Bulk Write Operation for remove as opposed to just a remove (for a single collection)
[00:47:11] <joannac> hahuang65: a single remove does a getLastError each time. BulkWrite bathces the getLastError calls
[00:47:27] <hahuang65> ah so it's much faster
[00:47:37] <hahuang65> that might... be a bad thing for me actually.
[00:54:50] <hahuang65> I'm having issues because we need to bulk delete things and I think the index changes and replication fucks our performance.
[00:54:51] <hahuang65> Any ideas?
[00:58:05] <joannac> ideas for what? delete more slowly?
[00:58:37] <joannac> increase your oplog?
[00:58:45] <joannac> re-evalauate if you need all those indexes?
[00:59:09] <Boomtime> also, can you say what makes you think replication is the problem?
[00:59:19] <bpap> "increase oplog" <--- what does this mean?
[00:59:41] <Boomtime> bpap: http://docs.mongodb.org/manual/tutorial/change-oplog-size/
[01:01:47] <bpap> good stuff. thanks.
[01:01:56] <hahuang65> Boomtime: not necessarily that replication is the problem, just a massive amounts of writes at once is the issue. Every operation associated with the writes (replication, index updating)
[01:05:00] <hahuang65> is there any way I can batch up deletes or space them out or somethign
[01:05:49] <hahuang65> db.collection.remove({site_id: ObjectId(1)}, amount: 1000) or something
[01:09:17] <bpap> .findAndModify(<query>, {}, <remove=1>).limit(1000) ?
[01:09:31] <hahuang65> bpap: does that really work?
[01:09:40] <bpap> dunno. i've been using Mongo for 3 days.
[01:09:45] <hahuang65> ...
[01:09:46] <hahuang65> lol
[01:09:52] <Boomtime> no
[01:11:22] <Boomtime> ah, here it is: https://jira.mongodb.org/browse/SERVER-1599
[01:12:19] <Boomtime> note it refers to delete as remove
[01:12:30] <hahuang65> yeah sure
[01:12:35] <hahuang65> so okay, it's not implemented yet.
[01:12:58] <bpap> i definitely used the wrong syntax. but it looks implemented here: http://docs.mongodb.org/manual/reference/method/db.collection.findAndModify/
[01:13:00] <Boomtime> anyway, the standard solution to "slowing down" bulk deletes is to do a query for the thing you are removing, with a limt - then issue a remove using a $in for that set
[01:13:22] <hahuang65> hmmm that's a good suggestion. I'll try that out.
[01:13:55] <Boomtime> if you $in using the _id it will be very efficient, despite the round-trip silliness
[01:14:05] <hahuang65> Boomtime: sure yeah
[01:14:08] <hahuang65> Boomtime: appreciate it man
[01:14:35] <Boomtime> also, filter your results to provide only the _id in the query result so you reduce the amount of data returned/gathered
[01:15:18] <Boomtime> not that optimizing this will matter much since your delete has to hit disk no matter what
[01:18:21] <hahuang65> yups
[02:08:19] <quuxman> I'm have a beast of a heisenbug, presumably somewhere deep in Pymongo
[02:08:49] <quuxman> I get: TypeError: "'dict' object is not callable" on calling cursor.next(), but it's not consistent
[02:09:22] <quuxman> it happens a lot, but only for certain queries, some of the time. Restarting my app consistently causes it to disappear, where it will then quickly reappear shortly after
[02:09:45] <quuxman> I have no idea where in pymongo this dict object is trying to be called
[02:09:48] <quuxman> It's definitely not in my code
[02:10:34] <quuxman> I've stepped through the entire code path in pymongo multiple times, and then it appears almost randomly later somewhere in my code where there is no dict object that's being called
[02:36:20] <Boomtime> quuxman: "I get: TypeError: "'dict' object is not callable" on calling cursor.next(), but it's not consistent"
[02:36:29] <Boomtime> this sounds like "cursor" has been reassigned
[02:36:51] <Boomtime> this is not something that pymongo could possibly have done, since you own that object
[02:37:26] <Boomtime> do you have a trace showing it is deeper in the next() method?
[02:38:12] <quuxman> I was reassigning cursor, with `cursor = islice(cursor, limit)` and I think I just narrowed my problem to that somehow
[02:38:32] <quuxman> why would that break things some of the time, only for certain queries?
[02:40:39] <Boomtime> isn't islice a python builtin?
[02:41:06] <Boomtime> you have asked a question whose answer seems to be "the mysteries of python" so i have no idea
[02:44:37] <Boomtime> also, you should use limit() not islice
[02:44:39] <Boomtime> http://api.mongodb.org/python/current/api/pymongo/cursor.html#pymongo.cursor.Cursor.limit
[02:44:45] <ElysiumNet> it's part of the itertools module
[02:48:35] <quuxman> I replaced the islice with a for loop, and I still see the same error
[02:51:10] <Boomtime> can you paste the entire error message?
[02:51:45] <quuxman> it's not relevant because it's the wrong traceback
[02:52:02] <quuxman> the line where the exception supposedly occurs makes no function call
[02:52:07] <quuxman> and there are no dict objects
[02:52:23] <Boomtime> you said it is on cursor.next()?
[02:53:00] <quuxman> As far as I can tell, yes
[02:53:20] <Boomtime> that is a function
[02:53:34] <Boomtime> please paste the entire error message
[02:53:40] <Boomtime> or put it in a gist if it's large
[02:53:46] <Boomtime> or pastebin, etc
[02:54:06] <quuxman> I'm not actually explicitly calling cursor.next(), I'm using `for r in cursor`
[02:55:48] <Boomtime> in python that's probably the same thing
[02:56:42] <quuxman> ah, I have figured it out. It was an issue with an ifliter I had in my code. Now that I've turned the ifilter and islice into a loop, I actually get a meaningful traceback
[04:06:10] <annoymouse> How do indices work?
[04:06:28] <annoymouse> Do I have to specify to use an index in the query?
[04:06:55] <cheeser> the query planner will try to pick the best one
[04:07:11] <cheeser> you can use $hint to suggest one
[04:11:26] <annoymouse> cheeser: And how do I make an index if I'm using a db hosting service like mongolab?
[04:11:40] <cheeser> what would the hosting service matter?
[04:12:22] <annoymouse> cheeser: It appears that MongoLab has some special way of making indices
[04:12:35] <annoymouse> cheeser: Anyway, how would I do it normally?
[04:12:58] <cheeser> http://docs.mongodb.org/manual/reference/method/db.collection.createIndex/
[04:14:41] <annoymouse> cheeser: How does that differ from http://docs.mongodb.org/manual/reference/method/db.collection.ensureIndex/
[04:17:11] <cheeser> oh, i had it backwards. createIndex() is deprecated. ensureIndex() is what you want.
[05:47:02] <asher^> hi all, does anyone know of a mongoose channel on freenode?
[05:48:39] <joannac> #mongoosejs
[05:49:03] <joannac> as linked on their site http://mongoosejs.com/
[05:49:33] <asher^> thanks
[07:05:06] <shushamna> hi
[07:05:31] <shushamna> Austin
[07:05:39] <shushamna> I am newbie to Mongodb
[07:06:21] <shushamna> just connected to feenode..its look like little bit different to communicate..
[07:07:09] <shushamna> just i am checking how to communicate through this window..
[07:07:09] <shushamna> please dont mind
[07:07:11] <shushamna> thank you
[07:07:35] <lqez> welcome :)
[07:08:47] <shushamna> Hi all
[07:20:21] <Hypfer> hi everyone
[07:20:32] <Hypfer> are there some tricks to reduce disk io of mongo?
[07:21:15] <joannac> um, what?
[07:21:39] <LouisT> does mongo do an abnormal amount of disk io?
[07:21:49] <joannac> what are you seeing diskio from? reads or writes?
[07:27:00] <Hypfer> LouisT: yes; joannac a lot of read IO
[07:28:27] <joannac> Hypfer: run an explain on your common queries
[07:29:30] <Hypfer> joannac: ah, that sounds promising. I'll report back in a few hours
[07:29:44] <Hypfer> thanks
[07:33:14] <LouisT> so, if i need faster read/write access, would it be a bad idea to use ramdisk for mongodb? say i don't really care about keeping data on system restarts or w/e.. could always just make backups ever X minutes or w/e..
[07:34:18] <joannac> would the speed of read/writes outweigh the backing up (writing your whole dataset to disk) every X minutes?
[07:37:40] <LouisT> well, i could probably figure out a way to do something like rsync or w/e... idk
[07:54:20] <zamnuts> LouisT, tail the oplog?
[07:54:35] <LouisT> zamnuts: what?
[07:55:08] <zamnuts> LouisT, you were talking about rsync'ing your db that would be stored on a hypothetical ram disk... i'm saying just tail the oplog
[07:55:31] <LouisT> how would that work?
[07:57:12] <zamnuts> instead of rsync. http://stackoverflow.com/a/9692678/1481489 start there
[07:57:18] <zamnuts> LouisT, ^
[07:58:58] <LouisT> zamnuts: yea but that wouldn't really help.. i could probably just set up mongodb on another server and use that there instead of reading the oplog manually but not really what i wanted to do
[10:27:20] <PirosB3> hi all, is it possible to select a super-set of a query in the aggregate pipeline
[10:28:07] <PirosB3> basically, I want a way to ask mongo: “get me ALL the coordinates or a specific ID, where at least one coordinate is near this point
[10:28:54] <PirosB3> with geonear it will correctly select the coordinates I want, but it will only return the coordinates that match
[11:19:03] <Lope2> I realized that I want to make one of my keys the _id field in my collection. But when I tried to rename myKey to _id, mongoDB told me _id is immutable.
[11:19:33] <Lope2> how can I easily dump the collection from the DB, run a search/replace in a text editor, and import it again?
[11:36:25] <joannac> Lope2: you basically have the steps, i'm not sure what else you're looking for?
[11:36:55] <joannac> mongoexport / mongoimport ?
[11:38:22] <ssarah> hei guys i make stuff like "sudo mkdir -p /srv/mongodb/rs0-0 /srv/mongodb/rs0-1" for my replicate set. then how do i change it's chown / chmod to make it work with mongo
[11:38:36] <ssarah> i'm doing chmod -R 0777 , but that's no good
[11:39:36] <Lope2> joannac: I used something like this and it wiped out my entire collection... oops. Luckily it was only 49 docs and I had dumped it in the terminal beforehand so I could just insert it again. db.status.find().sort({foo:1}).forEach(function(doc){ var id=doc._id; doc._id=doc.UserId; db.status.insert(doc); db.status.remove({_id:id}); })
[11:39:40] <kali> ssarah: you need to find out which users runs mongo (probably "mongo" or "mongod")
[11:39:57] <ssarah> i just list users?
[11:41:25] <kali> ssarah: it would be safer to check the startup scripts
[11:49:19] <ssarah> how does one do that, mr kali?
[12:31:19] <scruz> hello again.
[12:33:50] <scruz> i’m looking at the aggregation pipeline for doing some analysis on some documents.
[12:34:49] <scruz> suppose my documents contain a field, myList, that could contain any subset of the list [1..5]
[12:35:35] <scruz> how might i count how many occurrences of each item was in the set of documents?
[12:36:14] <scruz> a possible output would be {1: 15, 2: 34, 3: 0, 4: 27, 5: 12}
[12:36:59] <scruz> oh, right. there’s $unwind
[12:40:47] <Coffe> Howdy
[12:44:43] <Coffe> someone have time to explain lite design hints from a sql noob ? as i cant seem to break my habit of relations .. having X sessions that connects to X number of maps.. and conencted to that map there is X number of zones in..
[12:47:32] <kali> Coffe: you need to think in terms of the queries you need to make efficient and the one you can afford to be expensive
[12:48:16] <Coffe> mpas will be the small post.. tops 3000 points..
[12:48:19] <Coffe> maps
[12:48:50] <Coffe> today a sector have the id of the track saved in it.. feels like the wrong way of doing it.
[13:31:01] <ssarah> hei, can someone point me to docs on batch inserts?
[13:33:07] <lqez> ssarah: just put docs in one insert command.
[13:33:07] <lqez> http://docs.mongodb.org/manual/reference/method/db.collection.insert/
[13:33:25] <lqez> Or, if you're on 2.6+, you can use dedicated bulk methods.
[13:34:03] <lqez> http://docs.mongodb.org/manual/reference/method/Bulk.insert/
[13:34:19] <lqez> and drivers also support these bulk operation.
[13:46:05] <LinkRage> I added a replica set to be part of a shard by executing rs.add("test-sh03-rs/server3:27017,server1:27016") - server3:27017 is always the primary/priority=2 . But why then sh.status() shows test-sh03-rs's members in the opposite order? - output here: http://pastie.org/private/jwpcuzwuf3ygqaqyhhk0nq
[14:08:11] <lipiec> Hi! I have the following problem: https://groups.google.com/forum/#!topic/mongodb-user/m2fYbA4C-go
[14:08:32] <lipiec> Is anyone aware of what could go wron?
[14:08:34] <lipiec> wrong*
[14:10:07] <scruz> hmm
[14:10:27] <scruz> seems the aggregation framework distinguishes between 2 and ‘2’
[14:11:51] <scruz> http://imgur.com/UioCGnu
[14:12:25] <kali> scruz: of course it does. mongodb does.
[14:13:57] <scruz> kali: javascript doesn’t
[14:14:43] <scruz> at least, not using ==
[14:18:27] <kali> scruz: mongodb is only marginally using javascript (in the shell or map/reduce). all the internal stuff works with BSON data, and BSON is typed.
[14:55:25] <scruz> kali: thanks.
[16:04:58] <Lonesoldier728> hey
[16:05:13] <Lonesoldier728> does anyone know if mongoose supports text search or if I have to use the plugin
[16:13:05] <GothAlice> kali: Ostensibly JavaScript (and PHP) are bad and should feel bad for treating them the same. See pages 80 and on, notably §11.9.3 and §11.9.4. It's insanity like that that forced me to prefix hex-encoded hashes with 'x' to prevent them being compared *numerically* in one application.
[16:13:17] <GothAlice> kali: (of http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf)
[16:13:54] <kali> GothAlice: you're preaching to the choir :)
[16:14:26] <GothAlice> 0A49 != 0CE7 (but they are under some circumstances…) *rips hair* ;)
[17:14:15] <GothAlice> http://cl.ly/image/2Y1t1W022p2x coded in time for the presentation. 16 aggregate queries gave their short, short lives (3-10ms, 4ms on average) to produce that.
[19:47:20] <boutell> hi. I needed to process results from mongo queries in parallel, but only up to a certain limit, and there wasn’t really an API for that, so I wrote an npm module: https://github.com/punkave/broadband
[19:47:32] <boutell> it’s nice if you don’t want to pull all the data into RAM first and feed it to async.eachLimit.
[20:40:10] <agend> hi, is there a channel were i could ask some general question about databases?
[20:41:02] <agend> i'm looking for a db which would be optimized for key value storing - where value would always be a number with incrament function
[20:48:45] <Goopyo> agend: in memory?
[20:49:43] <Goopyo> agend: use redis. Though internally it stores ints at strings although it does allow for atomic incrementing of them (this is only a concern because your ints consume more memory)
[20:51:28] <bttf> A little confused ... I just kicked off mongod for the first time, then created an admin user. After that, I was still able to connect to mongod with mongo without using any credentials. This shouldn't happen, right?
[20:51:50] <GothAlice> bttf: Did you enable authentication on the command line or in the configuration file (and remembered to specify the configuration file on the command line…)
[20:52:10] <GothAlice> bttf: http://docs.mongodb.org/manual/tutorial/enable-authentication/
[20:52:26] <agend> Goopyo: nope, disk persistent
[20:52:30] <bttf> I just ran 'mongod' when I started the daemon, so no I don't think I enabled authentication on command-line nor did I load a config file
[20:52:45] <bttf> I'll take a look, thx GothAlice
[20:52:51] <GothAlice> bttf: That tutorial should have you covered. :)
[20:53:03] <Goopyo> agend: Redis has  dis persitance too. Its probably the fastest option for what you need
[20:53:35] <agend> Goopyo: how would redis do with terabytes of data?
[20:53:47] <Goopyo> ah
[20:53:59] <Goopyo> you want a distributed kv like cassandra
[20:54:14] <Goopyo> http://www.datastax.com/documentation/cql/3.0/cql/cql_using/use_counter_t.html
[20:54:14] <agend> Goopyo: if u say so :)
[20:54:25] <GothAlice> agend: As an alternate, I store per-document arbitrary key/value data as mappings or sometimes lists of nested {key: …, value: …} type things (depending on index needs). (Mostly for my CMS, where the arbitrary additional key/value pairs are tied to a more concrete document.)
[20:54:51] <GothAlice> agend: And I've got 24TiB of "arbitrarily key-value tagged" data at home. ;)
[20:55:04] <Goopyo> Although a sharded mongo will probably work too, Cassandra is mostly designed
[20:55:08] <Goopyo> for what you're doing
[20:55:14] <GothAlice> (In a MongoDB GridFS-based FUSE metadata filesystem.)
[20:55:36] <agend> I need something really fast with writes/upserts
[20:55:43] <Goopyo> GothAlice: are keys compacted now?
[20:56:10] <GothAlice> Goopyo: https://gist.github.com/amcgregor/1ca13e5a74b2ac318017#file-sample-py Mine are. :)
[20:56:26] <Goopyo> if you have terrabytes of k/v data that are mostly ints, storing {key: …, value: …} would probably be 98% of your storage
[20:56:31] <GothAlice> (Each key in my documents takes 6 bytes exactly.)
[20:56:53] <agend> im using mongo now but its not enough fast - and I thought whole i need is string -> number with inc funcionality
[20:57:03] <GothAlice> Goopyo: Yeah, I tune specific metadata values depending on need and sparseness in the dataset.
[20:57:21] <Goopyo> I like mongo, but definitely wouldn't recommend it for this... Hammer issue
[20:57:47] <agend> so cassandra?
[20:58:01] <Goopyo> yeah look at Cassandra or Hadoop or something like that. I'd go cassandra
[20:58:02] <GothAlice> Goopyo: I also use MongoDB to replace zeromq/rabbitmq. ;^)
[20:58:14] <Goopyo> RabbitMQ cool, ZeroMQ?
[20:58:23] <Goopyo> btw this schema really looks like stock marcket OHLC data ;)
[20:59:20] <agend> would leveldb also work for my case?
[20:59:42] <Goopyo> yup but its probably going to be alot less user friendly
[20:59:49] <GothAlice> Goopyo: Back in the day I joined a project that had Postgres, RabbitMQ, ZeroMQ, Redis (+ Celery), and membase. RabbitMQ had persistence, ZeroMQ was lower-latency. After some benchmarking and stress testing of failure scenarios, we replaced all of the aforementioned with MongoDB (for a 1M simultaneous user benchmarked Facebook game).
[21:00:42] <Goopyo> ah. Probably small zmq functionalities replaced. That is full fleged networking library though.
[21:01:15] <Goopyo> I was like "you do multicast hwm pub sub using mongodb?"
[21:01:57] <GothAlice> Goopyo: Yes.
[21:02:04] <GothAlice> Goopyo: But indirectly. ;)
[21:02:52] <GothAlice> Goopyo: Most of our pub/sub for per-user eventing is handled entirely in Nginx, see: http://ecanus.craphound.ca/chat/ >:D
[21:03:15] <GothAlice> It's the 1.9 million dRPC requests per second replacement for Celery that really makes use of MongoDB capped collections (as a replacement for other queues).
[21:04:22] <Goopyo> eventing is handled entirely in Nginx
[21:04:27] <Goopyo> got a link for info on that?
[21:04:31] <GothAlice> >:D
[21:04:43] <Goopyo> sounds very interesting
[21:04:52] <GothAlice> Goopyo: https://gist.github.com/amcgregor/2712270 is my working prototype (running on my /chat/ link, above).
[21:04:55] <Goopyo> my number 1 devops policy: if nginx can do it... make it do it
[21:05:34] <GothAlice> Goopyo: Uses webapp-mediated access to https://pushmodule.slact.net endpoints (internal redirect access only).
[21:06:02] <GothAlice> (Using X-Accel-Redirect returned if the webapp likes you; also obfuscates the channel names.)
[21:07:58] <GothAlice> Goopyo: https://github.com/bravecollective/forums/blob/develop/brave/forums/util/live.py is an example Python-side publisher for these channels; used by the forums software I wrote to live-update threads you have open when new comments are added (or comments are deleted/modified/etc.)
[21:09:36] <Goopyo> this is too cool. Thanks
[21:09:42] <GothAlice> :D
[21:29:08] <FIFOd[a]> { [MongoError: cursor killed or timed out] name: 'MongoError' } Any idea what would cause that? I'm iterating through my collection fine with cursor.nextObject() and then suddenly the cursor is killed
[21:29:45] <GothAlice> FIFOd[a]: Cursors are only allowed to live for a certain period of time before they are automatically cleaned up. You must have been .next'ing that cursor for nearly 10 minutes, though…
[21:30:00] <FIFOd[a]> GothAlice: I'm nexting for almost an hour
[21:30:10] <GothAlice> FIFOd[a]: http://docs.mongodb.org/manual/core/cursors/#closure-of-inactive-cursors
[21:30:29] <GothAlice> Ah… but you *waited* more than 10 minutes between two invocations of .next…
[21:30:32] <FIFOd[a]> GothAlice: It's not inactive though, or it doesn't seem to be.
[21:31:37] <GothAlice> Test cursor.alive (I believe). I think you'll find it's being cleaned up.
[21:32:10] <GothAlice> Well, either test cursor.alive and re-query when needed or handle the exception to re-query.
[21:34:34] <FIFOd[a]> I have like a 250k items in the collection. The documentation says it needs to be idle for 10 minutes, but that's not really possible unless the cursor is exhausted. According to the docs, if the cursor is exhausted I would not see that error.
[21:37:15] <GothAlice> FIFOd[a]: I'm not sure… the cursor *may* have been exhausted. I'm not sure how .nextObject handles it in your case. (In Python raising StopIteration from a generator being consumed by a for loop will exit the loop gracefully…)
[21:37:43] <GothAlice> You may need to check cursor.alive and cursor.hasNext yourself before calling nextObject.
[21:38:00] <FIFOd[a]> The documentation implies to me that there will not be an error but that with nextObject(function(err, obj) {}); obj will be null
[21:38:55] <FIFOd[a]> http://mongodb.github.io/node-mongodb-native/api-generated/cursor.html#nextobject
[21:39:17] <GothAlice> Hmm.
[21:39:22] <GothAlice> Alas.
[21:39:43] <FIFOd[a]> documentation is pretty poor :(
[21:40:06] <GothAlice> And all of my long-running (multi-second queries) are also maxTimeMS limited… so my code expects and handles timeout errors.
[21:40:39] <FIFOd[a]> There is no default maxTimeMS?
[21:41:00] <GothAlice> AFIK no, other than the standard 10 minute idle killer.
[21:41:14] <GothAlice> (Which is wholly separate from maxTimeMS.)
[21:41:29] <FIFOd[a]> I'm logging timestamps so I can see if it's 10 minutes between the last nextObject and the one that fails
[21:41:41] <FIFOd[a]> Too bad it takes an hour to recreate the failure
[21:41:43] <GothAlice> Must be doing some serious processing on that.
[21:42:37] <FIFOd[a]> GothAlice: just changing how the collections are organized. So much for being schema-less =)
[21:42:49] <FIFOd[a]> *how the objects in the collection are organized
[21:43:15] <GothAlice> FIFOd[a]: http://www.slideshare.net/montrealpython/mp24-the-bachelor-a-facebook-game — see slide 13. XD
[21:43:34] <GothAlice> Schema-free migrations are a pain point for many. :)
[21:43:49] <sheki> hey guys does anyone have any thoughts on why "$and" queries are not picking the right index.
[21:44:11] <sheki> i posted to the mailing list to no avail
[21:44:17] <Boomtime> sheki: can you give an example?
[21:44:43] <sheki> Boomtime: https://gist.github.com/sheki/b47180a562975993a48e
[21:44:44] <Boomtime> also, do an example query using the mongo shell and add .explain(true) to the end, put the output in a gist
[21:44:53] <sheki> ya i have the explain in the gist
[21:45:04] <sheki> each find is followed by the index
[21:45:08] <sheki> s/index/explain
[21:46:36] <florinandrei> when doing db.serverStatus(), what is the meaning of network.numRequests? I thought it's the sum of everything under opcounters.*, but obviously that's not true. Clues, please?
[21:48:30] <Boomtime> erk, competing indexes
[21:49:33] <sheki> hmm, but the implicit form does a better job.
[21:49:38] <Boomtime> you know that your "good case" is different right?
[21:49:46] <sheki> i don't
[21:49:52] <sheki> can you explain
[21:49:56] <Boomtime> "$or":[ {"_id":"NOvRVdV0ru","deviceIds":"id123"} ],
[21:50:07] <Boomtime> versus:
[21:50:08] <Boomtime> { "$or":[ {"_id":"NOvRVdV0ru"}, {"deviceIds":"id123"} ] }
[21:50:18] <Boomtime> the top one is the "good case"
[21:50:31] <Boomtime> which essentially solves to removing the $or
[21:50:35] <sheki> yes, but logically isn't it the same.
[21:50:40] <Boomtime> since there is only one actual condition
[21:50:50] <Boomtime> no, these are quite different
[21:51:04] <Boomtime> the second one is a $or, the first one is a $and
[21:51:40] <sheki> ok, in an ideal world they should return the same results, correct?
[21:51:46] <Boomtime> no
[21:52:07] <Boomtime> you know the difference between a logic "and" and a logic "or"?
[21:52:48] <sheki> satisfy both, satisfy any ?
[21:52:54] <Boomtime> right
[21:53:47] <Boomtime> if you have these two documents: a = {"_id":"NOvRVdV0ru","deviceIds":"none"}, b = {"_id":"yo","deviceIds":"id123"}
[21:54:03] <Boomtime> the first query pattern will not match either of them
[21:54:14] <Boomtime> the second query pattern will match both
[21:55:13] <sheki> i think the first one should match both, not clear why it wont
[21:55:54] <Boomtime> yes, that is exactly what i'm saying
[21:56:09] <sheki> if i had to rewrite my good case to use "$and" it would result in the same query ?
[21:56:11] <Boomtime> "$or":[ {"_id":"NOvRVdV0ru","deviceIds":"id123"} ], <- this will only match a document which has both fields set to these values
[21:56:22] <Boomtime> the $or is a red herring, there is no or condition here
[21:57:13] <Boomtime> your "bad" performing query looks like this: "$or":[ {"_id":"NOvRVdV0ru"}, {"deviceIds":"id123"} ]
[21:57:47] <Boomtime> that is a real $or, there are two possible paths to satisfying the query, indeed it must be fully evaluated twice and then the results sets merged
[21:58:03] <sheki> aha, right, good catch
[21:58:49] <Boomtime> the only reason you need $and is to include the _id field more than once, otherwise it is implicit
[21:59:21] <sheki> _id field or any other field for that matter.
[21:59:31] <sheki> it is also easier to generate $and queries programatically.
[21:59:35] <Boomtime> right
[22:00:16] <sheki> i think there is a bug in how these queries are being generated, i will look into.
[22:00:21] <sheki> thanks for the patience
[22:00:31] <Boomtime> cheers
[23:09:34] <trco> Hello all. I am using mongo 2.4.8 and using the aggregation framework to group and count by a field. No problem. Now I would like to be able to a) put this into a collection (I don't have the $out command because that didn't come out until 2.6+) and b) manipulate the results (get a count of the count). Can anyone point me in the right direction?
[23:10:53] <joannac> 1. save the output of aggregation into a variable, and insert the result field into a collection
[23:11:42] <joannac> 2. just run another stage of the pipeline to get the count of the count?
[23:12:00] <trco> OH! If I could get it into a collection then that would work.
[23:12:13] <joannac> you don't need it in a collection
[23:12:17] <Boomtime> trco: do you want the results or just the "count of the count"?
[23:12:25] <joannac> you can have multiple group stages
[23:12:29] <joannac> also what Boomtime said
[23:12:49] <trco> I would like the results and then I would like to play with them. I think I actually would like them as a collection
[23:13:07] <Boomtime> right, then what joannac said
[23:13:10] <trco> How would I insert the result part of the var into a collection?
[23:13:57] <joannac> foo = db.foo.aggregate(...)
[23:14:04] <joannac> db.bar.insert(foo.result)