[00:23:53] <GothAlice> T_T Today was the first day I've ever incinerated security tokens within minutes of receiving the package. All because of this one conditional: if (!pw1.isValidated() && pw1_modes[PW1_MODE_NO81]) ISOException.throwIt(…)
[00:25:34] <GothAlice> My kingdom for a set of parenthesis.
[01:10:22] <bros> How can I query for only the length of an array in a subdocument, not the actual contents?
[01:23:09] <GothAlice> Unwind emits one document per array element to the next pipeline stage, duplicating the other fields in the document.
[01:23:28] <GothAlice> I.e. {foo: 1, bar: [2, 3]} -> $unwind on bar -> {foo: 1, bar: 2}, {foo: 1, bar: 3}
[01:26:27] <bros> What do I do if I want to query certain fields of the documents I also want to aggregate?
[01:27:55] <GothAlice> bros: http://docs.mongodb.org/manual/core/aggregation-introduction/ < the documentation is quite in-depth, and there are numerous tutorials available on Google.
[01:28:14] <GothAlice> (In the case of querying fields, that's a $match pipeline stage.)
[01:32:23] <bros> I think the logic I want to achieve is too complicated to fit into a match/group/project sort of aggregation.
[01:32:24] <bros> for all orders matching X stores within the time span of Y and Z,
[01:32:24] <bros> loop through all orders, according which user the order, its scans, misscans, and time elapsed belong to
[01:32:24] <bros> if the order belongs to a batch, divide the time elapsed by the number of orders in the batch
[01:38:03] <GothAlice> Step 1: get it done using plain application logic. Step 2: benchmark it and find out where it's slow as a dog. Step 3: optimize the slowest bits. If attempting to refactor into a single monolithic map/reduce or aggregate, you have a "known good" process to compare against.
[01:40:31] <bros> I got it done in plain application logic. It falls apart in the fact that it takes 3 seconds over 200 records.
[02:11:13] <JonGorrono> Anyone know a way to disable auto-creation of db' s in mongoshell? .... erytime I mistype a use <db name>... erytime
[02:14:51] <joannac> JonGorrono: "use abc" does not create a database for me
[02:29:33] <JonGorrono> I guess not I thought it was there since it gets dropped :
[03:57:27] <dunkel2> is it possible to sort by a field string with -number suffix? lie mystring-1 mystring-2 mystring-10
[05:10:07] <dreamdust> People say skip() is not performant because mongo has to walk the whole collection… but isn't that true of any find query ? What I mean is, is I have a find() query with some constraint based on an indexed field, is adding skip() to it any *less* performant?
[05:12:20] <dreamdust> IE find({ some: 'indexedField'}) and find({ some: 'indexedField'}).skip(0) should have the same performance, correct?
[06:06:26] <morenoh149> dreamdust: you should test it but I'd image first it does the filtering then the skipping. Wouldn't make much sense if it worked otherwise, unless mongodb intends for you to not use it that way
[06:06:41] <morenoh149> this is pretty in depth http://docs.mongodb.org/manual/reference/method/cursor.skip/
[10:03:37] <Johnade> hi everyone great to see a channel about MongoDB :)
[10:07:25] <Johnade> i'm using nodejs with express, i'm trying to render an image (jpg) from db to my browser
[10:23:03] <Johnade> someone is using mongodb with nodejs right here ?
[11:29:48] <newbsduser> spliting data and using 4 mognodb instance on the same machine VS using all data with single mongodb instance on the same machine
[11:49:06] <Johnade> there is a lot of people and not so much of them talkiing
[12:05:07] <deathanchor> most of us are happy with our mongo usage :)
[12:08:34] <Johnade> you're the proof of late messages lol
[12:08:49] <Johnade> but yes it's cool to be happy with mongo
[12:16:42] <m4th> does anyone know if there is a clean way of doing a tailf-like on a collection that is not capped (i.e tailable cursors are not an option) ?
[12:22:09] <deathanchor> yeah, but I don't think you can get cursor that will just keep polling for new entries.
[12:22:12] <StephenLynx> why do you need it to be non capped?
[12:22:40] <StephenLynx> because I just read you can make a tail for capped collections, and capped collections will make room for new documents when needed.
[12:22:50] <m4th> StephenLynx: I have disk space, I want the collection to store as much as data as possible, and I can lvresize if ever I need it
[12:23:05] <m4th> StephenLynx: I read that already
[12:23:20] <StephenLynx> sec, gonna search what is lvresize
[13:10:22] <StephenLynx> yes, for example. but I would have look at it and try to do it by myself.
[13:10:57] <StephenLynx> it doesn't seem to be too complex to need a dependency.
[13:11:26] <Johnade> yes i used it before it's cool. but i will to work with a lot of json, but it will be simple, no inscription at all, just posting and reading list of ad
[13:11:44] <Johnade> but i would like a auto refresh when there are news ad posted
[13:37:00] <StephenLynx> afaik, wiredtiger is not the default db and several people wouldn't recommend it for production.
[13:37:24] <StephenLynx> even tough 10gen uses it for production.
[13:48:37] <bros> How can I aggregate over multiple subdocuments in one document?
[13:49:19] <bros> I have a collection called orders with subdocuments called: log, scans, misscans, and shipments. I want to extract the size of each of these subdocuments over a set of data that's about ~300 entries long without overloading my servers.
[13:50:41] <StephenLynx> use a fixed _id for the outputted document.
[13:55:06] <jigax> hi everyone. was wondering if someone could kindly help me with this gist https://gist.github.com/fpena06/206d67acf4a5b4d3cd20
[13:57:46] <griffindy> does anyone here have experience running mongo with many many collections in one db? i'm talking about hundreds of thousands
[13:58:03] <StephenLynx> GothAlice how come I never saw OVH on your list? They have great prices from what I noticed both for small stuff https://www.ovh.com/us/vps/vps-classic.xml and dedicated are pretty great too.
[13:58:17] <StephenLynx> griffindy you have dynamic collection creation?
[13:59:18] <greyTEO> jigax, what help are you looking for? What is the problem?
[13:59:28] <GothAlice> StephenLynx: A VM bulk storage does not make.
[13:59:52] <StephenLynx> griffindy having hundreds of thousands is only possible if you have dynamic collection creation, which is a HUGE no no.
[14:00:15] <jigax> trying to run a async.waterfall to query some data then update it. but when i save it i dont see the upate.
[14:00:46] <griffindy> StephenLynx out of curiosity, what's wrong with dynamic collection creation (not trolling, I just haven't found anything online)
[14:01:00] <StephenLynx> its impossible to maintain.
[14:04:41] <StephenLynx> and impossible to document.
[14:04:41] <jigax> greyTEO: no they are not. i actually did a console log after each update to see the lines and update was made
[14:09:03] <griffindy> StephenLynx one of my colleagues was under the impression it would offer better performance for indices fitting in memory
[14:09:52] <griffindy> instead of one giant index, parts of which could get paged out of memory
[14:10:04] <griffindy> the docs also say "Generally, having a large number of collections has no significant performance penalty and results in very good performance. Distinct collections are very important for high-throughput batch processing."
[14:13:23] <mrmccrac> maybe but if you're having to constantly query from all these multiple collections wont it be the same net result?
[14:15:50] <griffindy> i think my colleague's thoughts are if one collection is much hotter than the others, it would be easier to hold just that one index in memory, rather than a subset of the entire index, if that makes sense
[14:16:13] <griffindy> I'm also not 100% certain how mongo behaves with its indices when they can't fit in memory
[14:16:35] <GothAlice> StephenLynx: No joins needed if you aren't silly about it (i.e. client segregation means one _doesn't_ want data pollution) and it's automatically handled by the app; a new client signs up, it populates the indexes in a new collection.
[14:16:54] <GothAlice> A little "automation" goes a long way. ;)
[14:17:59] <griffindy> although at the end of the day, there seems like a very hard cap of ~3m namespaces
[14:42:41] <jigax> guys i've honestly been googling all night to try and figure this one out and can't find much. I have a async waterfall which queries a line price and is supposed to update the line price. I can see the change in the first asymc call when doing a console.log, however when saving the document in the second call the document is being saved without the price that was previously updated.
[14:52:17] <jr3> is there a convention to follow on schemas? like I prefer to fully qualify an id so: new CarSchema({ carId: Number }) vs new CarSchema({ id: Number})
[15:06:58] <jigax> can someone please help me figure out why this gist ins't working as expected https://gist.github.com/fpena06/206d67acf4a5b4d3cd20 thanks
[15:10:17] <snowcode> Is possible to use an expression inside the $group aggregate operation. Let me explain. I've a group operator which count a set of field by day (number of events per day): { $group: { _id: { $dayOfYear: "$time"}, eventscount: { $sum: 1 } }} Now I would to get a detailed info: number of events in this day with a property x > value and number of events in this day with a property y > value2
[16:30:06] <GothAlice> There are some notes regarding requirements to use that function, though.
[16:31:03] <GothAlice> For use outside of a sharded cluster, see: http://docs.mongodb.org/manual/reference/command/serverStatus/#globallock-activeclients
[16:31:11] <GothAlice> (It's just less fine-grained about what it returns, there.)
[16:36:09] <GothAlice> For conPoolStats you'll need clusterMonitor, clusterManager, clusterAdmin, or root. These are the ones I checked for that command.
[16:37:03] <tpayne> isn't there a default user that has 100% access to everything?
[16:37:10] <GothAlice> clusterMonitor and clusterAdmin also give access to serverStatus.
[16:37:30] <GothAlice> The first user you add should be an "admin" (i.e. "root" role), but it isn't actually enforced, I don't think.
[16:37:47] <fontanon> Hi everybody, I'm migrating from a single-node mongodb to a mongodb-cluster. To migrate data, I added temporarily the single-node as a shard, so the sharded collections started to drain data to the cluster, but ... what about the not-sharded collections? Once I remove the old-single-node shard would the not-sharded collection be copied to the rest of the shards in my cluster?
[16:59:59] <fxmulder> this is now the second time this has happened and my week-long replica cloning is starting over for the second time now
[17:01:00] <fxmulder> I'm going to have to rsync this, if there is any part of mongodb that needs to be looked at it is replica clong, this is horrible
[17:04:12] <daidoji> hello, can I add space to the mongo data directory dynamically?
[17:04:38] <StephenLynx> I don't think it has any hard caps by default.
[18:15:47] <StephenLynx> well, how can you have a binary value for your document id? then you would just hold two documents in the collection?
[18:16:00] <StephenLynx> why did you changed that in the first place?
[18:17:02] <ah-> it's a binary string, not just a bool
[18:17:31] <ah-> and it's very convenient, since my primary key/identifier for that collection is that binary string, which doesn't fit into an objectid
[18:20:07] <ah-> it's just a sha hash, and i'm pretty sure it worked in previous experiments with python
[18:21:59] <StephenLynx> I find _id to be such a pain in the ass that I use another field to make my unique indexes.
[18:22:11] <StephenLynx> it is so much out of the norm
[18:22:32] <StephenLynx> even when you are projecting it has special rules.
[18:22:45] <StephenLynx> IMO it wasn't very well designed, this whole _id system.
[19:43:30] <christo_m> design question: i have users, and many queues belong to a user, and many queue items belong to a queue. sholud these all be nested subdocuments?
[19:48:31] <GothAlice> http://www.javaworld.com/article/2088406/enterprise-java/how-to-screw-up-your-mongodb-schema-design.html goes into a few of the factors involved in choosing when to embed.
[19:50:21] <GothAlice> From one app I wrote, MongoDB-powered forums, embedding replies to a thread in a thread? Great idea. Embedding replies in threads in forums? Terrible idea.
[19:50:41] <christo_m> GothAlice: is that because replies will be changing more?
[19:50:44] <christo_m> i mean being added frequently etc
[19:50:57] <christo_m> cheeser: is that why you said queue items should be their own collection?
[19:51:29] <GothAlice> It's a combination of factors: when looking at one thread, the user could care less about the others, so you'd need to project them out. It's also nigh-on impossible to update beyond a single nesting without losing your sanity.
[19:52:16] <christo_m> GothAlice: say i do split these collections up
[19:52:17] <cheeser> christo_m: more or less, yes.
[19:52:24] <christo_m> whats the method in joining them together after? is that what mapreduce is about
[19:52:30] <cheeser> like GothAlice said, it's a bit of a nuanced decision
[19:52:33] <GothAlice> And it'd also restrict the amount of content available per forum, not per thread. (I.e. 16MB threads = ~3.3 million words per thread. 16MB forums? 3.3 million words in _all threads in that forum_.) There are ways around this, i.e. having "continuation" documents, but that adds a lot of complexity to the app.
[21:41:43] <drags> anything recommended for visualizing shard layout besides shard-viz? browsed through mongodb-tools.com and been searching, but not coming up with much
[22:13:32] <greyTEO> GothAlice, you mentioned you replaced redis(and/or)memcache with mongo. Is it as simple as creating a collection for calculated data? Same as memcache?
[22:13:47] <greyTEO> What approach did you take to purge it?
[22:14:25] <GothAlice> greyTEO: Depends. If you want to replicate a redis queue, use a capped collection. If you want to replicate a memcache auto-expiring key-value store, use TTL index (with zero delay) on a date field you set to the expiry time.
[22:15:30] <greyTEO> I am wanting to avoid introducing a new system that I have to manage. I think I can do it mongo.
[22:17:52] <GothAlice> greyTEO: As a note, MongoDB TTL indexes are culled once a minute, and if a pass doesn't complete in time it won't clean up all expired data all at once.
[22:18:31] <GothAlice> greyTEO: Ref. my own cache implementation's "getter" which needs to handle the TTL edge case: https://github.com/marrow/cache/blob/develop/marrow/cache/model.py#L115-L133
[22:19:31] <GothAlice> Wait; that's not the right method.
[22:27:12] <GothAlice> ^_^ This lib is 100% tested across Python 2.6+ and Pypy2+ and is used in production at the moment. It's MIT. Feel free to steal any ideas that seem good. ;)
[22:36:54] <greyTEO> I will definitely reference it. Solid extension. Ill have to figure out how to navigate through python...
[22:46:20] <bros> GothAlice, Thanks for helping me bring my server usage from 100% to 5% and response times from 3s to 30ms with aggregation.
[22:51:54] <bros> How do the aggregations look? Ok?
[22:52:18] <GothAlice> bros: The .0. stuff is… potentially concerning. (I take it you are intentionally only checking the first array element in that first $match, line 70?)
[22:52:24] <bros> There's no way to check the last element of a subdocument, correct?
[22:52:35] <GothAlice> bros: There sorta is, using aggregation and some $unwind tricks.
[22:53:08] <GothAlice> $unwind on the array, $last the values you care about in a $project stage, bam, you'll (eventually) get the last value of each document's array.
[22:53:37] <GothAlice> Too many types of projection goin' on here. ^_^
[22:56:19] <bros> GothAlice, what can I do instead of the projection?
[22:56:38] <GothAlice> Wait, no, that was a comment on me getting confused, not your code.
[22:57:22] <bros> I would have never known about $project if it wasn't for you. All of the tutorials/documentation tried to lead me toward $group it felt
[23:04:42] <GothAlice> For certain edge cases involving lists, an $unwind/$group pair is effectively the only solution. I.e. your "get the last array element" issue.