PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 19th of January, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:12:16] <jumpman> Is there a way to structure an .update so I can use $addToSet on a field an object may or may not have?
[00:12:35] <jumpman> Currently: DAL.UsersDAO.collection.update({_id: userId}, {$addToSet: {roles: permission}});
[00:22:07] <jayjo> Boomtime: Thanks for the response, sorry I had walked away
[00:24:22] <jayjo> so I'll use the one-to-many relationship with document references, but if I go this way I'll be keeping objects connected through the objectid, should I create a separate id variable to be used in my url and keep the reference through objectid?
[00:25:10] <jayjo> Sorry - dumb question. I've figured it out just reading it again
[02:29:21] <huleo> hi
[02:29:42] <huleo> this surely will be simple - I'm taking my first steps with aggregation framework
[02:30:15] <huleo> using $geoNear operator as first element of the pipeline
[02:30:41] <huleo> which is cool - it takes "query" argument, to actually use it when querying for documents to process
[02:31:19] <huleo> but sometimes I'm not querying by geo-location...how will I "find" documents by basic, simple query, to pass further to the pipeline?
[02:31:40] <joannac_> erm, just, $match ?
[02:32:04] <joannac_> same syntax as you would pass to a normal db.collection.find()
[02:32:15] <huleo> ooh, again
[02:32:17] <huleo> just found it
[02:32:19] <huleo> yup, match
[02:32:21] <huleo> $match*
[02:32:30] <huleo> always amazes me how quickly I can find solution to my problem
[02:32:34] <huleo> after I ask about it on IRC
[02:32:37] <huleo> :P
[02:32:52] <joannac_> clearly irc channels have magic powers ;)
[02:37:29] <Jonno_FTW> hello, can I store a date independent timestamp?
[02:40:05] <huleo> hmm
[02:40:08] <huleo> [{"$match":{"something.smthelse":{"$gt":13}}}]
[02:40:22] <huleo> this doesn't work - empty result
[02:40:37] <huleo> works fine with .find()
[02:40:41] <huleo> hmm
[02:43:20] <huleo> [{"$match":{"something":{"$elemMatch":{"smthelse":{"$gt":"2005-07-17T20:23:22.247Z"}}}}}]
[02:43:22] <huleo> this neither
[02:43:30] <huleo> (query by date field)
[02:44:23] <joannac_> huleo: can you pastebin a document that matches with a find(), but not the aggregation $match?
[02:44:31] <joannac_> also, your full aggregation pipeline
[02:45:08] <huleo> joannac_: one sec
[02:50:47] <huleo> http://jsfiddle.net/g5m4zyd7/
[02:52:29] <huleo> oh...
[02:52:57] <huleo> I was passing numberish timestamp to the query instead of new Date object
[02:52:59] <huleo> (js here)
[02:53:25] <huleo> that's the difference here that I didn't account for
[02:55:56] <joannac_> huleo: yeah, your elemMatch works for me here. I was about to ask why you're storing as strings rather than Date objects
[02:56:49] <huleo> joannac_: it's something like "publishedSince" filter, value of it given as number of seconds
[02:57:02] <huleo> therefore Date.now - publishedSince was just simpler
[02:57:07] <huleo> and didn't make any difference, until now that is
[03:03:25] <huleo> hmm
[03:03:34] <sgo11> hi, I just want to log all user queries. I did "db.setProfilingLevel(2)". When I do "db.system.profile.find().pretty()", it shows there are many queries against "test.system.indexes" and "test.system.profile" (I don't know what triggers those queries). after a few minutes, 400 queries against "indexes" and "profile" ns compared to only 7 queries against my actual collection. how can I disable logging "indexes" and "profile" queries?
[03:03:42] <huleo> $project - will I actually be able to /exclude/ specific fields, instead of including them one-by-one?
[03:03:47] <huleo> (not talking _id here)
[03:15:10] <lisangang> hi room
[03:16:04] <TheAncientGoat> Being lazy here, but does anyone have an estimate of the perf/efficiency benefits of using the aggregation pipeline over just manually grouping fetched data in node for a datataset of a couple k docs?
[03:27:34] <huleo> TheAncientGoat: ++
[03:28:21] <huleo> TheAncientGoat: all I can speak about is my case - and here aggregation saves me plenty of tinkering to get distance of each result (2dsphere) from the point I'm querying against
[03:28:24] <Boomtime> TheAncientGoat: that has a lot of variables
[03:28:56] <Boomtime> huleo: that sounds like you are benefitting from the geo API not strictly the aggregation part
[03:30:21] <huleo> Boomtime: do you know any other way to get calculated distance in the results?
[03:30:41] <huleo> (aside from calculating it myself)
[03:31:20] <TheAncientGoat> My case is basically grouping transactions by day, a lot less LOC through aggregation, but not sure what the perf considerations are regarding it
[03:33:09] <Jonno_FTW> is it wise to have very large documents? for example, my data looks like: site_number: {date:[{2010-01-01:[{time:10:00,readings:[{detector:1,count:50}]}]}]}}, where there is ~2m individual readings, spread across different sites,dates and times
[03:35:04] <Boomtime> huleo: I assume you mean you specifically use "distanceField" in your pipeline, although that is aggregation specific, it is the geo engine that supplies the value and does the calculation - for you, this does mean that aggregation has significant convenience over doing the calcs client-side
[03:35:50] <huleo> Boomtime: exactly, no way to get distanceField with "regular" find() AFAIK
[03:36:03] <Boomtime> clearly however, the calculation of distance is performed in regular query since $geoNear doies precisely this - you just don't have access to the value
[03:36:09] <huleo> (other than calculating it manually all over again)
[03:36:14] <Boomtime> right
[03:40:21] <Boomtime> TheAncientGoat: there are still lots of variables, and it depends also on what you actually desire to "optimize" - do you want to save network bandwidth or server resources?
[03:44:27] <TheAncientGoat> I have a lot of concurrent, low bandwidth queries for other data. Not sure if I'll be choking at bandwidth or CPU first, but the connections to the clients could be pretty slow, so both are a factor. Aggregation will save bandwidth, I can see that, I just don't know at what cost
[03:46:09] <Boomtime> you're not going to get a definitive answer, it depends on CPU, memory, disk IO (current load of previous three, and how much spare), how many documents are involved, how frequently this is run...
[03:46:17] <Boomtime> you should test it
[03:46:32] <Boomtime> the pipeline you describe sounds pretty straight forward
[03:47:57] <Boomtime> if you match, sort and group on fields which are indexed, then the pipeline can take advantage of that - something the remote client cannot - but the difference might not matter, depending on the numerous variables I mentioned before
[03:48:28] <Boomtime> aggregation is probably the right thing to do, but you should test it
[03:49:36] <Boomtime> btw, it's not just bandwidth, it's also latency - if you have more than a few hundred documents, the cursor has to go back to the server to get more - if you group these, you send less and subsequently get fewer round-trips to the server
[03:50:32] <TheAncientGoat> That's good insight, thanks
[04:10:17] <Jonno_FTW> anyone?
[04:11:10] <morenoh149> Jonno_FTW: depends
[04:11:30] <morenoh149> 2m entries in a single document? how large is the whole document?
[04:11:46] <morenoh149> if it excceeds 16mb you need to fragment it
[04:12:42] <Nilium> I've had tons of fun with 16mb documents.
[04:14:27] <morenoh149> Nilium: you're asking for trouble. Just make it a top-level document
[04:15:01] <Nilium> It's not my doing.
[04:15:15] <Nilium> I'm just maintaining the awful code that produced said documents, and it makes me want to do a sick flip off a building.
[04:16:17] <Nilium> Nobody should ever have to experience the things I've experienced.
[09:34:35] <sjose_> Hi, to update value in an embedded document..?
[09:40:18] <sjose_> Hi, to update value in an embedded document..?
[09:40:53] <sjose_> Hi, how to update value of an item in an embedded document..?
[09:46:08] <Derick> db.colname.update( { _id: ... }, { '$set' : { 'fieldname.subfieldname' : 42 } } ); ought to do... it's here in the docs: http://docs.mongodb.org/manual/reference/method/db.collection.update/#update-parameter and http://docs.mongodb.org/manual/core/document/#document-dot-notation
[10:40:04] <nelasx> hi! my mongodb DB got big, is there a way to shrink the white space and reclaim HDD space?
[10:42:51] <kali> a repair can help, for a while
[10:43:43] <kali> if you have a replicated setup, you can also do a full resync of the secondaries
[10:43:57] <nelasx> kali: i have a 3 shard cluster
[10:44:24] <kali> ok, what what are the shards made of ? single servers or replica sets ?
[10:44:36] <nelasx> kali: single servers
[10:44:59] <nelasx> kali: i will just extend the lvm, and latter in the week dump the data to a new replica set
[10:46:13] <nelasx> kali: will these work out? http://blog.mongolab.com/2014/01/managing-disk-space-in-mongodb/
[10:46:51] <kali> yeah
[10:47:10] <kali> just be aware that compact and repair will trigger significant downtime
[10:47:42] <nelasx> it produces locks?
[10:47:48] <kali> yes
[10:48:09] <nelasx> so adding a new LVM disk its the best way
[10:48:33] <kali> yeah, growing your partition is a good option
[10:49:13] <kali> moving to replica set would be a nice plan if you can afford it
[10:49:37] <kali> that way you can take down one replica at a time to run a repair (or do a resync)
[10:50:00] <sjose_> Hi, how to update value of an item in an embedded document..?
[10:50:28] <kali> sjose_: $set and dot notation, i guess
[10:52:18] <sjose_> kali, I want to update an embedded document using its Id is it possible using $set...?
[10:52:53] <kali> yes. you also need the positional operator in this case ( array.$.field )
[10:59:17] <sjose_> kali, just see my document here... http://justpaste.it/iz3p
[10:59:52] <sjose_> kali, I just want update the trainings using its id
[11:00:17] <kali> sjose_: db.blah.update({ "trainings.id" : ...}, { $set: { "trainings.$.date" : ... }})
[11:00:21] <kali> something like that
[11:00:51] <sjose_> kali, let me check that...
[11:00:59] <StephenLynx> for you to use the $ on the update block, you would have to also have something to query on the math block
[11:01:17] <StephenLynx> oh
[11:01:20] <StephenLynx> nvm
[11:01:39] <StephenLynx> it is correct
[11:02:00] <StephenLynx> but I think you would have to use $elemMatch on the match block.
[11:02:19] <kali> StephenLynx: nope. i don't think so.
[11:02:21] <sjose_> kali, StephenLynx : as trainings is a embedded document how can I locate that first...?
[11:02:41] <StephenLynx> using dot notation, like kali wrote
[11:20:04] <sjose_> kali, If I want to update the embedded item as a whole using its id ..how shall I do it...???
[11:20:47] <sjose_> all fields of embedded item together..?
[12:12:44] <sjose_> Hi, how to replace an embedded item using its id
[12:13:45] <StephenLynx> what do you mean by embedded?
[12:13:51] <StephenLynx> an object in a field?
[12:13:56] <StephenLynx> or in an array?
[12:35:16] <mnms_> Im trying to sum field for all my documents in collection. It takes much time. So I added single field index on this field, but it also didnt help
[12:35:55] <mnms_> Collection has 90mln documents.
[12:36:11] <kali> yeah, nothing will make it significantly better. your best option is to denormalize
[12:37:00] <mnms_> kali: what does it means denormalize ? Cause for me it cannot be more denormalized..
[12:39:40] <kali> store and maintain this sum separately
[12:42:32] <mnms_> kali: can I check somehow if query use specific index without executing query ?
[12:43:59] <mnms_> kali: cause waiting till its over is sometimes annoying
[12:45:11] <kali> explain option should work on the aggregation pipeline if you're running 2.6
[12:46:50] <mnms_> Im running 2.6 but I need to finish the query which is very long
[12:47:37] <mnms_> and my question is is there any way to check if query will use index without waiting til its over
[12:48:29] <kali> the explain option is all there is
[13:23:16] <zhiyajun> hi there, what's covered query?
[13:24:53] <zhiyajun> help, please
[13:27:07] <zhiyajun> anyone know that?
[13:31:09] <zhiyajun> Is anyone there?
[13:32:45] <zhiyajun> Am I in the wrong channel?
[13:33:21] <zhiyajun> It's not about the mongodb?
[14:37:21] <sjose_> StephenLynx, Please have a look over here 'http://justpaste.it/iz7v', here 'trainings' is an embedded document. so can I update trainings with "_id" : ObjectId("54bcaf4eba09015284646f6d") with all new values in an easier way....?
[16:25:54] <Raffaele> Hello. I've noticed there seems to be some mismatch between engine_v8-3.25.cpp vs engine_v8.cpp up to the point that the former doesn't even compile because of silly errors like string and set missing std::
[16:26:09] <Raffaele> I wonder if this sounds completely new to anybody
[17:07:54] <toter> Hi everybody... I'm running a collection against another to find missing ['company'] values. The first collection has 345 documents and the second one, 20285. This operation is taking 63 seconds to complete... Is there anything I can do to improve this execution time? Code: http://hastebin.com/ifayexesuv
[19:49:16] <sekyms> I have a question regarding Schema as it relates to this: http://docs.mongodb.org/ecosystem/use-cases/product-catalog/#schema
[19:49:40] <sekyms> that I just answered by looking more carefully.
[19:52:05] <sekyms> is there a good place to find mongodb contractors
[20:09:41] <morenoh149> sekyms: I'm a contractor :)
[20:09:51] <morenoh149> how much can you afford ;P
[20:10:42] <sekyms> well of you are going to play that game so can I :-p
[20:10:52] <sekyms> I cant pay you now but I can offer you pizza
[20:10:54] <sekyms> jk
[20:11:15] <tbo_> "I can't pay you but it'l look great in your cv"
[20:11:20] <sekyms> heh
[20:11:26] <sekyms> IT WILL LOOK AWESOME
[20:11:35] <sekyms> Remember those painters who painted the facebook office?
[20:11:48] <sekyms> That will never be you, stop dreaming
[20:11:49] <morenoh149> srs tho
[20:11:57] <morenoh149> the.r3dm.com <- thats me
[20:12:40] <sekyms> is there anything about you?
[20:12:45] <sekyms> besides the contact form :D
[20:15:27] <morenoh149> sekyms: PM
[20:15:30] <morenoh149> harrymoreno.com
[20:15:31] <sekyms> saw it
[20:15:41] <sekyms> how would you rate your mongo experience
[20:15:54] <sekyms> 0 - 999,997,431
[20:16:00] <sekyms> 999,997,431 being the lowest
[20:16:03] <sekyms> 0 being the highest
[20:17:39] <morenoh149> 7777777
[20:20:19] <sekyms> how about 1-10 with 10 being the highest
[20:23:23] <morenoh149> 7
[20:25:51] <sekyms> oh WPI, I see
[20:25:59] <sekyms> couldn't get into MIT?
[20:26:03] <sekyms> I KIIID I KIIIID
[20:27:52] <morenoh149> :p missed the application date actually
[20:53:57] <j0k3r_> hi guys...
[22:35:14] <FunnyLookinHat> Is it fairly trivial to move a 2/1 ( member/arbiter ) set to a 3/0 ( primary with two secondary members ) ?
[23:10:13] <mortal1> howdy folks, say have a collection foo. this collection has a document with an id of 1, and a object of bar that I'd like to remove
[23:10:55] <mortal1> so db.foo.remove({_id:"1"},{"bar":1});?
[23:17:40] <morenoh149> mortal1: should the id be unique?
[23:17:48] <morenoh149> and you only want to remove the bar?
[23:34:25] <liamkeily> I want to append to an array in mongo. But want to make sure its not overwritten by another process. Whats the recommended way to do this?
[23:35:16] <Boomtime> liamkeily: $push or $addToSet