PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Saturday the 5th of January, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:24:05] <fommil> hi all – if I do an update on a document, but don't specify some of the fields, will they be deleted?
[00:25:50] <drjeats> fommil: by default, yes. use $set (I believe that's the one, check the docs) to add/remove
[00:26:05] <fommil> drjeats: ok thanks
[00:41:04] <fommil> if I do a sorted search and go through the cursor, and this takes say 5 minutes or so – what will happen if the collection is edited in that time?
[00:45:41] <fommil> also, does the timeout work for the whole search, or just for each next() call of the cursor?
[00:45:59] <fommil> I'm really struggling to find this info from the docs
[02:02:49] <Dr{Who}> anyone know what a bson type of 120 is? I am storing what I think is a copy of a bson oid type in a BSONElement but it seems to change from 7 to 120 I am thiking its not a copy but a pointer to the real record? so what is 120 and what is a safe way to copy an BSONElement
[02:04:58] <Dr{Who}> meh. ya just saying BSONElement A = B does not actually copy the data looks to be by ref as soon as i do a c.next() it changes to 120
[02:12:22] <Dr{Who}> ok i see part of the issue. I do copy the BSONObj using .copy but then extract to an element the _id once I change my copy it breaks. So I need a real copy of my field in my case _id something I can keep around a bit.
[02:13:04] <Dr{Who}> or do I have to just settle with a full copy of the document?
[02:41:34] <Guest_1448> is it possible to group documents by a field and return a result with the documents keyed by field?
[02:43:50] <Guest_1448> e.g. if I have [{a: 'b', foo: 'bar'}, {a: 'b', y: 'z'}, {a: 'c', blah: 'blah'}] I want mongo to group by 'a' field and return {'b': [{a: 'b', foo: 'bar'}, {a: 'b', y: 'z'}], 'c': [{a: 'c', blah: 'blah'}]}
[02:55:56] <Guest_1448> https://gist.github.com/dc0affbc01ed075a9dd1
[02:56:05] <Guest_1448> first is the collection
[02:56:15] <Guest_1448> second part is what I want the results to be
[02:56:20] <Guest_1448> how can I do that?
[02:56:46] <Guest_1448> if I group by 'type'
[02:57:29] <Guest_1448> if I use .group(), it only returns that one key that I used for grouping
[02:57:33] <Guest_1448> I want the whole document
[03:01:55] <Guest_1448> ?
[03:11:34] <Guest_1448> is it possible?
[03:37:39] <Guest_1448> what's better - .group(..) with a function to count or .aggregate(..) with count: {$sum: 1} ?
[03:40:01] <timeturner> count is faster I believe
[03:42:40] <Guest_1448> I see, thanks
[03:56:37] <Guest_1448> and regarding my original question, I managed to do that using .group() + reduce function
[03:58:14] <sander__> What limitations does mongodb have compeard to eg. mysql?
[04:01:03] <sander__> No foreign keys and no joins?
[04:02:35] <sander__> How do I effectively search/combine (like joining) documents in mongodb?
[04:07:45] <timeturner> you have to do that on the client side
[04:07:47] <Guest_1448> you know there's a google
[04:08:35] <timeturner> you'll have to run queries serially on the app side to do joins
[04:34:57] <Dr{Who}> collection.stats does not seem to include anything that is clear to be a unique id of the collection. does such a best exist?
[04:35:50] <Dr{Who}> s/best/beast/
[07:22:13] <dawra> http://docs.mongodb.org/manual/applications/read/
[07:22:49] <dawra> i read this, but i cannot figure out how to do a query like WHERE field != '0' AND field > 0
[08:29:35] <dawra1> got DC
[08:29:39] <dawra1> but no answers i guess
[11:55:55] <Kakera> is it possible to remove multiple documents by id?
[11:57:46] <hever> Hello, as mongo db provides using Array.unique in Map/Reduce, I'm asking myself what kind of JavaScript Version or JavaScript Frameworks works serverside because Array.unique is not a "standard" JS function.
[12:06:20] <hever_> Hello, as mongo db provides using Array.unique in Map/Reduce, I'm asking myself what kind of JavaScript Version or JavaScript Frameworks works serverside because Array.unique is not a "standard" JS function.
[12:58:29] <wereHamster> heroux: mongodb uses spidermonkey 1.8
[13:09:37] <hever_> Can I somehow achieve that data that have just one value are ignored and not part of the result ?
[13:11:35] <hever_> I'm using map and reduce and if there are data with just one value there's nothing todo for reduce (and it's not passed to reduce) so in finalize I've to change the data but I'm going to skip the data there from the result or by some flag perhaps...
[13:12:43] <kali> hever_: there is something fichy here. reduce should not change the format of the data, just aggregate it
[13:12:58] <kali> hever_: precisely because some data can go through the reduce 0 to N times
[13:13:14] <hever_> hmmm
[13:13:39] <kali> hever_: so basically, the idea is, if one key is emitted only once, it's value as output of map should be in the final state
[13:17:38] <kali> hever_: also, if you can achieve whatever you're doing with the aggregation framework, it will be way faster
[13:19:41] <kali> hever_: also, if you can achieve whatever you're doing with the aggregation framework, it will be way faster
[13:19:49] <kali> not sure you got that last time
[15:29:10] <bizzle> Does anyone know how in java, I can not have an insert attempt throw and exception on a duplicate key error? I would like to just handle the application by doing commandResult.getLastError.ok()
[15:29:25] <bizzle> and avoid using try { } catch { }
[15:38:36] <Guest_1448> if I do db.coll.find({...}).limit(100)
[15:38:59] <Guest_1448> is it possible to find out the total amount of documents that the query would return IF I hadn't applied the limit?
[15:39:06] <Guest_1448> I just want to get a 'max' value for paging
[15:39:44] <ron> nope. you'd have to do a count() separately.
[15:40:27] <Guest_1448> can I do foo=db.coll.find({...}); foo.count() then continue calling .limit() etc on the 'foo' cursor?
[15:41:50] <ron> that, I do not know. sorry.
[15:43:40] <Guest_1448> ok, thanks anyway
[15:51:34] <skot> yes, count does not change foo, but simply copies stuff out of it to run the count command.
[15:51:57] <skot> see foo.count — no parens to see what it does.
[15:52:23] <skot> (assuming you are using the shell …)
[15:55:25] <Guest_1448> ah good
[15:56:11] <Guest_1448> is there a way to 'apply' the options to a cursor later then? like .find(..).options({sort:'x', limit: 200}) instead of .find(..., options)
[15:56:25] <NodeX> no
[15:56:44] <NodeX> but that is a good idea
[16:15:22] <skot> yes, there is in the shell.
[16:15:41] <skot> it is find(…).limit(…).sort(…).ext
[16:16:02] <skot> It just mutates or changes fields on the local cursor instance.
[16:16:14] <skot> you cannot do this once you have started to return results from the cursor
[16:16:17] <skot> see the docs
[16:18:36] <Guest_1448> no, not the .limit() etc methods
[16:18:40] <Guest_1448> but like passing an object
[16:21:54] <skot> you mean changing the query?
[16:23:04] <skot> anyway, the cursor is just a holder for fields which will be sent to the server before you start getting results.
[16:27:45] <Guest_1448> not the query, the options object that holds the sort/limit/skip options
[16:28:06] <Guest_1448> I want to do a .find(query), get the count then apply the rest of the options/paging stuff and carry on
[16:42:22] <igotux> hey guys.. am getting this on a slave node :- "[replslave] all sources dead: data too stale halted replication, sleeping for 5 seconds"
[16:42:40] <igotux> can someone help me how to make this slave node up2date ?
[16:53:23] <kali> igotux: replsave ? you're still using the old master/slave replication ?
[17:01:16] <igotux> kali: this is for migrating to a new RS...
[17:01:50] <igotux> currently our master is small.. so thought of porting this data to a bigger node first and then we will make this slave node master and start replica set
[17:02:50] <kali> i assume this is a live system ?
[17:02:58] <kali> you can't affors much downtime ?
[17:03:52] <igotux> nope
[17:04:02] <igotux> this is a real time production systemm...
[17:04:05] <kali> ok
[17:04:47] <igotux> it synces almost all of the data to new system and around at the end, it started showing this error and stops replication...
[17:05:26] <kali> so you had one master and one slave, but you outgrowed them, so you're trying to substitute the slave by a bigger one, and plan to use this slave as a seed to a replica set ?
[17:06:50] <igotux> our DB is growing pretty fast and now we need to convert it to a RS ... so we are using a big node and made as slave of current master.. while current slave is also syncing from current master...
[17:06:58] <igotux> but in the bigger node, am seing this issue
[17:07:16] <igotux> these bigger node, i'll use as RS node
[17:07:33] <igotux> and add more fat nodes to this bigger node.. sounds good ?
[17:08:46] <igotux> the idea is to get rid of current master -> slave nodes
[17:09:04] <kali> sure
[17:09:21] <kali> i don(t know what this error means :/
[17:09:44] <igotux> does this help :- http://pastie.org/pastes/5627196/text?key=wgf5lm3cza2yithp9mcbba
[17:09:56] <kali> master / slave is so old and deprecated, most people with live experience are dead and burried :)
[17:10:10] <igotux> lol
[17:10:39] <kali> i think i would try to degrade the current master/slave to standalone, and seed the replica set from there
[17:11:14] <igotux> you mean, start the current master with a RS name ?
[17:12:19] <kali> mmm yeah: 1/ make the current master a standalone node 2/ check everything is fine 3/ restart the standalone node with a rs name, add a node, and hope the sync will be fast enough
[17:13:17] <kali> igotux: but i've never done that: i don't know how "broken" is a master slave if you try to "downgrade" it to standalone
[17:14:16] <igotux> there is no change on master side if you make it a standalone na.. as per my understanding, a master node is a standalone node as well..correct ?
[17:14:42] <kali> i don't know, i have no experience on the old master/slave system
[17:15:29] <kali> but i have some on RS, so at least, i might help you once we're on this terrain :)
[17:15:30] <igotux> master and slave are running on 2.2.1
[17:16:24] <igotux> kali: in the pasteit, "Sat Jan 5 08:32:12 [replslave] repl: nextOpTime Jan 5 06:26:37 50e7c79d:4 > syncedTo Jan 5 05:26:56 50e7b9a0:1c"
[17:17:11] <kali> it looks like your oplog is not deep enough
[17:18:07] <igotux> 32G is oplog size in master
[17:18:51] <kali> what does db.printReplicationInfo() says on the master ?
[17:21:00] <igotux> kali: http://pastie.org/private/aczsefqk24euf769i0q
[17:21:50] <kali> igotux: so it's a bit less than two hours
[17:22:00] <kali> how long did the initial syn take ?
[17:22:16] <igotux> sync takes 4-5 hours
[17:22:45] <kali> ok. that's your problem
[17:22:59] <kali> well, probably
[17:24:00] <igotux> it synced most of the data... and start throwing this error only towards end...
[17:24:51] <kali> yes, it synced to a state 4 or 5 hours stale, and the oplog is not deep enough to close the gap
[17:24:53] <igotux> can you explain the relation ship of oplog < 2 hours and sync getting broken ?..can you pls explain
[17:25:25] <igotux> so if i restart oplog with a bigger value, it will fix this ?
[17:25:53] <kali> i think so. make the oplog 128G, assuming you have the disk space, and restart the sync
[17:25:55] <igotux> i mean restart primary with a biggger oplog value, we will be able to fix it ?
[17:25:57] <kali> that should do it
[17:26:48] <igotux> or what about , we sync the data manually from the backup mongodump to the new node and make the new node as slave..
[17:26:57] <igotux> in tht way, it just needs to do small catch na..
[17:27:02] <igotux> not form begining
[17:27:24] <igotux> will that work ?
[17:28:50] <igotux> you got what am trying to do ?
[17:28:55] <kali> yes
[17:29:13] <kali> but just checking the documentation, it does not says you can perform a resync with a dump
[17:29:37] <kali> and honestly, if the initial syn took 5 hours, i doubt you can perform a dump and a restore in less than two hours
[17:29:51] <kali> your best bet is to increase you oplog
[17:30:17] <igotux> ok... got it...will let you guys know my finding
[17:30:52] <igotux> btw, i can slave from a slave na ?
[17:32:08] <kali> igotux: the docs mention slave chaining...
[17:33:07] <igotux> sure..ty
[22:15:37] <timah> hey all… new to room… welcome criticism… please /slap if necessary...
[22:15:45] <meghan> hi timah
[22:16:04] <timah> hi! not sure if smalltalk is frowned upon here.
[22:16:16] <timah> hence the brevity.
[22:17:30] <timah> i've thoroughly combed the inter webs for an answer, some clues, anything really.
[22:18:52] <timah> i'm utilizing mongodb for logging event logging.
[22:19:45] <timah> specifically usage events… mobile ios/android apps, mobile web, full web...
[22:21:25] <timah> my event documents look something like this: { _id: { g: 123, o: Object() }, a: 0, b: 1, c: 2, d: 3 }
[22:24:55] <timah> that's the new design… the previous design (created by someone other than myself) :) looked something like this: { _id: ObjectId(), properties: { some_key: 'some_static_option_value', some_other_long_key: 'some_other_long_static_option_value', etc: 'etc_etc' } }
[22:27:49] <timah> my new design includes the application layer handling the on-the-fly transformation (compacting, really) of both keys and values for all of this static data.
[22:31:56] <timah> by utilizing multiple collections (g.1.e - events, g.1.c - counts/aggregates), as well as including the second level grouping id in the document._id itself i've exponentially reduced the overall size of the data (dataSize?), as well as improved performance.
[22:33:22] <timah> however… it seems as though either the preallocation or nssize is forcing my uber tiny documents to still pad much more than necessary.
[22:34:23] <timah> in size that is… if i'm not being clear i apologize… it's not really all that complicated… i'm really just trying to figure out why the size on disk doesn't seem to change much, even though the size of my documents has changed alot.
[22:35:24] <Dr{Who}> timah: did you rebuild the entire collection or just modify it on the fly?
[22:36:19] <timah> this is without any modifications post-transformation (old>new).
[22:36:25] <Dr{Who}> timah: db.collection.stats() ->pastbin
[22:36:38] <timah> kk.
[22:37:29] <Dr{Who}> timah: you may try exporting all the bson data and then re importing or just doing a repair on the collection to see if it changes.
[22:38:31] <timah> i'll try that first.
[22:38:40] <Dr{Who}> big collection?
[22:39:11] <Dr{Who}> just be prepared for global locks for as long as it takes.
[22:40:02] <timah> dev. env.
[22:40:07] <Dr{Who}> k
[22:43:50] <Dr{Who}> timah be warned btree indexes in general on any database tend to fall apart for insert speed when you get into the millions or billions of records.
[22:44:51] <timah> yeah… speaking of… so what i've done is partitioned using collections.
[22:45:11] <Dr{Who}> ya that works from my experience.
[22:46:35] <Dr{Who}> a recent downside I am finding is collections dont seem to have any unique id so it can be hard to manage the shards.
[22:47:43] <cad> Are pymongo lists atomic? For ex. if I append something to a list in an existing document. Than save() it. Will this be atomic ?
[22:49:20] <timah> publishers > merchants (o:m), merchants > consumers (m:m), consumers > events (m:m)
[22:49:56] <timah> p.1.e (p=publisher,1=mysqlId,e=event)
[22:50:15] <timah> p.1.c (p=publisher,1=mysqlId,c=count).
[22:51:34] <timah> p.1.e: { _id: { m: merchantId, o: objectId }, a: 1, b: 2, c:3 }
[22:52:32] <timah> p.1.c: { _id: { m: merchantId, d: timestamp }, a: 321, b: 654, c:987 }
[22:52:46] <timah> follow?
[22:53:39] <Dr{Who}> cad: until you save afaik no communication happens so the original document is not touched.