[00:06:00] <nooga> or do i have to retrieve whole a
[00:16:08] <geoffeg> how do i get a secondary to reconnect to a different secondary for syncing? i just took one secondary into recovery for compaction and there are two other secondaries syncing off it. which are now falling behind
[01:51:36] <doxavore> To be sure I understand correctly, using the 10gen apt repo, you can only install the latest version?
[01:52:15] <doxavore> So do people actually use that, or does everyone just install the same version from the binary zip to keep their servers running the same version?
[02:17:19] <svm_invictvs> Can Mongo run in memory only?
[02:48:56] <vsmatck> svm_invictvs: wassssssup! .. Mongo uses a memory mapped file. If you keep your database smaller than your physical memory it's basically in-memory.
[02:49:11] <vsmatck> But there's no option to explicitly do it.
[04:01:40] <svm_invictvs> Hm, I"m having trouble doing an usert
[04:02:25] <svm_invictvs> If I have a document like {$id:"some_id", foo:"bar"} and I want to ensure that for all documents foo is unique, how would I go about doing that?
[04:02:41] <svm_invictvs> I'm looking at examples for findAndModify, and I'm striking out.
[07:59:20] <ShishKabab> A unit test I wrote just broke in a way that puzzles me. I think it is best explained by some (Python) code. Could anyone tell me why this fails? http://pastebin.com/iJNcKrP9
[08:00:10] <NodeX> are you likely to be doing 250million inserts in a 4gb windows 7 box constantly ?
[08:13:04] <cubud> I'd be likely to scale it up to about 32GB, but I would expect Mongo to degrade performance rather than to crash
[08:21:20] <cubud> Using YouTube video comments as an example. If a web page were to show a list of 10 videos including "Total number of comments", "Total thumbs up", and also "Total thumbs down" would I run a map-reduce on the action data (liked, commented, etc) or would it be better to periodically count them up and store them in the Videos collection object?
[08:22:48] <NodeX> I would store that hot data inside an embedded object in the document
[08:22:58] <NodeX> then page the rest out to a comments collection
[08:23:31] <cubud> So a new comment would go into a comments collection + store the most recent 20 in the video? Or do you mean store the comment count in the video?
[08:23:35] <NodeX> I would strongly suggest you store the count of the comments in the document too as count() of collections when using queries is costly
[08:27:37] <cubud> I am just reading a page on conflict resolution, it seems that there might be a programmatic way of dealing with 2 comments added to the same video on different DB servers
[08:27:46] <NodeX> it does mean you'll have 20 comments of duplicate data for every video but it stops the need for a second query and HDD's are cheaper than RAM so it will scale more efficiently
[08:33:15] <cubud> The purpose is for load balancing. You could have servers on opposite sides of the world serving IP ranges geographically close to the client and then replicating between themselves. It also means there is no single point of failure because the client will try another master node if it is unable to reach the one it used last
[08:33:39] <cubud> It also receives load statistics from the master and so knows which server will give the fastest response for the next request
[08:33:41] <NodeX> cubud : then use cassandra if it's more suited to your needs
[08:37:09] <NodeX> always remember that all these things bleed thru app facing caches
[08:37:26] <NodeX> they then trickle to the databases and shards over time via queues
[08:37:50] <NodeX> so as long as the caches are close to the client then the app never slows
[09:01:24] <mbuf> is there a recommended way to run tests for a Rails project that uses MongoDB? any specific object mapper that you suggest that can also be used with the testing?
[09:45:42] <NodeX> but you should keep the total comment count anyway for paging
[09:46:02] <cubud> Yes that wouldn't be too difficult
[09:46:15] <NodeX> so you'll allways know... you can then do in one query $pop : where your count $gt : 20
[09:47:14] <cubud> yes I suppose I could run two queries couldn't I? Push the comment, inc the CommentCount, then another query to pop a comment but with a filter "CommentCount >= 10"
[09:47:45] <NodeX> It's probably more efficient to always push then every so often just pop the last few
[09:48:14] <NodeX> you're only every going to read the last 20 anyway so what does it matter if an extra 5 creep through if it maintains a faster app
[09:48:24] <NodeX> you're not going to echo them out to the client
[09:49:19] <cubud> What I might do is add them anyway, and have a thread which selects the videos with the highest RecentCommentCount then pops and resets the RecentCommentCount to 0
[09:50:01] <cubud> Comments, CommentCount, and RecentCommentCount - I'll think it through
[09:50:18] <cubud> I am certainly going to have to change my app from a Domain based one :)
[09:51:12] <cubud> If I have a Video collection with the info in, then a VideoComments collection I could simply push the latest comment to the Comments list and inc CommentCount, and then when paging select a slice
[09:51:29] <cubud> That sounds promising doesn't it?
[09:52:27] <NodeX> you're collections are capped to a size limit, at present it's 16mb
[09:54:06] <NodeX> take my advice and store the first page of comments in both the Video document and the comments collection
[09:54:46] <cubud> Yes, considering what you have just told me about a 16mb limit and having to read all comments into memory even when splicing I think that would be best
[09:55:18] <NodeX> even not regarding that 1 query is better than 2
[09:58:51] <cubud> The web app yes. The DB initially, but might move it to Linux if the load is high
[09:59:03] <cubud> I will use ASP MVC for the web front end
[10:04:19] <cubud> By "Domain" I mean that I currently load a C# object in its entirety, update stuff, save it all. For Users this is okay, but for other stuff (such as Video) I will need my app to update individual column values only
[10:04:45] <cubud> I like this! I hope MongoD doesn't crash again in a couple of hours :)
[10:07:20] <_johnny> [11:57] < NodeX> I can tell you;re a windows based programmer from the CamelCasing <- lol, NodeX :)
[10:15:01] <_johnny> btw, i converted all my xml data (3gb) to json (800mb). roughly 2,4 mio docs, imported in half an hour. i have restored faith in mongoimport :)
[10:16:00] <_johnny> the biggest time delay was my ignorance (big surprise, huh) though. i was querying the upper case names with a compiled regex, rather than just the regex itself
[10:16:30] <_johnny> compiled added 0.5 secs of latency! removed it, got 0.008 in average on queries. d'oh
[12:14:10] <NodeX> I'm just guessing because I would never run mongo on a 1gb RAM box but I would hazard a guess that some of the internals that go on normal in RAM are happening on disk
[12:15:15] <Littlex> okay, so what would be the minimum of ram you would assign?
[14:40:50] <ShishKabab> Is there any way to query on a subobject without key order mattering? Because I use Python with Pymongo, the key order of my objects are not guaranteed. So if I do a query like db.coll.find({x: {y: 1, z: 2}}) it randomly fails. See http://pastebin.com/Mwpqx72g .
[14:41:56] <circlicious> 343 eople no one helps :(
[14:42:21] <circlicious> can someone hlp me with this - https://gist.github.com/53039e8b3f209759d091 ?
[14:44:57] <algernon> circlicious: you'll have to do that client side (or use m/r), I think. Unless the new(ish) aggregation framework can do that.
[14:46:28] <algernon> looks like the aggregation framework can help, at a first blink: http://docs.mongodb.org/manual/applications/aggregation/
[14:47:47] <algernon> it's probably more straightforward to do it on client side, though.
[14:47:49] <circlicious> oh NodeX was talking about this ? ok will have to read. but i am not using 2.1, i am using 2.0.6 i think, algernon
[14:48:03] <circlicious> can you help me with map/reduce to achieve this?
[14:48:04] <algernon> well, client side it is then :)
[15:28:56] <scott1> I tried posting a question to the google group yesterday, but it appears not to have shown up. Is there some kind of moderation that I'm not getting to, or am I just delusional about having posted?
[15:32:50] <Derick> scott1: first post by a new person is indeed moderated - sadly, I don't have access to the moderation queue
[15:33:28] <scott1> thanks Derick. I suppose I will just have to be patient then.
[16:03:45] <Vile> I need an idea. I have an hierarchically arranged collection (using materialized path). I need to do a m/r on it, but… for proper processing of each document i need all of its parents
[16:07:27] <Vile> basically, i need to emit parent node for each of its direct and indirect child nodes (and for itself of course)
[16:08:28] <Vile> question is - how can i do that?
[17:51:41] <thedahv1> Anybody had to implement a change/audit-log on documents?
[17:51:47] <thedahv1> I'm thinking about how I want to design it
[17:51:56] <thedahv1> Might be nice to throw some ideas against the proverbial wall and see what sticks
[23:26:46] <vsmatck> Ah, I see what is going on there. In that first example it's fetching the document and decrementing that "qty" field.
[23:27:09] <svm_invictvs> So I'ma ssuming that I'd basically do something like, original = find("_id":"myId"); changes = original.copy(); /* make changes */ insert(original, change, false, false);
[23:27:10] <vsmatck> The update command is trying to find a document with a specific _id and a specific qty. If it can't find that document it won't update.
[23:32:19] <vsmatck> yeah, it seems like this should work. With update.
[23:32:20] <svm_invictvs> I just tried using my local shell, seems to work alright.
[23:40:22] <crudson1> owen1: so you want not just the full documents that a certain user is in, but just their bits in answers array for all documents
[23:45:01] <rboyer> if w=majority is specified for a write, is that measured off of the same N as the majority vote calculation for Election? or something else?
[23:45:21] <fbjork> is it possible to return only matching sub array elements?
[23:45:25] <rboyer> my gut would say it's the majority of N, where N is the total number of electable members (master, and non-hidden secondaries)
[23:45:34] <rboyer> but i can't find reference documentation to corroborate that
[23:47:24] <crudson1> owen1: if so you could do something like: db.u.aggregate({$match:{'answers.user':'josh'}}, {$unwind:'$answers'}, {$match:{'answers.user':'josh'}})
[23:48:57] <jarrod> is aggregate used in favor of groups now?