[04:39:23] <nemothekid> Is there a way to pre split the chunks before data insertion?
[04:39:53] <nemothekid> (I want to write 10 million records to an empty sharded collection, but I don't want all the data to hit a single shard)
[07:36:15] <k610> if i for my collection i want to specify the "_id" myself, should i { "_id" => "mystring" } or { "_id" => BSON::ObjectId("mystring")}
[13:14:46] <efazati> how update currecnt cur in foreach?
[13:15:42] <efazati> some thing like this -> db.books.find().forEach(function(obj) { obj.update({$set: {'circulation':sum}}) })
[13:40:32] <Bilge> Didn't I read somewhere that it is possible to run mongodb in a mode where it will reject all queries that are not indexed?
[15:27:14] <OliverJAsh> I have a model called "users" which needs to know how many "posts" that user has. Would it be a good idea to create a property in the "users" collection which represents a "posts" count, and update that every time a new post comes in? How would you do it?
[15:29:15] <kali> OliverJAsh: it's a perfectly valid idea, but you have to do it yourself, in your application code.
[15:30:05] <OliverJAsh> kali: i wasn't sure if it would be best to create a property on the schema, or just count the posts every time a "user" is requested — so it it sent along to the API but it doesn't persist. what's better?
[15:31:36] <kali> OliverJAsh: as i said, it makes sense to have a post counter field on the user document
[16:57:36] <joshua> Anyone familiar with flushRouterConfig command for mongos? Any issue running that periodically?
[17:48:30] <scoates> Given this http://paste.roguecoders.com/p/c11acea40731540ae4da79f38799a675.txt , does anyone know why 1) indexOnly is false, and 2) without the hint(), it uses a different index that has `owner` but not the other fields… ?
[17:52:24] <Derick> scoates: you need _id => 0 in the projection
[17:58:22] <scoates> for #1: what's the projection bit?
[17:58:48] <Derick> "indexOnly" only shows up for when your "projection" (select field1, field as opposed to select *) can be satisfied by data coming just from the index
[17:59:00] <Derick> ie, if you only want owner, backup and date fields
[18:06:49] <scoates> Derick: "millis is a number that reflects the time in milliseconds to complete the query." <-- that's the time it takes to complete the actual query, not the time it takes to explain(), right?
[18:06:55] <Derick> unless you're using the Aggregation Framework
[18:07:17] <Derick> scoates: hmm, how is that different?
[18:08:00] <scoates> I guess that's kind of what I'm asking. I don't know if/how explain()'s inputs/outputs might be cached.
[18:08:51] <scoates> the cursor might be generated differently (as it is for .limit()) if there's an explain() before a read, too (but I hope not)
[18:09:45] <joshua> That data is there if you ask for it or not I thought, cause the profiler will give you some of it for slow queries
[18:10:30] <joshua> Check this out https://github.com/TylerBrock/mongo-hacker
[18:10:37] <scoates> joshua: limit() was a bad example there. sort() definitely affects a query between find and read
[18:11:10] <scoates> Derick: anyway, this is what I get (without hint) now: http://paste.roguecoders.com/p/6b47429ea920fdb9fe24c508c2817bd6.txt … thanks for the help
[18:11:42] <scoates> I just need to adjust the app to use `from_backup: {$in: [true, undefined]}` instead of the current `from_backup: {$ne: [false]}`
[18:12:46] <scoates> hmm. I wonder how to do undefined (or if that's the same as null) in PHP land.
[18:13:21] <scoates> oh. I'm a dummy. Was using the wrong query in my shell anyway.
[18:23:20] <Derick> perhaps try rebuliding the index then? should save you a lot of space too
[18:23:26] <scoates> yeah. old index; survived through the upgrades
[18:23:46] <scoates> yeah. that's on the list. there are a bunch. annoying I have to do them offline, though
[18:23:54] <scoates> it takes around an hour to do each one.
[18:24:04] <scoates> (30 mins per node in that collection, and currently 2 nodes)
[18:29:21] <joshua> I have a database with too many indexes at the moment, and I don't write the app. Going to have to go through and determine what needs to be removed. Fun
[18:31:49] <scoates> Derick: "however, when trying the indexes, the first one that has 101 matches, "wins"" <-- even if the n and nscanned are very different, and even if millis is much longer than others returning 101?
[18:34:22] <scoates> joshua: I'd started writing a tool for that (based on the profiler data), but never got it to useful-to-share state. /-:
[18:36:27] <scoates> joshua this might help: http://paste.roguecoders.com/p/05ae5ea9da6b1237bc362b3aaea129f0.txt
[18:36:34] <joshua> There is this one https://github.com/wfreeman/indexalizer but the problem is in production so I don't know if I can turn on profiling. Its annoying not having control over everything
[18:36:41] <scoates> haven't used it in over a year, though, so it probably needs some tweaking.
[19:05:05] <zhodge> docs say import with --jsonArray can't be bigger than 16mb either
[20:19:52] <Snii> Hi, I have a selection like {{id:a, rev:0},{id:a, rev:1},{id:b, rev:0},{id:b, rev:1},{id:b, rev:2}} - How can i make a query to get only the highest rev of each id? like {a, 1} and {b, 2}?
[20:32:56] <Snii> Can it be done by a normal find(), or do I need an aggregation or group or something similiar?
[21:44:14] <preinhei_> I guess i could use the FQDN, and put the internal IP in /etc/hosts
[21:44:29] <Derick> because the driver will talk to 192.168 and see it wants voltron, and will create another connection to that from the replset config
[21:47:38] <preinhei_> so, i want $mongo_host = "washington.wonderproxy.com:27017,mongo.voltron.wondernetwork.com:27017,sanantonio.wonderproxy.com:27017,montreal.wonderproxy.com:27017";
[21:47:43] <Snii> I'll try once more before going to sleep: I have a selection like {{id:a, rev:0},{id:a, rev:1},{id:b, rev:0},{id:b, rev:1},{id:b, rev:2}} - How can i make a query to get only the highest rev of each id? like {a, 1} and {b, 2}?
[21:50:56] <preinhei_> thank you kindly, let's see what weird problems crop up now
[21:51:33] <preinhei_> this paste from earlier, it talks about connecting to washington, then says it hasn't found a server http://paste.roguecoders.com/p/11745a23d2514e57578344f84f128112.txt
[21:51:39] <preinhei_> Could you explain to me what went wrong in there?
[21:56:25] <nemothekid> I am downgrading my cluster to 2.2.4 from 2.4.2. Should I downgrade the config servers as well?
[21:58:44] <preinhei_> nemothekid: out of curiosity, why the downgrade?
[22:09:45] <Derick> preinhei_: your paste doesn't show about washington, or something
[22:10:04] <Derick> ismaster: the server name (washington.wonderproxy.com:27017) did not match with what we thought it'd be (127.0.0.1:27017)
[22:10:35] <Derick> yes, but 127.0.0.1 is wrong...
[22:11:08] <Derick> in /etc/hosts you need to have it set to the internet facing IP
[22:11:20] <Derick> maybe I'm confused now though...
[22:12:20] <preinhei_> I've switched everything I could find to the connection string we discussed
[22:12:44] <preinhei_> but the connection string seems to be a lot more fragile than I expected
[22:13:19] <fl0w> I'm doing some reading on MongoDB, and I'm thinking about using it for an experimental project of mine. If I understood things right (comming from a traditional relational DB background), to truly utilize the power of mongo, I shouldn't normalise my data - but say I have a blog posts created by user "John Doe", I shouldn't link it (again, to gain the power of mongo) - but what if the user John changes his name, and I need to update all the records
[22:13:19] <fl0w> based on this name? What if I have multiple John Doe's (though still different users)?
[22:16:08] <Derick> preinhei_: hmm, as long as you have the same names as in your repl set config, and all the members can access them under the same name and IP, it should work
[22:16:23] <Derick> preinhei_: I'm off to bed though... it's getting late
[22:19:46] <preinhei_> fl0w: I'm making it up as I go along, but I'd store a bit on the user there (their name, id and anything else you need to display when showing that blog post), and yes update if a user changes his name
[22:20:05] <preinhei_> Really, you're saving a bunch of joins every single time a user loads a page
[22:20:26] <preinhei_> at the expense of a larger operation on the rare instance that someone changes their name
[22:20:42] <fl0w> preinhei_: I should however keep a "foreign key"-like identifier though, because otherwise, I'm extremely confused.