PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Sunday the 21st of April, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[02:49:55] <kexmex> is WriteConcern.Unack is same as SafeMode.false?
[02:50:00] <kexmex> and Acked is SafeMode.True?
[03:43:54] <kexmex> anyone here from Ukraine?
[03:50:30] <waheedi> hello bobs
[03:51:02] <waheedi> my global lock ratio is very high
[03:51:08] <waheedi> I use indexed
[03:51:14] <waheedi> index
[03:52:36] <waheedi> http://pastebin.com/Y7nP1vnT
[04:39:23] <nemothekid> Is there a way to pre split the chunks before data insertion?
[04:39:53] <nemothekid> (I want to write 10 million records to an empty sharded collection, but I don't want all the data to hit a single shard)
[07:36:15] <k610> if i for my collection i want to specify the "_id" myself, should i { "_id" => "mystring" } or { "_id" => BSON::ObjectId("mystring")}
[13:14:46] <efazati> how update currecnt cur in foreach?
[13:15:42] <efazati> some thing like this -> db.books.find().forEach(function(obj) { obj.update({$set: {'circulation':sum}}) })
[13:40:32] <Bilge> Didn't I read somewhere that it is possible to run mongodb in a mode where it will reject all queries that are not indexed?
[13:41:30] <kali> Bilge: http://docs.mongodb.org/manual/reference/mongod/#cmdoption-mongod--notablescan
[13:47:43] <Bilge> kali: wasn't it possible to enable this per-query or per database or something?
[13:50:23] <kali> Bilge: not that i know of
[14:19:53] <fomatin> hey what's the best way to convert an objectid to a unique integer
[15:10:56] <kali> fomatin: an objectid is a unique integer
[15:12:54] <fomatin> i tried to do parseInt(doc1._id)
[15:13:23] <fomatin> but it ended up: doc1._id != doc2._id; parseInt(doc1._id) == parseInt(doc2._id);
[15:14:55] <kali> what are you trying to do ?
[15:27:14] <OliverJAsh> I have a model called "users" which needs to know how many "posts" that user has. Would it be a good idea to create a property in the "users" collection which represents a "posts" count, and update that every time a new post comes in? How would you do it?
[15:29:15] <kali> OliverJAsh: it's a perfectly valid idea, but you have to do it yourself, in your application code.
[15:30:05] <OliverJAsh> kali: i wasn't sure if it would be best to create a property on the schema, or just count the posts every time a "user" is requested — so it it sent along to the API but it doesn't persist. what's better?
[15:31:36] <kali> OliverJAsh: as i said, it makes sense to have a post counter field on the user document
[16:57:36] <joshua> Anyone familiar with flushRouterConfig command for mongos? Any issue running that periodically?
[17:47:27] <scoates> hello.
[17:48:30] <scoates> Given this http://paste.roguecoders.com/p/c11acea40731540ae4da79f38799a675.txt , does anyone know why 1) indexOnly is false, and 2) without the hint(), it uses a different index that has `owner` but not the other fields… ?
[17:52:24] <Derick> scoates: you need _id => 0 in the projection
[17:52:49] <Derick> and I doubt you need the hint
[17:53:07] <Derick> actualy, you have no projection at all!
[17:53:22] <Derick> indexOnly shows whether you are having a covering index...
[17:53:31] <Derick> so tha tMongo doesn't need to read the document at all
[17:55:54] <scoates> Derick: I don't quite understand.
[17:56:09] <scoates> first, the hint: it's a new index (last night), so maybe the statistical optimizer hasn't noticed yet?
[17:56:37] <Derick> the statistical optimiser doesn't exist
[17:57:13] <scoates> really? I thought query plans were based on trial+error + best plan…? is that old? or imaginary?
[17:57:14] <Derick> it only remembers the "best index" for 10000 queries
[17:57:23] <scoates> ah, yeah. that's what I meant.
[17:57:23] <Derick> but that gets reset whenever you add/drop an index
[17:57:40] <scoates> I added the index with the offline build approach (RS). Maybe that has something to do with it?
[17:57:40] <Derick> your index really might just not be better
[17:57:47] <Derick> shouldn't really
[17:57:52] <Derick> but, that could be a bug :-)
[17:57:55] <scoates> heh
[17:58:15] <scoates> ok. so... let's skip #2 for now.
[17:58:17] <Derick> (i've never tried)
[17:58:19] <Derick> yeah, as for one
[17:58:22] <scoates> for #1: what's the projection bit?
[17:58:48] <Derick> "indexOnly" only shows up for when your "projection" (select field1, field as opposed to select *) can be satisfied by data coming just from the index
[17:59:00] <Derick> ie, if you only want owner, backup and date fields
[17:59:05] <scoates> ah. that makes sense.
[17:59:25] <scoates> ok. sec.
[17:59:36] <Derick> so you need to add { _id: 0, owner: 1, backup: 1, date: 1 } to your query (As 2nd argument)
[18:00:02] <scoates> right. I actually need more data. I just misunderstood indexOnly (thought it had to do with scanning)
[18:00:12] <Derick> nope
[18:00:22] <scoates> so, with the hint, I get n=17960 and nscanned=17960
[18:00:29] <joshua> Ah so it still has to pull the results from non-indexed data unless you add the projection to limit it. Hmm
[18:00:40] <Derick> joshua: yes
[18:00:48] <Derick> scoates: that looks good
[18:00:52] <scoates> without the hint (different index), I get n=17866, nscanned=22094
[18:01:11] <scoates> so my (hinted) index is actually better, right?
[18:01:18] <Derick> yeah, that *can* happen
[18:02:05] <Derick> I have a slide for the index finding, one sec
[18:02:12] <joshua> Are hints something available from the API to put in code or for the shell, and will it still work if the index doesn't exist?
[18:02:37] <scoates> joshua: at least some drivers have a hint()
[18:02:41] <scoates> PHP's: http://php.net/manual/en/mongocursor.hint.php
[18:03:15] <joshua> I guess python too http://api.mongodb.org/python/current/api/pymongo/cursor.html
[18:03:35] <scoates> yeah. I'd guess it's there in most or all drivers.
[18:03:38] <Derick> gah, can't find it
[18:03:46] <joshua> " Hinting will not do anything if the corresponding index does not exist." that answers my other question
[18:04:02] <scoates> Derick: in production, this query has a limit(), too. I wonder if that affects the `best` logging in the query planner
[18:04:58] <Derick> yes, it might
[18:05:00] <scoates> so, I think my #1 question is resolved (I just misunderstood indexOnly), but #2 about hinting is still confusing.
[18:05:07] <Derick> I don't quire recall the algorithm
[18:05:16] <Derick> however, when trying the indexes, the first one that has 101 matches, "wins"
[18:05:35] <Derick> depending on the case, the index that you think works best is not the one that hits 101 items first
[18:05:56] <scoates> heh. it's hitting it without the hint if I add limit()
[18:06:13] <scoates> so I guess that answers my question (about limit affecting it)…
[18:06:35] <joshua> There was a section on the online course where they detailed the order of find operations like sort/limit etc.
[18:06:36] <Derick> :-)
[18:06:48] <Derick> that's always find, sort, limit
[18:06:49] <scoates> Derick: "millis is a number that reflects the time in milliseconds to complete the query." <-- that's the time it takes to complete the actual query, not the time it takes to explain(), right?
[18:06:55] <Derick> unless you're using the Aggregation Framework
[18:07:17] <Derick> scoates: hmm, how is that different?
[18:08:00] <scoates> I guess that's kind of what I'm asking. I don't know if/how explain()'s inputs/outputs might be cached.
[18:08:51] <scoates> the cursor might be generated differently (as it is for .limit()) if there's an explain() before a read, too (but I hope not)
[18:09:45] <joshua> That data is there if you ask for it or not I thought, cause the profiler will give you some of it for slow queries
[18:10:30] <joshua> Check this out https://github.com/TylerBrock/mongo-hacker
[18:10:37] <scoates> joshua: limit() was a bad example there. sort() definitely affects a query between find and read
[18:10:43] <scoates> s/query/cursor/
[18:11:10] <scoates> Derick: anyway, this is what I get (without hint) now: http://paste.roguecoders.com/p/6b47429ea920fdb9fe24c508c2817bd6.txt … thanks for the help
[18:11:42] <scoates> I just need to adjust the app to use `from_backup: {$in: [true, undefined]}` instead of the current `from_backup: {$ne: [false]}`
[18:12:46] <scoates> hmm. I wonder how to do undefined (or if that's the same as null) in PHP land.
[18:13:21] <scoates> oh. I'm a dummy. Was using the wrong query in my shell anyway.
[18:17:53] <Derick> sorry, was away for a bit
[18:18:12] <Derick> scoates: explain is just running the query
[18:18:34] <Derick> scoates: NULL is not undefined
[18:18:39] <scoates> I'm still confused by the hint()
[18:18:54] <Derick> hint() will just make mongodb use the index you specify
[18:18:54] <scoates> Derick: yeah. undefined was wrong anyway.
[18:19:02] <Derick> ok :)
[18:19:15] <scoates> right. I meant I'm confused by why my hint works better than the query planner.
[18:19:42] <scoates> http://paste.roguecoders.com/p/78710a2160a1eb4fe21849e6065bebad.txt
[18:20:13] <scoates> maybe it's the `$in` ?
[18:21:06] <Derick> shouldn't be.. $in can use an index too
[18:21:29] <Derick> sorry, can't explain it :S
[18:22:05] <scoates> the other index is v0… was created on MongoDB 1.x
[18:22:21] <scoates> <shrug/> thanks for looking anyway, Derick
[18:22:59] <Derick> np
[18:23:06] <Derick> 1.x ? :-)
[18:23:20] <Derick> perhaps try rebuliding the index then? should save you a lot of space too
[18:23:26] <scoates> yeah. old index; survived through the upgrades
[18:23:46] <scoates> yeah. that's on the list. there are a bunch. annoying I have to do them offline, though
[18:23:54] <scoates> it takes around an hour to do each one.
[18:24:04] <scoates> (30 mins per node in that collection, and currently 2 nodes)
[18:29:21] <joshua> I have a database with too many indexes at the moment, and I don't write the app. Going to have to go through and determine what needs to be removed. Fun
[18:31:49] <scoates> Derick: "however, when trying the indexes, the first one that has 101 matches, "wins"" <-- even if the n and nscanned are very different, and even if millis is much longer than others returning 101?
[18:34:22] <scoates> joshua: I'd started writing a tool for that (based on the profiler data), but never got it to useful-to-share state. /-:
[18:36:27] <scoates> joshua this might help: http://paste.roguecoders.com/p/05ae5ea9da6b1237bc362b3aaea129f0.txt
[18:36:34] <joshua> There is this one https://github.com/wfreeman/indexalizer but the problem is in production so I don't know if I can turn on profiling. Its annoying not having control over everything
[18:36:41] <scoates> haven't used it in over a year, though, so it probably needs some tweaking.
[18:36:51] <scoates> meh. just turn it on (-;
[18:37:07] <scoates> (just kidding)
[18:37:18] <scoates> that'll be much harder to do if you can't turn profiling on.
[18:37:53] <joshua> I'll have to go back to the app dev and the person who did the load testing (they seem to keep track of it).
[18:38:06] <joshua> When I am not around people start messing with the database. heh
[18:39:00] <scoates> (IIRC to use that script, you need to copy the profiler collection into a not-capped collection (because of the updates))
[19:04:02] <zhodge> getting "exception:JSONArray file too large" from mongoimport
[19:04:11] <zhodge> is there a limit on how big the json array can be?
[19:04:24] <zhodge> I know no individual document can exceed what is it 16mb?
[19:04:55] <zhodge> oh nevermind
[19:05:05] <zhodge> docs say import with --jsonArray can't be bigger than 16mb either
[20:19:52] <Snii> Hi, I have a selection like {{id:a, rev:0},{id:a, rev:1},{id:b, rev:0},{id:b, rev:1},{id:b, rev:2}} - How can i make a query to get only the highest rev of each id? like {a, 1} and {b, 2}?
[20:32:56] <Snii> Can it be done by a normal find(), or do I need an aggregation or group or something similiar?
[21:07:37] <robinson_k> hi
[21:07:54] <robinson_k> any node mongoose users here?
[21:37:27] <preinhei_> Derick: ping?
[21:37:33] <Derick> heh
[21:37:37] <Derick> I just tweeted you
[21:37:40] <preinhei_> saw it
[21:37:45] <Derick> wow
[21:37:45] <preinhei_> It's been a rough pair of days
[21:37:48] <Derick> fast!
[21:37:50] <preinhei_> yeah
[21:38:02] <preinhei_> thanks :)
[21:38:04] <Derick> I saw your email thread too - that got resolved, right?
[21:38:08] <preinhei_> it did
[21:38:10] <preinhei_> though
[21:38:15] <preinhei_> I don't understand why that fix worked
[21:38:36] <Derick> i just read the first and last post in the thread ;-)
[21:38:43] <preinhei_> Didn't work: 'host' => "127.0.0.1:27017,sanantonio.wonderproxy.com:27017, montreal.wonderproxy.com:27017",
[21:38:51] <preinhei_> sorry
[21:38:55] <preinhei_> that one worked
[21:38:56] <Derick> you need the FQDN
[21:39:10] <Derick> exactly as it says in the repl set config
[21:39:23] <Derick> now you end up connecting to four hosts
[21:39:32] <Derick> 127.0.0.1 - and the 3 members of the repl set
[21:39:55] <Derick> i think you had washington not resolve correctly or something
[21:39:58] <preinhei_> right
[21:40:02] <preinhei_> yeah
[21:40:06] <preinhei_> washington is 127 on that host
[21:40:14] <preinhei_> http://paste.roguecoders.com/p/11745a23d2514e57578344f84f128112.txt
[21:40:15] <preinhei_> is recent
[21:40:19] <preinhei_> and it really baffled me
[21:40:26] <Derick> 17, mon
[21:40:29] <Derick> has a space?
[21:40:36] <preinhei_> yeah
[21:40:46] <Derick> that will cause issues
[21:40:46] <preinhei_> which is also apparently bad
[21:41:09] <Derick> Couldn't connect to ' montreal.wonderproxy.com:27017'
[21:41:38] <preinhei_> Is this acceptable?
[21:41:39] <preinhei_> $mongo_host = "washington.wonderproxy.com:27017,192.168.42.3:27017,sanantonio.wonderproxy.com:27017,montreal.wonderproxy.com:27017";
[21:41:48] <Derick> no
[21:41:55] <Derick> what does rs.status() say?
[21:42:42] <preinhei_> http://paste.roguecoders.com/p/27ba25be44343ba23b3256658a79e0ae.txt
[21:43:18] <Derick> what's voltron in there?
[21:43:37] <preinhei_> voltron is the 192.168 one there
[21:43:44] <Derick> ah
[21:43:51] <preinhei_> they're VMs on the same box
[21:43:53] <Derick> can you not use the FQDN?
[21:44:14] <preinhei_> I guess i could use the FQDN, and put the internal IP in /etc/hosts
[21:44:29] <Derick> because the driver will talk to 192.168 and see it wants voltron, and will create another connection to that from the replset config
[21:44:35] <preinhei_> ahh
[21:44:36] <preinhei_> okay
[21:47:38] <preinhei_> so, i want $mongo_host = "washington.wonderproxy.com:27017,mongo.voltron.wondernetwork.com:27017,sanantonio.wonderproxy.com:27017,montreal.wonderproxy.com:27017";
[21:47:39] <preinhei_> everywhere
[21:47:43] <Snii> I'll try once more before going to sleep: I have a selection like {{id:a, rev:0},{id:a, rev:1},{id:b, rev:0},{id:b, rev:1},{id:b, rev:2}} - How can i make a query to get only the highest rev of each id? like {a, 1} and {b, 2}?
[21:47:56] <preinhei_> :w
[21:49:04] <preinhei_> where's it up looks to be working
[21:50:07] <Derick> preinhei_: yes, you want that
[21:50:56] <preinhei_> thank you kindly, let's see what weird problems crop up now
[21:51:33] <preinhei_> this paste from earlier, it talks about connecting to washington, then says it hasn't found a server http://paste.roguecoders.com/p/11745a23d2514e57578344f84f128112.txt
[21:51:39] <preinhei_> Could you explain to me what went wrong in there?
[21:56:25] <nemothekid> I am downgrading my cluster to 2.2.4 from 2.4.2. Should I downgrade the config servers as well?
[21:58:44] <preinhei_> nemothekid: out of curiosity, why the downgrade?
[22:09:45] <Derick> preinhei_: your paste doesn't show about washington, or something
[22:10:04] <Derick> ismaster: the server name (washington.wonderproxy.com:27017) did not match with what we thought it'd be (127.0.0.1:27017)
[22:10:11] <preinhei_> right
[22:10:15] <preinhei_> but it is connected to it
[22:10:18] <preinhei_> and it is the master
[22:10:35] <Derick> yes, but 127.0.0.1 is wrong...
[22:11:08] <Derick> in /etc/hosts you need to have it set to the internet facing IP
[22:11:20] <Derick> maybe I'm confused now though...
[22:12:20] <preinhei_> I've switched everything I could find to the connection string we discussed
[22:12:44] <preinhei_> but the connection string seems to be a lot more fragile than I expected
[22:13:19] <fl0w> I'm doing some reading on MongoDB, and I'm thinking about using it for an experimental project of mine. If I understood things right (comming from a traditional relational DB background), to truly utilize the power of mongo, I shouldn't normalise my data - but say I have a blog posts created by user "John Doe", I shouldn't link it (again, to gain the power of mongo) - but what if the user John changes his name, and I need to update all the records
[22:13:19] <fl0w> based on this name? What if I have multiple John Doe's (though still different users)?
[22:16:08] <Derick> preinhei_: hmm, as long as you have the same names as in your repl set config, and all the members can access them under the same name and IP, it should work
[22:16:23] <Derick> preinhei_: I'm off to bed though... it's getting late
[22:16:29] <preinhei_> thanks you kindly Derick
[22:19:46] <preinhei_> fl0w: I'm making it up as I go along, but I'd store a bit on the user there (their name, id and anything else you need to display when showing that blog post), and yes update if a user changes his name
[22:20:05] <preinhei_> Really, you're saving a bunch of joins every single time a user loads a page
[22:20:26] <preinhei_> at the expense of a larger operation on the rare instance that someone changes their name
[22:20:42] <fl0w> preinhei_: I should however keep a "foreign key"-like identifier though, because otherwise, I'm extremely confused.
[22:20:49] <preinhei_> I do
[22:21:22] <preinhei_> I'm not storing a blog, but in that similar circumstance I am keeping a foreign key like element
[22:21:47] <fl0w> aye, neither will I … just an example to make my point.
[22:22:08] <fl0w> I'm actually not ever doing a web-related application
[22:22:14] <fl0w> s/ever/even*