PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 10th of October, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[01:42:20] <Astral303> hey folks
[01:42:32] <Astral303> is "sharded connection to X not being returned to the pool" in mongos any kind of alarm or is that just a random informational log message?
[03:43:58] <jonasschneider> given an _id index, is there a way to find out which key is at position N of the index?
[03:44:18] <jonasschneider> i'm trying to partition an id-space into roughly-equally-sized parts
[03:44:30] <joannac> db.coll.find().sort({_id:1}).skip(N-1)
[03:44:59] <jonasschneider> is this… a good idea for very large collections?
[03:51:27] <jonasschneider> looks like it is!
[03:51:46] <jonasschneider> talk about forests and trees. thanks joannac!
[06:27:07] <liquid-silence> hi all
[06:27:19] <liquid-silence> I am trying to model a directory structure with permissions in mongodb
[06:27:48] <liquid-silence> so a folder will contain folders and files
[06:28:01] <liquid-silence> each use might have access to some of the files or some of the folders
[06:28:22] <liquid-silence> so I was thinking an array of permissions (permission per user)
[06:28:38] <liquid-silence> but I am struggling to find the best way to model the whole solution
[06:28:42] <liquid-silence> for query purposes
[06:28:59] <liquid-silence> for example, I would like to pass in a userid and get all the folders and files he has access to
[06:29:14] <liquid-silence> was looking at materialized trees
[06:29:20] <liquid-silence> but not too sure it will work
[06:30:33] <liquid-silence> any ideas / docs will be helpfull
[07:05:01] <rhalff> hi not sure if I understand mongodb, an _id is not an id and you can have an _id that looks like an _id but it's not an ObjectId(), I have now the occasion that I have two different documents in the db both with the same _id but on wrapped inside ObjectId() and the other not.
[07:05:19] <rhalff> one
[07:08:05] <rhalff> ok, rant over, I'll read the docs again... :)
[07:15:39] <rhalff> ok I was under the assumption I could save a document with an _id, this is not the case?
[07:15:42] <rhalff> I should use update?
[07:17:40] <admiun> rhalff: you get it for free? just save something { test: 1 } and it will get an _id automatically?
[07:17:59] <admiun> or is that not what you meant
[07:24:06] <rhalff> admiun, I think I misunderstood how to update a document, I thought when I give save() the full document, it would just save, now trying it with .update({ _id: ObjectId(obj.id) }, { $set: obj })
[07:25:17] <admiun> save() will save the full document yes
[07:30:42] <rhalff> admiun, ok, doesn't work for me yet.
[07:30:57] <admiun> hmm, how are you retrieving the document?
[07:31:52] <rhalff> ah, I actually got an error back now, mod on _id not allowed.
[07:32:09] <rhalff> should I just leave _id alone, and only work with an id for my application?
[07:32:26] <rhalff> I'm retrieving the document from the database, then trying to send it back.
[07:32:46] <admiun> yeah in that case you shouldn't have to touch _id unless you want it to be something special
[07:33:16] <rhalff> ok :)
[07:34:10] <luminous> hello! I'm installing mongodb with salt stack, and trying to automate adding the PPA
[07:34:18] <luminous> sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
[07:34:33] <luminous> where can I find/determine the user/archive?
[07:35:04] <luminous> http://docs.saltstack.com/ref/states/all/salt.states.pkgrepo.html#salt.states.pkgrepo.managed <<< ppa is in form of 'user/archive'
[09:45:03] <gerryvdm_mbp> i have a document with a hash embedded, is there any way to query the documents that have at least one elemment in the hash?
[09:46:48] <kali> gerryvdm_mbp: not easily and efficiently
[09:47:32] <gerryvdm_mbp> with map reduce then?
[09:55:07] <kali> gerryvdm_mbp: http://uu.zoy.fr/p/hotebumo#clef=vckpmbfpcpokogee
[09:55:35] <kali> gerryvdm_mbp: the best i can imagine
[10:09:30] <gerryvdm_mbp> thx kali
[11:24:02] <salty-horse> hey. I upgraded a sharded cluster from 2.2 to 2.4.6. Running db.serverStatus({workingSet:1}) doesn't give me any workingSet info. Does it take time to gather it? Perhaps I did something incorrectly when upgrading?
[11:30:16] <kali> salty-horse: are you calling it agains a mongod or a mongos ?
[11:31:02] <salty-horse> kali: mongos. I guess that's my mistake :)
[12:18:40] <frzen> Hi, I'm currently an undergraduate Computer Science student, and I was thinking about looking into the mongoDB codebase as a preliminary check to see if there are any interesting project ideas. (I'm trying to do a project on distributed systems and I was wondering if replacing some protocols with PAXOS would be a challenge) Is this the right place
[12:18:40] <frzen> to ask questions on the codebase? or is there a dev channel?
[12:19:39] <ron> heh
[12:39:11] <dllama> hello
[12:42:42] <dllama> can someone help out a mongo noob with some optimization please
[12:44:32] <dllama> i have an 11gb db with approximately 40k records
[12:48:55] <dllama> yesterday with half of those records it was lightning fast, today i can barely get it to load. i really am not sure what indexes i should put or where, but any input would be greatly appreciated
[12:55:24] <kali> dllama: what is slow ? read ? write ?
[12:55:31] <dllama> read
[12:55:43] <dllama> its taking like 60+ sec to load teh page
[12:55:52] <dllama> and i see in top, mongo is taking up about 16.3gb of VIRT
[12:56:25] <kali> dllama: yeah. that's ok
[12:56:42] <kali> dllama: show us a typical slow query, and a typical document
[12:56:48] <dllama> i think i'm left with maybe 64kb of free ram
[12:57:29] <dllama> for now its just a listing
[12:57:36] <dllama> 1 sec, i'll gist the query
[13:00:58] <dllama> https://gist.github.com/mvoloz/e9eab3186f23ab075137
[13:01:04] <dllama> tahts from my app logs
[13:03:31] <kali> VIRT is not RAM, it's adress space
[13:04:02] <dllama> like i said, i'm an absolute noob with mongo, so i dont know where to look to try to optimize my code, or how to ever read these queries tbh
[13:04:43] <dllama> i havent yet been able to figure out how to go eager loading in my queries which i suspect will improve things a bit
[13:04:48] <dllama> but i mean as it is now, its pretty terrible
[13:05:07] <kali> have you checjed the indexes ?
[13:05:21] <dllama> ive only created 1 index
[13:05:28] <dllama> on my Award model
[13:05:40] <dllama> index(award_date: 1)
[13:05:44] <dllama> i have that for award_date
[13:05:49] <kali> you need one on claimant_id :P
[13:06:23] <dllama> do i need one for award_id ?
[13:06:43] <kali> not from what you've pasted
[13:06:45] <dllama> sorry if this is a stupid question, but shouldn't id's get indexed automatically?
[13:06:49] <kali> no.
[13:07:00] <dllama> ok i'll add the claimant_id index now,
[13:07:44] <dllama> kali, whats teh syntax to reindex?
[13:07:59] <kali> "reindex" ?
[13:08:09] <kali> what do you mean ?
[13:08:25] <dllama> i'm going to add the index now, but how do u populate it with data?
[13:08:37] <dllama> or it'll just pick up?
[13:09:43] <kali> just create it, mongodb will build it
[13:09:46] <dllama> ok
[13:10:22] <dllama> sorry, another stupid question
[13:10:30] <dllama> do i create that index on the claimant model or awards model?
[13:10:39] <dllama> its an association
[13:10:55] <dllama> hmm, ok that paste was actually from the claimants model
[13:10:58] <dllama> so i'll index that
[13:11:07] <kali> what ? no !
[13:11:45] <kali> look at your log, you're iteratively counting awards for 30 claimants, each time scanning your whole collction for 6-7 seconds
[13:15:30] <dllama> kali, i see the queries in the log, but i'm not seeing where to add them.
[13:15:36] <dllama> i mean for now i can disable counting
[13:15:44] <dllama> counting the associated records*
[13:18:12] <dllama> just to start the debugging process.
[13:18:52] <dllama> alright, i've removed the count query from the claimants model and its pretty quick again, so i'm fairly confident the culprit is the awards model
[13:19:36] <dllama> i'm going to gist another query in a sec
[13:19:43] <dllama> this one is takign exceptionally longer
[13:23:01] <agend> hi all
[13:23:50] <dllama> kali, https://gist.github.com/mvoloz/e9eab3186f23ab075137
[13:23:57] <dllama> i've updated the log to show both
[13:25:25] <birdy111> I am facing an issue in database drop...
[13:25:34] <birdy111> Running dropDatabase from mongos failed, for 150 GB+ database.....
[13:25:47] <agend> I have a nested doc, and the nested doc's keys are ISODates - and while doing update I want to do smth like: 'values.months.(ISODate here).n' : 5. And trying to do it from python with pymongo - how can I make key like this with ISODate?
[13:27:00] <birdy111> when i checked show dbs on mongos it showed that database.. But when i tried it on all shards, I Observed it was deleted from some of the shards
[13:28:23] <kali> dllama: well, MOPED: db1.mongo:27017 QUERY database=db_admin collection=awards selector={"$query"=>{"processed"=>true}, "$orderby"=>{"award_date"=>-1}} flags=[] limit=30 skip=240 batch_size=nil fields=nil runtime: 1284.6350ms
[13:28:38] <kali> dllama: you need an index on (processed, award_date)
[13:28:48] <kali> dllama: and you need to read about mongodb query optimisation.
[13:28:50] <birdy111> Then again I tried dropDatabase from mongos... this time I received error message "Config DB not found"
[13:29:15] <dllama> kali, i will be reading up on it for sure. like i said, this is my first attempt @ an app powered by mongo
[13:29:33] <dllama> and while i was ecstatic about performance last night, today i'm miserable
[13:30:09] <birdy111> This was the exact error message in second time - "couldn't find database [ZZZZ] in config db"
[13:33:49] <dllama> kali, in my model i have award_date: 1, but in that query i'm seeing award_date=> −1, can i have 2 indexes for same column? meaning 1 for ascending, 1 for descending?
[13:34:12] <kali> agend: you'll have to refactor your documents. using keys which are values and not keywords is a bad idea
[13:35:09] <kali> dllama: it's not needed, mongodb knows how to read an index backwards. but you need the index to be on (processed, award_date) and not only (award_date) as you're filtering on processed
[13:35:18] <agend> kalI: but is it possible to have isodate as key?
[13:36:43] <kali> agend: mmm actualy, no, it's not possible, keys are string
[13:37:22] <agend> kali: ok - thanks
[13:42:35] <birdy111> any help on this is appreciated
[13:43:07] <dllama> kali, i've gotten SOME improvement from adding the processed index, but my biggest bottle neck seems to be when i'm doing associations
[14:07:08] <kali> dllama: well, you need to learn much more about mongodb (and i do mean mongodb, not moped). your schema and querying patterns could probably be improved
[14:07:53] <dllama> kali, rails forms the queries for me
[14:08:08] <dllama> i'm also in the mongoid room trying to see if maybe i'm using the rails adapter wrong
[14:13:39] <kali> dllama: well, rails/mongoid/moped whatever does a crappy job, here
[14:13:57] <kali> dllama: you need to know how a database works, even if you're using mongoid/moped/mongoose, etc
[14:18:09] <bertodsera> I am having trouble in storing a Decimal on Debian (python 2.6 / mongo 2.4.6), whereas the same code works fine on Gentoo (with python 26/2.7 and mongo 2.2). Any idea of where the missing bson conversion could be? It seems that Debian has something outdated
[14:18:16] <remonvv> Hey. Anyone any theories on why a collection takes 21Gb storage while dataSize and indexSize combined is 3.8Gb. I'm aware there's always a gap, sometimes significant ones but this is a new record for me.
[14:18:36] <cheeser> have you deleted a bunch of data from the collection?
[14:19:11] <remonvv> No. This is a users collection for one of our customers. Users never get deleted, only blanked and flagged as such.
[14:19:40] <cheeser> huh
[14:19:52] <kali> remonvv: are they growing ?
[14:20:25] <remonvv> kali: Not usually, perhaps sometimes if significantly larger values are stored for the same field.
[14:20:40] <saml> hey, how can I do SELECT DISTINCT foobar FROM tb; ?
[14:20:50] <remonvv> I'm tempted to compact it with a higher padding.
[14:20:59] <saml> db.tb.find({foobar: {$exists: 1})
[14:21:12] <saml> and then only display distinct
[14:21:17] <saml> let me google because i'm good
[14:21:40] <saml> db.tb.distinct('foobar')
[14:21:55] <kali> saml: yeah, and the filter as second parameter
[14:22:19] <saml> level unlocked: google search skills
[15:42:58] <feggot007> good night guys, can you help me out?
[15:43:06] <feggot007> i got an issue with the _id creation of mongodb.
[15:43:14] <feggot007> it generates invalid _id's by default.
[15:43:34] <cheeser> i ... doubt that.
[15:43:34] <feggot007> its like 5256ca3925d16a2667743b6ae145b4
[15:43:36] <cheeser> invalid how?
[15:44:04] <feggot007> im using the javascript native driver, but i also tried in mongo console
[15:44:22] <feggot007> when i try to search for it for example the ObjectId function gives an error.
[15:44:47] <cheeser> how are you querying?
[15:45:33] <feggot007> let me get you a code snippet.
[15:45:43] <feggot007> i simply use .find and .insert
[15:45:44] <feggot007> atm.
[15:46:23] <feggot007> i use expressjs
[15:46:31] <feggot007> i added an extra closing for it
[15:46:31] <feggot007> process.on('SIGINT', function() { console.log('succesfull.'); db.close(function(){process.exit()}) });
[15:46:57] <feggot007> it simply generated wrong _id's after some time
[15:47:29] <feggot007> db.collection('categories').insert(toadd, function(a,b){res.redirect('/categories')});
[15:47:33] <feggot007> i insert like that
[15:48:11] <cheeser> that's the javascript driver or the shell? looks like the js driver...
[15:48:17] <feggot007> yes, its the js driver.
[15:50:09] <feggot007> did you see problems like that already?
[15:57:23] <feggot007> guys, have you saw any _id's generated by mongodb like this: 5256ca3925d16a2667743b6ae145b4 ?
[15:58:22] <feggot007> i have a problem with id generating, any of you have any suggestion?
[15:58:48] <cheeser> what's the collection look like in mongo?
[16:00:00] <feggot007> i have records like that: { "name" : { "original" : "none", "actual" : "Intel" }, "pricemargin" : 0, "children_of" : ObjectId("524d729af6596d176c000002"), "_id" : ObjectId("524d73a7a188f40d6e000008") }
[16:00:30] <feggot007> *documents
[16:00:44] <feggot007> nothing badass:/
[16:00:54] <cheeser> that doc looks fine
[16:00:56] <feggot007> everytime i wanna go forward i get hit by that issue
[16:01:03] <feggot007> yes, but the others
[16:01:16] <cheeser> i've not used the js driver so i'm not sure if that syntax is appropriate or not
[16:01:20] <cheeser> what others?
[16:02:24] <feggot007> the syntax is good, because its working
[16:02:35] <feggot007> but the _id's generated by mongodb is bad after some time
[16:04:10] <cheeser> i'm not convinced the problem is in mongod
[16:04:25] <cheeser> but then, i still haven't seen any documents with these bad IDs
[16:05:49] <ron> there are no bad ids. only bad people.
[16:10:05] <feggot007> cheeser
[16:10:07] <feggot007> when i insert
[16:10:13] <feggot007> a blank document in mongo shell
[16:10:18] <feggot007> or with a test data
[16:10:24] <feggot007> it still generated bad id's
[16:11:02] <feggot007> > db.categories.insert({test:'data'}) > db.categories.find({test:'data'}) { "_id" : ObjectId("5256d15adc6b5f3a311797b2"), "test" : "data" }
[16:14:46] <jyee> how is that a bad id?
[16:16:30] <feggot007> that one is not.
[16:16:31] <feggot007> hmm.
[16:16:36] <feggot007> let me check some stuff.
[16:18:50] <jyee> feggot007: but what do you mean by "bad" id? are you getting duplicates (i.e. collisions)?
[16:19:01] <jyee> or is there something else wrong with the id?
[16:19:56] <feggot007> jyee: i had longer _id's generated
[16:20:19] <feggot007> its got fixed randomly
[16:20:27] <feggot007> i tried a mongo-restore from an older db
[16:20:36] <feggot007> seems like i had luck this time
[16:20:57] <feggot007> i had ids like 5256ca3925d16a2667743b6ae145b4 jyee
[16:26:25] <schnittchenzzz> I think I found a bug in the init.d script, what should I do next?
[16:31:12] <schnittchen_> seems I need to do a bit more of debugging first
[16:44:39] <Number6> schnittchen_: What Linux version, and what's the bug?
[16:45:27] <schnittchen_> Number6: found https://jira.mongodb.org/browse/SERVER-6008 in the tracker by now. trying to applying the diff locally
[16:46:01] <schnittchen_> I guess it's fixed upstream
[17:18:54] <Nomikos> What could cause cursor->count() to return 306, but looping through it returning over double the items?
[17:36:12] <leifw> Nomikos: is there anything else happening on the system while you're doing this? concurrent inserts or updates?
[17:38:28] <Nomikos> leifw: I've just "narrowed it down" to if I leave off the ->sort(['_id'=>1]) (it's php) it /does/ behave..
[17:42:53] <Nomikos> so I'm running the find(), then sort(), then count() - that returns 306. but when I foreach through them, I get 700 unique products somehow. If I leave out the sort(), it has the correct nr of products
[17:45:18] <Nomikos> I should add it's an $or search, and one of the items uses regex. searching *only* on the regex I get 306 vs 316
[17:45:26] <Nomikos> (unless, again, leaving out the sort())
[17:45:55] <leifw> can you show an example document and the actual query?
[17:45:55] <Nomikos> the other part of the $or matches nothing
[17:46:11] <leifw> and what indexes do you have?
[17:48:13] <Nomikos> trying to put something comprensible up
[17:50:28] <Nomikos> leifw: http://pastebin.com/QHCmUZeX
[17:51:57] <Nomikos> leifw: example doc http://pastebin.com/VW3Y9yfg
[17:55:07] <Nomikos> I can not reproduce it in the CLI, does that make it a driver issue? or is my code/operation order to blame somehow?
[17:55:50] <leifw> I don't know how the php driver works
[17:55:57] <leifw> but I did find this, it seems related: https://jira.mongodb.org/browse/SERVER-1205
[17:56:02] <leifw> hope that sets you on the right track
[17:58:37] <Nomikos> leifw: thanks, I'll try to work around it somehow. they should be small enough result sets to sort in code
[17:58:49] <Nomikos> and I'll google some more
[22:08:36] <heewa> Question about complex indexes. I'm saving _id values like {"t": new Date(2013, 02, 03), "m": NumberLong("120")} (so a date & a big number). The _id is indexed, obviously. Then, we want to query a specific "m" value, on a range of dates. So, how do we do that?
[22:11:15] <heewa> db.coll.find({_id: {t: {$gte: new Date(2010, 1, 1), $lte: new Date(2018, 1, 1)}, m: NumberLong("120")}}) or db.coll.find({_id: {$gte: {t: new Date(2010, 1, 1), m: NumberLong("120")}, $lte: {t: new Date(2018, 1, 1), m: NumberLong("120")}}) ?
[22:58:52] <joannac> index on m, t in that order
[23:04:16] <joannac> Actually, is that your only find() usecase?
[23:28:31] <rbd_> hey guys...say I have a datetime field called 'when' and I want to get results where when >= date1 and when <= date2. To properly index this, do I need a compound index with 'when' twice in it, or will just a simple index work? if compound, does indexing order matter?
[23:35:17] <crudson> rbd_: did you try .explain() and see what it says?
[23:43:46] <rbd_> crudson: I will...but I haven't built the DB yet, so I figured I'd ask here after looking on google a bunch
[23:53:51] <crudson> rbd_: ah. Yes it will use a single index just fine