[00:06:09] <Godslastering> do mapreduce jobs run, in any way, concurrently? i.e. - will running mapreduce take advantage of multiple available cores?
[00:55:30] <dstorrs> I have a sharded DB. there are no backups or replicas. I tried to create an index, it ate the entire CPU and locked the DB. I ^C'd the operation (which I've been able to do before). That didn't work. I tried to killOp the operation.
[00:55:59] <dstorrs> then I tried to service mongoX stop my servers. one shard went down easily, the other took a while then said FAILED. it now has a lock file hanging around.
[00:56:13] <dstorrs> I'm reluctant to start up again until I know where I am.
[00:56:27] <dstorrs> Log files are available. Can anyone help?
[11:14:45] <akaIDIOT> so I need another roundtrip, but for the 'inner' query i only need to retrieve something to restrict the result set for the second part
[11:15:26] <akaIDIOT> can i tell a query (from Java) to give me just the objids and tell the second query to restrict itself to the objids
[11:15:49] <akaIDIOT> the latter could just be a $nin: <objids>
[11:16:13] <akaIDIOT> or should i just iterate it and cache all the objids manually
[11:17:22] <NodeX> you can bring back the documetns you want to parse and use $in
[11:17:56] <NodeX> since _id's are always indexed it would make it a fast(er) round trip
[13:44:28] <souza> I'm trying to insert a big BSON in my mongodb, but I won't get this :( - It runs normally, but doesn't insert in database, this is my code: http://pastebin.com/azpxbgbS and if i try to print the BSON that will be inserted in MongoDB, I got this stack > http://pastebin.com/tB6uTnJU
[13:44:44] <MANCHUCK> how do i create a user that wil be able to access all databases ?
[13:47:27] <souza> MANCHUCK: i recommend you read this > http://www.mongodb.org/display/DOCS/Security+and+Authentication =)
[13:48:19] <MANCHUCK> i followed that however i still need to run the command on the database i basically want to create a new root user
[13:49:28] <NodeX> you need to login to the admin database first
[13:50:39] <MANCHUCK> ahh thats is what i was missing
[15:08:48] <simon___> Debian 6.0 vs Ubuntu 12.04 for mongodb server? Whats your thoughts?
[15:09:41] <simon___> Ive been using 10.04 for a long time without any big issues. But thinking about how much more problems Ive had with ubuntu on the desktop than with Debian Im thinking about moving over to Debian.
[15:10:20] <simon___> Reason I switched on desktop Is because out of date desktop packages. On a server I really could not care less about version of XFCE etc.. :)
[15:10:54] <simon___> I.e using Debian has always been a more stable experience for me when it comes to desktop, is there any real difference on the servers?
[15:11:02] <simon___> (not used Debian on servers since years ago)
[15:12:06] <vsmatck> Ubuntu is debian unstable with few changes. So I just weigh it as debian stable vs debian unstable (and some people like to run testing, unstable, or mixed).
[15:13:29] <vsmatck> With mongodb the important consideration is mainly the filesystem.
[15:16:47] <TubaraoSardinha> Any idea on why mongo won't save dates on local time using ruby driver?
[15:19:38] <kali> TubaraoSardinha: mongodb dates are stored as offset from 1970-01-01T00:00:00 utc, without timezone information
[15:21:34] <TubaraoSardinha> kali: I was expecting this, but the thing is that I need to store consolidated user actions per day and my fear is that some user actions might end up on the wrong day due to timezone differences.
[15:22:57] <TubaraoSardinha> I imagined something like actions: {03082012: 5, 04082012: 3} with the key representing the day
[15:23:09] <TubaraoSardinha> Maybe my logic is wrong
[15:32:08] <kali> TubaraoSardinha: i would not use mongodb dates for that, because there are actualy timestamps
[16:13:57] <jordanorelli> is there a tool out there that will let me specify a hash/dict/object with collection names for keys and query documents for values, that will dump all relevant documents?
[16:14:19] <jordanorelli> that is, for doing something like getting all the data for a particular week, to extract a portion of my data to make a testing database.
[16:23:11] <Godslastering> i'm attempting to do something like this ( http://paste.pound-python.org/show/ckzGbjCZHg3qOy5ixFEI/ ) in python using pymongo. this is running pretty slow to iterate over so many rows, and i'm wondering if i can do some custom advanced query in mongodb to speed it up. Basically, i want to peer into multiple collections with one query
[17:06:31] <nofxx> anyone got a replset can run `mongotop` on the master? I'm in doubt it's normal: local.oplog.rs 1094ms 1094ms 0ms
[17:30:55] <Godslastering> can anyone possibly help explain to me how the heck this is happening? http://paste.pound-python.org/show/uvXZxFrwxkdMvoFauLHk/
[18:00:30] <Godslastering> this is uh .... extremely annoying, does anyone have any clue how in the world this could happen? http://paste.pound-python.org/show/uvXZxFrwxkdMvoFauLHk/
[18:01:15] <chubz> Is there a way to start a replica set with all its members all at once?
[18:03:58] <mpobrien> @godslastering what does the resulting doc have in it?
[18:05:00] <Godslastering> mpobrien: {u'name':'bob',u'hostname':'bob.com'} it doesn't have u'ip' .. i'm just wondering how in the world mongodb is giving me the document if it doesn't have u'ip'. my query is wrong somehow, i'm guessing
[18:10:23] <mpobrien> it shouldn't really be too bad unless there are a LOT of those docs that are missing the ip field; is that the case?
[18:10:40] <Godslastering> mpobrien: yes. about 3 million are missing the ip field, and about 12 million have the ip field
[18:11:46] <mpobrien> ok, that does make sense then
[18:16:15] <Godslastering> also, if i have about 10 pre-allocated files, and i delete a 2gb chunk of data, can i tell mongodb to re-arrange the data and remove empty pre-allocated files?
[18:49:03] <Godslastering> mpobrien: http://paste.pound-python.org/show/uvXZxFrwxkdMvoFauLHk/ still having the same issue. i've tried the same query another place in my application and i'm getting the same error; this seems like a proper query, but it's acting wrongly
[18:53:36] <chubz> how come when i type rs.status().members.stateStr i get no output in mongo?
[19:01:46] <chubz> mpobrien: is there a way to display more than one member's state? like instead of rs.status().members[0].state something like rs.status().members[0..3].state
[19:02:07] <mpobrien> you can just write a function in javascript to do it
[19:08:50] <mpobrien> @godslastering what happens if you run that query from the shell, do those docs come up
[19:09:40] <Godslastering> mpobrien: actually what i pasted is wrong. i dont want those docs. i'm lost, i've gotta come back to this problem later or tomorrow.
[19:12:14] <slavik_> is it possible to recursively search for a key=value pair and return the top level doc if it is found somewhere?
[19:16:20] <wereHamster> slavik_: do it manually, using where
[19:23:05] <Godslastering> mpobrien: i am having an issue though: viewing my mongod output log, i'm getting like 80% of my queries over 150ms ... it's running on a quad core 3.2GHz i7 ... these are pretty simple queries, and i'm quering on indexed data. is this normal?
[19:24:24] <mpobrien> you should double check what the query plan is with .explain() just to be sure
[19:24:47] <mpobrien> also look in mongostat to see if theres page faults
[19:25:15] <Godslastering> mpobrien: ok i'm running mongostat. what do i do now? what am i looking for?
[19:25:25] <mpobrien> do you have a "faults" field in there?
[19:26:18] <Godslastering> mpobrien: no. i don't believe so.
[19:26:46] <mpobrien> FYI cpu is rarely the bottleneck for queries, its usually RAM or io
[19:27:22] <Godslastering> mpobrien: mongod is using 1gb ram, and i've got about 6gb of ram free on the system. how would i know if the bottleneck here was I/O?
[19:28:11] <mpobrien> try running iostat and monitor that
[19:28:38] <mpobrien> also in the log, what does it say when it reports a slow query? should have something like nscanned etc.
[19:34:49] <Godslastering> mpobrien: hm, ok. that explains it. but if i do db.mycollection.ensureIndex({ip:1,hostname:1,name:1}), why wouldn't this work?
[19:35:03] <mpobrien> whats the query you're running?
[19:35:34] <Godslastering> mpobrien: db.mycoll.find({ip:'182.168.0.1'}) in the one i pasted
[19:41:20] <Godslastering> mpobrien: alright, db.mycoll.getIndexKeys() is telling me exactly what i'd expect, but it's still using a basiccursor in .explain()
[19:42:26] <Godslastering> mpobrien: and attempting to force an index with db.mycoll.find({ip:'192.168.0.1'}).hint({ip:1}).explain() is telling me bad hint
[19:43:00] <mpobrien> what does getIndexKeys() say?
[19:43:35] <Godslastering> mpobrien: that's what i pasted earlier, the [{_id:1}] stuff
[19:44:10] <mpobrien> thats still incorrect though - the ordering matters
[19:44:25] <mpobrien> if you have an index on (name, ip, hostname)
[19:44:39] <mpobrien> you can query on (name) or (name, ip), or (name, ip, hostname) and it will use the index
[19:44:45] <Godslastering> mpobrien: ok i was confused at how indices work
[19:44:57] <Godslastering> mpobrien: i just did db.mycoll.ensureIndex({ip:1}) and now it's using a BTreeCursor
[19:45:03] <mpobrien> but you can't query on just ip, because its not the first field
[19:47:44] <Godslastering> mpobrien: wow, since i did that i haven't seen a single slow query show up yet
[19:53:51] <Godslastering> mpobrien: ok, since i was confused before, let me be sure i know what i'm doing here: i only have to do ensureIndex once, even if i insert a lot of data after this? will it rebuild the index?
[19:59:04] <kali> Godslastering: yes. index are maintained when you write, (not on the first read a la couchdb)
[19:59:49] <Godslastering> kali: alright, thanks. i was pretty confused with the documentation that i read
[20:01:25] <kali> Godslastering: that said, if you have a big chunk of data to load, it might be a good idea to insert and then to setup the index
[20:02:48] <Godslastering> kali: i already have the bulk of my data inserted, from now on it will be just like 10-20000 entries an hour, compared to the initial chunk. should i check to be sure the indices exist at the beginning of my script before i run to make sure i dont get bad performance?
[20:04:12] <kali> Godslastering: you don't have to do anything. if the index it there, it wont go away :)
[20:06:17] <kali> it think ip:null actualy works for both null and lack of value
[20:06:27] <kali> unless you define the index as sparse
[20:06:30] <Godslastering> ah, alright, good then.
[20:06:59] <Godslastering> and db.mycoll.find({ip:{$nin:[null,'__unresolved__']}}) was giving me some documents in which 'ip' wasn't even defined though
[20:09:54] <Godslastering> kali: that's fine, yeah, but db.mycoll.find({ip:{$nin:[null,'__unresolved__']}}) is still giving me documents where ip might be null
[20:25:47] <Godslastering> kali: luckily this is a personal project, but because it _is_ personal, lots of disk usage can get annoying
[20:35:15] <Godslastering> kali: if mongodb allocated 20 files when i had a lot of data, and then i remove like 4gb, can i tell it to get rid of unecessary pre-allocated files?
[20:37:52] <mpobrien> if you do a repairDatabase, it will re-write the datafiles with the minimum space needed
[20:38:11] <Godslastering> mpobrien: so, from the mongo client, db.mycoll.repairDatabase() ?
[21:20:36] <mpobrien> well, depends on what acceptable performance is for you… the potential issue is that .skip() gets expensive if you have a lot of docs
[21:20:49] <mpobrien> so it works better on smaller collections
[21:21:12] <mpobrien> an alternative is, populate a random number into each doc when you insert it
[22:53:42] <nofxx> My server is almost idle, got some spare ram for mongo, 90% queries are < 100ms , but every once in a while a got some find by ids, going 300-700ms ...
[22:53:57] <nofxx> and this local.oplog.rs 1973ms 1973ms 0ms on mongotop.... things are normal? first trip sailor...
[22:55:25] <pas256> is there a reason why MongoDB with a read only workload doing geospatial queries would not use all cores on a system?
[22:55:41] <nofxx> that and mongoid that always but an $order, even in find by ids...: { $query: { _id: ObjectId('501c46c343d8a44c26000163') }, $orderby: { _id: 1 } } ... wondering if it could hit performance in some way
[22:56:05] <nofxx> pas256, iirc you need sharding to use multiple cores
[22:57:56] <nofxx> manveru, using mongoid? 3 has findAndModify working
[22:58:07] <pas256> nofxx: On a smaller box, it used all cores
[22:59:05] <Liquid-Silence> its there a way to remove element from all documents in a collection?
[22:59:28] <jordanorelli> manveru: in mgo it's Find(…).Apply(). it might be fixed by now, but there's a bug on findAndModify if your session mode is set to eventual
[23:00:41] <Liquid-Silence> jordanorelli: any idea?
[23:01:16] <mpobrien> Liquid-Silence: use the $unset operator
[23:01:54] <Liquid-Silence> not to sure what you mean mate
[23:26:10] <jordanorelli> e.g., i get a document, regardless of whether it has some key, and then i want to see, for that document that i've already retrieved, if i have that key.