[01:38:25] <styles> I have a mongo server and am running a map reduce. It only uses one core, which I've read is normal? The map reduce is taking ~34s for ~1M records (growing about 10k /day). Is there anything I can do to speed this up?
[01:40:14] <styles> sorry typeo it's about 100k /day
[03:38:02] <runnyspot> i want to iterate over all items in a huge collection and want to resume if something goes wrong, what's the best way? can i sort by mongoid for free?
[03:40:25] <joannac> sort by _id and keep track of the last one you saw
[05:03:39] <pause> I'm currently using mongodb with nodejs/mongoose, and i have a hard time having consistent unique index errors when inserting documents
[05:04:56] <pause> I'm spamming insertion of documents, and uniqueness error happen like one time for ten tries
[05:27:31] <stevo__> is there any possible way to reset the password on an MMS account without knowing the username?
[05:27:49] <stevo__> I only have the email address belonging to our account
[05:41:02] <netQt> Hi everyone, is there a way to force secondary to sync specified db from primary? The thing is I have missing db in secondary, and I need it to sync it from primary
[05:58:25] <joannac> how did you get a primary with a db that the secondary doesn't have?
[05:59:25] <netQt> look here is the scenario, I have set where all of the sets are synced, then I'm stopping secondary and runing as an instance independent from the set and i run drop db
[05:59:41] <netQt> then when i start it as a member of set, it no longer syncs
[06:00:42] <joannac> i assume constant write activity during all this time?
[06:03:51] <netQt> my problem is that i'm want to reclaim disk space, but didn't run repair db because it will go through everything in the directory, and i have several other dbs there too, so it will take long time
[06:04:50] <netQt> with initial sync it will have to copy all the databases, which i wanted to avoid, but seems thats the way to go
[06:05:10] <joannac> well, it's the same "load" as repair
[06:17:20] <zhodge> question about BSON: if I serialize the object { a : true }, I get back binary [09 00 00 00 08 61 00 01 00]
[06:17:34] <zhodge> after following http://bsonspec.org/spec.html I know that 09 is the size of the serialized document in bytes
[06:18:44] <zhodge> 08 denotes either a boolean true or false, 61 is the ascii representation of ‘a’ which refers to the key name, 00 ends the key name, and 01 denotes that the field is true not false
[06:19:00] <zhodge> but I don’t know what those three 00s are in indexes 1-3
[06:19:22] <zhodge> anyone with more experience with bson know why those are there?
[07:30:11] <stefuNz> hey, i know this is a broad question, but my mongodb instance (3 nodes, replicaset) is getting slower and slower (after some months now) - of course there's more data over time. how can i identify the bottlenecks and improve the performance? problem is, it's bringing mysql, redis and solr down which also runs on those hosts.
[07:38:45] <rspijker> stefuNz: resource monitoring (is it CPU, RAM, IO that’s bringing it down). Query monitoring (enable profiling), are there bad/missing indices?
[07:39:08] <rspijker> start gathering some data, basically… Then you can dive into it and narrow it down further and further
[07:39:59] <stefuNz> rspijker: Thanks, i will gather data. How can i ensure what is bringing everything down (CPU,RAM,IO)?
[07:40:45] <rspijker> stefuNz: monitor usage. Is it using 100% CPU at peak times? Is it swapping a lot? Are the read/write queues big?
[07:41:15] <rspijker> are you using MMS on this instance?
[07:41:29] <rspijker> if you;re not familiat with monitoring systems, it can make life a lot easier
[07:41:51] <stefuNz> rspijker: No i'm not, but looks promising. I think i'll give it a shot.
[08:48:53] <Quincy> I have a JS-script with a mongo-query, which should return something in case the criteria is the same as the 1st parameter, but somehow it also returns something when it ISN'T the same..
[08:49:12] <Quincy> Am I in the correct place to ask for help with this?
[08:50:45] <rspijker> Quincy: sure, pastebin the query?
[08:51:07] <Quincy> Well, that would be the whole JS function :p Hold on ^^
[08:53:01] <Quincy> http://pastebin.mozilla.org/5561987 Here we go, I also added a little bit of info of how I call it
[08:57:22] <rspijker> don’t see anything really weird with it :/
[08:57:29] <bestis> hmm.. could someone say what i'm going wrong as $pullAll seems to work if i run it by hand, but from php it doesn't seem to affect all documents.
[08:59:10] <Quincy> rspijker: would it help if I grabbed the document out of my collection to inspect for you? Ohlol I just noticed your name is dutch :P
[09:10:48] <Xteven> I'm currently using mongo 2.0.4 (Ubuntu 12.04) and on a "group" query, mongo uses about 45% CPU load... I've asked sysadmins to upgrade to mongo 2.6. Will that perform better? It's a 32-core machine with 64GB RAM
[09:11:14] <slainer68> I though mongodb do not verify types, so why I'm getting errors like : " failed with error 16837: "The field 'tags' must be an array but is of type String in document {_id: ObjectId('4f870265bb94786e84002f56')} "
[09:16:03] <kali> slainer68: you're trying to $push or $addToSet to something which is not an array
[09:16:47] <kali> slainer68: there is no integrity check at insertion time, but on update, there as to be something as some operators can only operate on certains types...
[09:17:05] <slainer68> kali: thanks for your answer. that seems to be that.
[09:24:35] <kali> well, it's less straightforward, but you can configure your server as a replica-set, your server being the only "real" member, and your laptop a hidden member. once a day you start your laptop as a replica so that it can pull the oplog (which has to be big enough), and the rest of the time, you run your laptop mongo as a standalone server
[09:24:51] <kali> just don't perform write on your laptop, or you'll have a desync
[09:32:35] <Quincy> This is a correct mongo statement? db.rounds.find({"gameID" : "53c641bf95fe1b5c0b5153ee"})
[09:33:50] <kali> Quincy: it is valid, but i guess you need a ObjectId() around the hash string
[09:34:23] <bearclaw> ObjectId("53c641bf95fe1b5c0b5153ee") does not match "53c641bf95fe1b5c0b5153ee", said differently
[09:37:30] <Quincy> bearclaw: I noticed that as well, hence my question as I do have an objectid in my document. second question: do I need to include ObjectId if I call a query fromout a JS file as well?:
[09:41:35] <bearclaw> depends on where the value come from and what it's type is. It can print itself as a string, but actually be an ObjectId.
[09:44:53] <Quincy> I pass the value into my function as a string, where I insert it into a column,find({}) function
[09:54:27] <shambat> I'm storing data in mongodb using pymongo. I'm getting the following error: pymongo.errors.OperationFailure: database error: BSONObj size: 32076923 (0x1E9747B) is invalid. Size must be between 0 and 16793600(16MB) .... what is causing this error?
[09:56:04] <lqez_> shambat: there are hard limit for maximum object size in MongoDB. Its default value is 16MB.
[10:08:51] <shambat> lqez_: could this happen if I use the db.collection.save() function very often? I am going through a large set of data performing .save() operations on them
[10:09:11] <shambat> is there some other function I should use to save the altered data?
[10:09:55] <Quincy> I use this to set my criteria in JS: var criteria = { gameID: GID }; and in my DB i store it as an ObjID, is this the correct way to call it?
[10:10:03] <movedx> How come when you drop a database in Mongo it says it's done it, but still shows in "show dbs?"
[10:13:04] <Quincy> You might have to update the whole collection set to make it definite
[10:13:45] <lqez_> shambat: not for frequent saving operation. BSONObj size is only caused by size of the object.
[10:20:17] <Quincy> movedx: I'm not too sure on that, maybe lqez_ has the answer to that
[10:20:35] <movedx> Quincy: For the record, these databases are empty.
[10:49:01] <movedx> Does sharding a collection activate a read/write lock on said collection?
[10:55:57] <movedx> I was thinking of index creation, but that can be backgrounded.
[11:12:38] <remonvv> Hi all. We're seeing significant performance degradation when using 2.12.x Java drivers compared to the 2.11.x ones.
[11:13:11] <remonvv> Known issue? And if so how do we circumvent these performance issues. It's about a 10-20% difference
[11:22:31] <shambat> lqez_: I have a question about my previous problem.would that error occur if the query itself is very big? say I had a list/array of values that together is more than 16MB, would the query then fail with that error?
[11:24:25] <cheeser> if your query is larger than 16M you might consider the possiblity you're doing something wrong. :D
[11:26:05] <Derick> shambat: yes, you would get an error
[11:32:38] <shambat> I want to find all records where sighash is not in that list and where data,sources.source is a given string
[11:36:40] <remonvv> shambat: That's a very questionable way to do that. Consider what you're asking the database to do in that query. You'll probably want to add a boolean to that document that is set if it has a sighash that is (or isn't) in the sourceSigs list.
[11:37:11] <remonvv> Which basically means you need to do an update on the sigs collection whenever the sourceSigs list mutates in some way.
[11:37:28] <remonvv> cheeser: You're the Java driver maestro are you not? Any idea about my performance issue mentioned above?
[11:39:09] <cheeser> maestro? no. but i help work on it.
[11:39:25] <cheeser> i've passed on your question but it's a few hours yet until NY comes online.
[11:39:47] <cheeser> i do remember some perf regression numbers but I think it was against the 3.0 async APIs.
[11:39:53] <cheeser> i'm checking for you, though.
[11:41:20] <remonvv> cheeser: If you need further info let me know or point me to a ticket where I can answer questions. Basically we updated the driver without changing anything else and (especially on EC2 machines for some reason) we see a drop of 1/5th or so
[11:41:43] <cheeser> that doesn't sound promising.
[11:41:56] <remonvv> shambat: Welcome. As a general rule of thumb you want to avoid big queries. It's almost always an indication there some sort of implementation/design issue.
[12:17:02] <orweinberger> Is there anything wrong with running a mongo findOne query and immediately in the callback doing a db.close()? I'm running this: http://pastebin.com/D53u9Wyg and SOMETIMES I get an error saying 'Connection was destroyed by application'
[12:27:44] <shambat> my document contains, among other things, arrays of values. I need to make a query where I find documents where all the values equal a certain value. This query would give me a document if "active_s" is 0 somewhere in data.sources: db.sigs.find({"data.sources.active_s": 0}) , but I want to find the ones hwere active_s is 0 in all the array entries in data.sources
[12:29:25] <obiwahn> shambat: look for aggregation $undwind, $match
[12:30:07] <shambat> obiwahn: could $all be helpful here?
[12:30:10] <obiwahn> and maybe you want first use $project so that the $unwind does not eat too much ram
[12:50:42] <remonvv> cheeser: We ran tests with 2.4.x and 2.6.x with 2.11.x and 2.12.x drivers. Seemed unrelated to mongod version
[12:50:53] <remonvv> cheeser: Although 2.6.x seems faster for our typical load patterns
[12:51:25] <cheeser> ok. here's what one of the drivers guys explained to me.
[12:51:48] <lqez> shambat: are you still in trouble?
[12:52:19] <cheeser> 2.6's new write mechanism is slightly slower (one test suggested only 10%) and by upgrading your driver to 2.12, you started using that new system.
[12:52:45] <shambat> lqez: seperate but realted issues :)
[12:53:45] <remonvv> cheeser: We're seeing 10-20% although more the latter than the former on EC2 (it's not very consistent). Is it by design or an issue that's going to be fixed?
[12:54:05] <cheeser> i can't speak for the server team but ... both? :D
[12:54:14] <cheeser> it's definitely a known issue, though.
[12:54:35] <cheeser> iirc, it's an artiface of the new write commands and the bulk write features
[12:55:05] <remonvv> Yes, we noticed bulk writes were slower than non-bulk writes too
[12:55:20] <remonvv> In anything but straight throughput runs anyway
[12:55:43] <remonvv> Okay, well, I'll assume this will magically improve in the future ;)
[12:55:50] <cheeser> i've seen similar numbers and i've always found them a tad confusing
[12:57:01] <remonvv> Well yes. A more skeptical soul could point out that releasing a new feature that should improve bulk writes but actually makes those and all other writes slower than they were a questionable decision ;)
[13:27:52] <obiwahn> i just thought that when you query that information often
[13:28:13] <obiwahn> you should save it directly in the object and update it when the array changes
[13:28:45] <obiwahn> you probably skip all the units that are not activated at all
[13:32:13] <cheeser> remonvv: what kind of work are you doing that you see that 20% drop?
[13:33:26] <obiwahn> or use my query and compare the length of the reduced and original array but it would be slower
[14:22:49] <remonvv> cheeser : Mix of writes and reads at maximum throughput (meaning; we basically go full blast until the work is done). Let me check with one of our devs for more details.
[14:42:28] <remonvv> cheeser: Here's mongostat from a test on 2.6.3+2.11.x : http://pastie.org/9396603 and here 2.6.3+2.12.x : http://pastie.org/9396605
[14:43:08] <remonvv> Quick glance is roughly an average of 25k/sec versus 19k/sec, so about 20%
[14:43:50] <remonvv> As you mentioned 2.12.x might use another codepath for writes on 2.6.x so this might actually show the differnce in perf between old and new write path
[14:49:21] <cheeser> he's actually one of the c# driver devs but works with us on the java driver a fair bit, too.
[14:49:42] <cheeser> do you guys have any 2.4.x servers still in the mix?
[14:53:10] <cheeser> also, is this through mongos? is it a replica set?
[14:55:27] <cheeser> are you doing single inserts or calling insert(List) ?
[17:09:14] <pseudo_> I am using the C api to insert documents into a mongo database(using mongoc_collection_insert) and it seems to fail after exactly 50 files are uploaded to my collection. Is this a limit that I need to set somewhere?
[17:10:49] <pseudo_> The error that I am getting back is "Failed to read 4 bytes from socket."
[17:12:21] <pseudo_> answered my own question. document was too big to insert.
[17:15:31] <jdkeller> does anyone in here have any experience with pymongo?
[17:27:46] <pseudo_> turns out i did not answer my own question. i saw 'packets.$cmd command: insert { $msg: "query not recording (too large)"' messages in my log file, but i think that is unrelated
[17:27:59] <pseudo_> how can i get more information to aide in debugging this?
[17:45:06] <pseudo_> any hints would be much appreciated. i'm kinda spinning my wheels right now
[18:01:34] <BigOrangeSU> any pymongo users here who use tailable cursor?
[18:40:57] <pseudo_> I am using the C api to insert documents into a mongo database(using mongoc_collection_insert). I am getting the error "stream: Failure to buffer 4 bytes: Failed to buffer 4 bytes within 300000 milliseconds." followed promptly by "Failed to read 4 bytes from socket." 50 documents make it into the database and then this failure happens. I am just trying to debug what is going wrong. I followed the api tutorial on here: https://api.mongodb.org/c/current/insert
[18:41:44] <pseudo_> any advice on how i can find more information about what is going wrong here? i turned on verbose server logging, but am not seeing any error messages in mongod.log
[18:47:31] <zhodge> I’m using mongodb to store log/analytics type of data (page requests, errors, etc.)
[18:47:50] <zhodge> and right now I have a single collection ‘logs’ which distinguishes between the kind of logs with a ‘type’ property
[18:48:28] <zhodge> after realizing that errors are a bit context free without including which page request caused the error to fire, I’m wondering if the error should duplicate the pertinent page request data in its document
[18:48:46] <zhodge> or if it should reference an already existing page request log document
[18:49:07] <zhodge> what are the considerations I should make when deciding between these two options?
[21:32:53] <LouisT> is there a way to push something into an array with upsert?
[22:13:39] <lfamorim> Hello! Someone know why the primary replica goes secondary when the last active secondary member goes down?
[23:00:37] <zamnuts> i'm in the process of updating from mongodb 2.6.0 to 2.6.3: setup is a cluster w/ 3 config, 2 mongos, 2 sharded repl sets at 3 mongods each; question: whn updating secondaries, rs.status() always lists them as "SECONDARY" and never "STARTUP2 or "RECOVERING" per the "upgrade revision" docs at docs.mongodb.org
[23:01:06] <zamnuts> the upgrade appears to work just fine, i'm just curious about the stateStr property of rs.status() - why doesn't this reflect what the docs say it should?
[23:02:51] <zamnuts> also worth noting: i'm seeing the "syncing to: <host>" message in the lastHeartbeatMessage property
[23:11:01] <joannac> zamnuts: why would they be at startup2? they should just spin up fine...
[23:11:50] <joannac> zamnuts: if you look in the logs you'll probably see them go briefly through STARTUP2
[23:12:15] <zamnuts> joannac, http://docs.mongodb.org/manual/tutorial/upgrade-revision/#upgrade-secondaries i honestly have no idea what "startup2" means, but from the docs, i expect at least one of startup2 or recovery _before_ seeing a secondary again
[23:15:31] <zamnuts> joannac, good call, it happens quick
[23:16:05] <zamnuts> i assumed the initial sync on startup would maintain a state other than secondary - some sort of indication that I can move on to another secondary or primary
[23:16:52] <joannac> oh, it will if you're in initial sync
[23:18:25] <joannac> as in, if the member is empty, you've just added it, and it needs to initial sync, it stays in ... STARTUP2 or RECOVERING, i can't remember
[23:18:27] <zamnuts> fyi, i don't mean initial to be "full" or "resync"
[23:18:56] <zamnuts> right, yea, i screwed up that terminology
[23:19:03] <joannac> zamnuts: why? you haven't changed any data
[23:19:07] <joannac> all you did was change binaries
[23:19:17] <joannac> there's nothing to "catch up on"
[23:19:43] <joannac> hence why it takes a couple of millis :)