[00:12:02] <pykler> Question: in replication, for simplicity one primary and one secondary ... what is the replication connection direction. Does the primary initiate the connection to the secondary or does the secondary connect to the primary?
[00:13:23] <pykler> essentially the primary connects on a much faster route to the secondary ... where as the secondary connects to the primary via a vpn and other hops ... so I would want the connection to be primary => secondary ... is there a way to force that?
[00:20:53] <joannac> pykler: the link has to be bidirectional
[00:21:40] <pykler> joannac: sure but who initiates the connection, or are there two connections? if its two, then which one is most of the data transferred?
[00:21:51] <joannac> for oplogs, i believe the secondary connects to the primary
[00:22:35] <pykler> and oplogs is the only data transferred once the repl set is up and running, right?
[00:25:29] <pykler> joannac: that answers my Q, thanks
[00:25:56] <pykler> i wish it was somehow configurable but its an edgecase
[00:43:47] <culthero> Howdy! Has anyone used TokuMX to great success in reducing Disk I/O? I have an application that does between 500/900 upserts a second.. and need to somehow manage I/O on linode
[02:16:55] <pykler> anyone know if tokumx has the same 16mb doc limit as MongoDB does?
[02:30:47] <joannac> pykler: you'd be better off asking in the tokumx community?
[03:21:06] <markfletcher> Hi Im troubleshooting an instance of mongo 2.6.3 - the configuration is a three node replicaset running on centos 6.5
[03:22:07] <markfletcher> A coworker wrote a process that basically creates a duplicate collection in a db and runs a mongoimport on the duplicate collection
[03:22:41] <markfletcher> so for example if you have a collection called foo, mongoimport is run on a collection called foo_stage
[03:23:21] <markfletcher> And then basically the old collection is swapped out with the new collection foo -> foo_old, foo_stage -> foo
[03:24:17] <markfletcher> What Im seeing is that one collection has 2m documents in it, and mongod is crapping out on one the secondaries
[03:26:04] <markfletcher> After the collections are renamed, the process runs a touch on each collection on the primary. Im seeing in the logs index rebuilds on the secondaries and its during the index rebuild that mongod is dieing
[06:54:40] <Viesti> trying to make mongo replica set eat data faster
[06:56:50] <Viesti> 3 nodes, 1 primary 2 secondaries. Using 2.6. Seems that running bulk upsert in parallel with 3 threads (2 core cpu:s, downloading files in between updates), I can get to 6 hours for a 70GB data set which is probably mostly updates
[06:57:18] <Viesti> this on a development replica, production replica is way slower so I'm back to using mongoimport
[06:58:43] <Viesti> development replica has less data in the collection that I'm writing to and in that 6 hour test dind't have read traffic (2 REST api's making ~20/s queries each from the two secondaries)
[07:01:48] <Viesti> a microbenchmark of bulk api on a single node is way faster than mongoimport, but running it on a replica set with read traffic seems to be totally different :/
[09:11:39] <Razz> Hi, can I find a roadmap of the new CXX driver?
[10:35:21] <Guest59941> somebody knows what role I should use to my check_mongo.py (nagios plugin) if I use replication set and auth?
[10:36:47] <Guest59941> when I use this: "usr/lib/nagios/plugins/check_mongodb.py -H my_host -u nagios -p nagiospass -A replset_state -P 27017 -W 0 -C 0" show me this: "CRITICAL - General MongoDB Error: local variable 'state' referenced before assignment"
[10:37:47] <Guest59941> but the mongodb is ok, i think that my user role is not correct
[10:38:16] <Guest59941> i use role "read" for db "admin" and db "local"
[11:16:21] <devastor> Hi, I sharded a collection using hashed _id as the key (_id is just an integer, and there's both normal and hashed index for it), it created two chunks where one has "min" : { "_id" : { "$minKey" : 1 } }, "max" : { "_id" : NumberLong("-9169502971928964967") } and the other "min" : { "_id" : NumberLong("-9169502971928964967") }, "max" : { "_id" : { "$maxKey" : 1 } }. That seems strange, but is it normal? The end result after it synced is that most of ...
[11:16:26] <devastor> ... the documents remained in the original shard and only a couple were moved to the other shard. Any ideas?
[12:00:00] <joannac> devastor: that does seem weird, yeah
[12:11:06] <devastor> joannac, yep, that was with mongo 2.4.9 btw
[12:11:52] <devastor> I'm wondering if the fact that _id has both normal and hashed index caused something strange to happen
[12:12:39] <kali> devastor: mmm no, that should be fine
[12:13:50] <kali> devastor: you only have two chunks ? and one of them have been migrated ?
[12:14:59] <devastor> kali, correct, but that chunk only got a couple of documents while the other one has most of them
[12:15:58] <kali> devastor: mmmm... maybe you are somehow unlucky
[12:16:11] <kali> devastor: no chance of a pre-2.4 mongos surviving somewhere ?
[12:17:19] <devastor> Hm, actually _id is a string and not an integer, but I guess that shouldn't make much difference
[12:17:44] <devastor> Well, I'll try to split the chunks and see what happens. at least mongos is able to find the documents from both chunks just fine
[12:18:43] <kali> devastor: string _id should not be a problem (i have a sharded collection on _id where _id is a string)
[12:21:26] <rspijker> if you only have 2 chunks it’s very early to draw conclusions about distribution
[12:22:09] <kali> yeah, you may just be unlucky. wait for more data to come
[12:23:21] <devastor> yeah, I'm starting to think that the way mongo show the min and max is just normal, and as the chunk size is quite big and there's not much data yet in the collection, there just wasn't need to create more chunks or migrate more documents
[12:34:34] <agend> im trying to make collection.group with keyf function, and I wonder is there any way to send extra params to this keyf function?
[12:38:35] <agend> or is there any other way to do smth like this. I have documents in mongo with id's like: 1x33h174, where 33 is key of campaign. So far so good, but in one case I need to group the data by product which is smth bigger then campaign, and there can be many campaigns in one product. So i group by campaign id, get data out from mongo and add campaings in python to get product data. Is it possible to make all this in mongo. I'd have to pass some object
[12:38:35] <agend> like campaignToProduct = {'1': '100', '2': 100'} and make mongo use it to group my data like this. How can I do it?
[12:49:11] <Nodex> infact scratch that.... With the PHP driver batch inserting http://www.php.net//manual/en/mongocollection.batchinsert.php is there any way to tell which docs got inserted or dropped incase of error?
[12:58:34] <Repox> Hi guys. I'd like to run a mongoexport, but I'd like to limit my export to only export documents containing a specific field value. How can I do that?
[14:29:49] <superhans> Hi! Not sure if this goes in here or if its more a shell-question but maybe someone can help me. I've installed mongodb using homebrew. going into /usr/local/etc/mongod.conf it sais my data should be stored in usr/local/var/mongodb rather than in data/db.... but when i start mongodb by just typing mongod in my terminal it gives me an error saying /data/db does not exist and i must create it, then it just exits.
[14:36:41] <kali> superhans: when you run mongod in the terminal, it looks for the default mongod.conf locations, and /usr/local/etc/mongod.conf is not one of them. so it fallbacks to the default configuration, and the default value for dbpath is /data/db
[14:37:37] <kali> superhans: you can use -f to specify the path to your actual configuration location (and i guess that's what the homebrew lauch script does somehow)
[14:37:56] <kali> superhans: but don't do it, you'll probably mess the permissions in the data location :)
[14:39:52] <superhans> Uh yea :) Basically i just want to be able to run a mongo server by typing mongod from a terminal and preferable i 'd like to store them in usr/loca/var/mongodb
[14:40:00] <superhans> But maybe its not as easy as i thought then
[14:40:30] <kali> superhans: --dbpath should do it, but make sure brew is not having a mongod of its own there
[14:42:13] <superhans> yea i've been starting it by typing mongod --dbpath /usr/local/var/mongodb up until now, i got kinda bored of typing it all out so thought i could set that one as default instead somehow :)
[14:44:03] <kali> superhans: well, i guess you can put a mongod.conf in /etc
[14:44:10] <kali> superhans: no super clean, but it should do it
[14:56:28] <markfletcher> Hi, if I create a collection that will hold a large number of documents, is it better to create the an empty collection, add an index, then insert data, or create collection, insert data and then add an index
[14:57:28] <cheeser> if you're bulkloading, defer the index creation to the end
[15:21:41] <remonvv> I'm noticing a rather big performance difference between writes to a single instance shard and a write to a 3 member replicaset (40% ish). Is this just about expected or is the gap in performance larger than expected?
[15:40:16] <remonvv> I'm stuck in the office due to a bomb threat.
[18:41:50] <urbanenomad_> sorry if this is a newb question...I am setting up a shared sharded cluster of mongodb. I have a super user account that basically creates a database and a database admin with username/password and provide that to one of our clients. I want to allow the database admin to manage their own shard configuration for only the database they own. Is there a built
[18:41:50] <urbanenomad_> in role I can add to the database admin account to allow him/her sharding control for only their database?
[19:10:44] <fourq> helo, can some one point me in the right direction? I'm trying to find in the docs how to update a collection with an arary of documents in node that meet have an existing ID.
[19:11:26] <fourq> I've found this but I don't se how to check for existing docs except for iterating
[19:27:48] <culthero> So I asked yesterday about Using tokuMX. I used it as a drop in for Mongo, managed to get my Disk I/o to go down to 140,000/s to 1400/s.. And my database is 50% the size. However, now my CPU is running at high utilization :)
[19:32:04] <insanidade> hi. I have a quick question about the size of indexes. db.mycollection.stats() reveals that my indexSizes value is 4635792. that would mean something around 4M. What does that mean exactly? Does it mean EACH index will consume 4M?
[20:23:44] <insanidade> anyone, please: If I define a collection and don't specify an index, it's assumed that an _id index will be created and ObjectId will be the type for that field, right ?
[20:23:52] <insanidade> if I define my index, is that default _id index still created ?