[00:24:55] <dstorrs> ah yes, Amazon. "Here, have these awesome cloud services! You get to choose your mandatory catastrophic failure mode: too slow or too volatile!"
[00:31:56] <cgriego> downloads-distro.mongodb.org doesn't act like it's on S3. Claims Apache 2
[01:23:31] <Frozenlock> Is there a way to return every other item in a collection? (or every other other other.... item)
[01:59:51] <sx_> hallo. new to mongodb so this is probably easy for you guys. trying to upsert a document. if uniqueid exists, just need to $push to a field. if doesn't exist, need to add entire object plus the $push to it.
[02:11:14] <dstorrs> sx_: so...you've got 3 cases, I think? document doesn't exist, exists with uniqueid, exists w/o uniqueid .
[02:11:48] <sx_> dstorrs: even easier. either id exists or it doesnt
[02:12:02] <dstorrs> to the room -- splitvector is taking 400-500 ms and happening relatively often. is there any way to speed it up? does indexing help?
[02:12:24] <dstorrs> sx_: so you know the document exists ?
[02:12:56] <dstorrs> if so, why are you doing upsert?
[02:13:10] <sx_> dstorrs: no, i don't know if it exists. let me rephrase the question
[02:13:39] <dstorrs> maybe easier if you back up and expalin the overall situation, actually. what's the end goal at a business level?
[02:14:48] <sx_> dstorrs: so I am querying an API, and retrieving a JSON object. if this object's id is already saved in my db, then I just want to $push some values to two array fields. if the object ID is not in my db (i.e. new document), then I want to save the entire object, and also $push to it.
[02:15:27] <sx_> i can easily do this with one separate find() and $push, but I am thinking that one update/upsert command may be able to do it all
[02:16:24] <sx_> the belabored way: z=find({id:x}). if (z) {$push) else {save+$push)
[02:17:47] <topriddy> say I want to have an entity have an attribute that stores someone's country. does having a list of country and doing foreign key linking usual in mongodb too?
[02:17:57] <sx_> but thinking one upset command could do it?
[02:18:15] <topriddy> also, how would i do a mass load of list of countries save writing the code manually myself.
[02:19:02] <dstorrs> topriddy: just put the country in the document where you want it.
[02:19:12] <dstorrs> sx_: this is actually trickier than it seems.
[02:23:37] <topriddy> dstorrs: what then do you do to avoid problems with denormalization? :S
[02:23:53] <dstorrs> topriddy: what problems? be specific.
[02:24:50] <dstorrs> topriddy: this isn't a SQL database. It works on different principles, and if you try to force it to act like SQL you are giving up most of the power and making your own life harder.
[02:25:43] <topriddy> dstorrs: i have a country entity (pardon my use of sql terms) with say president name. What happens when the President changes and i have a denormalize database?
[02:26:30] <topriddy> SQL teaches me that if i have a foreign-key and well normalized db i just need to change this info in one place. I dont know what i'll get from nosql, but then it doesnt look "so-good" already
[02:26:38] <dstorrs> sx_: so far as I know, that's not legal syntax.
[02:26:40] <topriddy> dstorrs: what do you advise?
[02:27:00] <dstorrs> do you mean db.widgets.find({id:x}).forEach(function...) ?
[02:27:03] <sx_> dstorrs: yeah. the $push will actually be an update command
[02:27:16] <sx_> maybe -- gotta run but thanks for help
[02:28:26] <dstorrs> because I was talking to sx_. be patient.
[02:28:38] <dstorrs> topriddy: use the right tool for the job. Judge each tool for the job it's intended for.
[02:29:00] <dstorrs> SQL is intended for implementing constraints on your data.
[02:29:32] <topriddy> dstorrs: i only chose Mongo cos of scalability. surely twitter is using NoSQL. Do i have to lose all constraints by using nosql?
[02:29:33] <dstorrs> used correctly, it lets you maintain a clean data store relatively easily, at the expense of making it very difficult to scale horizontally
[02:30:15] <topriddy> dstorrs: am asking sincere questions here. if you have a good link to point me too, i dont mind doing some more reading
[02:30:16] <dstorrs> constraints get moved to the app layer. Or, more commonly, to the ORM layer.
[02:30:47] <topriddy> dstorrs: i'm using java and Morphia
[02:30:54] <dstorrs> In general, yes, you lose all constraints.
[02:32:20] <dstorrs> hm. don't have one off the top of my head. one sec.
[02:32:31] <topriddy> dstorrs: but twitter somehow still avoids people using same "username", guaranteeing uniqueness. Also, really the trivial Country example. Data would CHANGE in real life. I cant possibly walk through all entities and update them cos of the denormalized data?
[02:35:16] <topriddy> dstorrs: while waiting the link, i am having a need to store a user picture. considering setting up another entity just for that. (this is based on my SQL background/school of thought)
[02:35:16] <dstorrs> this is short and a bit trite, but all good points: http://facility9.com/2010/09/five-reasons-to-use-nosql/
[02:36:51] <dstorrs> topriddy: schema design still matters in NoSQL, and I don't know your app. but my users collection has attributes like this: name, age, thumbnail_url, canon_name, display_name
[02:37:53] <dstorrs> I like to have a canonical version of the name that makes for guaranteed deduplication and sort ordering, and then a 'display_name' which is how it was actually entered.
[02:41:43] <topriddy> dstorrs: well, i wont want the time-lag and redundancy from getting picture objects everytime i ask for a user. thats why am thinking of seperating it.,
[02:42:01] <dstorrs> oh, and one other thing which you probably already know -- don't store actual images in your DB. not only does it bloat everything, but you risk blowing the 16M-per-document limit
[02:42:34] <dstorrs> if you really want to do that, I would suggest using Mongo's GridFS
[02:42:54] <dstorrs> possibly with memcache (because it implements LRU semantics)
[02:43:34] <dstorrs> that said, think about how many users you have and whether or not the time to deep-link a picture is really going to matter.
[02:46:34] <topriddy> dstorrs: alright. maybe i can store in same entity afterall.
[02:47:17] <dstorrs> *shrug* up to you. like I said, I don't know your app or schema. I'm just making a suggestion, because I think it will make your life easier.
[03:31:46] <hdm> still getting weird bson corruption - but with odd results. mongodump/mongoexport + mongorestore/mongoimport both work for this data, but doing a map-reduce triggers a invalid BSON error (randomish)
[03:32:00] <hdm> most queries still seem to work too, just not map-reduce
[03:37:56] <skot> Can you post an copy of the error to gist/pastebin?
[03:39:32] <hdm> hit the problem consistently between 2.1.x, 2.2.0rc, and 2.0.6
[03:39:54] <hdm> let me paste the error in a second - ive been loading/reloading the data (takes 48 hours+ each time) trying to figure out if its something with my system
[03:40:32] <hdm> it seems to be triggered by certain content in the string field (binary bits)
[03:40:49] <hdm> at least, it doesnt trigger until i load a ton of http responses in
[03:44:50] <hdm> waiting on the m/r to trigger it again
[03:54:16] <jwilliams> is it possible that bulk insert slow down due to too many servers are doing the same thing (around 50 servers are doing bulk insert at the same time) ?
[06:04:09] <hdm> -d dbase seems to work, still, yuck
[07:01:55] <bullfrog3000> hi all, i am using pymongo, and am running a commond db.test.update({}, {$set, {"a":5}}). It seems without safe enabled, it returns right away. Is there no way to block until the call is done, without paying the safe=True performance cost? It appears significant on my dataset.
[08:48:32] <jwilliams> we've shard the server, etc. but when monitor (mpstat -P All 1) the activities, noticing large amont of time only 1 cpu is not idle.
[08:52:17] <ron> Derick: Now that you guys are more active in the channel (and kudos for that), you may want to start using an IRC bouncer :)
[08:54:18] <Derick> nah, I was just checking some client settings in irssi
[08:54:33] <Derick> irssi runs on my server so shouldn't go down unless I do it myself
[09:35:23] <vak> oops, I've got an unexpected performance artifact with pymongo. Although using of "fields=" in my case makes I/O bandwidth 3-times lower (as expected) it makes also the whole time of looping through the collection about 3 times longer... how come?..
[09:44:13] <vak> ok it is blows up badly in pypy, not in cpython
[10:11:46] <new2nosql> hi guys, running a repair ... getting a FAILED_TO_UNCOMPRESS any ideas
[11:16:48] <remonvv> Anyone aware of changes to findAndModify behaviour between 2.0 and 2.1? Unit tests of our system our failing on 2.1 that are okay on 2.0. Unspecified assertion error in find_and_modify.cpp:140
[11:18:07] <kali> remonvv: when you have found what has changed, i'm interested
[11:18:35] <remonvv> Trying 2.2.0-rc0 now, may have been a bug that was fixed.
[11:19:06] <kali> i'm in the laborious process of re-applying an adhoc patch on 2.2.0-rc...
[11:20:03] <remonvv> Jep, it's "broken" in 2.2.0 as well.
[11:25:22] <remonvv> If so, please try and reproduce if you have the time.
[11:59:18] <remonvv> Okay, so this findAndModify fails since 2.1+ : db.test.findAndModify({query:{stringField:1}, update:{$set:{stringField:2}}, upsert:true, new:true})
[12:00:10] <remonvv> Assuming it has something to do with querying on the same field I'm updating with new=true/upsert=true
[12:06:39] <awestwell007> I am trying to create a mapreduce in mongo and am getting empty values back from the reduce part. I have a collection with a id and a count I am trying to get all the values based on the id vale (views or counts)
[12:06:55] <remonvv> NodeX, yes. It fails if you findAndModify with the query and the update using the same field and new=true and upsert=true. All other permutations of that command are okay.
[13:24:14] <Samuel__CTX> I have a question. In PHP I can find devices based on some fiend: $devices = $conn_devices->find(array("key" => "value" ));, however when I try to do this on an _id this doesn't work: $devices = $conn_devices->find(array("_id" => "id" ));
[13:24:32] <Samuel__CTX> how can I get a document when I have its ID?
[13:30:38] <souza> guys i have a high level problem, i must to create a database to a HUGE data structure, my question is: I must "cut" this huge environment in a lot of collections or put all into a one collection
[13:37:30] <souza> if you have a Big environment and must to put it in MongoDB, you will create some collections and share ObjectID's, or put all into one table.
[14:34:08] <Rrjois> souza: I want to do case insensitive string match. I tried regex. but it returns the value even if its a substring. I tries "is" but that is case sensitive here is my code http://pastebin.com/4rAEBdPK
[14:40:23] <Rrjois> souza: you want me to explain once?
[14:41:53] <souza> i think that i understand you want to create a query that get the data from database, that doesn't cares about the case of the string
[14:47:35] <Rrjois> souza: yes. the problem is suppose i hve two values in my collection. say "bill status" and "Bill" . if I search bill, It returns be bill status as bill status(may be cause its found first). I want it to match the full string "Bill status"
[14:55:30] <remonvv> ron, oh that IBC ;) Maybe, probably not.
[14:56:39] <ron> remonvv: my company has a stand there, so if you do go, you should visit. unfortunately, I won't be there. I'm staying to hold down the fort.
[14:57:09] <remonvv> ron, I will. Why aren't they sending you?
[14:57:30] <ron> remonvv: "I'm staying to hold down the fort." :)
[14:58:44] <ron> remonvv: gonna overview the R&D while the big guys are away. should be fairly exciting there though, it's basically the company's debut.
[15:02:16] <hjb> funny fact of the day: when that ff addon is installed: http://www.bolwin.com/software/snb.shtml it breaks access to the html admin interface of mongod
[15:02:45] <hjb> btw. IE8 isn't able to access it at all
[15:02:55] <SisterArrow> Hiya! I updated mongodb with a apt-get update and restartet mongodb and restarted the MMS-agent on a machine, lets say machine2. But now(after 15 minutes wait) the MMS-webpage only reports machine1 and machine3 but not machine2 :o
[15:03:11] <SisterArrow> The debug output from the agent just says starting watching machine1 and machine3, but not itself.
[15:03:21] <SisterArrow> Anyone know why that might be?
[15:10:23] <SisterArrow> On the machine thats gone away(machine2) I updated from 2.0.4 to 2.0.6..
[15:14:59] <remonvv> ron, cool. I might go but I'm not that interested in the conference itself and I have to keep an eye on my annual conference budget ;)
[15:16:18] <ron> remonvv: hehe :) well, so far I didn't get to travel from work (though surprisingly, two of my team members do).
[15:17:25] <remonvv> ron, you're doing it wrong ;) Shame you're not coming to Amsterdam. Would be good to have a coffee.
[15:18:07] <SisterArrow> Ok, manually adding the server in MMS solved the issue.
[15:18:14] <ron> remonvv: don't be silly. I don't drink coffee. but yeah, had I come, I would let you know so we could meet. maybe to next year's IBC.
[16:55:56] <jwilliams> http://www.mongodb.org/display/DOCS/Inserting#Inserting-Bulkinserts says continuous on error is implied in shard environment and can not be disabled. does user need to explicit to enable it?
[16:56:08] <jwilliams> or just use DBcollection.insert would work?
[19:02:24] <rnickb> i'm using scopeddbconnection and my program is crashing when it terminates. is there something i have to do to properly destroy the database connection pool?
[19:46:13] <y3di> vsmatck: I can't really figure out which option would be best, im worried im going to mess something up
[19:46:33] <cgriego> downloads-distro.mongodb.org is still running slow today :(
[19:51:50] <y3di> would it be possible to do a db restore from a simple log
[19:59:52] <y3di> how can i test if ./mongodump successfully stored all the data?
[20:00:20] <ron> you can import it to a new database?
[20:42:58] <linsys> y3di: Also mongodump doesn't restore indexes..
[22:13:08] <mmlac> How do I store tags best? As an embedded array or as a separate model that is linked to? Use-cases are: get the distinct tag names, display every model that has a specific tag
[22:19:19] <trbs> mmlac, for a similar use case i use a list in the document, then my application has a set() of all distinct tag names which if a unique tag gets added gets updated (if your tag list is big i guess you could put that in a collection as well) both my set and distinct tag names are small enough for this to be no problem.. and i am fine with the set() that gets checked and updated via celery (cron-alike jobs) every once in a while
[22:30:27] <mmlac> ic. How do you find the objects that contain a certain tag?