[02:52:15] <kataracha> kali: you werent lying about it taking a long time to build
[03:35:27] <kataracha> does anyone know why I might be gettings this: scons: *** [build/install/bin/bsondump] Source `src/mongo-tools/bsondump' not found, needed by target `build/install/bin/bsondump'.
[03:35:49] <kataracha> i have built mongo and ran: scons install
[03:50:02] <cheeser> kataracha: you should ask the mongodb-dev list
[04:16:52] <blargypants> For some reason I thought it was possible to initiate a whole replica set as a shard from my mongos instance. Am I right that it's possible or am I imagining things
[06:33:02] <pyarun> I am new to mongodb and trying some queries. I have a collection named person
[06:34:40] <pyarun> I am new to mongodb and trying some queries. I have a collection named person, i want to get a dictionary having as key the last_name of the person and as value rest of the information about the user. here is the sample input output: http://pastie.org/9777746#25
[08:43:34] <kali> pyarun: this is not the mongodb way. you may achieve to format the answer this way, but at a very high price in terms of code complexity and performance (with map/reduce)
[08:43:58] <kali> pyarun: presenting results is the purpose of the application server, not the database
[12:00:07] <YaManicKill> Hey guys, I have a data structure for something which is similar to: {user: "sffwer", url: "http://mg.reddit.com"}, I want to get a list of these objects that match a certain regex for the url, but at most 1 per user.
[12:00:17] <YaManicKill> I got this far: db.reference.find( { url: {$regex: "mg.reddit.com"} } ).count()
[12:00:29] <YaManicKill> But obviously that doesn't do at most 1 per user. Any ideas?
[12:01:13] <YaManicKill> I thought I would just be able to stick on a ".group({key: user})", but that doesn't work...it seems to be a completely different function with a completely different way of doign things.
[13:41:55] <alexgg> So, is there a good, robust way to have a stable "insertion order" sort ?
[13:44:19] <jiffe> so when I add a shard, how can I tell if things are moving?
[13:53:03] <kali> alexgg: if you use auto generated ObjectId for _id, they will be "roughly" sorted by creation date, and it will be stable
[13:53:37] <kali> alexgg: it's rough as two document created in the same second by different process and/or server will not necessarily be in the right order
[13:53:41] <alexgg> yes, roughly :) I'm implementing an event store and external system will ask to be synchronized from event X
[13:56:24] <alexgg> but if not, can't the DB handle the Id generation?
[13:57:33] <kali> alexgg: there is a cost. you need to "pull" an id from mongodb to put it in your document before inserting them: http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/
[13:57:57] <alexgg> yeah I saw that, that would increase latency
[13:58:57] <alexgg> is that read query very performant too?
[13:59:10] <arnser> whats the best way to see how much memory / cpu mongo is using during a stress test? I am sending significant load to my server at the moment but cant see mongo in top anywhere seems to be using almost no memory or cpu
[13:59:27] <kali> alexgg: which read query ? it's not a read query, it's a findAndModify
[13:59:39] <kali> alexgg: so you'll have to perform twice as many write ops
[13:59:45] <alexgg> kali: ah, I had the second scenario in mind
[14:00:40] <alexgg> There's a problem with the findAndModify approach I think
[14:00:43] <kali> alexgg: ha, you mean optimistic insertion ?
[14:01:09] <kali> alexgg: yeah, that one will be quite fast (just a lookup on the last item of an index, O(1))
[14:01:26] <alexgg> Writer A findAndModifies counter 3000, Writer B findAndModify counter 3001, but Writer B then writes the event faster than Writer A
[14:02:22] <alexgg> If "read last event id" operation is 0(1) then that would be quite excellent.
[14:02:42] <alexgg> Except the use of ever incrementing ids perhaps
[14:03:49] <kali> but i think you can get unlucky and find race conditions leading to local permutation in event order too
[14:04:20] <alexgg> even though the id would have an unique index?
[14:04:22] <kali> short of implementing a global lock, I think there are always RC
[14:05:45] <kali> alexgg: yes. A comes first, performs the max lookup, find 3000, tries 3000 but fails. B comes next, lookup says 3001, B insertion works. A tries again and get 3002
[14:33:40] <alexgg> I already needed a capped collection because the interface will watch changes in real time
[14:35:46] <alexgg> The only issue I can think of is back pressure
[14:36:22] <alexgg> http requests will happily return once the event is in the capped collection, but potentially the cold collection won't be able to follow the speed
[14:36:34] <alexgg> I don't think I will reach that level of activity though.
[15:23:00] <jiffe> so I've added a replica set as a shard to a cluster and watching logs I see the balancer locking and unlocking something but I don't see anything moving
[15:37:14] <jiffe> and when I run sh.isBalancerRunning() it varies between true and false so it seems like its turning on and shutting off
[15:53:49] <kali> jiffe: yeah, the balancer sleeps and wake up once in a while, so this is expected
[15:54:15] <kali> jiffe: have you sharded a database and a collection ? how big is the collection ?
[15:59:24] <jiffe> kali: the db was sharded but not the collection, I called shardCollection() but its taking a while for that call to complete, I don't see anything happening in the logs yet, the collection is 22869151710404 in size so not sure how long it should take
[16:25:50] <DesertRock> So in figuring out how to do something similar to a SQL join, I've been seeing lots of things suggesting I do a forEach, and then insert each modified record into a new collection. Is this really the suggested method of operations?
[17:49:44] <alexi5> I have a document pasted at http://pastie.org/9778447. is it possible to do a modify query on the document by incrementing votes and adding the string "121345" to the Numbers array of the sub document that has keyword equal to christmas ?
[17:50:37] <alexi5> can you give me an example of a modification document that can accomplish this
[20:01:27] <mnms_> Does single shard can be replicated ?
[20:01:47] <Derick> In the general setup, a shard is a replicaset.
[20:03:17] <mnms_> Derick: What does it means, cause each shard should be on individual machine and this machine should have fe. his own slave with replicated data ?
[20:14:10] <Derick> mnms_: that's why you have three copies
[20:14:25] <Derick> mnms_: it does means no data can be moved between shards (as shards autobalance normally)
[20:14:41] <mnms_> Ok I see test architecture it is with one config server
[20:14:42] <Derick> in general, you don't really need sharding though...
[20:15:02] <Derick> just a 3 node replicaset gets you very far - and you don't need mongos or config servers
[20:17:06] <mnms_> But to be clear if two from three config servers are down data will be not moved between shards ?
[20:24:52] <kali> with one config server down, the configuration database goes to read only: shards stayed where they are
[20:25:12] <mnms_> Derick: Ok, Im little confused why data should be moved between shards ? Data is stored in particular shards based on key which you choose ?
[20:25:40] <mnms_> kali: If from three servers one will stay up, what happened then ?
[20:26:02] <Derick> mnms_: yes, but if you start adding data, you want each shard to have about the same amount of data. And balancing moves data between them
[20:26:31] <Derick> mnms_: same thing with 1 of 3 up - but that's something you don't want to be in
[20:26:51] <Derick> if you only have *configured* 1 config server, then there is no "read only" situations
[20:26:53] <mnms_> Derick: but it can be until i fix it
[20:26:56] <kali> mnms_: with two config servers down, it starts to stink. config database is still locked, of course, but you can no longer restart a mongos. as long as they stay up, the cluster "works"
[20:35:04] <mnms_> Derick: So why failure of one of config servers can have impcat on moving data between shards ?
[20:35:33] <kali> because the config servers have to stay in sync
[20:35:49] <kali> if one of them is down, the balancing mongos can no longer update it
[20:36:36] <kali> in real life, it's usually fine: you don't want any server to stay down for very long, and you don't want balancing to happen too often either
[20:37:49] <mnms_> I dont understand the concept, cause I have three config servers to provide HA, and when one is down I loss some funcionality
[20:38:58] <kali> your cluster will still work fine from the application point of view
[20:40:22] <kali> balancing is not a functionality, it's a sad necessity
[20:42:18] <mnms_> Im reading right now about sharding cause I dont understand some things...
[20:45:05] <mnms_> I understand that the config server doesnt need strong machines ?
[21:18:57] <mnms_> MongoDB reads data from the config server data in the following cases: 1) A new mongos starts for the first time, or an existing mongos restarts 2) After a chunk migration, the mongos instances update themselves with the new cluster metadata
[22:22:34] <blizzow> I have a collection that was created but is not showing up when I run show collections on my mongos. I tried to drop the collection and saw a complaint \"ns not found\" for the servers in two of my three replica sets. :( I try the drop again and it says the drop failed with the same ns not found message. I'm able to re-create the collection but it a) doesn't show up when I do show collections, and it complains that it's already sharded. How can I rem
[22:23:57] <Derick> It stopped at "sharded. How can I rem"
[22:24:03] <Derick> IRC has a 500 char limit per message
[22:29:03] <engirth> also, example in the pastie updates only the first match
[22:33:31] <blizzow> Derick: How can I remove this stubborn beast.