[01:04:40] <bakadesu> say like in a SaaS model, would client_id be a bad shard key?
[01:35:21] <Goopyo> bakadesu: SaaS models vary…. what exactly are you trying to do?
[01:36:03] <Goopyo> The curcial thing is key cardinality so if you're clinet_id gives you good chunks of data according to which you want to shard, it could be a great key
[01:37:06] <bakadesu> say a client gets really large, say 25-50% of database
[01:38:29] <Goopyo> then yeah I can see it being a good shard key. Thing is 'good shard keys' are relative to how good other keys would be which you have to investigate relative to your dataset
[01:39:15] <bakadesu> still trying to wrap my head around it, been reading some conflicting information
[01:39:50] <bakadesu> like I would think a listing of people by first name would be a decent shard key
[01:40:44] <bakadesu> but I don't know if/how things change if that listing is already sharded by client_id
[08:51:42] <lerian> hi guys, a friend is looking for a System Engineer (the middleman will get 200euros if the person is hired): http://pastebin.com/zpCTvK64
[12:50:58] <aganov> Hi all, can I search for specific string on entire database, I have huge db (10GB) and Java application that use mongodb, I need to find where it's storing specific data/record
[12:52:55] <lucian> aganov: if you set an index for that query, it'll be reasonably fast
[12:55:34] <aganov> lucian, it does not needed to be fast, I just need to find where in the db is trored specific "string" for ex db.find("awesome") -> { objects that contains string "awesome" }
[12:59:15] <aganov> NodeX, i have big db which is used from closed sourced Java application, that application returns some data from mongodb, I need to find in which collection is stored specific information. For example string "awesome"
[12:59:54] <NodeX> you have to loop each collection and loook for it
[13:24:21] <whitehat752> hi! if i have a json document in other json document and I assign that document to a variable - how can i change the inner document through the variable?
[13:24:42] <whitehat752> a = {title:test,body:{par1:blabla,par2:tata}}
[14:05:48] <Gargoyle> whitehat752: After you have done var a = db.coll.find(), then you can type a.help() to find the methods on the cursor that you can use.
[14:06:36] <Gargoyle> I'm only getting 34KB/s from downloads-distro.mongodb.org
[14:08:56] <Gargoyle> whitehat752: …find() returns a cursor, findOne() returns a document.
[14:09:16] <NodeX> a cursor -must- be looped / itterated
[14:09:42] <whitehat752> with findOne - it works :) thanks
[14:11:22] <whitehat752> for find I shold use for, foreach.. for findOne I can use variable like a document, i've got it :)
[14:13:45] <Gargoyle> Derick: I am jsut rebuilding the box I got the info for those segfaults from. But I think I can get my mac to do it as well. Just let me know if there's any more info I can dig up for you,
[14:55:22] <Gargoyle> I like poking round to see what it's trying to do! So many these days are just "come and stick your username and password in my phishing site".
[15:04:37] <WoodsDog> we upgraded from mongo 2.0 to 2.2. we are running a replica set. in the 2.0 version, we were using a read only user to backup with mongodump.
[15:18:31] <doxavore> If background flushing is averaging 15 seconds, is there something that should be tweaked? I'm seeing periodic locking of the DB (reads and writes). I've heard background flushing is truly in the background and shouldn't affect anything, but I'm at a loss as to what's causing it.
[15:19:04] <doxavore> I've tried using a single disk, various levels of RAID, they all see disk busy% spike and the DB freeze.
[15:39:27] <jrdn> geese, replica has been in recovery and doing something for 6 hours
[16:02:58] <addisonj> hrm, so I have a large collection of log data that I just dropped, some 35 gb, will mongodb reuse those storage files, or do I still need to do a compact?
[16:03:33] <NodeX> it will; eventualy come back and page out/in
[16:05:09] <jrdn> NodeX, so our primary crashed last night and our secondaries promoted themselves… but, the secondaries somehow were stale.. so we recovered our initial primary and are trying to create new replicas for it… doing a full resync has been taking about 6 hours and there's only 12G of data
[16:05:27] <jrdn> we're going to attempt just copying the data from the primary (since we have that data snapshotted anyway)
[16:06:47] <NodeX> It's not something I know alot about jrdn sorry
[16:06:49] <addisonj> make sure your oplog isn't empty on the secondary, otherwise it will just trash the data
[16:08:03] <addisonj> 6 hours though, for only 12gb... you should be done or pretty close, unless your write lock % has been consistently high
[16:08:27] <addisonj> or your network blows (but it sounds like you are on ec2)
[16:10:17] <_m> As addisonj said, you should be caught up or really close by now. I would spin up a new primary with the snapshot's data. Make sure the secondary's oplog isn't empty though.
[16:11:50] <doxavore> we sync about 500GB of data in 6 hours when turning up a new replica member - and we usually have positively horrible MongoDB performance :-/
[16:30:28] <codemagician> Is there an advantage to using Doctrine MongoDb versus writing my own customer Data Mapper patterns?
[16:30:45] <Derick> I am of the opinion you don't need an ODM with MongoDB
[16:31:13] <Derick> don't add extra layers, you can use Mongo from your models
[16:31:24] <jrdn> we were using odm, but got rid of it and started just using mongo on top of typical domain modeling
[16:31:27] <codemagician> Derick: yes, I had been thinking this
[16:33:03] <codemagician> is there a simple way to convert model classes to arrays… perhaps just overriding the __toArray() magic method and then casting the model objects before passing them into save() ?
[16:33:27] <codemagician> or using a SPL interface?
[16:36:09] <bhosie> i noticed here http://www.mongodb.org/display/DOCS/Excessive+Disk+Space that running repairDatabase() is blocking action. Does it block at the database level, or will it block the whole mongod process?
[16:39:50] <jrdn> codemagician_, we have repositories, services, and domain models.. repositories only do persistence to mongo, services use the repository, finds hydrate the domain models, and saves "dehydrate" them… you can use toArray() or explicitly get what you need in each save method.
[16:40:25] <jrdn> addisonj, _m, so if the oplog count is 0 on the replica that's trying to resync.. that means it's F'd right?
[16:42:02] <codemagician_> jrdn: is the MongoDb PHP API enough that I don't need to write an abstraction layer (such as a data mapper pattern) between my controllers and it
[16:43:00] <jrdn> as Derick said, you can just put the document data into a $data property in your domain model
[16:43:42] <codemagician_> jrdn: if for example I have a class User { private $_id; private $data1; .. } could I just do a $db->save($user). If the value of _id is null, will it insert and if it's set to a MongoDb object will it update?
[16:46:38] <Derick> codemagician_: $_id will automatically be added
[16:48:07] <_m> Also see: https://github.com/mongodb/mongo-php-driver/blob/master/collection.c#L1090-1126
[16:48:42] <codemagician_> What about if my app has a model hierarchical and a child and parent model changes, would I then gain from using a ODM like Doctrine?
[16:48:44] <Derick> dunno if it actually will update the $_id property of the object
[16:51:22] <NodeX> upserting is done with a query, if the query is matched then it updates else inserts
[16:51:29] <NodeX> if that explains it any better for you
[16:51:32] <codemagician_> say I have a tree of objects like A which contains B, C and D. If I delete D and update C. Then I save A, will MongoDb PHP just take care of all this?
[16:51:45] <codemagician_> i.e. will it chain down
[16:51:46] <Derick> there is no magic in the driver
[16:52:03] <codemagician_> so then, I will end up with a fat controller scanning for changes?
[16:52:16] <_m> codemagician_: I find it easier to use an ORM for larger applications as generally some amount of an ORM's functionality will need to be reimplemented within my stack.
[16:52:21] <codemagician_> which I why I had leaned towards having an abstraction layer
[16:52:37] <Derick> remember though that in mongodb, you don't have relations between collections
[16:58:54] <jrdn> we still use domain objects and custom mapping to ease validation and provide something to the view (so we show 'name' instead of 'n' / 'createdAt' instead of 'ca', etc)
[16:59:10] <jrdn> but it's just one file no instead of 1000000012398489072398074
[16:59:35] <jrdn> and then a custom cursor to do object mapping… but where our app needs speed, we just use raw mongo.
[17:25:11] <jrdn> what causes the "replSet not trying to sync from *, it is vetoed for N more seconds"
[17:34:37] <TecnoBrat> do we have a realistic timeline for 2.2.1? every day I look its pushed back another day.
[17:35:01] <tomreyn> hi, i'm trying to back up all mongo databases (ideally including any relevant meta data such as permissions) on a server using mongodump --username $DBUSER --password $DBPASSWORD --out $BAKDIR/$TIMESTAMP
[17:59:30] <jrdn> could my data be corrupt? if so how can i check?
[17:59:39] <NodeX> about 21 miles from Liverpool iirc
[19:03:54] <LouisT> I'm trying to allow users to search multiple fields in a database using a string or regex.. what would be the safest way to do user supplied regex?
[19:11:13] <crudson> LouisT: sanitizing input doesn't get you anything. you can strip out whatever for peace of mind, but there is no 'bson injection' or such. If you want to validate that a regex is valid or conforms to some specific rules that you want, you'll have to do that in your application; the options available to you will depend on the language being used.
[19:13:02] <LouisT> Well it'll be PHP, but my issue is that I planned on using $where with a function to check multiple fields, I'm just not sure if it'd be easy/possible for them to exploit it.. So I figured someone else would know more than I do.
[21:38:04] <tystr> yes we've set up mms monitoring
[21:38:21] <tystr> but it doesn't seem to flexible as far as notifications go
[21:55:17] <meghan> friendly reminder for the west coast people here MongoSV (silicon valley mongodb conference) early bird is ending this week http://www.10gen.com/events/mongosv
[22:02:45] <Zelest> NodeX, looks like my hatred for debian/ubuntu remains.. ugh. :P
[22:37:32] <alx___> is it possible to sure a matrix in mongodb
[23:10:17] <aboudreault> do you sometime manage your app users inside mongodb, and not just the app itself (big data, etc.) ?
[23:12:11] <drunkgoat> i need some help with node-mongodb. i'm doing collection.find({_id:{'$in':userList}}).toArray(function(err, users) {} ). this should return all users with _id in userList[]. is that right?
[23:19:35] <drunkgoat> so it must be unrelated to mongo
[23:55:23] <jin> I have a question about replica sets with 4 nodes. I am looking to configure a secondary node to be a backup by setting a slaveDelay. However, I can't find any documentation to set this backup node to read only from the other secondary, so that it won't bog down on the primary. I would appreciate if you have some pointers.