[03:26:26] <Siraris> If I have a field with Andrew Jones and one with Andrew Jones Smith, and I do a find for Andrew Jones, how do I make sure I don’t get Andrew Jones Smith?
[03:40:40] <joannac> Siraris: erm, if you do an exact match, that won't happen?
[06:39:11] <hahuang61> balancer seems to be not moving... very slowly at least. lots of disk io on the machine that has all the chunks and barely any network usage...
[06:39:22] <hahuang61> a mongorestore just finished, is it possible that it's still flushing something to disk?
[06:50:28] <Boomtime> i'd start with the one with lots of disk io
[06:51:03] <hahuang61> Boomtime: I see: luster cw-rs2-1.internal.bv:27018,cw-rs4-1.internal.bv:27018,cw-rs6-1.internal.bv:27018 pinged successfully at Thu Feb 5 22:48:25 2015 by distributed lock pinger 'cw-rs2-1.internal.bv:27018,cw-rs4-1.internal.bv:27018,cw-rs6-1.internal.bv:27018/cw-rs1-1.internal.bv:27019:1422559530:1804289383', sleeping for 30000ms
[06:51:14] <hahuang61> why's it sleep for 30 seconds in between?
[07:02:08] <Boomtime> yeah, it has to constantly verify, robustness costs speed
[07:02:15] <hahuang61> the machine with all the chunks
[07:02:25] <hahuang61> but 1 chunk per like 4 minutes seems way too long?
[07:03:48] <Boomtime> yeah, that isn't actually very good.. but it's not horrible either - a good shard key and pre-splitting will largely avoid the need to balance at all though
[07:04:22] <hahuang61> we're looking at network stats on the machine and it's kilobytes.
[07:04:26] <Boomtime> how big are the documents on average?
[07:06:32] <hahuang61> what's the process? these machines have gigabit transfer speed between them, so unless its ALL disk IO, there's like no activity happening
[07:07:38] <joannac> what is the disk io on the one with all the chunks?
[07:14:44] <hahuang61> joannac: here's the story so far
[07:15:05] <hahuang61> joannac: we had everything sharded and ready to go, then we noticed that upsert is not an option for mongorestore.
[07:15:35] <hahuang61> joannac: we already set our maintenance window with our cucstomers for tonight, so we had to do a last minute player collection dump (6.5 hours) and restore (26 hours) to the new database
[07:15:46] <hahuang61> joannac: which means I dropped the players collection, and lost the sharding settings I supposed
[07:16:09] <hahuang61> joannac: and now that we restored the players we have to wait for this thing to balance, cuz I didn't shard again after I dropped the collection.
[07:19:57] <hahuang61> i suppose nothing I can really do except keep doing deltas for the other collections then try this again in a few days (we have to give 1 week of notice for maint windows).
[07:20:03] <joannac> hahuang61: right. not sure what to tell you. you're doing it the slow way
[07:20:19] <hahuang61> joannac: doesn't matter, it's 1 week either way it seems
[07:46:33] <hahuang61> bah, yeah it's about a minutes
[07:47:21] <Boomtime> goodo, that is what i'd expect
[07:48:07] <Boomtime> the shell command can detemrine pretty quick if the commit step is going to work, but it doesn't wait for the entire operation to finish
[07:49:06] <Boomtime> meanwhile, the source shard will not permit a new migration to or from to begin until it has fully completed the previous one - this includes cleaning (removing) the dual copies of documents which have migrated away
[07:49:36] <Boomtime> you are bound by the fact that all migrations are occurring at a single mongod - something that normally does not happen
[08:40:52] <hahuang61> joannac: thanks for the info and help today
[08:40:59] <hahuang61> joannac: I dropped the and am reimporting
[09:00:01] <tim___> Hey all. Using Morphia and Java. If I have several documents with Array<User> in each, how do I pull out all documents that contain arrays that contain a specific user?
[09:00:50] <tim___> i am hoping I will not have to iterate through them all in java but have the mongo layer find them form me
[10:34:34] <amcsi_work> I have PHP end-to-end tests written, each starting with mongo wiping the test db and inserting a bunch of static test data. This setUp does take half a second each which adds up quickly. Is there a way to speed this process up for testing?
[12:33:29] <StephenLynx> if I use findOne to retrieve an object with a subarray, is it faster than an aggregate that retrieves a number of documents that result more or less in the same size of the sub array in the first operation?
[12:34:05] <joannac> StephenLynx: an aggregate that retrieves full documents?
[12:34:25] <StephenLynx> no, I would use projection to retrieve on relevant data.
[12:34:53] <StephenLynx> I would use the same projection on both operations.
[12:35:38] <joannac> I am really confused by this usecase
[12:36:13] <StephenLynx> in the first case my data is {identifier:a, subArray[{datax:1},{datax:1}]} and in the second {identifiera:a, subidentifier:b, datax:1}
[12:47:37] <StephenLynx> yeah, I was thinking about that. I had to manually slice the posts because mongo can't slice a sub array.
[12:47:41] <joannac> an array is going to be smaller and less to retrieve, but none of that is going to matter if you have to parse your array and pull out only the bits you want
[12:48:30] <StephenLynx> when I retrieve posts, I take a parameters that dictates to return only posts with an id greater than the informed one.
[14:50:48] <agend> but what about: "Pipeline stages have a limit of 100 megabytes of RAM. If a stage exceeds this limit, MongoDB will produce an error. To allow for the handling of large datasets, use the allowDiskUse option to enable aggregation pipeline stages to write data to temporary files."
[16:04:03] <LindaKendall> turns out I had a dupe slave somehow, thanks for the help ;)
[20:06:40] <dsirijus> if you have an array of {id:String, unlocked:Boolean}, and max size is ~20, is it better to alocate all 20 items, then just switch up the "unlocked" property, or just have an array of {id:String} and add/delete elements of it?
[20:10:49] <joannac> is there a Number in each array element?
[20:12:07] <dsirijus> (there's several of such array types, but each in itself is of fixed types)
[20:12:22] <dsirijus> but let's say this one is just a 5 char string and a Bollean
[20:12:39] <dsirijus> and other one is 5 char string, boolean and a number (but that's another array)
[20:16:41] <dsirijus> order of magnitude of all such arrays is 1, so 10-100 elements
[23:39:12] <jeebster> I know it's generally standard practice to index foreign keys in a relational database, but does this hold true in mongo?
[23:41:14] <jeebster> or is there not much reason since there's no concept of 'join' in mongo?
[23:43:35] <Boomtime> jeebster: MongoDB doesn't recognise the concept "foreign key" (as you say, it arises from the idea of joins), but you should index whatever patterns you commonly do queries on
[23:48:23] <jeebster> Boomtime: gotcha. I generally add an _id attribute for what I'd consider a 'relation' and query against it so perhaps that's a candidate for indexing
[23:55:05] <esko> hi guys! could someone help with a quickie.. im just starting out with mongo. http://pastebin.com/tzH88wfB i need to insert some placeholder value inside all tags: which are empty