[00:24:15] <AlmightyOatmeal> https://jira.mongodb.org/browse/SERVER-17688 <-- open for almost a year and a half but yet the default storage engine doesn't support this feature and https://docs.mongodb.com/manual/reference/command/parallelCollectionScan/ makes no mention of wiredTiger not supporting it
[00:39:25] <AlmightyOatmeal> GitGud: you would be better off dumping the data to a backup. if you just copy the directory then you risk inconsistent data amongst a number of other potential problems
[00:39:45] <AlmightyOatmeal> GitGud: referring to a backup vs. filesystem snapshotting
[00:39:54] <GitGud> "dumping the data to a backup" meaning?
[00:40:47] <AlmightyOatmeal> GitGud: actually i am wrong. copying the data directory isn't as fragile as it is on a larger RDBMS
[00:41:03] <AlmightyOatmeal> GitGud: https://docs.mongodb.com/manual/core/backups/ and more of what i was referring to: https://docs.mongodb.com/manual/tutorial/backup-and-restore-tools/
[00:41:21] <GitGud> well AlmightyOatmeal what im doing is just more a rough copy not for a production solution
[00:41:34] <GitGud> because in production env i'm going to add replica sets anyway
[00:43:34] <AlmightyOatmeal> GitGud: makes sense. looks like mongo gives you a plethora of options then :)
[00:44:37] <GitGud> AlmightyOatmeal, basically have 1 db. 1 repl set. and 1 backup file system copy i do every week
[00:45:07] <AlmightyOatmeal> GitGud: if you use ZFS then you can send filesystem snapshots to a remote host ;)
[00:46:08] <GitGud> AlmightyOatmeal, on the regular ?
[09:45:18] <crazyadm> ok i got them all added to replication
[09:45:35] <crazyadm> now, how to make set1 and set2 shards
[09:51:22] <crazyadm> is there tutorial on sharding
[09:52:57] <crazyadm> when i add a shard, it must be primary or secondary?
[12:37:39] <jrmg> Hello everyone. I've got a question on what's the best way to rebuild a collection that acts as a cache layer. The collection contains aggregations results from another collection (>1M documents). What would it be the best way to rebuild the entire cache collection preventing the process from taking ages? via shell? Multithread client?
[13:19:27] <mumla> hi everyone! does somebody know how to execute db.stats() in java with die mongodb-driver? I just downloaded the newest version 3.4.0beta and the command "getStats" doesn exists as expected according to docs.
[13:26:06] <cheeser> what are you calling getStats() on? pastebin your code somewher.e
[14:51:44] <AlmightyOatmeal> is there a secret to using a compound index? i'm trying to query a single field that is part of a compound index but the query planner is telling me that it's going to scan the entire collection instead of using the compound index :(
[15:09:37] <StephenLynx> how id the compound index built and what is the query?
[15:14:44] <AlmightyOatmeal> StephenLynx: what do you mean by "how is the compound index built"? it's a simple find() with a field that is in the compound index and a corresponding value
[15:15:52] <AlmightyOatmeal> oh. the field i'm querying is one of the last fields in the compound index
[15:16:18] <AlmightyOatmeal> there are maybe 22 fields in the compound index
[15:33:33] <AlmightyOatmeal> mongo decided to stop returning results around 14M documents which is under half of the documents in that collection. so now i need to re-run a find() and upsert all of the 14M documents that mongo has been spending the past 12 hours spitting out at me.
[15:39:03] <AlmightyOatmeal> GothAlice: the entire collection won't fit in RAM but that's a good suggestion, let me check my vm stats
[15:40:24] <GothAlice> However, more generally, a process which requires examining every single record in a collection of millions in a single pass is sub-optimal. At a minimum you're going to need to track progress, handle cursor timeouts, and retry. If bulk write and bulk read is purely the goal, you might also investigate capped collections, which are somewhat more efficient for FIFO use.
[15:44:44] <GothAlice> (Capped collections also provide a means to stream process data instead of buffering it all then processing it all in one go.)
[15:45:37] <AlmightyOatmeal> GothAlice: mongo is grabbing ~1k documents at a time and even if the entire collection doesn't fit into memory, a majority share can fit in memory but that still wouldn't explain the slowness.
[15:49:39] <AlmightyOatmeal> GothAlice: but i have very little disk I/O happening.
[15:50:41] <GothAlice> AlmightyOatmeal: Faults represent access to memory that was not loaded into the process. It might already be in cache, so the fault would "link" the missing memory into the MongoDB process. This wouldn't show up as disk IO directly.
[15:51:59] <GothAlice> However faulting freezes the thread that faults until the linking is complete, so, it's terrible for performance.
[15:52:31] <AlmightyOatmeal> GothAlice: oh, now that does make sense
[15:52:38] <GothAlice> (It might show up as io-wait, but not as literal IO bus activity.)
[17:32:06] <FrancescoV> Hi all, I try to use mongodb in a docker container but I got this error: 'KeyError: 'DB_PORT_27017_TCP_ADDR'' using this code: client = MongoClient(os.environ['DB_PORT_27017_TCP_ADDR'],27017)
[17:32:09] <FrancescoV> any advice what it should be?
[20:27:52] <cheeser> replace the binaries. restart.
[21:00:40] <edrocks> do you have to do anything special to upgrade a replica set's `protocolVersion` to 1? I just upgraded everything to mongodb 3.2.9
[22:22:19] <DarkCthulhu> I have a question about mongodb replicasets. If I'm running a replicaset in production, I need to always send writes to the master right? How do I discover the master?
[22:22:28] <DarkCthulhu> HOw does a production setup work rather
[22:23:51] <joannac> connect using a replica set connection, the driver will determine which one is the primary
[22:29:01] <DarkCthulhu> joannac, do you have docs for this? How does the driver figure it out?