pmxbot IRC Log Viewer

[03:53:18] <dimaj> hello, could someone help me with a query please?

[03:53:33] <dimaj> I am trying to sort an array of strings and nothing is happening

[03:53:43] <dimaj> p.s. sorry for the noob inquiry

[03:53:48] <dimaj> still relatively new to mongo

[03:56:04] <Boomtime> @dimaj: can you provide what you have, what else you've tried, and a sample of what you want in a pastebin or equivalent?

[03:56:37] <Boomtime> are you trying to sort an array in a single document?

[03:56:47] <dimaj> kinda...

[03:56:52] <Boomtime> documents can be sorted, fields within a document are your problem

[03:56:55] <Boomtime> mongodb stores documents

[03:56:56] <dimaj> I've supplied a chain for aggregate

[03:57:01] <Boomtime> ok

[03:57:09] <dimaj> match -> project -> group -> sort

[03:57:59] <dimaj> the $group stage creates an array of simple objects in a form of: [{dn: "thread 1"}, {dn: "thread 0}]

[03:58:30] <dimaj> and I'd like it to be [{dn: "thread 0"}, {dn: "thread 1"}]

[03:58:55] <dimaj> let me pastebin something really quickly so that you have a better understanding of what i'm doing (if you still need it)

[04:01:41] <dimaj> Boomtime: this is my query: http://pastebin.com/k0PpS1SU

[04:04:23] <dimaj> @Boomtime: just updated the pastbin with some sample data

[04:04:29] <Boomtime> is threadName optional field in your documents?

[04:04:35] <dimaj> yes

[04:04:52] <dimaj> previous inserts didn't have it... all future inserts will contain it

[04:05:11] <Boomtime> what is the granularity of the 'date' field? millisecond?

[04:05:33] <dimaj> sadly, UTC string

[04:05:48] <Boomtime> ah

[04:06:33] <Boomtime> ok, so that's fine, but since you group on it, i'm trying to understand how many docs are likely to actually group on that field

[04:06:41] <dimaj> actually, i also have a question around that... when I do 'new ISODate("$date")', i either blow up or get a huge negative number for the year... not sure why

[04:07:16] <dimaj> so, this db is for test results... if i have 100 tests within a single run - 100 docs

[04:07:38] <Boomtime> 'new ISODate' would be evaluated on the client, because that's a javscript language object

[04:07:50] <Boomtime> you need it to be evaluated on the server

[04:08:13] <dimaj> how would i do that?

[04:08:18] <dimaj> and what are you calling a server?

[04:08:26] <dimaj> vs a client that is

[04:08:56] <Boomtime> why not just set threadName to $date if it's null?

[04:09:04] <Boomtime> seems that is what you do in the next stage anyway

[04:09:04] <dimaj> i assumed that if I run 'mongo' from my command line, that'll log me in to server's console where I can do all my queries, no?

[04:09:41] <Boomtime> all javascript is evaluated on the client, not the server

[04:09:43] <dimaj> excellent call :D

[04:09:44] <dimaj> thnx

[04:10:01] <Boomtime> 'new ISODate('$date') is javascript not JSON

[04:10:39] <Boomtime> there is a simple test: assign your aggregation pipeline to a local variable

[04:11:08] <Boomtime> i.e the actual array that you pass to the aggregation function; assign that array to a local: var aggpipe = [ {$match...} ]

[04:11:24] <Boomtime> now print back that local variable

[04:11:33] <Boomtime> THAT result is what gets passed to the server

[04:11:46] <dimaj> ah

[04:11:53] <Boomtime> tada

[04:12:18] <dimaj> oh, so if I add "new ISODate", server chokes as it doesn't have knowledge of that datatype?

[04:13:15] <Boomtime> no, server never sees it

[04:13:25] <Boomtime> client chokes because '$date' is unparseable

[04:13:54] <dimaj> i think i get it... i need to experiment with it some more

[04:13:59] <dimaj> thanks!

[04:14:04] <Boomtime> if it manages to parse $date at all it will show up as some crazy value (though it will be a ISODate) in the JSON

[04:14:21] <Boomtime> then the server just sees whatevewr crazy result the client chucked in there

[04:14:39] <dimaj> i see

[04:14:58] <dimaj> so, back to my original question... how come $sort doesn't sort my resulting array?

[04:15:04] <dimaj> am i misusing it?

[04:15:06] <Boomtime> doesn't it?

[04:15:15] <Boomtime> have you observed the result without the $sort stage?

[04:15:46] <dimaj> yup... same

[04:16:08] <dimaj> { "_id" : { "project" : "name", "date" : "2016-03-08T14:45:15-0800" }, "runs" : [ { "displayName" : "test_context_0" }, { "displayName" : "test_context_1" }, { "displayName" : "test_context_3" }, { "displayName" : "test_context_2" }, { "displayName" : "test_context_4" } ] }

[04:16:40] <Boomtime> can you add the results (without $sort) to the pastebin?

[04:16:59] <Boomtime> or is there only one document?

[04:17:12] <Boomtime> are you trying to sort the array itself?

[04:17:13] <dimaj> my $group resolves a single document

[04:17:20] <dimaj> yes

[04:17:21] <Boomtime> then that document is sorted

[04:17:26] <dimaj> oh

[04:17:31] <Boomtime> by definition, a single document is always in order

[04:17:36] <Boomtime> :p

[04:17:59] <dimaj> ok. so, i'm sorting documents based on a field

[04:18:09] <dimaj> not values in an array field

[04:18:23] <Boomtime> ok.. so to fix it

[04:18:36] <Boomtime> (your understanding is correct btw)

[04:18:44] <dimaj> :)

[04:18:55] <Boomtime> sort before your group - or group partially as needed to achieve the right sort

[04:18:57] <dimaj> my head is about to explode :)

[04:19:21] <Boomtime> then group so that documents go into the array in a sorted ordr

[04:19:40] <Boomtime> got it?

[04:20:19] <dimaj> so, if I do sort on threadName: 1, i get 4, 3, 2, 1, 0...

[04:20:29] <dimaj> if I do -1, I get 0, 2, 1, 3, 4

[04:20:38] <Boomtime> wat

[04:20:53] <dimaj> sorry...

[04:21:09] <dimaj> ok.. i just injected $sort between $project and $group

[04:21:20] <dimaj> and I'm doing a sort on threadName

[04:21:41] <Boomtime> which is null very frequently...

[04:22:04] <dimaj> if I do $sort: {threadName: 1}, I get the following results: test_context_4, 2, 3, 1, 0...

[04:22:33] <dimaj> if I change $sort to {threadName: -1}, I get 0, 2, 1, 3, 4

[04:22:54] <dimaj> it's never null... i took your advice and assigned a value to the $ifNull operator

[04:23:34] <Boomtime> get rid of the $group stage and look at the results - i suspect you're tripping over in the group

[04:24:09] <Boomtime> standard 'debug a pipeline' method is to leave off stages until the you a result that you expect at that stage

[04:24:43] <dimaj> right you are!

[04:25:40] <dimaj> ok. so, how come $addToSet changes my sort order?

[04:26:39] <dimaj> if I change $addToSet to $push, I get correct order, but double the results...

[04:29:40] <dimaj> i think that's just my data

[04:30:08] <dimaj> which is why i have dups

[04:31:39] <dimaj> found my answer... http://stackoverflow.com/questions/21967233/sorting-aggregation-addtoset-result

[04:42:42] <Boomtime> @dimaj: rather than $addToSet then $unwind like the suggestion, in your case, you can $group once (with the array field as part of it) to cull out duplicates, then $sort to get the right order, then $group again to build the array

[04:43:19] <Boomtime> on the second $group you will know there are no duplicates, so $push will work cleanly

[04:43:19] <dimaj> how expensive is that?

[04:43:35] <Boomtime> should be cheaper than $addToSet followed by $unwind :p

[04:44:05] <Boomtime> but you can always test it yourself - the result is the same either way

[04:44:12] <dimaj> well... i'm thinking that if it's rather expensive procedure, i might just leave it the way $addToSet gives it to me

[05:28:18] <dimaj> @Boomtime: one more question... is it possible to inject a static object into an array of each resulting document?

[05:28:44] <swaps> menas what do you wnat to do here

[05:30:33] <dimaj> so, after i do my aggregate pipeline, I end up with a document that looks like this: '{name: 'some name', runs: [ {date: '2016-03-29', threads: ['thread1', 'thread2']}]}'

[05:30:56] <dimaj> I would like to inject a $literate 'All Runs' as the first element of the '$runs' array

[05:36:54] <Boomtime> @dimaj: the easy way would be to use $setUnion but i think that won't assure order again (cos sets)

[05:37:31] <Boomtime> can you carry that extra information in a seperate field?

[05:38:53] <dimaj> no.. i want to have a new "date" that would be interpreted by my web app as "give me results for all dates"

[05:41:02] <Boomtime> wait.. your example looks pretty simple

[05:41:30] <Boomtime> you just want an extra field in the object? - so add in a literal prior to the $group that forms that array

[05:42:31] <Boomtime> field:{$literal:<value>} in the $project btw

[05:43:23] <dimaj> this is my new pipeline: http://pastebin.com/k0PpS1SU

[05:47:44] <Boomtime> ok, i don't think i understand your original question, but whatever works for you

[05:47:55] <dimaj> lol

[05:48:01] <dimaj> let me add expected output :)

[05:54:36] <dimaj> updated pastebin

[05:54:43] <dimaj> @Boomtime:

[05:57:11] <Boomtime> ok, so what i said the first time, you'll need a $setUnion or some complicated little dance

[05:57:23] <Boomtime> the issue is ordering mainly

[05:57:49] <dimaj> ok

[05:58:02] <dimaj> thanks again for all your help!

[05:58:55] <Boomtime> :D

[06:05:23] <YoutubeAPI> hi

[06:05:59] <YoutubeAPI> Can someone explain why this doesn't save when I use my `id` and `reply` variables? https://gist.github.com/InternetExplorer7/3de11fd6bba592077e61f3ab43d1a476

[06:09:49] <dimaj> @YoutubeAPI: if I were to make a guess is because default behavior of _id is an autogenerated Object.. you might want to modify your model to rename you '_id' to something else like 'id'

[06:10:50] <dimaj> also, my schema looks something like this: new Schema({// my types //}, {collection: 'myCollection'});

[06:11:54] <YoutubeAPI> @dimaj: I knew that _id generated unique IDs per document, are we not allowed to edit the _id field in a document?

[06:11:57] <YoutubeAPI> (Also, that worked.)

[06:12:39] <dimaj> i think so...

[06:12:55] <YoutubeAPI> Oh, you know what I think it is?

[06:13:18] <YoutubeAPI> I think there might possibly be a limit in length.

[06:13:24] <Boomtime> _id is immutable, but it can be set to whatever you like on first insert

[06:14:10] <Boomtime> if it is not present, the driver will invent one for you, if you beat up the driver and force an insert to the server of a document without a _id, the server will invent one for you

[06:14:22] <YoutubeAPI> Is there a limit on how long your _id can be?

[06:14:41] <Boomtime> it is a field like any other, but the index on that field has the usual limit of 1024 bytes

[06:15:42] <Boomtime> it also cannot be an array - a subdocument is also considered bad form, but not expressly denied

[06:16:05] <YoutubeAPI> Right, I'm converting it from an array to a string before insert.

[06:16:24] <YoutubeAPI> Hmm, then how did 1120 length String fit into another field?

[06:16:45] <Boomtime> it is not the field length that matters, as i said, it is the index on that field that matters

[06:17:01] <YoutubeAPI> Oh, got it.

[06:17:15] <Boomtime> you can store whatever you like in any field you like, but if you try to index a field the index will only permit values of 1024 bytes to be indexed

[06:17:49] <YoutubeAPI> Right, and that makes sense. Thanks.

[09:39:30] <Ange7> is it possible to calc time for one query in mongoshell ?

[09:42:19] <joannac> Ange7: explain()

[09:44:15] <Ange7> db.collection.aggregate().explain() ?

[09:46:45] <Ange7> joannac: TypeError: db.collection.aggregate(...).explain is not a function

[09:48:21] <Ange7> ok db.collection.aggregate([], {explain: true})

[09:48:26] <Ange7> But i don't have time of execution

[09:50:16] <Derick> aggregate is not a query...

[09:51:47] <Ange7> ...

[09:52:51] <Ange7> i try to optimize one aggregation which take 150sec with PHP Driver. so i try with other driver ... but i see that in mongoshell it's very slow too, so i don't know how to resolve performance problems

[09:53:38] <Derick> can you share the explain output, and the aggregation pipeline?

[09:54:11] <Ange7> Derick: of course

[09:54:12] <Ange7> http://pastebin.com/zMmDDjG7

[09:55:48] <Ange7> i have 79560559 doc in my collection

[09:56:36] <Derick> Ange7: yeah, and it does a COLLSCAN, whic is to be expected

[09:57:04] <Derick> so it needs to read all 80 million documents

[09:57:09] <Derick> doing that in 150 secs is not bad really

[09:57:20] <Derick> do you have any indexes?

[09:58:36] <Ange7> yes i have 4 indexes

[09:58:37] <Derick> if you have one for *just* host, I would try prepending the pipeline with a { $project: { _id: 0, host: 1 } }

[09:58:50] <Ange7> and one on host column

[09:59:03] <Derick> but not sure how well that works - it might be clever enough to use a covered index in that case

[09:59:13] <Derick> try it - and share the aggregation and output again please

[10:00:21] <Ange7> okay i will try

[10:00:22] <Ange7> thank you

[10:03:49] <Ange7> Derick: http://pastebin.com/dBiD4ZJe

[10:17:45] <Derick> Ange7: so no chance

[10:18:15] <Ange7> Ok

[10:18:20] <Ange7> so i will split my collection

[13:11:16] <gcfhvjbkn> trying to figure out why my shard won't drain; i run removeShard, it is in the "ongoing" stage now, but number of chunks doesn't decrease; moreover, whenever i run "db.collection.getShardDistribution()" it shows the same docs count on the target shard

[13:11:36] <gcfhvjbkn> is it wrong to think that getShardDistribution can be used to track draining progress?

[13:11:50] <gcfhvjbkn> either that, or my draining has stalled

[13:12:04] <gcfhvjbkn> none of my collections are primary on the target shard btw

[14:15:16] <bjpenn> we had an issue yesterday where mongodb cpu spiked to near 100% usage on all CPUs

[14:15:28] <bjpenn> anyone know how this could happen?

[14:15:40] <bjpenn> i thought usually it would be IOPs taking a hit

[14:58:45] <Ange7> is is possible to do : findAndModify(['key' => $key], ['count' => '$count' + $otherVarCount]) ?

[15:13:44] <JustMozzy> hello everyone :)

[15:14:12] <JustMozzy> I got the following aggreation (in PHP) http://pastebin.ca/3427103. How can I get the users into the resulting document?

[15:14:20] <invapid2> for pymongo what's the best way to split up find() results? I'm doing .limit(100), how do you find then next 100?

[15:14:31] <invapid2> the* next 100

[15:15:21] <Derick> JustMozzy: you need to add something to the second group

[15:15:45] <Derick> JustMozzy: why are you unwinding btw?

[15:16:04] <Derick> and I've not seen $let before

[15:16:08] <JustMozzy> Derick: Good question, I didn't write the code and I am very.... very new to MongoDB :/

[15:16:31] <JustMozzy> Derick: $let, lets you define new variables. this is mongo 3.2

[15:16:57] <Derick> ok

[15:17:26] <Derick> JustMozzy: what is the pipeline supposed to do?

[15:17:54] <JustMozzy> Derick: it is supposed to give me the count of all unique users and an array of all unique userids

[15:18:46] <Derick> easiest would be to just add (After line 20): , 'users' => [ '$addToSet' => '$users' ]

[15:18:59] <Derick> but, doing an unwind, and then an add to set is a bit silly

[15:20:42] <JustMozzy> Derick: just tested to see what would happen if the unwind was removed. it would result in a count=1. I think I got my head wrapped around the unwind functionality now

[15:23:00] <bjpenn> anyone know what can cause mongodb to go to 100% cpu?

[15:27:09] <Ange7> is is possible to do : findAndModify(['key' => $key], ['count' => '$count' + $otherVarCount]) ?

[15:27:53] <JustMozzy> bjpenn: mapreduce on a large (20gb+) dataset?

[15:30:06] <bjpenn> JustMozzy: wouldnt that just increase iops?

[15:30:08] <Derick> Ange7: no

[15:30:33] <bjpenn> i guess im trying to understand what would cause CPU to be pegged at 100% vs what would generally cause higher IOPs :/

[15:31:06] <bjpenn> possibly loaded question

[15:31:06] <JustMozzy> bjpenn: good question. am no expert ;) but we had once the problem that we fired too many aggregates at the server at once and it went close to 100%

[15:33:15] <Doyle> bjpenn, if your CPU is under provisioned, the compression in WT could cause that. Indexing. An archive gzip dump.

[15:35:46] <Doyle> Here's a MongoDB 3.0 question. Sometimes, during periods of high load I see the mongodb process using swap space, and then get killed by the OOM killer. I know MMAP shouldn't use swap, but it does. How much swap space should be provisioned on a MongoDB server? Considering it should use any, I've been giving them 1GB swap for system use, but it's not enough it seems, or there's a bug.

[15:39:14] <bjpenn> Doyle, JustMozzy doing db.currentOps() found a bunch of queries taking a really long time

[15:39:27] <bjpenn> do you think thats a symptom of high CPU?

[15:39:30] <bjpenn> or the cause of high CPU

[16:17:34] <GothAlice> Doyle: MMAP will use swap; its entire point is to let the kernel handle paging memory on and off disk by using "memory mapped files". If you have more data than will fit in RAM, that process of loading/unloading is "swapping".

[16:17:52] <GothAlice> https://gist.github.com/amcgregor/4fb7052ce3166e2612ab#memory

[16:24:40] <GothAlice> invapid2: .skip(100).limit(100) roughly. However, please note that skipping requires scanning the index to find where to continue from, making it slower to skip further at roughly O(log n).

[16:25:13] <invapid2> hmm, ok - thx GothAlice

[16:26:19] <GothAlice> invapid2: Also that's assuming an index for the query. No index, and it'll have to scan whole records the slow way.

[16:27:12] <GothAlice> Fun fact: the trait that skipping results becomes slower is a behaviour common to many databases, including various SQL ones.

[16:27:14] <GothAlice> :)

[16:47:58] <Doyle> Thanks GothAlice

[16:52:33] <Doyle> In situations where your dataset is 1TB+, you can't fit it in ram. The indexes likely won't fit in ram. Would you still limit the swap to 8GB, or give it crazy swap? 256GB swap?

[16:52:38] <Doyle> GothAlice,

[16:53:04] <GothAlice> Doyle: Neither. Both are incorrect solutions, where sharding is the correct solution.

[16:54:21] <Doyle> So in sharding, you'd want to limit the storage capacity of each shard, to the amount of ram you've got? Ideally, that is. Not give each shard a TB of storage...

[16:55:21] <GothAlice> For very large datasets, it's the indexes that really matter.

[17:09:10] <MacWinner> GothAlice, have you moved to WT?

[17:09:30] <GothAlice> Aye, with the release of 3.2.

[17:11:01] <MacWinner> cool, is there any performance downside to going to WT? the articles I read seem to indicate there are no downsides to WT, but it's confusing to me because of teh commpression/decompression overhead

[17:11:17] <MacWinner> i guess that overhead is minimal compared to other gains?

[17:12:58] <GothAlice> The compression algorithm used by default is "stupid fast".

[17:12:59] <Doyle> MacWinner, CPU became a bottleneck for me with 4 cores on a test system.

[17:13:21] <Doyle> Instead of disk, which was a first, so not bad.

[17:13:23] <GothAlice> Doyle: But under what workload?

[17:13:34] <Doyle> GothAlice, a stupid workload :P

[17:13:34] <GothAlice> Such stats are meaningless without context.

[17:14:08] <Doyle> Very heavy reads with an index being created in the background

[17:14:32] <Doyle> It wasn't a bad thing, the performance overall was hugely improved with WT over MMAP

[17:14:43] <Doyle> With MMAP disk was always the bottleneck

[17:15:08] <Doyle> And CPU was just hanging out, being a bruh, not doing much

[17:15:37] <Doyle> As far as I've seen, it'd be a strange day when someone didn't benefit from WT

[17:16:18] <Doyle> If your server was running a pentium M... maybe

[17:18:29] <GothAlice> I'm just glad for the restricted choice of compression algo. While I use lzma for my deep archive material (… I'm always adding features to MongoDB before they're added to core …) offering it or something like bzip2 would only result in support tickets about CPU use. XP

[17:18:43] <Doyle> MacWinner, you'll want to match your cores to the average number of active operations under WT for best performance. I noticed that with an 8 thread CPU, performance was best when the active ops didn't exceed 8. The AR/AW columns of mongostat

[17:20:48] <Doyle> LOL, don't give the masses a space shuttle instrument panel when they can barely drive a Fiat?

[17:23:02] <GothAlice> I noticed someone requesting bzip2 on JIRA a while back.

[17:23:37] <GothAlice> That just blew my mind; bzip2 is _terrible_ for speed (gzip wins) and compression ratio vs. speed (lzma thrashes bzip2).

[17:25:12] <Doyle> That's good to know. Noted. One feature I'd like is a delayed drop mechanism.

[17:25:23] <Doyle> safedrop {interval}

[17:25:29] <MacWinner> thanks for the tips!

[17:25:42] <Doyle> so the DB would become unavailable for some interval before it was truly dropped.

[17:26:01] <GothAlice> Doyle: https://docs.mongodb.org/manual/reference/command/renameCollection/

[17:26:07] <GothAlice> That's trivial to implement yourself.

[17:26:26] <Doyle> Fair enough

[17:27:05] <MacWinner> Doyle, I'll keep an eye out on mongostat too.. so right now my workload is primarily writing activity log data to a mongodb.. highly repetitive data.. only recent log data is in the working set.. I see in my test environment that it gets redued by about 80% on disk. I have another gridfs collection about 175 gigs.. after going to WT, it became about 150gigs.. but the collection is full of PDFs and PNGs which aren't compressing well..

[17:27:24] <MacWinner> GothAlice, alrighty.. if WT has your stamp on it, i'm going to do it this weekend

[17:27:26] <GothAlice> Another approach, depending on dataset size, would be to have a "expires" field and add a TTL with the time to live explicitly defined. The data won't likely all get cleaned up the same instant, of course, so solution would depend on needs.

[17:28:36] <GothAlice> MacWinner: Snappy compression is all about speed. If you want improved ratios at the cost of CPU, switch to zlib.

[17:28:53] <GothAlice> But as a note, WiredTiger uses page compression, not record compression.

[17:30:31] <GothAlice> https://docs.mongodb.org/manual/reference/configuration-options/#storage.wiredTiger.collectionConfig.blockCompressor < with a neat trick involving switching compression algos prior to creating specific collections in order to pin the algorithm used per-collection.

[17:30:56] <GothAlice> I.e. you can construct your GridFS collections without compression, then enable compression before constructing the remainder.

[17:31:36] <MacWinner> interesting.. thanks

[17:54:58] <StephenLynx> yeah, I do that.

[17:55:04] <StephenLynx> I store the file and a compressed version.

[17:55:19] <StephenLynx> then use the compressed one if the client can read it.

[18:19:39] <GothAlice> StephenLynx: In my case it's a bit more convoluted; material older than 6 months that hasn't been accessed within the last 30 days is expensively LZMA'd, replacing the uncompressed version. I do my own full text indexing (also pre-dating MongoDB support for it ;) so searches aren't affected by the "deep archiving".

[18:20:24] <GothAlice> Turns out that even already compressed video is still further compressible (if you're willing to dedicate enough CPU to it ;) due the repeated packet headers.

[18:24:42] <StephenLynx> yeah, I can afford to don't bother that much because content is ephemeral on my scenario.

[18:24:53] <StephenLynx> so there is never too much data and data is never too old.

[18:29:46] <GothAlice> Hehe. Very different than "I haven't manually deleted anything since 2001". XP

[19:04:17] <kgee> I have a (fairly complex) mongodb query that is returning “InternalError: too much recursion” upon first execution. After a second execution however, the query returns results. db.version() reports 2.6.11 . Can someone explain the recursive behaviour? I dont see how my query requires recursion in the first place. http://pastebin.com/j9PNivBT

[20:02:04] <basldex> hi. if I use db.cloneCollection to copy a collection from a host to my current instance, is there any danger of duplicated object ids? (quite important data so I want to be sure)

[20:02:42] <cheeser> no, there's no (practical) risk.

[20:03:10] <cheeser> there's a machine id component to ObjectID as well as time

[20:03:26] <basldex> that's great! thank you very much

[20:09:53] <GothAlice> <3 ObjectId

[20:10:57] <GothAlice> cheeser: Though it's important to note that with every client driver I can think of, the IDs are generated application-side, not mongod-side, identifying the application worker and not primary.

[20:24:53] <uuanton> hi yall anyone know why starting secondary requires "recover create file /data/local.1 2047MB" file because it takes forever

[20:26:11] <GothAlice> uuanton: In order to maintain internal state replicas maintain a "local" database containing such information as the oplog, oplog counters, plus a copy of the replication configuration and current status of all known nodes. If the initial 2GB allocation is too much for you, you can try enabling --smallFiles.

[20:27:14] <uuanton> thanks GothAlice it creates 10 of this files total of 2gb * 10

[20:31:48] <GothAlice> What size oplog are you using?

[20:37:59] <uuanton> if got db.getReplicationInfo() and "logSizeMB" : 20502.3212890625,

[20:39:30] <mexiwithacan> Can anyone advise me whether it's possible to update and $set a field to a dynamic calculation? I'm trying to do this: https://bpaste.net/show/97792aca060a

[20:44:37] <GothAlice> … that is a massive oplog, uuanton.

[20:46:35] <GothAlice> For one of my at-work datasets, that oplog would preserve 20 years of operations.

[21:22:36] <uuanton> GothAlive I haven't setup anything special it was default behavior

[21:27:57] <uuanton> @GothAlive anyway to avoid it ? smallFiles not the best option for me because i have 2 databases that are pretty large

[21:28:07] <uuanton> @GothAlice anyway to avoid it ? smallFiles not the best option for me because i have 2 databases that are pretty large

[21:30:31] <GothAlice> uuanton: https://docs.mongodb.org/manual/tutorial/change-oplog-size/

[21:31:34] <uuanton> thanks

[21:32:18] <GothAlice> The deciding factor for oplog size is the duration of _time_ you want to be able to easily recover from vs. operation size. I.e. if you're writing a gigabyte of oplog a day, and you need 48 hours (two days) recovery time, then you'll need a two gigabyte oplog plus a little room for growth.

[21:33:22] <GothAlice> (I.e. as long as a secondary that has become disconnected reconnects to my hypothetical dataset within 48 hours it can "catch up" using the oplog, beyond that period of time it would need to perform a full sync.)

[21:38:45] <uuanton> makes sense why is default size so big

[21:42:39] <kgee> I’m trying to figure out why my mongodb 2.6.11 console is running out of stack space (InternalError: too much recursion). I’ve boiled it down to two queries, one that works and one that doesn’t. I still don’t understand the root cause though. http://pastebin.com/snGUXLw5

[21:44:10] <kgee> I think the combination of or(and(a,b),and(c,d)) could be causing the issue

[21:47:07] <kgee> If I take the regex out, the same problem occurs. It’s not a large collection (<600 records) so I’m really at a loss

[22:16:28] <mexiwithacan> kgee I ran into a similar problem earlier today with an $and operation.

[22:17:05] <kgee> mexiwithacan: so the query is failing on our production server (v2.6.11) but working on our staging server (v2.6.6)

[22:17:40] <kgee> mexiwithacan: the strange thing is that if I run the query twice, the second time works? Something is very strange

[22:19:16] <mexiwithacan> Searching for "too much recursion" on Google makes it seem like a JavaScript-related issue.

[22:19:30] <kgee> mexiwithacan: well mongodb seems to use a javascript shell

[22:19:41] <mexiwithacan> Yup.

[22:20:11] <kgee> so the trouble must come when the json query gets parsed on the mongodb server. but that’s not transparent to me from my client application

[22:21:30] <kgee> I’ve taken out as much complexity as I can while reproducing the problem. It still happens with boolean comparisons: or(and(a:true,b:true),and(c:true,d:true))

[22:23:03] <kgee> in fact I mean that quite literally: db.getCollection('messagingSummary').find({$or: [{$and: [{a: true}, {b: true}]}, {$and: [{c: true}, {d: true}]}]})

[22:23:15] <kgee> regardless of the schema, that is crashing for me

[22:27:23] <kgee> similarly with or(and(a,b),c)

[22:32:24] <kgee> it appears to be a problem with my mongodb client application (robomongo)

[22:32:49] <kgee> since different client application versions give different results

Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 30th of March, 2016