pmxbot IRC Log Viewer

[07:46:16] <Industrial> Hi. I have a query that I'm printing with console.log(`db.segments.find(${JSON.stringify(query)})`)

[07:46:50] <Industrial> if I take that and put it in the mongo terminal it works. If I run it from Node.js with client.db().collection('segments').find(query).toArray() then I get 0 results

[07:47:03] <Industrial> In the MongoDB terminal I get one result (expected).

[07:47:07] <Industrial> Why? :S

[10:56:46] <GothAlice> Industrial: Because find() takes an object, not a string.

[10:56:59] <GothAlice> Why are you JSON.stringify’ing the query first?

[11:03:04] <GothAlice> Ah, that's a byproduct of trying to "generate code" like this. I am not rightly able to apprehend the kind of confusion of ideas…

[11:24:27] <Industrial> yeah, generating queries is so weird :D

[11:24:32] <Industrial> or printing them

[11:24:50] <Industrial> My error was using a $lte instead of a $gte

[11:26:24] <Industrial> GothAlice: btw, hi, my nickname is Industrial, and it has been for over 15 years, because I used to go out to goth clubs/parties in Amsterdam :)

[11:42:57] <GothAlice> Of course, now I have access to: https://www.heavymontreal.com/en >:DDD

[15:22:18] <bluezone> Is there some kind of library that can create a mongo database from a kind of specification file in yml or json

[15:22:42] <bluezone> I see some for testing, they make testing fixtures, I'm not sure if that will work for me

[15:27:54] <GothAlice> bluezone: I combine pyyaml with pymongo. The “tool” you speak of is four lines.

[15:28:16] <bluezone> Hmm what about for nodejs

[15:28:18] <GothAlice> (Or for my XML fixtures, lxml + pymongo.)

[15:28:32] <bluezone> hello again :)

[15:28:49] <GothAlice> bluezone: You’re on your own, there. I avoid JS like the plague. (To the point that I run Node.js code under Python, without Node, and compile my Python to front-end JS.)

[15:28:58] <bluezone> lol

[15:29:40] <bluezone> well looks like I will need to write a script to do this

[15:30:14] <GothAlice> def multiply(a:int, b:int=2): return a * b — lol when JS can invoke this function in these ways: multiply(4); multiply(4, 12); multiply(a=42); multiply(**some_dict_containing_at_least_a); mutiply(‘na’, 9) + “ Batman!”

[15:30:16] <bluezone> the team is worried about permission management, disc space, security

[15:30:30] <bluezone> lol

[18:29:31] <RxMcDonald> Hello, is there a way to force MongoDB to keep the index in RAM? For some reason there's a lot of RAM for the indexes but it doesn't keep the index in RAM

[18:30:33] <RxMcDonald> Also, it's using an index and it's scanning 80k documents yet it takes 30 seconds

[18:31:04] <RxMcDonald> How can it take so long to loop through 80k docs?

[18:31:08] <GothAlice> RxMcDonald: Essentially, no. Are your indexes small enough to fit in RAM? MongoDB uses memory mapped files, allowing the kernel itself to perform memory page optimizations such as eager fetch (predictive read-ahead), etc., etc. Ensure you have disabled “transparent huge page” support, if your kernel offers it.

[18:31:58] <GothAlice> THP can make a “page fault” (request for a page of memory not currently loaded from disk) highly variable in terms of performance. Instead of every memory block being 4KiB, they can be gigabytes.

[18:32:30] <RxMcDonald> My indexes are much smaller than the RAM available, it says 9GB and the cluster has hundreds of GB of RAM

[18:33:08] <RxMcDonald> "totalDocsExamined" : 78269 "executionTimeMillis" : 29616, why?

[18:34:16] <GothAlice> “Cluster has…” — how much does a single data-carrying node have in relation to how much data it is carrying?

[18:35:08] <GothAlice> For example, on my “cluster” I can compile a kernel from depclean in less than 40 seconds. MongoDB, using BOOST, does not benefit, and takes as long to compile on the cluster as it does on a single machine—availability does not mean usability.

[18:35:10] <RxMcDonald> A lot more than enough, each RS has 3 nodes and each node 16GB of ram and 70GB of data of which only a tiny fraction is indexed

[18:35:47] <RxMcDonald> I've got 10 shards, each with 3 nodes

[18:36:09] <RxMcDonald> Meaning total cluster RAM is almost 500GB

[18:36:21] <RxMcDonald> Total index size is less than 10GB

[18:36:47] <RxMcDonald> For some reason it's taking 30 seconds to loop through 80K index values

[18:37:13] <RxMcDonald> The field is indexed

[18:37:37] <RxMcDonald> There has to be something wrong with it

[18:38:22] <GothAlice> Question of the day: $explain … is it actually using the index?

[18:38:39] <GothAlice> Have you tried explicitly hinting the index?

[18:39:13] <GothAlice> Are you positive you are issuing the query against a mongos query router, and not against specific nodes?

[18:39:23] <RxMcDonald> I haven't tried that but .explain() shows IXSCAN at the last node

[18:39:24] <RxMcDonald> Yep

[18:39:24] <GothAlice> And if so, are all nodes of equal IO performance?

[18:39:31] <RxMcDonald> Yes

[18:39:36] <RxMcDonald> It's not SSD though

[18:39:44] <RxMcDonald> Which could be an issue

[18:39:53] <RxMcDonald> But not if it was actually using the index, which is what bothers me

[18:40:36] <GothAlice> I’ve got 40TiB of data in platter-backed MongoDB at my home office; I’m using extremely slow storage arrays, even it isn’t 30 seconds slow on queries.

[18:40:50] <RxMcDonald> The issue could also be because I'm sorting by _id -1

[18:41:09] <RxMcDonald> Yeah, I know that's why I'm trying to find the issue

[18:41:24] <GothAlice> I think 7 seconds was the slowest one I’ve issued against it. _id is indexed, that final index scan is likely to facilitate just that sort. So, same question: is the initial query actually using the index?

[18:42:03] <GothAlice> (The index you wanted / can optimize the query, that is.)

[18:42:48] <RxMcDonald> It shows inputStage.stage = "FETCH", filter: my index, should that be IXSCAN instead of FETCH?

[18:43:27] <RxMcDonald> Stage: SHARDING_FILTER, then inputStage.stage = "FETCH" ...

[18:43:39] <RxMcDonald> And finally the IXSCAN

[18:43:44] <RxMcDonald> With _id

[18:45:15] <GothAlice> So, yup, not using your tuned index. https://docs.mongodb.com/manual/reference/method/cursor.hint/index.html

[18:45:17] <RxMcDonald> Actually I just changed the query and instead of doing $in for the array field I did a plain query, i.e. field: "value" and it took 300ms

[18:45:31] <GothAlice> Likely using the index in that case.

[18:46:21] <GothAlice> If the initial query were using an index, there would be an IXSCAN entry contained within the FETCH entry, as a child node, I do believe.

[18:47:42] <GothAlice> Ref: https://docs.mongodb.com/manual/tutorial/analyze-query-plan/ + https://docs.mongodb.com/manual/reference/explain-results/

[18:49:20] <RxMcDonald> Do you know if $all or $in make a difference?

[18:49:34] <RxMcDonald> The indexed field is an array of strings, actually country codes

[18:55:14] <RxMcDonald> I still haven't managed to make it use the index

[18:58:19] <RxMcDonald> It's taking even longer now, 45 seconds

[18:58:21] <GothAlice> Arrays of strings create multi-key indexes: https://docs.mongodb.com/manual/core/index-multikey/

[18:58:31] <GothAlice> Have you explicitly hinted the index to use?

[18:58:35] <GothAlice> (Third time’s the charm…)

[18:58:41] <RxMcDonald> Yeah I tried that

[19:00:09] <RxMcDonald> db.documents.find( { countries: "fr" }, {"_id": 1 } ).hint( { countries: 1 } ).sort( { "_id": -1 } ).limit(50) takes almost a minute

[19:06:58] <RxMcDonald> What changed in the explain now is that it shows IXSCAN at the bottom with countries instead of _id

[19:09:22] <RxMcDonald> Adding the hint didn't fix it, don't know what to do

[19:11:26] <RxMcDonald> It bothers me that it doesn't keep the index for some reason I make a query, then the same one right after and returns instantly, then if I wait 10 minutes it takes a minute

[19:13:15] <RxMcDonald> For some reason it shows executionTimeMillisEstimate 266 for the IXSCAN and the top SHARDING_FILTER -> FETCH 70 seconds

[19:20:52] <RxMcDonald> Maybe I should run a database repair?

[19:29:10] <RxMcDonald> It goes from a FETCH of 26 seconds to an IXSCAN of 60ms, I don't understand

[19:31:29] <RxMcDonald> GothAlice: Any other thing I could try?

[19:34:01] <GothAlice> RxMcDonald: Gist your $explain; alas, I’m actually at work and can’t really 1:1 assist.

[19:38:20] <RxMcDonald> Do you need the output of all the shards?

[19:41:36] <RxMcDonald> GothAlice: https://pastebin.com/raw/9PyN5YiV

[20:07:08] <RxMcDonald> GothAlice: I don't understand why it goes from IXSCAN 30 ms to FETCH 30 seconds

Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 28th of May, 2019