pmxbot IRC Log Viewer

[00:21:00] <GothAlice> hardwire: It's not new, though. SSL was an enterprise feature.

[00:21:36] <GothAlice> As was/is deeper authentication (LDAP-style) integration, which is a typically enterprise need. (And a hideous PITA to support.)

[00:27:36] <cheeser> kerberos is fun!

[00:27:53] <GothAlice> In my experience, also fragile. ;^P

[00:29:56] <GothAlice> "You know that pre-signed host key you were using? Yeah… it's not so good any more." "But, but why?" "Reasons."

[00:41:36] <nicolas_FR> hi there. Maybe not the right place but I can't find help. Mongoose (?) problem I think with sub documents : http://pastebin.com/GLJhSHyd

[00:42:45] <joannac> nicolas_FR: have you tried the various mongoose support forums?

[00:46:27] <nicolas_FR> joannac: like stackoverflow ?

[00:46:55] <joannac> nicolas_FR:

[00:46:58] <joannac> http://mongoosejs.com/

[00:47:10] <joannac> SO, github, google group, gitter

[00:48:09] <joannac> (i don't know anything about mongoose, so can't help further than that)

[00:48:30] <nicolas_FR> joannac: thanks, will post on stackoverflow

[07:45:58] <zerOnepal> Hi there, I have a server running with a collection weigh 1TB and trending size, my application was designed that way, but I trying to trim down the un-necessary entries

[07:48:31] <Boomtime> hi zerOnepal, sounds like you have a use case problem - is there something specific you want help with?

[07:48:36] <zerOnepal> any recommendation on this, how do I safely delete old records without hampering live traffic, I don't have much time to figure this out and I can't afforord to add additional disk for temporary fix... My only choice is to trim the old datas, I am thinking of keeping latest 3months data only

[07:50:36] <Boomtime> zerOnepal: each deletion will add load, these are write operations, albeit small ones, but a large batch will have an effect - if you want to reduce this then delete in small batches (a fewhundred at a time perhaps?) with a writeConcern of majority and have a small sleep between batches (say, 1 second?)

[07:51:04] <Boomtime> notably, this is your use case, these are just general recommendations

[07:51:13] <Boomtime> the details need to be filled in by you

[07:53:20] <zerOnepal> batch deletions, with certain grace time delay... hmmm

[07:54:18] <kurushiyama> zerOnepal: Do you have a date field on the data?

[07:54:23] <zerOnepal> any mongo tools you want to recommend ?

[07:54:31] <kurushiyama> zerOnepal: shell

[07:54:34] <zerOnepal> yes I do, rails does that for me

[07:54:39] <zerOnepal> hahah kurushiyama

[07:54:49] <zerOnepal> nice one

[07:54:52] <kurushiyama> zerOnepal: No kidding. I have yet to find something better

[07:55:16] <Boomtime> clarify which thing you're laughing at; the shell suggestion, or the idea you have an index

[07:55:20] <kurushiyama> zerOnepal: If you have a date field, you might want to use a TTL index.

[07:55:24] <Boomtime> no!

[07:55:38] <kurushiyama> Boomtime: No?

[07:55:50] <Boomtime> a TTL will issue a single delete for everything past that date

[07:55:55] <zerOnepal> shell thing Boomtime, :D.. I wish there exists something like percona tools for such alterations...

[07:56:19] <Boomtime> you must not use a TTL to 'catch up' on the fact that you don't have a TTL

[07:56:26] <kurushiyama> Boomtime: Erm... no? There is a expireAfterSeconds param...?

[07:56:44] <Boomtime> and what would you set that to?

[07:57:28] <Boomtime> TTL is extremely useful so long as not many documents are _initially_ to be deleted

[07:57:58] <Boomtime> if all the documents in the collection are more recent than 3 months, then by all means, create a TTL index for an expiry of 3 months

[07:58:15] <zerOnepal> what is TTL guyz ??

[07:58:22] <zerOnepal> time to live or sth ??

[07:58:23] <kurushiyama> Boomtime: Say the date field is set to the insert date, and you want to have the doc deleted after 3 months, so you'd set it to 3*30*24*60*60? As per the initial deletion: It is a background process, and we are talking of a fairly recent MongoDB ;)

[07:58:31] <kurushiyama> zerOnepal: Aye

[07:58:41] <Boomtime> if you have a large collection, with documents spread across a year, do not create a TTL with an expiry of 3 months,because that just means the first run needs to delete 3/4 of the collection in the first pass

[07:59:42] <kurushiyama> zerOnepal: Which version and (if applicaple) which storage engine do you use?

[08:00:25] <zerOnepal> I am stucked with old mongo relase 2.4.x

[08:00:42] <zerOnepal> I am thinking of upgrading it to 3.xx in days to come

[08:00:57] <zerOnepal> Storage engine is the default one on legacy 2.4.x

[08:03:47] <zerOnepal> xhagrg anything you want to add ?

[08:05:58] <zerOnepal> so you guyz recommend/use built in mongo shell to interface/alter/trim mongo records ??

[08:15:26] <Boomtime> zerOnepal: you have provided very little information about your actual use case - collection size, but not document counts, or how many you expect to be deleted, or anything about the application - we'd use the shell for most simple one-off scripts because it's easy

[08:16:25] <Boomtime> a simple loop that deletes a certain number of matching docs out of a progressive cursor, with a sleep between is about 4 lines of javascript - and it's efficient - what do you want?

[08:16:35] <zerOnepal> its about 1TB weigh collection, that I get using mongo-hacker awesome prompt, and I can't afford to calculate the count the records :(

[08:18:18] <Boomtime> mongo-hacker is a shell extension - with or without it won't change the 4 lines of javascript - but the impact of those 4 lines will change radically depending on your circumstances

[08:18:53] <Boomtime> what criteria will you use to delete appropriate records?

[08:20:15] <zerOnepal> my only available low hanging fruit is the "created_at" field on that collection and

[08:20:34] <kurushiyama> zerOnepal: Whut? You can not do a db.coll.count()?

[08:20:46] <Boomtime> awesome - is there an index on that field?

[08:21:12] <Boomtime> maybe just post the .stats() of that collection to a pastebin

[08:21:29] <zerOnepal> that's an expensive query kurushiyama... let me check on slave

[08:21:31] <Boomtime> redact the namespace if it's important

[08:21:37] <zerOnepal> nope boomtime :(

[08:21:38] <Boomtime> lolwut?

[08:21:54] <kurushiyama> zerOnepal: Uhm, it does not get any cheaper than a db.coll.count().

[08:21:58] <Boomtime> db.coll.count() is an expensive query?

[08:22:14] <Boomtime> if that takes a long time you have serious problems

[08:22:17] <zerOnepal> wait I am checking

[08:22:22] <zerOnepal> ya :(

[08:22:35] <zerOnepal> deep shit

[08:23:06] <Boomtime> are you running it with parameters? - when we say "count()" we mean count with no parameters

[08:23:16] <kurushiyama> Well, give it a sec or 2 for returning. 1TB may be a hell lot of docs, and iirc, the way count works was changed a little.

[08:23:29] <Boomtime> he said 2.4

[08:23:39] <Boomtime> that's an MMAP collstats record return

[08:23:44] <kurushiyama> Aye

[08:23:45] <zerOnepal> its roughly equals 500413374

[08:23:48] <zerOnepal> sorry my bad

[08:24:09] <Boomtime> righto - i have to go, but hopefully kurushiyama can be around for a bit

[08:24:26] <kurushiyama> Sort of.

[08:24:28] <zerOnepal> thanks Boomtime

[08:24:33] <kurushiyama> Will try to.

[08:25:03] <zerOnepal> thanks all, we are 297 in room and I am still getting a company here in irc room

[08:26:35] <zerOnepal> kurushiyama it's db.col.count() results around 500413374 in few secs, I was wrong

[08:27:08] <kurushiyama> zerOnepal: Ok, that relieves me a bit. If it takes longer than a few secs, there is something weird going on.

[08:27:30] <kali> rf

[08:27:32] <kali> oups

[08:28:23] <kurushiyama> zerOnepal: how critical is the disk space?

[08:33:35] <zerOnepal> way to critical, I will be doomed in few days :(

[08:34:34] <zerOnepal> Hey, isn't there a way to delete in the background? Like when we build indexes?

[09:09:19] <kurushiyama> zerOnepal: Well, we have to find out a few things. I am not too sure wether you say "We can not have any impact" because of the paranoia DBA's usually have or wether your System is really at its limits and a delete operation would have noticeable or even critical impact on UX.

[09:58:34] <jokke> hi

[09:59:13] <jokke> wt has a cache. what is being cached exactly? queries? aggregations?

[12:30:33] <Freman> wow, we scored 66% on mms health score... that's... more than I expected

[12:32:04] <kurushiyama> Freman: That is not bad. Follow their advice, however ;)

[12:48:37] <silviolucenajuni> o/

[12:54:43] <kurushiyama> silviolucenajuni: \o

[13:44:01] <bros> Why does this query return 1 matching store and not 4: { '$elemMatch

[13:44:41] <bros> Why does this query return 1 matching store and not 4: { '$elemMatch': { 'stores': { '_id': { '$in': [ObjectId(...), ObjectId(...), ObjectId(...), ObjectId(...)] } } } }, { 'stores.$': 1 }

[14:44:03] <zylo4747> I posted a question to https://groups.google.com/forum/#!forum/mongodb-user but it isn't showing up. Is there something I need to do for a post to become visible? I can't even find it now, it's just gone

[14:45:00] <cheeser> zylo4747: https://groups.google.com/forum/#!contactowner/mongodb-user

[14:45:12] <cheeser> contact the group owner and see what's up.

[14:45:26] <zylo4747> ok thanks

[14:53:01] <Industrial> Hi.

[14:53:18] <Industrial> How do I find all documents that have no key X OR Y ?

[14:54:07] <cheeser> $exists

[14:54:15] <Industrial> right

[14:54:29] <Industrial> I see, and https://docs.mongodb.org/manual/tutorial/query-documents/#specify-or-conditions

[14:55:59] <Industrial> so `.find({ $or: [ { retailer_id: { $exists: false } }, { category_id: { $exists: false } }, ] })` :)

[14:56:12] <cheeser> "try it and see" :)

[14:56:22] <cheeser> but it looks right

[15:28:59] <Justin_T> Hi all, is possible to add the result from one collection to another for print in a query?

[15:30:28] <cheeser> you can $lookup in an aggregation

[15:32:08] <Justin_T> https://bpaste.net/show/5c9418723335

[15:32:29] <zylo4747> I posted the issue on serverfault if any of you want to take a look and see if you can help me out http://serverfault.com/questions/774567/intermittent-crashes-of-a-mongodb-replica-set

[15:32:30] <Justin_T> I'm using mongodb 3.0 :( I can't use lookup

[15:35:40] <saml_> object.count = tasks.next().count; printjson(object); ?

[15:37:11] <Justin_T> InternalError: too much recursion

[15:38:46] <saml_> db._User.find().forEach(function(u){ u.count = db.GeneralTask.count({userID: u._id}); printjson(u); });

[15:41:06] <Justin_T> thanks saml_, can I add a match there? because I need this: $match: { "isDone": { $ne: false }

[15:41:16] <saml_> sure

[15:41:29] <saml_> db._User.find().forEach(function(u){ u.count = db.GeneralTask.count({userID: u._id, "isDone": { $ne: false }}); printjson(u); });

[15:42:54] <Justin_T> great, thanks!

[15:55:38] <zylo4747> Do you know if db.repairDatabase() will tell me if corruption was found and fixed?

[16:27:38] <jokke> i'm measuring write performance and read performance for different schemas on a dockerized mongodb sharded cluster. i've noticed that at some point the performance drops radically and the cpu usage of the replication set members goes to a constant 100%. Any leads what might be causing this?

[16:27:54] <jokke> this drop happens when testing writes

[16:28:09] <jokke> after a few minutes into the benchmark

[16:29:12] <jokke> i've capped the max memory of each replication set container to 2g

[16:29:31] <jokke> and assigned individual cpu cores

[16:31:47] <bros> Does elemMatch only return the first element and I need to use aggregate to get them all?

[16:32:55] <jokke> elemMatch doesn't return anything. It just matches documents where at least one element of an array field matches the quer(y|ies)

[18:47:23] <hardwire> GothAlice: thanks (late reply).

[18:48:19] <GothAlice> No worries. :)

[18:50:21] <hardwire> Just have to treat them as different things. MongoDB CE and MongoDB Enterprise. One reason I'm not using InfluxDB right now is due to it recently becoming crippleware.

[18:50:42] <hardwire> The most important use is now removed from the open source software.

[18:52:54] <cheeser> tthe philosophy with such enterprise features in mongodb is to only do that to features needed by large enterprises. the more "run of the mill" features will be open and available. (*not an official statement. details subject to change without notice or consultation)

[18:53:47] <hardwire> cheeser: indeed

[18:54:19] <hardwire> I'm not sure what made influx remove clustering capabilities other than the need for cashmoneyz

[18:54:30] <cheeser> that's as good a reason as any

[18:54:37] <hardwire> I'm just not sure how much of a need it was.

[18:54:49] <hardwire> Their support contracts are pretty well priced for both sides.

[18:54:52] <cheeser> enough of one apparently

[20:44:18] <jokke> hi

[20:45:12] <jokke> i have a compound index as _id consisting of the fields sensor_id, panel_id and timestamp

[20:45:37] <jokke> the timestamp field changes every 2 seconds

[20:45:48] <jokke> i use this as my sharding key

[20:46:01] <jokke> but the shards aren't distributed evenly at all

[20:46:25] <cheeser> the shard key shouldn't be mutable, i believe

[20:46:34] <cheeser> i could be wrong about that one, though.

[20:58:07] <quadHelix> How do I pull back all records using mongoexport? Here is my command: mongoexport --type=csv --host <ip>:<port> --db myorders --collection 57290da5-b67c-4b6b-b9bd-3fa1608e448c -f 'created,job_id,name,' -o myfile.csv

[20:58:20] <quadHelix> gah, my text is grey

[21:17:17] <jfhbrook> hello! Looking for opinions on when to use _id for a property vs making a new property

[21:17:31] <jfhbrook> I have a scenario where most of my lookups are against a unique key, but that key is a string

[21:17:40] <jfhbrook> and I'm wondering if there are reasons for/against setting that to _id

[21:17:44] <jfhbrook> thoughts?

[21:17:59] <jfhbrook> I think setting that as _id is a good idea, my team isn't convinced

[21:18:04] <jokke> jfhbrook: sure

[21:18:09] <jokke> go ahead

[21:18:12] <jokke> totally fine

[21:18:18] <jfhbrook> okay, so fine, but good?

[21:18:18] <jokke> if you're sure it's unique

[21:18:27] <jfhbrook> yeah, everything falls apart if it isn't unique

[21:18:38] <jokke> jfhbrook: it's not bad practice if that's what you're asking

[21:18:49] <jfhbrook> well that's part of what I'm asking

[21:18:49] <jokke> fairly common even

[21:18:54] <jfhbrook> but I'm also asking if it's *good* practice

[21:18:57] <jfhbrook> and if so, for what reasons

[21:19:03] <jokke> well that surely depends

[21:19:13] <jokke> would you add an index for that field anyway?

[21:19:17] <jfhbrook> yes

[21:19:29] <jokke> then i'd say it's good practice

[21:19:45] <jokke> i'm even using a subdocument as an index

[21:19:55] <jokke> as _id i mean

[21:20:04] <kurushiyama> cheeser: _id's are immutable, for starters ;)

[21:20:04] <hardwire> jfhbrook: one big consideration is how you plan to partition data for sharding, if ever.

[21:20:05] <jfhbrook> yeah, we do that in one place but my team never trusted the contractor that wrote that code

[21:20:18] <kurushiyama> jfhbrook: _id's are unique. So you can not have more than one doc per string.

[21:20:30] <jfhbrook> kurushiyama, that's a goal here

[21:20:38] <kurushiyama> jfhbrook: That is one impact you have to carefully consider.

[21:20:48] <kurushiyama> jfhbrook: Furthermore, it makes a bad shard key.

[21:20:55] <hardwire> if you want to determine how information is stored against a partitioned cluster, you'd use your own key regardless.

[21:21:09] <jokke> cheeser: ah yeah sorry, i meant that the timestamp fields of the docs are 2 secs apart

[21:21:31] <jfhbrook> so times I wouldn't want to do this is if I can get in a scenario where I want to update the doc by changing this key (not going to happen ever)

[21:21:44] <jfhbrook> or if I want to do sharding against that property

[21:21:44] <jfhbrook> ?

[21:21:59] <jokke> kurushiyama: ObjectId is also a bad shard key

[21:22:09] <jfhbrook> this property is a string fwiw

[21:22:23] <jfhbrook> in fact, it's a uri

[21:22:44] <kurushiyama> jfhbrook: _id's are immutable.

[21:22:55] <jfhbrook> and what do you mean by that kurushiyama

[21:23:03] <jokke> jfhbrook: that you cannot change it

[21:23:05] <jokke> ever

[21:23:06] <kurushiyama> jfhbrook: You can not change it.

[21:23:09] <jfhbrook> do you mean that I can't change the _id of a document after the fact? because if so that's desirable

[21:23:18] <kurushiyama> jfhbrook: Correct.

[21:23:18] <jfhbrook> at least imo it's desirable

[21:23:25] <jfhbrook> okay

[21:23:36] <jokke> jfhbrook: only way would be to delete and create a new doc with the same dat

[21:23:38] <jokke> a

[21:23:45] <jfhbrook> yeah and that's the current mental model, so good

[21:24:08] <jokke> alright

[21:24:25] <jokke> for sharding you'd want a compound index though (most of the time)

[21:24:55] <jfhbrook> we're not using sharding

[21:25:00] <jokke> not yet ;)

[21:25:05] <jfhbrook> I mean

[21:25:10] <jfhbrook> our data is relational

[21:25:17] <jfhbrook> so you can guess what the long game is here

[21:25:30] <jfhbrook> but in the short term, great!

[21:25:31] <jfhbrook> thanks

[21:25:51] <jokke> kurushiyama: any thoughts on my shard distribution problem?

[21:27:15] <jokke> https://p.jreinert.com/7Xyb/

[21:27:18] <kurushiyama> jokke: a) you can not change the _id, which you are trying to do, as far as I understood. b) You need to use an {_id:"hashed"} index, since most likely you have a monotonically increasing shard key with the use of a date field.

[21:28:10] <jokke> i'm not

[21:28:13] <kurushiyama> jokke: Problem is: you can not change the designated shard key.

[21:28:22] <jokke> i'm not changing anything

[21:28:40] <kurushiyama> jokke: Chunk migratin threshold is not met

[21:29:24] <kurushiyama> jokke: https://docs.mongodb.org/manual/core/sharding-balancing/#migration-thresholds

[21:29:26] <jokke> but isn't chunk migration done only when it doesn't distribute evenly on it's own?

[21:30:10] <kurushiyama> jokke: No.

[21:30:26] <jokke> hm

[21:30:35] <kurushiyama> Chunks are split when they cross maxChunkSize/2

[21:30:59] <kurushiyama> Then, in case the chunk migration threshold is met, the balancer kicks in.

[21:31:06] <jokke> but from what i've gathered _id: 'hashed' is a bad idea with time series data

[21:31:23] <kurushiyama> jokke: Why that?

[21:31:45] <jokke> since it results in scatter gather queries for time range queries (which is the most common kind of query)

[21:31:52] <kurushiyama> jokke: Actually, "hashed" was invented for a monotonically increasing shard key.

[21:32:06] <kurushiyama> jokke: You still have key ranges

[21:32:21] <jokke> yeah sure if my shard key would be _only_ a timestamp that'd be a bad shard key

[21:32:28] <kurushiyama> jokke: But yes, there is a tradeoff.

[21:33:08] <jokke> a sample doc: https://p.jreinert.com/yVxKi/

[21:33:16] <kurushiyama> jokke: The thing is this: let us assume your _id consists of 2 arbitary strings plus a timestamp

[21:33:42] <jokke> it does :)

[21:33:44] <kurushiyama> jokke: Now, let us assume we have 1970-01-01T00:00:01

[21:34:10] <jokke> ok

[21:34:11] <kurushiyama> so our ts would be 1000, and two strings.

[21:34:28] <jokke> 1000?

[21:34:35] <kurushiyama> 1000ms since epoch

[21:34:38] <jokke> ok

[21:35:01] <kurushiyama> So, lets say we do string + ts + string

[21:35:09] <kurushiyama> a1000b

[21:35:19] <kurushiyama> Now, the next doc comes in

[21:35:25] <kurushiyama> a2000b

[21:35:36] <kurushiyama> A second later, obviously

[21:35:52] <kurushiyama> now do a lexicographic sort on that

[21:36:12] <kurushiyama> Monotonically increasing

[21:36:28] <kurushiyama> Now lets say we'd put it ts+string+string

[21:36:28] <jokke> um

[21:36:36] <jokke> hold on

[21:36:46] <jokke> there are others coming in too of course

[21:36:54] <jokke> b1000c etc

[21:36:57] <kurushiyama> Lets stay simple for now

[21:37:05] <kurushiyama> lets say we do it different

[21:37:11] <kurushiyama> 1000ab

[21:37:14] <kurushiyama> 2000ab

[21:37:19] <kurushiyama> Now sort again

[21:37:36] <jokke> still monotonically increasing

[21:37:39] <kurushiyama> Aye

[21:38:13] <kurushiyama> We could use _any_ combination, and it would be monotonically increasing from a lexicographic pov.

[21:39:32] <kurushiyama> So, now say we have b1000c

[21:40:11] <jokke> but i thought that choosing compound indexes allows the sharding algorithm to distribute for example docs with the first string being a into one shard and chunk it via timestamp

[21:40:20] <kurushiyama> It would still be "bigger" than a1000b, and b-docs would monotonically increase in their own right.

[21:41:28] <kurushiyama> jokke: Well, even if you do assign the key ranges manually, you still would be adding to the plus infinity shard only.

[21:41:56] <kurushiyama> jokke: "plus infinity" being a rather abstract thing here, but you get the picture.

[21:42:07] <jokke> i'm not sure i do

[21:45:12] <kurushiyama> Ok, your example is a bit complicated. let us assume you assign a* -b* to shard1 and c*-d* to shard2. At some point, you have an imbalance due to simple variance. Clear so far?

[21:46:43] <jokke> no :/ sorry. why would i have an imbalance?

[21:47:14] <jokke> because for some reason i get more docs in the a* -b* range for example?

[21:47:22] <kurushiyama> jokke: aye

[21:47:26] <jokke> why?

[21:47:31] <kurushiyama> Sensor down

[21:47:46] <kurushiyama> Godzilla stomped on a line

[21:47:49] <jokke> well.... statistically that's very rare

[21:48:46] <kurushiyama> A variance of (32MB * migration threshold) + 1 is sufficient.

[21:49:13] <jokke> mhm

[21:50:05] <kurushiyama> And it goes beyond that. A: you can not assign this way. you can assign from -inifinity to a certain value, and from that value to +infinity

[21:50:26] <kurushiyama> As defined by $minKey and $maxKey

[21:50:43] <jokke> yeah

[21:51:33] <kurushiyama> jokke: Please pastebin the output of sh.status(true)

[21:52:14] <jokke> https://p.jreinert.com/awWzE/

[21:53:38] <kurushiyama> jokke: Have a look at the key ranges. Pretty self-explanatory.

[21:54:37] <jokke> i don't know how to interpret the output

[21:54:46] <kurushiyama> jokke:

[21:54:49] <kurushiyama> jokke: Ok

[21:54:50] <neoadventist> ok

[21:55:28] <jokke> sorry i'm so slow... :/ this is the first time i'm dealing with sharding

[21:56:57] <kurushiyama> jokke: Ok, you see the key ranges assigned to the respective shards?

[21:58:58] <kurushiyama> All those "{...}-->>{...} on rsX"

[21:59:04] <jokke> no? :/ i don't understand the output at all. what's up with the -->> and Timestamp(x, y)

[21:59:24] <kurushiyama> jokke: Those are the key ranges assigned to the respective shard.

[22:00:04] <kurushiyama> jokke: Ok, one step back.

[22:00:27] <jokke> but how should i read this?

[22:01:19] <kurushiyama> well, have a look at the ranges. let us take lines 37 to 43

[22:02:11] <jokke> alright

[22:02:18] <kurushiyama> Understood?

[22:03:01] <jokke> so everything from $minkey to aaaa, data_source_1 and 2016-05-03T20:31:03.427Z ?

[22:03:25] <kurushiyama> jokke: Aye

[22:03:50] <jokke> ok

[22:03:54] <kurushiyama> jokke: Now have a (casual) look over the other key ranges of that database only

[22:04:39] <kurushiyama> jokke: until line 124 exclusive

[22:05:20] <olpery> mongodb is a piece of shit that lost my data again. fuck you.

[22:05:33] <kurushiyama> olpery: writeConcern is your friend ;)

[22:05:41] <jokke> almost everything is assigned to the first shard

[22:05:50] <jokke> lol @ lopery :D

[22:05:55] <jokke> *olpery

[22:07:31] <jokke> kurushiyama: is that what i was supposed to see?

[22:08:14] <kurushiyama> jokke: Sort of. That more data was written to rs1 is quite obvious from the chunk distribution. Question is: Why?

[22:09:05] <kurushiyama> jokke: Now we know that a broader range of keys gets written to rs1. Why does this happen?

[22:09:29] <jokke> i wish i knew :)

[22:10:05] <jokke> i guess i just don't know well enough how the shard keys are used by mongodb

[22:10:13] <kurushiyama> Oh, that is easy

[22:10:24] <kurushiyama> Lets start simple

[22:10:32] <kurushiyama> our shard key is an integer

[22:10:45] <jokke> ok

[22:11:13] <kurushiyama> so we have $minKey => 10 assigned to rs1 and 11 => $maxKey assigned to rs2

[22:12:01] <kurushiyama> Now, let us assume our shard key increases monotonically. Autoincrement, if you will

[22:12:20] <jokke> quick follow up question: should $minKey and $maxKey sort of define the "middle" of the sharded data?

[22:12:40] <kurushiyama> jokke: no

[22:12:43] <hardwire> newp

[22:13:06] <kurushiyama> In our example $minKey literally means -infinity, in case we talk of singned ints

[22:13:23] <kurushiyama> vice versa with $maxKey

[22:13:46] <kurushiyama> So, we now add our docs.

[22:13:48] <jokke> ah i misinterpreted your sentence. you meant "from $minkey to 10" and "from 11 to $maxKey"

[22:13:59] <kurushiyama> Aye

[22:14:09] <kurushiyama> Now, we add our docs

[22:14:10] <jokke> ok

[22:14:29] <kurushiyama> docs 1-10 go to rs1, _all_ other docs go to rs2

[22:14:34] <jokke> sure

[22:14:41] <kurushiyama> we only have two chunks so far

[22:15:27] <kurushiyama> Now, we get to doc 100 of our rather large docs and the chunk on rs2 crosses the 32MB border and gets split.

[22:15:56] <jokke> ok

[22:15:58] <kurushiyama> This happens a few times, until the chunk migration threshold is reached and the balancer moves _one_ chunk to rs1

[22:16:13] <jokke> yeah

[22:16:49] <jokke> but wouldn't that mean rs1 is always chunk migration threshold chunks "behind" rs2?

[22:17:15] <kurushiyama> jokke: not quite. Chunk migration threshold -1

[22:17:27] <jokke> yes

[22:17:47] <jokke> ok but constant chunk migration is bad anyways

[22:17:52] <kurushiyama> Plus, all writes go to rs2, but lets keep this aside for the moment.

[22:17:56] <jokke> yes

[22:18:03] <jokke> that's the major bummer

[22:18:18] <kurushiyama> Now, how does mongos know where to look for a doc?

[22:18:19] <jokke> plus the io load from the chunk migration

[22:18:29] <kurushiyama> jokke: That is rather laughable.

[22:18:34] <jokke> it is?

[22:18:40] <jokke> well network load then

[22:18:48] <kurushiyama> A chunk can have max 64MB, otherwise it becomes jumbo and is not migrated any more

[22:19:00] <jokke> k

[22:19:31] <kurushiyama> Which is one of the reasons for an early split. Furthermore, there can only be ony, and exactly one chunk migration in a cluster at any given point in time.

[22:19:55] <jokke> mhm

[22:20:32] <kurushiyama> jokke: Which can become part of the problem when choosing a bad shard key, for obvious reasons.

[22:21:19] <kurushiyama> So, now you want to access the data you saved. How does mongos know where to get the doc?

[22:21:36] <kurushiyama> Assuming you query by shard key.

[22:22:44] <kurushiyama> It gets the key ranges assigned to the individual shards and does a rather simple lexicographic comparison (single or compound does not matter much, as we have seen) to identify the shard the document lives on.

[22:23:51] <jokke> alright

[22:24:06] <kurushiyama> Then mongos contacts that shard, fetches the data pretty much like you'd do in a standalone instance (iirc, _exactly_ the same way) and gets you the data.

[22:24:46] <kurushiyama> That last part "gets you the data" can become pretty tricky, but for now that is sufficient.

[22:25:14] <jokke> ok

[22:25:14] <kurushiyama> The thing is that we are talking of (more or less) contiguous ranges of keys assigned to each shard.

[22:27:38] <kurushiyama> So, and our example made it pretty obvious why you do want to prevent anything monotonically increasing at all costs.

[22:28:41] <jokke> hm yes i guess so

[22:29:07] <jokke> but i'm not happy with a hashed _id either

[22:29:11] <jokke> it's too random

[22:29:19] <jokke> no query isolation at all

[22:29:28] <kurushiyama> You are afraid of scatter/gather, as I could tell?

[22:29:33] <jokke> yes

[22:30:26] <jokke> the most common query would be "get me all data from panel X in the time range from Y to Z"

[22:31:10] <jokke> which is why i thought using _id would've made sense

[22:31:47] <kurushiyama> jokke: Well, the alternative is a constantly overloaded shard. That does not much for performance. In worst case scenarios I have seen, there were 8 shards (rather big ones), 7 of them idling around, 1 of them totally overloaded up to a pint where the application came to a standstill.

[22:33:38] <kurushiyama> Compared to inserting data, how often do you do this query?

[22:34:33] <kurushiyama> jokke: I know that is a stupid question. But you really have to ask yourself.

[22:34:43] <jokke> sure...

[22:35:10] <jokke> hm i don't know. even the write performance was way better without the hashed _id

[22:35:32] <jokke> i don't know why...

[22:35:37] <kurushiyama> jokke: The actual writes? Sure. Taking balancing into account? I doubt that.

[22:36:04] <jokke> mmh i guess you're right about that

[22:36:30] <jokke> from the docs: Generally, choose shard keys that have both high cardinality and will distribute write operations across the entire cluster.

[22:37:03] <kurushiyama> jokke: Well, there are several approaches we could use.

[22:37:53] <kurushiyama> jokke: First, we could use a redundant key.

[22:38:01] <jokke> yeah

[22:38:09] <jokke> seems inevitable...

[22:38:35] <kurushiyama> jokke: I am not too convinced. Let me check

[22:39:35] <kurushiyama> I need a smoke. be back in 5

[22:40:44] <jokke> kurushiyama: i need to get some sleep.. i'll see you tomorrow. Thanks for being so patient and helping me out like that. I really appreciate it, knowing you're usually getting paid for this stuff.

[22:44:01] <kurushiyama> jokke: You might want to dig into "tag based sharding". Just an idea. havent elaborated yet

[22:44:55] <jokke> alright

[22:46:37] <kurushiyama> nah, we still are monotonically increasing. Can you pastebin a sample doc quickly? So maybe I can come up with something.

[22:50:25] <jokke> ok, so this will be the schema i'm most likely going with based on the performance and the way the data comes in: https://p.jreinert.com/nzKb/

[22:51:09] <jokke> but this seemst to distribute pretty well: https://p.jreinert.com/mPK0/

[22:51:12] <jokke> *seems

[22:51:38] <kurushiyama> Sort of. From first sight: flat data model, sharded by data source.

[22:52:10] <kurushiyama> Pre-split.

[22:53:19] <jokke> i won't go with a flat data model :) i know we had this discussion a few times already. The data comes in packets though, per panel. so it's way faster to store them inside a "panel-document" (since they're most likely being queried together too)

[22:54:37] <jokke> the doc i just pasted is basically the closest representation of how the data comes in

[22:56:22] <kurushiyama> jokke: Well, it would have the advantage of distributed load (no problems with monotonically increasing shard key). If panels do not have overlapping datasources, we could eliminate scatter/gather by tag based sharding.

[22:56:59] <jokke> no overlapping datasources

[22:57:09] <kurushiyama> jokke: With index prefix compression you would be pretty well of.

[22:57:42] <jokke> they might have the same id. it's unique only in the scope of a panel

[22:58:48] <kurushiyama> jokke: Ok, but I guess we can overcome that. Actually, we could use the panel's ID/name/distinctive whatever for that.

[22:59:18] <jokke> sure

[22:59:59] <kurushiyama> jokke: even better: if the panel ID distincts them, we might use the panel ID as a shard key, halfway even distribution taken for granted. Scatter/gather eliminated altogether without tag based sharding.

[23:01:09] <kurushiyama> And if the distribution is uneven, we still could resort to tag based sharding.

[23:01:10] <jokke> hm but wouldn't the chunks become huge?

[23:01:17] <kurushiyama> No.

[23:01:21] <kurushiyama> They get split ;)

[23:01:26] <jokke> by what?

[23:01:48] <kurushiyama> A chunk does not have a designation on it's own, remember?

[23:02:00] <kurushiyama> We are talking of ranges.

[23:02:19] <kurushiyama> aa - ab may consist of 10k chunks

[23:02:36] <jokke> yes

[23:02:48] <kurushiyama> still, it would hold all docs from aa-ab

[23:04:12] <jokke> gosh i feel so stupid :/ i thought something like a timestamp would be needed for the splitting to work...

[23:05:35] <jokke> just using _id.panel_id seems like hammering everything doc with the panel_id aaaa to aabb into one chunk

[23:05:43] <jokke> *every

[23:05:45] <kurushiyama> Lets check. Distributed writes, no scatter/gather, an index we need anyway. At the expense of 8 bytes/record? Great deal in my book.

[23:06:03] <kurushiyama> jokke: Nah

[23:06:10] <kurushiyama> jokke: We'd need to pre-split.

[23:06:16] <jokke> pre-split?

[23:06:19] <kurushiyama> jokke: Say you have 1k panels

[23:06:27] <jokke> yeah

[23:06:43] <kurushiyama> And they are pretty even in terms of expected data size.

[23:06:55] <jokke> ok

[23:07:17] <kurushiyama> Say our panel IDs are ints

[23:08:09] <kurushiyama> Now, we simply tell MongoDB beforehand to split at 500 and put the panels from 0-500 on rs1 and the rest on rs2

[23:08:44] <kurushiyama> Since our shard key is panelID, and the data is even for each panel, everything is fine.

[23:08:48] <jokke> oh ok we can do that?

[23:09:21] <kurushiyama> jokke: You can always do that, but then the shard key must have a high cardinality.

[23:09:36] <kurushiyama> jokke: To make sense, that is.

[23:09:42] <jokke> i see

[23:11:20] <kurushiyama> Hence, again: flat data model, with panel ID in the docs, sharded by said panelID, pre-split or even tagged.

[23:11:45] <jokke> wouldn't using something like the date + hour of a timestamp hashed with some modulo be a good shard key too?

[23:12:05] <kurushiyama> That is basically a hashed key, no? ;)

[23:12:19] <jokke> yes but with a lot more control

[23:12:21] <kurushiyama> And does not eliminate scatter/gather ;)

[23:12:29] <jokke> sure does!

[23:12:33] <kurushiyama> Nope.

[23:13:06] <kurushiyama> Since it is sort of hashy.

[23:13:18] <jokke> but not really random

[23:13:32] <kurushiyama> A hash is not, either.

[23:13:51] <kurushiyama> Plus, you have a good chance of colisions.

[23:14:02] <jokke> it makes sure all docs within an hour on the same day gets written to the same chunk

[23:14:26] <kurushiyama> Uhm say hour only

[23:14:39] <kurushiyama> take a modulo of 5

[23:14:40] <jokke> yeah why not

[23:14:45] <kurushiyama> Ok

[23:15:21] <jokke> why 5? :)

[23:16:01] <jokke> or was it just an example

[23:16:04] <kurushiyama> Lets take 2, does not matter

[23:16:08] <jokke> yeah

[23:16:10] <jokke> ok

[23:16:27] <kurushiyama> so, you want to shard by h%2

[23:16:27] <jokke> basically lets you control the chunk size

[23:16:32] <kurushiyama> no

[23:16:34] <jokke> no?

[23:16:36] <kurushiyama> size is fixed

[23:16:43] <kurushiyama> split at 32MB

[23:16:50] <jokke> mhm

[23:16:53] <kurushiyama> (sort of fixed)

[23:17:26] <kurushiyama> jokke: You can control the number of values your shard key can have. But so does the panelId, no?

[23:17:50] <jokke> yes

[23:17:56] <kurushiyama> qed ;)

[23:18:54] <jokke> :)

[23:20:25] <kurushiyama> jokke: So, my solution eliminates scatter/gather and distributes writes evenly while giving good performance at the expense of like 20 bytes (12 ObjectId + 8 additional TS) record.

[23:20:31] <kurushiyama> jokke: Your turn ;P

[23:21:12] <jokke> additional TS?

[23:21:56] <kurushiyama> jokke: Yeah, you'd need a TS for each of the flat docs to correlate for the panel. If a panel itself is time based, even that TS gets eliminated.

[23:22:14] <kurushiyama> Oh, but the panelId would be redundant, ofc.

[23:22:31] <kurushiyama> so 12 bytes oid + panelId.length

[23:23:15] <kurushiyama> jokke: either way, we could correlate, either by time or panelId. As said: your turn ;)

[23:25:16] <kurushiyama> jokke: So { _id: ObjectId(), panelId:yourPanelId, datasource:ds, r:1,i:1.5}, sharded by panelId

[23:25:49] <jokke> i'm still not using flat docs though :D

[23:26:00] <kurushiyama> jokke: Have you read the above?

[23:26:06] <jokke> yes

[23:26:30] <kurushiyama> Well, I don't get it. It solves your problems. All of them.

[23:26:53] <jokke> not the one of a huge index and the one with slower writes

[23:28:19] <jokke> as i said. the data arrives packaged together. so for a panel with 50 datasources i'd have 50 inserts with the flat data model as opposed to 1 insert with my "flat_panel" model

[23:28:22] <kurushiyama> Well, you can not have the cake and eat it. You have to make tradeoffs. And as proven, updates are slower. Your call. But accepting a foul compromise when it comes to sharding, you are doomed.

[23:28:40] <jokke> i won't have updates

[23:28:45] <jokke> ever

[23:28:52] <jokke> just inserts

[23:30:03] <kurushiyama> Well enough. But still, you have the sharding problem. You may be able to use something to get the distribution done, but that would be an index either way. Oh, and you might want to check bulk inserts ;)

[23:30:37] <kurushiyama> Try it, at least.

[23:30:40] <jokke> i'll promise to look into bulk inserts and index compression

[23:30:48] <jokke> then we'll talk again :)

[23:31:17] <jokke> but you took a look at the flat_panel doc right?

[23:31:25] <kurushiyama> Bulk inserts for the datamodel above, sharding based on panelId. Note the details (you should have about 1M sample docs).

[23:31:43] <jokke> and the data distributed pretty nicely

[23:31:54] <kurushiyama> jokke: Only briefly ;)

[23:32:09] <jokke> nah! i kept inserting

[23:32:23] <jokke> til i got something like 1.2GiG data

[23:32:43] <jokke> and it was distributed 60 / 40 or so

[23:32:46] <kurushiyama> 20/13 (iirc) with a lot of failed migrations because it was stopped is not what I call nice ;)

[23:32:55] <jokke> sure, not ideal but fair enough

[23:33:17] <kurushiyama> Failed migrations is something I'd take _very_ seriously.

[23:33:34] <jokke> what do you mean?

[23:33:34] <kurushiyama> Especially given the rather small size of the cluster.

[23:33:56] <jokke> well it's small now

[23:34:04] <jokke> doesn't mean it stays that way

[23:34:18] <kurushiyama> line 27

[23:34:33] <jokke> oh that was before i inserted more docs

[23:34:37] <kurushiyama> That is a _serious_ problem.

[23:35:09] <kurushiyama> Interupted chunk migrations are a really bad sign.

[23:35:37] <jokke> how does that happen?

[23:37:59] <kurushiyama> jokke: I _think_ it happens when a chunk is supposed to be migrated, but changes after the chunk started to be migrated. Or sth like that. It is a symptom I have regularly seen at... You can guess three times ;)

[23:40:58] <jokke> wow.. i just killed and set up the cluster again and ran _just_ the insert benchmark for the flat_panel model

[23:41:06] <jokke> it's ready in about a minute

[23:41:17] <jokke> but shard distribution is looking _very_ good

[23:41:32] <jokke> one chunk apart

[23:41:40] <jokke> and constantly changing

[23:41:49] <jokke> with _id as shard key

[23:43:20] <jokke> https://p.jreinert.com/Yxndp3/

[23:43:38] <jokke> and https://p.jreinert.com/CcYy/

[23:43:47] <kurushiyama> Too small to be meaningful.

[23:44:26] <kurushiyama> Well, let us dissect that tomorrow, however ;)

[23:44:38] <jokke> hehe

[23:44:41] <kurushiyama> Unhashed _id?

[23:44:45] <jokke> yes unhashed

[23:45:28] <jokke> hm but the read performance is pretty messed up

[23:45:35] <kurushiyama> There is something strange. But let us have a look then. You know what we need. If you can, please insert like a shitload of documents.

[23:46:02] <jokke> on the other hand i'm doing stupid queries... i'm getting the _latest_ data sample

[23:46:24] <jokke> haha will do

[23:46:27] <kurushiyama> Makes sense. As said: please add like 10M docs.

[23:46:41] <kurushiyama> 100M, if you can.

[23:47:46] <jokke> jeez

[23:48:06] <jokke> this is running in docker on my laptop, you know? :)

[23:48:59] <kurushiyama> I do not do it different ;)

[23:52:14] <jokke> ssd's have their limits :P

[23:52:24] <jokke> fortunately i just put in a 1TB one :)

[23:53:33] <kurushiyama> Ok, we'll see tomorrow.

[23:57:40] <jokke> ok thanks again!

Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 3rd of May, 2016