PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Sunday the 5th of June, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:24:21] <bros> I just put 8,551 docs from MongoDB into Postgres on my MacBook running local servers, and the SQL is 2x as fast at reading than Mongo...
[00:58:17] <oky> bros: back again?
[00:58:22] <bros> oky: yo
[00:58:34] <oky> bros: are both running local? i'm not surprised at 2x read perf on table scan
[00:58:43] <bros> i am running local
[00:58:44] <bros> why?
[00:58:53] <oky> just making sure similar circumstances for both
[00:59:01] <oky> bros: what type of data is it, anyways?
[00:59:23] <oky> bros: doesn't postgres support multi-threaded scans now, too? and column stores?
[01:00:09] <bros> column store being?
[01:02:06] <oky> bros: columnar store. it lets you load only fields off disk relevant to query (but if you are doing full doc scans, irrelevant)
[01:02:34] <oky> bros: if postgres works for you, why not use it?
[01:02:41] <bros> oky: https://gist.github.com/brandonros/e68710bea0dae6a2d0819cd4b942a672 example doc
[01:02:52] <bros> I'd have to rewrite my entire app
[01:02:52] <bros> lol
[01:03:11] <bros> I'm trying to see if I can rewrite only the performance-sensitive parts
[01:11:15] <oky> bros: sounds reasonable to me - does your app need to do lrge doc scans often? is it part of a normal request?
[01:11:36] <oky> if it's not, maybe put it in the async queue for offline processing
[01:15:00] <bros> oky: do you ever use last modified timestamps or anything similar in mongodb?
[01:15:03] <bros> I think that's what I Need here
[01:17:12] <oky> bros: what do you mean? i've used an ORM that adds created/updated fields and keeps them sync'd
[01:17:23] <bros> what ORM? isn't an ORM an antipattern?
[01:17:28] <oky> is there a mongodb you are referring to?
[01:17:33] <oky> erm... mongodb feature*
[03:28:04] <bros> oky: are you still around
[03:28:36] <oky> bros: no
[03:28:41] <bros> kidding?
[03:30:17] <bros> GothAlice told me not to use timestamps for lastModified because they weren't atomic or something?
[03:30:23] <bros> Do you have any experience with cache invalidatio?
[03:31:22] <bros> Can I have a WriteResult include the IDs it modified in the case of a multi update?
[07:40:51] <GothAlice> Hmm. No, I would have mentioned creation timestamp because it's included in the ObjectId. Alas.
[10:30:06] <nofxx> Is there some kinda of binary field? Need to store weekdays, as in, user chooses: monday or monday and thursday or whatever.. so was thinking 0b1111111 bit per day, an some kind of OR search... all from sunday: 0b1000000
[12:42:09] <oky> GothAlice: good point
[12:42:19] <oky> GothAlice: i was wondering if ObjectID has updated stamp, though
[12:42:28] <oky> i don't think it does?
[17:01:55] <GothAlice> oky: Certainly it would not. Otherwise your ID would change every time you save, which would be entirely unmanageable.
[17:03:12] <GothAlice> oky: As a note, bros is the fellow who chose to store deeply recursive structures (arrays of sub-documents containing arrays of sub-documents, several levels deep) and multi-megabyte documents and decided that instead of fixing his broken structure that reimplementing the entirety of MongoDB at the application level (as a "cache") would be a good idea. (It wasn't.)
[17:14:18] <GothAlice> oky: In terms of tracking modification time, adding a {$currentDate: {updated: "timestamp"}} to your update operation will do that. Atomically.
[17:14:51] <GothAlice> (Where "timestamp" may also be "date")
[18:24:01] <oky> GothAlice: ah, cool - thanks for explaining!
[18:24:55] <oky> i don't have bias towards or against mongodb - just trying to help people towards their end goal. if they don't listen when the proper solutions are presented, i don't see why i should spend time helping them fix it, either
[18:25:07] <oky> so... duly noted
[18:25:22] <GothAlice> That was my situation; I had spent quite a bit of time one-on-one with him, and pretty much every bit of advice was ignored. :'(
[18:48:17] <KostyaSha> Is it possible to have secondary hidden/priority=0 replica that wouldn't participate in elections (prevent alone primary to not become secondary when hidden replicate dies)?
[18:54:48] <GothAlice> KostyaSha: Yes.
[18:55:04] <GothAlice> KostyaSha: https://docs.mongodb.com/manual/tutorial/configure-a-non-voting-replica-set-member/
[18:55:30] <GothAlice> + https://docs.mongodb.com/manual/tutorial/configure-secondary-only-replica-set-member/
[18:55:58] <KostyaSha> cool :) somehow missed this part
[18:57:17] <KostyaSha> GothAlice, thanks! documentation is too cross linked so i periodically misleading it :(
[19:24:28] <KostyaSha> GothAlice, how priority vs vote works?
[19:28:55] <KostyaSha> vote is just 1/0 switch?
[19:34:35] <bros> Mongo transactions... no way?
[19:54:40] <GothAlice> bros: https://docs.mongodb.com/manual/core/write-operations-atomicity/ + https://docs.mongodb.com/manual/tutorial/perform-two-phase-commits/
[19:56:01] <GothAlice> bros: In terms of tracking modification time, adding a {$currentDate: {updated: "timestamp"}} to your update operation will do that. Atomically. I would also have not mentioned the _id ObjectId except for creation time, which it includes. (It would be entirely non-sensical for it to contain a modification time, as your ID would change on every save.)
[19:56:21] <GothAlice> ("timestamp" or "date" being valid values, determining which type of field is used.)
[19:56:46] <bros> Didn't you tell me not to rely on that the $currentDate last updated timestamp idea?
[19:56:54] <GothAlice> No.
[19:57:15] <bros> I'm running into race conditions where one process is reading from a collection while another is in the middle of writing to it and invalidating it (in terms of cache).
[19:57:20] <bros> What can I do here? Locks?
[19:58:37] <GothAlice> Your cache is a fundamentally flawed concept, adding nothing but more work for you. You could lock, but then the situation is worse, and you aren't addressing the real problem.
[19:59:02] <GothAlice> http://xyproblem.info
[19:59:13] <bros> What's the real problem at the moment, from waht you can tell?
[20:00:29] <bros> What are your thoughts on http://rain1017.github.io/memdb/
[20:01:29] <GothAlice> Incorrect data architecture; you've structured your data in direct opposition of documented processes, wilfully ignoring the many documented warnings advice regarding this. You're mis-using MongoDB in several fundamental ways (over-large documents, whole-document updates, and deep nesting).
[20:02:09] <GothAlice> (Instead of addressing this, you choose to add more layers of complication and abstraction, only creating new problems.)
[20:02:24] <bros> Should I even be using MongoDB in your opinion?
[20:02:42] <GothAlice> Without restructure? No. With restructure to make use of MongoDB features, possibly.
[20:03:35] <GothAlice> Minimal atomic updates are a huge thing vs. whole-document replacements. With over-large documents, the difference will be night and day, performance-wise.
[20:04:29] <GothAlice> You can only do minimal atomic updates (i.e. restricted to just the "changing fields", and making use of more than just $set…) if a) your documents are not deeply nested, and b) no change needs to modify multiple arrays at the top level, i.e. avoiding "broad" updates of nested data.
[20:04:35] <bros> I just need to solve the problem where one process can be invalidating the cache collection while the other is reading from it at the moment.
[20:04:51] <GothAlice> With atomic updates, that is not an issue at all.
[20:05:21] <bros> It is in my app. I do client side looping to send updates, not bulk operations because I have business logic making each operation unique.
[20:06:29] <GothAlice> If your concerns are multi-document, then there are several approaches. Locking is the classic one, but treads on performance terribly (denying reads). Versioned documents and/or two-phase commit allows for reading of stale values while the new values are being "committed".
[20:06:49] <GothAlice> (Eliminating the read/write lock issue.)
[20:08:27] <bros> GothAlice: so I would need to change all of my writes (in order to use two phase commit) into a transaction collection
[20:09:06] <bros> and an array to all documents called pendingTransactions
[20:09:26] <bros> rewrite all of my updates, saves, removes, bulk inserts/upserts
[20:10:21] <GothAlice> And have a persistent process coalescing the data.
[20:10:42] <GothAlice> To handle: https://docs.mongodb.com/manual/tutorial/perform-two-phase-commits/#recovering-from-failure-scenarios
[20:11:13] <GothAlice> As to the "business logic", at work we too have several complex processes requiring calculations and API outside of MongoDB. We still boil changes down to minimal atomic updates. (Some of our individual update statements, and queries, are kinda crazy, as they're machine-built.) We even use versioned collections for aggregate query reporting, to not give users partial "stored views".
[20:12:14] <bros> actually
[20:12:15] <bros> hold on
[20:12:20] <bros> I just rewrote my caching system the other day
[20:12:28] <bros> to only read from a separate database called cache
[20:12:49] <bros> I think I might be ok... The changes happen in the "db" database, then make their way to the "cache" database
[20:40:13] <pdekker> Hi, I have made a replication set with two mongodb instances. What is a good way to check if the contents of the two databases are the same, and the replication is thus succesful?
[20:47:07] <Quick_Wango> Hi! My mongod 3.2 is not starting, when started as a service on debian jessie, because it cannot write to the pid file. The pid file is 644 and owned by root, changing the permissions does not help. how can I solve this?
[20:50:21] <GothAlice> pdekker: rsStatus
[20:50:34] <GothAlice> https://docs.mongodb.com/manual/reference/method/rs.status/
[20:51:35] <GothAlice> Quick_Wango: You need to find out which user MongoDB is trying to run as, and make sure the PID file is owned (chown) by that user. I suspect during initial setup you ran the server as root to test, this would have created a fair number of files (PID file, log files, and data directory + files) that will all need to be updated.
[20:56:03] <Quick_Wango> GothAlice: interestingly it seems like the init.d script created it as root
[20:56:16] <GothAlice> That's odd.
[20:56:21] <Quick_Wango> chown'ing it did fix it, chmod'ing however did not
[20:56:55] <Quick_Wango> I never manually started the daemon, always through service mongod (re)start
[21:02:38] <Quick_Wango> GothAlice: after rebooting the system I have the same problem again
[21:03:57] <GothAlice> Quick_Wango: How did you install MongoDB?
[21:04:21] <GothAlice> Via https://docs.mongodb.com/manual/tutorial/install-mongodb-on-debian/ or using the core packages?
[21:04:28] <Quick_Wango> GothAlice: I installed the package from wheezy/mongodb-org/stable (on a jessie system)
[21:05:16] <Quick_Wango> could the problem be related to systemd?
[21:05:30] <GothAlice> Alas, I have no idea what the consequences of cross-distro-version use would be. I don't Debian. Could very likely be, as I do find some discussion over issues with System V compatibility.
[21:13:57] <Quick_Wango> GothAlice: I've seen there is a 3.3 package für jessie already, but I guess that's not stable yet, or is it?
[21:28:59] <pdekker> GothAlice: If the output of rs.status() shows that date and lastHeartBeat are the same, can I be sure the contents are the same?
[21:30:10] <GothAlice> Quick_Wango: Unlikely to be so, no.
[21:31:26] <GothAlice> pdekker: Ah, no. That's a bit more involved. http://blog.mlab.com/2013/03/replication-lag-the-facts-of-life/#How_do_I_measure_lag#How_do_I_measure_lag
[21:31:57] <GothAlice> The contents will always "be the same" as of a certain point in time, usually a little in the past. That's the nature of replication. (By a little, it can be as low as a few milliseconds.)
[22:48:18] <bros> GothAlice: what do you recommend for IPC?