PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 27th of July, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[01:59:33] <Trinity> how do you guys deal with mongodb race conditions?
[02:00:09] <Trinity> for instance, I want to update a document based from a find query on that document previously
[02:01:08] <Trinity> I know I could use implicit application specific logic to guarantee that I never enter a race condition due to not updating those documents/fields concurrently but it does not seem like a good method
[02:09:18] <Trinity> hmm, i guess I could also use something like redis-lock to manage it if I have clusters
[02:09:30] <Trinity> but that doesn't solve the problem if I have more than one system
[02:13:31] <Trinity> wait, I think redis can support multi-systems
[04:58:05] <trq> Iv'e done this before in my dev environment, but am now needing to do some data merging in our staging environment.
[04:58:40] <trq> Are there any consequences to running a mongorestore and trying to import data into already existing databases?
[04:59:01] <trq> The export from mongoexport has mostly the same data but some new data.
[04:59:34] <trq> On my dev box I got a whole bunch of errors due to indexes already existing, but in the end, it seemed to work.
[04:59:40] <trq> Is this a bad idea?
[05:01:29] <trq> The alternative is to restore these databases into temp databases then create a script to loop through all there collections looking to for docs that don't exist in the proper databases and inserting them.
[07:02:04] <afp> Hi there Mongoers!
[07:02:47] <afp> I have taken a mongodump from a standalone mongod instance on a box I have, moved the data to a new box and have tried to do mongorestore and it appears to be stuck in some form of a continuous loop
[07:04:36] <afp> https://puu.sh/qfYhC/0720a4eb0b.png Here is an example of what I mean
[07:05:12] <afp> not sure if it is important, but the mongod instance I am trying to do mongorestore to is a member of a replica set: rs1.. I did specify rs1/127.0.0.1:27017 as the host as per docs
[07:05:22] <afp> but yeah.. pretty confused, i've never seen anything like this before
[09:48:20] <mroman> What do you do when you have {_id : 'key', 'B' : 5}, {_id : 'key', 'C' : 6}, {_id : 'key2' : 'B' : 4}, and you want everything where 'B' < 10 and 'C' > 3 or something like that.
[09:48:58] <mroman> you obviously can't just do {'B' : {$lt : 10}},{'C' : {$gt : 3}} because these are in different documents.
[09:49:31] <mroman> The only way I know of is to do a group, but that has horrible implications later on.
[09:50:32] <mroman> (especially since grouping requires disk use and in general it seems to take minutes to group just a couple of million documents)
[09:59:21] <mroman> http://codepad.org/5HcsW9Yj <- that's my problem.
[12:13:33] <cheeser> mroman: i'm heading in to the office (and I may not have an answer for you even still) but it'd help to see your agg pipeline
[12:57:50] <mroman> cheeser: http://codepad.org/TV027BWr
[12:57:53] <mroman> that's the aggregation pipeline
[12:59:46] <mroman> then I can add matches such as {$match: {'measurements.v' : {$lt : 50}, 'measurements.d' : {$gt : 0}}}
[13:00:45] <mroman> http://codepad.org/RdixOalr (example data + query + result)
[13:04:32] <mroman> I've simplified it a bit.
[13:05:45] <mroman> (location is actually an array for the real data set and consists of waypoints)
[13:06:22] <cheeser> mroman: you'll want to match first so you can use an index.
[13:07:37] <mroman> that's why I tried putting the matches at the start of the pipeline as an $or
[13:08:12] <mroman> I can't match for < 50 AND > 0 before grouping
[13:08:33] <mroman> but I can match for either of them, then group and then check if both hold.
[13:09:04] <mroman> however, d > 0 pretty much matches 99.9% of all the data
[13:09:32] <mroman> which means you end up grouping pretty much the whole data set
[13:09:57] <mroman> (the queries are dynamically generated by users)
[13:10:16] <mroman> (by which I mean the 'match' criterias are based on what a user wants)
[13:10:42] <mroman> (so there's the case that somebody is interested in everything where d > 0)
[13:11:18] <mroman> if I do a {$group : {_id : '$location'}},{$limit : 1}
[13:11:27] <mroman> it takes a long time.
[13:13:09] <mroman> around 20s on the test data set.
[13:13:27] <mroman> (that's around 5 Mio. rows)
[13:13:37] <cheeser> do two matches. first an 'or', do your thing, then an 'and'
[13:14:08] <mroman> still, worst case the match matches nothing because the user is interested in something that is quite common :)
[13:14:13] <mroman> *matches everything
[13:14:39] <mroman> also I have not figured out how I can do where measurements.a == measurements.b
[13:14:42] <mroman> except with a $where clause
[13:15:07] <mroman> and $where clauses take about 2.5minutes on those 5 Mio. rows.
[13:15:14] <cheeser> { "measurements.a" : { $eq : "$measurements.b" } } ?
[13:15:33] <cheeser> i've not tried that. no idea if it works
[13:15:42] <mroman> I thought you could only use constants
[13:16:12] <mroman> but that's an issue for later.
[13:17:14] <mroman> Using a collection with pre-grouped data works kinda well (perfomance wise)
[13:17:38] <mroman> but there's no "incremental" grouping I'm sure :)
[13:18:09] <mroman> so the live data and the pre-grouped data would diverge over time of course.
[13:18:54] <cheeser> you could schedule that grouping. dump it to a new collection with $out
[13:19:04] <cheeser> 3.4 will have views fwiw.
[13:20:30] <mroman> there's no way to do a find with the conditions ORed and then do some kind of intersection?
[13:20:40] <Derick> cheeser: views - really?! :) I did not know that!
[13:20:43] <mroman> that could eliminate grouping
[13:21:28] <mroman> wait
[13:21:29] <mroman> wait
[13:21:43] <mroman> hm.
[13:22:07] <mroman> no you can't just chain $matches
[13:22:28] <mroman> the problem is that my predicates are actually predicates over multiple documents.
[13:22:54] <mroman> one group is just to merge all measurements about one location into one document
[13:24:21] <cheeser> group then match? sort should *not* be the first stage.
[13:24:35] <cheeser> (though I think the aggegration engince can optimize some of that.)
[13:24:56] <mroman> but I need to sort to filter out deprecated measurements
[13:25:15] <cheeser> you'd still sort. just later.
[13:25:35] <mroman> although sorting with an index seems to be really fast :)
[13:25:51] <mroman> but when should I do the sort?
[13:26:10] <mroman> after the grouping you can't sort by timestamp anymore afaik?
[13:26:16] <cheeser> right before you're trying to find the first of whatever.
[13:26:46] <mroman> that's what I'm doing already?
[13:26:56] <mroman> the pipeline is two groups actually
[13:27:10] <cheeser> the sort i was first
[13:27:14] <mroman> first you group by location,instrument to only get the latest results for a location by an instrument
[13:27:15] <cheeser> i *saw* was first...
[13:27:55] <mroman> then you group again to merge all measurements left for the same location into the same document so you can do the AND thing
[13:28:21] <cheeser> sure. group. match. sort. group.
[13:28:49] <mroman> I don't get that.
[13:29:01] <mroman> I'll show you what I meant *just a sec*
[13:29:15] <cheeser> group to get both measurements. match against both. sort. group to get the first.
[13:32:52] <mroman> cheeser: http://codepad.org/6ybgerTW
[13:33:24] <mroman> oh wait the instrument needs to be kept too
[13:33:38] <mroman> but how do I now pick the most recent measurement out of that array?
[13:33:43] <mroman> and by instrument
[13:34:04] <cheeser> you'd have to $unwind it
[13:34:56] <mroman> yeah ok $unwind and then $group again would work too, yes
[13:46:28] <wspider> I have a document, I want to replace some of it's values and insert new ones
[13:46:40] <wspider> and keep everything else
[13:47:36] <wspider> since I haven't figured it out yet, what I was thinking is about replacing it completely..
[13:48:01] <wspider> as I already have the doc which I want to modify in memory
[13:48:47] <mroman> upsert?
[13:49:20] <mroman> no wait that's a regular update.
[13:51:09] <mroman> or findAndModify
[13:53:37] <mroman> wspider: db.collection.update({'a':5},{$set : {'a' : 3, 'new' : 'one'}}) ?
[13:55:38] <wspider> mroman: I will try it, I am not sure if $set will let me add new properties and at the same time modify old ones
[13:57:07] <mroman> wspider: http://codepad.org/wFMQBEgq it does.
[13:58:35] <wspider> mroman: great
[13:59:03] <wspider> that's exactly what I was looking for :p
[13:59:16] <wspider> thanks!
[14:04:30] <wspider> didn't try it before because I got mislead by my misunderstanding of a stackoverflow answer,sry
[14:05:13] <mroman> I wouldn't consult stackoverflow before at least searching 15min through documentation.
[14:05:29] <mroman> That's a general advice from me :p
[14:06:31] <mroman> Even if you end up not finding what you were looking for and have to consult IRC, stackoverflow you probably will stumble over other things that you can later come back to.
[14:07:01] <mroman> also... security and safety and best practice advice is usually mentioned in the docs more often than in forum answers.
[14:14:53] <wspider> ok, I agree
[14:15:02] <wspider> advice taken
[16:20:58] <wspider> hmm, weird.. it works using the interactive shell but not in nodejs
[16:28:33] <wspider> I think it's because of mongodb versions..
[18:24:31] <wspider> mine was an implementation fault, sry for the spam :P
[23:04:45] <afp> https://puu.sh/qfYhC/0720a4eb0b.png Here is an example of what I mean
[23:04:59] <afp> I have taken a mongodump from a standalone mongod instance on a box I have, moved the data to a new box and have tried to do mongorestore and it appears to be stuck in some form of a continuous loop
[23:05:05] <afp> can anyone answer this question pls?
[23:13:27] <afp> I left it run overnight
[23:13:30] <afp> and its still the same status
[23:14:11] <afp> is it because i am doing mongodump from standalone and then mongorestore to a replset ?
[23:17:37] <cheeser> that should be fine. you're importing against the primary?
[23:17:47] <cheeser> i would expect you'd have to but ...
[23:18:15] <cheeser> check the logs on the server you're importing to and see what's there
[23:20:35] <afp> I fixed it, I noticed the 2 other secondaries were in state STARTUP
[23:20:47] <afp> I changed hostname of primary (member id 0) to ip address
[23:20:48] <afp> to fix
[23:20:49] <afp> why is that
[23:21:51] <afp> lol
[23:21:56] <afp> #justmongodbthings
[23:24:41] <cheeser> there are certain ... sensitivities with hostname/IPs in cluster configs.
[23:32:05] <afp> Lol come on man
[23:32:08] <afp> :P
[23:32:20] <afp> It does not make sense to me :/ I have bound to 0.0.0.0
[23:32:39] <afp> and the name it configured was configured by its self, via its own FQDN
[23:32:46] <afp> which is a valid DNS A record
[23:32:52] <afp> ~_~