PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 12th of May, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:00:35] <joannac> godzirra: "what's wrong with it" is very vague. Do you see an error? Does it not return what you expect? What's your evidence that something is wrong with it?
[00:57:42] <godzirra> joannac: Sorry, I get an error when I run it, saying that "$near requires geojson point, given { type: \"Point\", coordinates: [ 36.0, -115.0 ] }
[00:59:01] <joannac> okay. so fix the point.
[00:59:23] <joannac> here are the docs: http://docs.mongodb.org/manual/reference/operator/query/near/
[00:59:37] <joannac> your point is not valid.
[01:03:02] <godzirra> argh. This whole time my test just had longitude and latitude reversed. :/
[01:03:07] <StephenLynx> lol
[01:03:18] <godzirra> Now I just need to figure out why my $near query doesn't return anything.
[01:03:27] <godzirra> Actually that's fixed too. Awesome.
[01:03:48] <godzirra> joannac: Thanks for pointing out the obvious. (no, that wasn't sarcastic. I seriously mean thanks. :)
[01:04:35] <joannac> no probs
[01:04:44] <joannac> and yes, it's longitude, then latitude. It's the GeoJSON spec :(
[01:07:48] <godzirra> Yeah, I know. I had it correct in my inserts.
[01:07:51] <godzirra> Just not in my query.
[01:08:20] <godzirra> Now I just have to figure out why node seems to hang after processing my inserts (and failing on all of them)
[01:09:21] <godzirra> Awesome. Thanks guys. :)
[02:05:38] <shlant> anyone using mms: is it possible to have a different instance type for arbiters? Can I deploy 2 members with one type and then an arbiter for the same replset with a t2.micro or something?
[02:21:17] <joannac> shlant: sure, you just need to deploy first, then distribute processes the way you want
[06:47:28] <amitprakash> Hi, i have a collection with 10m records on mongo 2.8 (mmapv1 engine).. what would be the fastest way to restore this to a mongo 3.0 instance(wiredTiger) ?
[06:48:04] <amitprakash> Currently I am syncing by adding the 3.0 instance as a replicaset member, however its taking over an hour to sync
[06:48:11] <amitprakash> Is there a better/faster way?
[09:16:17] <nixstt> is this normal disk IO usage for mongodb 3.0.2? http://i.imgur.com/Y8dTQhy.png
[10:48:12] <pamp> hi, is there any equivalent command for touch in wiredtiger?
[11:25:14] <bogn1> so nixstt, from looking at your disk IO graph, I suppose you're pushing time series data into hour documents. We're doing that and have those sawtooth query runtimes as well. Finding out why is on our list. Seek time shouldn't be that big of an issue with structured hashes having minutes on top-level und seconds below. That's only 59+59 jumps as per MongoDBs documentation. Another candidate is record moves due to padding not being available anylonger in 3.0.2.
[11:26:34] <nixstt> bogn1: yeah pushing data into hourly documents here, noticed this since testing 3.0.2 also having some memory usage issues it’s using way too much memory (much more than specified in the config file)
[11:27:04] <bogn1> 3.x removed the padding factor which preallocated dynamically
[11:27:28] <bogn1> with 3.0.2 you have to take care of this yourself
[11:27:49] <bogn1> or use powerOf2 allocation which dramatically increases collection size
[11:28:02] <bogn1> I think the latter is the default
[11:28:05] <Derick> powerof2 is the default now though IIRC
[11:28:28] <bogn1> but that might be too slow as well
[11:28:55] <bogn1> we're looking at this: http://docs.mongodb.org/ecosystem/use-cases/pre-aggregated-reports/
[11:29:03] <bogn1> preallocating the stuff up-front
[11:29:38] <bogn1> no need to incrementally increase the document by powerOf2
[11:29:45] <bogn1> which should hurt as well
[11:30:27] <nixstt> hmmm I might give that a try, that use case looks a lot like what I’m doing :)
[11:30:49] <bogn1> the magic here seems to minimize the overhead of manual preallocation
[11:30:54] <bogn1> seems to be
[11:31:56] <bogn1> I will look into stored procedures for not having to send the bulky {0: {0: Infinity, ...}, ...} structures for every pre-allocation
[11:32:02] <bogn1> of the next hour that is
[11:32:22] <bogn1> because that hit me when the pre-allocation probability was high
[11:34:11] <nixstt> I’m actually using timestamps as keys in my subdocuments but that seems to be a really bad idea looking at that article
[11:34:36] <bogn1> depends very much I think
[11:35:02] <bogn1> I think with wired tiger the approach would be wholy different
[11:35:21] <bogn1> as that uses appends instead of in-place updates
[11:36:58] <bogn1> updates look like this {$set: {"values.11.23": 2.03}}
[11:37:19] <bogn1> with above structure that is also outlined in the article
[11:39:46] <bogn1> are you using a schema similar to this one on slide 45: https://www.mongodb.com/presentations/mongodb-time-series-data
[11:40:35] <nixstt> Yes like this {$set: {“d.1431430607" => $large_data_array}}
[11:40:58] <nixstt> one document per hour, with one upsert every minute
[11:41:41] <bogn1> upsert? You mean update of a sub-document, aren't you?
[11:42:40] <nixstt> I use upsert for these hourly statistics
[11:42:42] <bogn1> ok, that's nitpicking sorry
[11:43:29] <bogn1> I have the upsert parameter set as well, to avoid "exists?"
[11:43:41] <nixstt> it’s hard for me to preallocate all the data, i’m saving for example the disk IO statistics you saw in that image, sometimes a disk might be added/removed and the whole structure changes
[11:44:09] <bogn1> that's why I have a document per metric
[11:44:29] <bogn1> per metric, per hour that is
[11:44:56] <bogn1> which also has it's implications I'd say
[11:45:53] <nixstt> Yeah I like to just grab a document and have all the metrics in there instead of grabbing everything from different collections
[11:46:26] <nixstt> my application uses quite a bit of memory cause of that but it’s a dashboard type application so it’s not heavily used all the time
[11:47:00] <bogn1> wiredtiger might be interesting for you, it is for us as well, as that doesn't need to pre-allocate
[11:47:17] <nixstt> this is wiredtiger already using it
[11:47:37] <bogn1> we did that as well
[11:47:46] <bogn1> performance wasn't good
[11:47:55] <nixstt> it does save me a ton of space I went from 140 gb to 22gb
[11:48:00] <bogn1> because our schema and update mechanics we're not suited for it
[11:48:05] <bogn1> exactly
[11:48:18] <bogn1> but we delayed the move to the new update mechanics
[11:49:28] <bogn1> my not very informed impression is, that you should no longer $set, but instead $push
[11:49:34] <Naeblis> What should be the design for a field which can take multiple types of responses? I have a "discussion" model and a "response" model. Response is currently text only, but I'd like it to also include things like polls etc.
[11:54:30] <nixstt> giving $push a try now
[11:55:24] <bogn1> have you seen my direct message?
[11:56:01] <bogn1> and also this seems to be a mixture of the multiple metrics schema with the indexed time-slots: https://www.mongodb.com/presentations/webinar-internet-things-bosch-concept-code
[11:58:33] <bogn1> it's memory requirements might be lighter
[11:59:09] <bogn1> but I think the indexed time-slots approach doesn't mix well with WiredTiger but that's just an initial impression
[12:01:07] <bogn1> A question to the broader audience. Does anybody know, whether it's possible to see a graph of record moves (due to document growing) in MMS?
[12:02:37] <nixstt> Not sure haven’t use MMS in a while
[12:02:57] <bogn1> I'm currently tracking it with a cron-job
[12:03:20] <bogn1> and sending the value picked out of serverStatus' json into grafana
[12:03:28] <bogn1> it works okaish
[12:33:10] <tadeboro> Hi all. I'm wondering how pure should map function of map/reduce should be.
[12:33:41] <tadeboro> Would doing this mess up anything:
[12:34:23] <tadeboro> delete this._id.field; emit(this._id, this.value);
[12:35:02] <tadeboro> Or do I need to make a copy of _id field and operate on that?
[14:09:45] <abyss> Hi
[14:12:41] <abyss> I'm reading about sharding and replication. For my requirements sharding is too much so I chose replica. I'm reading about replica: http://docs.mongodb.org/master/MongoDB-replication-guide.pdf and I have question, because that doc mentions about heartbeat.
[14:14:10] <abyss> if three nodes working: primary -> secondary -> secondary, then everything is ok, because I can add heartbeat between secondary nodes: secondary <-hearbeat-> secondary and put there reads... But when primary fall down then one of the secondary become primary... I'm right?
[14:15:09] <abyss> so... What happens with writes? How I can handle that I mean avoid doing writes to secondary node... I should handle this in heartbeat, or how?
[14:16:10] <lietu> I've been reading a bit on some practical aspects of mongodb replication .. apparently regardless of what we set write concern etc. values to, there is absolutely no way to say what data is in the database if you're running a replicaset? if you write with a write concern of e.g. 3 with 4 nodes out of which 2 are down .. it will write to the 2 that are up, and hang indefinitely until more are coming up .. if you set wtimeout it won't "hang"
[14:16:11] <lietu> but you have no idea if something was written or if it will eventually be written to the DB .. so is there a way to rollback the failing write, or kill the query and write the old data back, or how do people deal with this in practice?
[14:16:49] <lietu> and the indefinite hanging is of course not a very practical solution for .. well .. really any purpose
[14:35:59] <unseensoul> How do I access a particular field from a document returned by findOne?
[14:36:36] <StephenLynx> document.fieldName
[14:37:04] <unseensoul> StephenLynx: great. Thanks :)
[18:02:49] <DragonPunch> sadf
[18:02:52] <DragonPunch> is it possible
[18:02:53] <DragonPunch> to
[18:03:03] <DragonPunch> sort across multiple collections?
[18:03:30] <cheeser> no
[18:03:31] <cheeser> it
[18:03:36] <cheeser> isn't. that makes
[18:03:39] <cheeser> little sense
[18:06:58] <DragonPunch> Oh okay. I guess I have to put some data into collections and define it inside of the collection.
[18:07:14] <ChALkeR> Does anyone here run mongodb on debian stable?
[18:07:50] <ChALkeR> Are there any problems that I should expect when using oldstable mongodb packages?
[18:08:40] <ChALkeR> Btw, the topic is a bit off, 3.0.3 got released.
[18:11:03] <StephenLynx> someone needs to update the topic then
[18:12:23] <ChALkeR> Strange.
[18:12:35] <ChALkeR> It's not listed on https://www.mongodb.org/downloads
[18:12:44] <ChALkeR> But it got pushed to debian oldstable repo.
[18:13:38] <ChALkeR> http://repo.mongodb.org/apt/debian/dists/wheezy/mongodb-org/stable/main/binary-amd64/mongodb-org_3.0.3_amd64.deb
[18:13:54] <ChALkeR> (that's the meta package)
[18:14:02] <StephenLynx> yeah, mongo already pushed it to centOS too.
[18:14:05] <ChALkeR> Everything else is here: http://repo.mongodb.org/apt/debian/dists/wheezy/mongodb-org/stable/main/binary-amd64/
[18:14:11] <StephenLynx> don't know about centOS packages though.
[18:37:27] <fxmulder> when rsyncing a replica over another do I need to keep anything on the new replica or can I just rsync the whole data directory
[20:05:51] <Nepoxx> You know what sucks? Mongoose's callback documentation.
[20:06:41] <StephenLynx> You know what sucks? Mongoose
[20:07:13] <Nepoxx> Know any other good NodeJS ODMs?
[20:08:50] <StephenLynx> there are none.
[20:08:57] <StephenLynx> I just use node.js driver with io.js
[20:09:17] <StephenLynx> works perfectly well, I am pretty sure the dev keeps an eye to io.js development to support it.
[20:10:01] <Nepoxx> Hmm... I'm not sure I'm ready to use the barebone driver
[20:10:15] <StephenLynx> very well documented and straight forward.
[20:10:25] <StephenLynx> never had any issue.
[20:10:37] <Nepoxx> I'Ve had my fair share of issues with Mongoose, I'll tell you that
[20:11:24] <doc_tuna> question why you are using an ODM versus just working with the driver
[20:11:28] <Nepoxx> I kinda like having a schema safety net
[20:11:56] <Nepoxx> Also, I'm coming from Java-Hibernate-Spring world, so I've come a long way :P
[20:12:01] <doc_tuna> i dont think apps generally need that much abstraction
[20:12:46] <StephenLynx> schema safety is useless.
[20:12:59] <StephenLynx> any error related to that is easily caught.
[20:13:35] <Nepoxx> removing Mongoose from my current project is going to be a PITA, however I'll try without on my next one. Or I might try Monk, that looks promising too
[20:14:00] <StephenLynx> nah
[20:14:02] <StephenLynx> its crap too
[20:15:38] <Nepoxx> https://github.com/mafintosh/mongojs or https://github.com/mongodb/node-mongodb-native?
[20:21:12] <StephenLynx> the second one, it is supported by 10gen
[20:21:20] <StephenLynx> but you just need the mongodb module on npm
[20:21:57] <StephenLynx> not to mention it has ten times more commits.
[20:22:10] <StephenLynx> 7 times more contributors
[20:22:21] <StephenLynx> and is on 10gen github account.
[20:28:32] <Nepoxx> I'm not refuting your arguments :P
[20:54:35] <ehershey> 3.0.3 is out!
[21:05:20] <GothAlice> Apparently proper $slice during aggregate projection was added. Wewt!
[21:05:49] <GothAlice> It's a good day when JIRA tickets I watch get closed.
[21:06:08] <Nepoxx> "good day" and "JIRA" in the same sentence
[21:06:13] <GothAlice> Rare, eh?
[21:06:14] <Nepoxx> don't belong**
[21:06:17] <Nepoxx> haha :P
[23:05:08] <StephenLynx> TTL is taking too long to drop documents.
[23:05:19] <StephenLynx> any common issue people have to cause this?
[23:05:32] <StephenLynx> it takes several seconds after its expiration time.
[23:06:03] <StephenLynx> ah
[23:06:11] <StephenLynx> "Warning: The TTL index does not guarantee that expired data will be deleted immediately. There may be a delay between the time a document expires and the time that MongoDB removes the document from the database."
[23:13:19] <GothAlice> StephenLynx: Known thing, as you noticed. My use of MongoDB as a cache takes that into account when accessing cached values.
[23:14:04] <GothAlice> StephenLynx: https://github.com/marrow/cache/blob/develop/marrow/cache/model.py#L126-L128
[23:14:24] <GothAlice> Reminds me: I need to set a "to hell with reality" write concern on that delete.
[23:15:52] <StephenLynx> I am using it for flood control. I just added a gte:current date on the query and removed the unique key for ip.
[23:16:05] <GothAlice> That'd work, too.
[23:18:13] <GothAlice> I prefer throwing away invalid responses, rather than having MongoDB do the extra document/index scanning, as I'm looking up a unique key (I'm using _id, in fact), so it's less computational power needed overall, esp. if the delete is a fire-and-forget, which it is now.
[23:19:40] <GothAlice> (If the TTL index hasn't gotten to it, in the off chance that I hit the minute window for the entry *I'll* delete it.)