PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 20th of April, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[02:58:40] <Ryzzan> testing, 1, 2, 3
[02:59:23] <Ryzzan> any good tutorial to definitely learn how to deal with aggregation after a populate result?
[03:00:19] <Ryzzan> or would anyone help with it? :)
[03:29:20] <Progster> I'm trying to create a web app using the MEAN stack, with mongoose for ODM. It seems that an ODM isn't really an ORM, or at least not like I've used them before. With ORMs, I could translate from domain models to DB and back, but mongoose doesn't really do that type of mapping. It seems that I have to hand-roll a set of domain models that I then write translators from mongoose models to. Or does developing on the MEAN stack presuppose that all
[03:29:20] <Progster> of your models, the way they're shown in views, are the way they should be saved in the DB? That would be a big issue with separation of concerns
[05:36:21] <Skaag> does _id have to be an object?
[05:36:53] <Skaag> I have a situation where I'm replicating data from PostgreSQL to a MongoDB collection. I know that in the PG table, the 'id' field is unique, since PG handles that just fine
[05:37:47] <Skaag> And I see in the mongodb docs that: "The field name _id is reserved for use as a primary key; its value must be unique in the collection, is immutable, and may be of any type other than an array"
[06:06:29] <joannac> Skaag: _id can be whatever you want. It does not have to be an ObjectID
[07:23:26] <mk1> hello there. I'm trying to export a collection from my db and would like to EXCLUDE a single field. mongoexport seems to only provide a projection (i.e. include fields). Any ideas without resorting to a scripting language?
[07:27:36] <kurushiyama> mk1: Projecttion can be used to include and exclude fields. http://pastebin.com/YAHa8S5Z
[07:28:10] <mk1> kurushiyama: could you paste it on gist? my company's proxy is blocking pastebin
[07:29:10] <kurushiyama> mk1: https://gist.github.com/mwmahlberg/1017f01cdaef960e488f054eaf8ce30b
[07:30:50] <mk1> sorry, I wasn't clear enough. I'd like to use mongoexport in the command line
[07:31:12] <kurushiyama> mk1: Same rule applies: exclude via document...
[07:32:00] <kurushiyama> Ah, export
[07:32:04] <kurushiyama> wait a sec
[07:32:22] <mk1> it has only a --fields option :-/
[07:33:31] <kurushiyama> Well, then denote the fields. Or do an aggregation, with an according $project and an $out stage and dump that collection.
[07:33:48] <kurushiyama> The $out collection, that is.
[07:34:44] <mk1> is it even possible to use an aggregation?
[07:35:17] <kurushiyama> mk1: On the _shell_
[07:35:34] <kurushiyama> mk1: Do the aggregagtion via the shell first, then do the dump.
[07:36:27] <mk1> any pointers? I'm sorry asking dumb questions. I'm usually not in the db department of the project
[07:37:25] <kurushiyama> mk1: Gimme 5, need a fag.
[07:37:34] <mk1> k
[07:40:39] <kurushiyama> Dang, does not work as expected. Gimme a sec...
[07:41:52] <kurushiyama> mk1: I fear you have to specifiy the fields manually.
[07:45:06] <mk1> I figured. Well, it's only a few so that's okay. hopefully there won't be any changes in the next version. thanks anyway
[07:54:21] <kurushiyama> mk1: You might want to file a feature request.
[07:56:46] <mk1> yeah, I should
[08:52:14] <siyb> is pmxbot an official logging service to this channel?
[09:04:51] <Derick> siyb: yeah - sortof. If it was for me, it gets booted
[09:05:49] <siyb> Derick: as long as it's official, I am ok with it, that's the channels ops call, personally I dislike logging services like that, but it's not my call to make ;)
[09:06:59] <Derick> yes, I dislike them too
[09:07:19] <Derick> not sure who runs it actually
[09:19:12] <siyb> does morphia understand JPA annotation, i.e. can I use the JPA annotation equivalents instead of the mongodb ones?
[09:34:34] <siyb> wow, the API of morphia is truely aweful Oooo
[09:47:10] <kurushiyama> siyb: Actually, I liked it. It is only awful when you try to use it as a RDBMS replacement ;)
[09:48:51] <siyb> kurushiyama: the annotation API is aweful, you have very little control over what you are doing, because the annotations do not support getters where it would be required, in addition, they could have used the JPA standard where applicable, so that you do not have to write adapter objects for your pojo
[09:48:56] <siyb> s
[09:49:35] <kurushiyama> siyb: Huh? Wait a sec, they must have changed _a lot_ then.
[09:50:59] <siyb> kurushiyama: the @Id annotation cannot be used on a getter for instance, JPA @Entity is not recognized, etc. I understand that some things are necessary design decisions, but the annotations seem broken to me
[09:52:07] <kurushiyama> siyb: In case you try to use MongoDB as a drop in replacement (which is what I conclude from your statements), you are preparing yourself to get into trouble.
[09:53:14] <kurushiyama> siyb: Taking an SQL approach to data modelling (Identify entities, their properties and relations to each other) is a pretty sure way of breaking sclability and performance.
[09:53:42] <siyb> kurushiyama: haha, nope that's not what i am doing, but i am actually contemplating switching to couchbase later on (actually my sysop is doing that)
[09:54:20] <siyb> at least i am not planning to use mongodb as a drop in replacement for rdbms
[09:54:30] <kurushiyama> siyb: Well, I came from the Java world myself and had to make the same decision. tbh, there are very few things CouchDB does better than MongoDB.
[09:54:43] <siyb> kurushiyama: not couchdb, couchbase
[09:55:21] <kurushiyama> siyb: Sorry, msiread that.
[09:55:48] <siyb> i have been using mongodb for a few years now, in general i am very happy but there are a few quirks that annoy me and our sysop is not happy about the possibilities of mongodb when it comes to the ops part of things
[09:56:17] <siyb> soooo, i am using mongodb now but i am planning ahead for a possible switch in the future
[09:56:23] <kurushiyama> siyb: For example? Just curious...
[09:57:23] <siyb> kurushiyama: for me it's some of the broken API, (don't use map/reduce, etc), for our sysop, it's many fear of dataloss.
[09:57:39] <kurushiyama> siyb: Argh........
[09:58:00] <kurushiyama> siyb: The data loss is only an issue if you choose the wrong write concern
[09:58:34] <kurushiyama> siyb: People ranting about loosing data with MongoDB _always_ used a write concern of 0 or 1.
[09:58:36] <siyb> kurushiyama: he has had to deal with dataloss due to messed up sharding before, afaik, it's still an issue, even if picking the correct WC
[09:58:54] <kurushiyama> siyb: Which is the rough equivalent of sending a UDP message
[09:59:25] <kurushiyama> siyb: In sharding? Well, then sb f... it up big time.
[09:59:40] <kurushiyama> siyb: With an emphasis on "sb"
[10:00:01] <siyb> kurushiyama: he took over from another op
[10:00:08] <siyb> anyway, i am just keeping my options open
[10:01:33] <kurushiyama> Well, I chose a different approach. Unix-inspired, I chose on document DB and learned it to a point where I understood it.
[10:10:41] <jokke> hi
[10:12:10] <jokke> i've read into schema design for time-series data and i'd be happy to receive any sources for comparisons between MMAPv1 and WiredTiger regarding these schema design approaches. As in: since WiredTiger, what has changed?
[10:13:00] <jokke> the talks and articles i saw/read were from 2014 which was before WiredTiger afair.
[10:47:05] <kurushiyama> jokke: Basically everything.
[10:47:19] <kurushiyama> jokke: First and foremost document level locking.
[10:47:40] <kurushiyama> jokke: There are no document migrations any more.
[10:47:51] <kurushiyama> jokke: There is no padding any more.
[10:48:07] <Derick> jokke: Documents are no longer updated in place
[10:51:39] <kurushiyama> jokke: The default compression of the data files enhances hardware utilization and tends to give you more bang for the buck.
[10:52:22] <kurushiyama> jokke: Bottom line: Unless you have _very_ good reasons to do so, use WT instead of MMAPv1.
[10:53:03] <kurushiyama> Derick: Are there any detailed performance comparisons avilable?
[10:54:10] <Derick> kurushiyama: not that I'm aware of
[10:54:46] <kurushiyama> Derick: I tend to refer to https://objectrocket.com/blog/company/mongodb-wiredtiger but this is a bit shallow, imho.
[10:57:04] <Derick> https://objectrocket.com/blog/company/mongodb-wiredtiger
[10:57:05] <Derick> hah
[10:57:11] <Derick> i was just reading that to paste it :D
[10:57:45] <Derick> i'm sure there are some others, but can't find them now
[10:57:49] <kurushiyama> I'd like to see the data sets.
[10:57:58] <kurushiyama> And ops.
[10:59:07] <Derick> he mentions sysbench-mongodb - i think that's a standard set
[10:59:25] <Derick> https://github.com/tmcallaghan/sysbench-mongodb
[10:59:56] <kurushiyama> Well, I'll have to google that. My personal experience is similar or even better, though. In one project, we achieved order of magnitudes better insert performance.
[11:00:12] <kurushiyama> Derick: Thanks for the service ;)
[11:00:18] <Derick> that's allright :)
[12:08:11] <owg1> I#m looking into benchmarking mongodb on different filesystems, do you think that tool would be a good place to start? The issue I am facing is creating a test that will actually use the disk not just RAM for reads.
[14:05:19] <quattro_> is there any difference in performance between $inc and $set? I'm keeping a collection with average memory usage (in bytes) and using $inc the values are getting very large
[14:42:32] <StephenLynx> id guess $inc does more work behind the scenes.
[14:42:42] <StephenLynx> since it has to work on the existing value
[14:42:51] <StephenLynx> while set just puts whatever you give in place.
[15:15:15] <Ryzzan> hi
[15:15:49] <Ryzzan> ok... after populating, how am i supposed to aggregate ($group) the results of this population?
[15:16:18] <Ryzzan> (sry about my bad english, brazilian here)
[15:19:52] <Ryzzan> anyone? :)
[15:20:20] <StephenLynx> hm
[15:20:27] <StephenLynx> I know you can use group on aggregation
[15:20:42] <StephenLynx> you could also use a map reduce, that that's usually not worth it
[15:23:28] <mkjgore> hey folks, so I seem to be having an issue where my replica sets in my shard cluster are always (based on iotop) reporting constant disk reads of ~800KB
[15:23:49] <mkjgore> which occasionally jump up to 36 MB for a few seconds
[15:23:58] <mkjgore> this is all the members of the replica sets on both shardws
[15:24:33] <zylo4747> We are running MongoDB v2.6.7. For the past week we have seen a stack trace occur on different nodes of a single 3 node replica set a few times. I don't know how to analyze it to find the issue. Can someone guide me?
[15:24:36] <mkjgore> is there a way to see what's been going on in the mongo instance to cause all this?
[15:28:20] <mkjgore> I've tried to turn on profiling via "db.setProfilingLevel(2)" for a short while (30 seconds) to capture "normal" bad behavior up until "very bad" behavior then shut it off with "db.setProfilingLevel(0)" but to no effect
[15:28:44] <mkjgore> I observed the behavior with shell tools but whenI went to admin.system.profile the collection was empty
[15:28:49] <mkjgore> ¯\_(ツ)_/¯
[15:44:04] <deathanchor> mkjgore: you turn on profiler per db, not per server
[15:44:31] <deathanchor> turning it on your admin db is pointless unless you have issues with that db
[15:46:40] <deathanchor> mkjgore: use suspectDBname; db.setProfileLevel(2); db.system.profile.find();
[15:46:57] <deathanchor> if you do it on a secondary, add db.setSlaveOk()
[15:47:39] <deathanchor> on secondary the system.profile collection gets removed when you turn off the profiler and attept to drop it and drop the connection from your shell.
[15:48:22] <deathanchor> I recommend tailing your mongo logs before doing profiler in production
[15:53:15] <owg1> I'm running those benchmarks on different filesystems like I mentioned yesterday
[15:53:24] <owg1> Its pretty interesting.
[15:55:23] <StephenLynx> just throw ext4 at it :^)
[15:56:06] <owg1> StephenLynx: Yeah it is looking like the fastest at the moment, we wanted to investigate how ZFS performs with various options though, so that is the main focus.
[15:57:15] <owg1> So far: http://i.imgur.com/Bc8MwIu.png
[15:57:17] <bros> I am running 10 threads that are querying a subdocument (an array of objects inside a doucment)
[15:57:38] <bros> I'm noticing 200% CPU usage on a two core machine. Is it due to the fact that I am using subdocuments?
[15:57:44] <owg1> Using https://github.com/tmcallaghan/sysbench-mongodb
[15:58:09] <StephenLynx> 6,6k inserts per second
[15:58:11] <StephenLynx> daium
[15:59:02] <owg1> StephenLynx: Is that a lot? It is just running on a 4GB VM on Linode
[15:59:57] <StephenLynx> I wouldn't have any concern at all with that speed.
[16:00:21] <StephenLynx> specially considering mongo focus is read-intensive scenarios.
[16:00:27] <bros> owg1: was that to me?
[16:01:37] <owg1> bros: Not aimed at any one in general, just thought some people may be interested
[16:11:42] <bros> owg1: you using wiredtiger?
[16:12:29] <zylo4747> Can someone look at this stack trace I'm getting in production? (MongoDB v2.6.7, Win2012). http://pastebin.com/U1qwxFRU
[16:12:51] <zylo4747> We've gotten it twice this week already and we've had it a few times in the past. I don't know if there's anything I can do to identify / prevent the issue.
[16:13:12] <zylo4747> When it happens, the replica set member crashes and the service has to be restarted
[16:14:21] <bros> https://www.airpair.com/postgresql/posts/sql-vs-nosql-ko-postgres-vs-mongo Does MongoDB still lose data like mentioned in this article?
[16:16:08] <StephenLynx> it never did.
[16:16:32] <mkjgore> @deathanchor: is there a way to get a server wide analysis? the wall we're hitting is that there are ~100 dbs on this server though they're all mostly tiny making per db performance an issues
[16:16:36] <mkjgore> *issue
[16:18:26] <mkjgore> I only ask here because all arrows point to no
[16:29:47] <deathanchor> mkjgore: you could change the server startup option: https://docs.mongodb.org/manual/reference/configuration-options/#operationProfiling.slowOpThresholdMs
[16:29:50] <deathanchor> have you tried tailing your mongodb logs first?
[16:29:50] <deathanchor> 90% of the time I see all slow crap in there
[16:32:06] <owg1> bros: Yeah I am using all the mongodb defaults
[16:53:44] <owg1> Is tehre anyway I can check that fsync is being ran by mongodb?
[16:54:42] <cheeser> you can use the appropriate write concern
[16:55:56] <owg1> cheeser: Do you know if sysbench-mongodb will be doing that?
[16:56:11] <cheeser> i have no idea what that even is, so no. :)
[16:56:37] <owg1> cheeser: I'm using this benchmarking tool: https://github.com/tmcallaghan/sysbench-mongodb
[16:56:56] <owg1> And I want to make sure that the data is actually get synced to teh disk, but not sure what teh best way of checkgn that is
[16:57:26] <cheeser> that's ... not great code.
[16:57:55] <cheeser> looks like you can pass "fsync_safe" in the WC parameter position
[16:57:59] <cheeser> https://github.com/tmcallaghan/sysbench-mongodb/blob/master/src/jmongosysbenchexecute.java#L111
[16:58:11] <owg1> I just noticed it is using `write concern = SAFE` by default
[16:58:16] <cheeser> yep
[16:58:23] <owg1> Will that be using fsync?
[16:58:59] <cheeser> everything gets fsync'd eventually. that's how things get to disk. the fsync WC level simply waits for that to happen before returning.
[16:59:49] <owg1> Cool, thanks that makes sense.
[17:00:26] <owg1> Annoyingly I've ran all my tests with the default SAFE option, I shall re-run one with FSYNC_SAFE and hope there isn't a massive difference!
[17:02:47] <cheeser> it'll be slightly slower
[17:09:36] <kurushiyama> I would like to add that calling fsync in write concern reduces the interval in which the data is synced to disk, iirc. owg1 A write concern > 1 usually is enough – it rarely happens that 2 servers fail the same time. If you want to be really sure, set wc = ceil(members/2)
[18:11:30] <mkjgore> @deathanchor: It seems that the performance has nothing to do with ops coming from without but from some sort of internal ops. is there anything like this that we can look into?
[18:12:18] <mkjgore> I'm wondering if some "housekeeping" functionality might have gotten borked after we ran the restore operation
[18:26:11] <mikebommarito> How come I cannot do a mongorestore on 3.2 with a .gz exported from another 3.2 instance/
[18:26:48] <mikebommarito> here is the command that is failing str8 from the docs: sudo mongorestore --gzip --archive=NAMEOFARCHIVE.gz -db NAMEOFNEWDATABASE
[18:26:53] <cheeser> is the file store the same?
[18:27:10] <mikebommarito> ya built off same image on AWS
[18:27:17] <cheeser> i don't think mmapv1 dumps can be applied to a WT instance, e.g.
[18:28:25] <mikebommarito> ya when I run it I get this but doesn’t create DB…guess I’ll do the old mongodump / tar / transfer to new server untar / restore
[18:28:44] <mikebommarito> here is what the output is tho: 2016-04-20T18:02:53.018+0000 creating intents for archive
[18:28:44] <mikebommarito> 2016-04-20T18:03:50.045+0000 done
[18:29:15] <cheeser> dumb question: you're sure the gzip isn't empty?
[18:29:48] <mikebommarito> that’s a great question
[18:29:49] <mikebommarito> hold on
[18:29:59] <cheeser> *fingers crossed* :D
[18:30:08] <mikebommarito> no it’s 323mb
[18:30:13] <cheeser> bah! :D
[18:30:43] <cheeser> try passing -vvvvv to mongorestore and see if you get anything new
[18:31:08] <mikebommarito> ok I will try love 3.2 and wiredtiger my dbs were compressed by like 90%
[18:31:10] <mikebommarito> crazy
[18:34:42] <mikebommarito> Ya I am getting crazy output now I think it has to do with the .gz with all the collections .gz’ed too
[18:35:05] <mikebommarito> demux namespaceHeader: {DBNAME COLLECTIONNAME false 0}
[18:35:08] <mikebommarito> than at the end
[18:35:29] <mikebommarito> 2016-04-20T18:26:11.530+0000 demux End
[18:35:29] <mikebommarito> 2016-04-20T18:26:11.530+0000 demux finishing (err:<nil>)
[18:35:31] <mikebommarito> 2016-04-20T18:26:11.530+0000 restoring up to 4 collections in parallel
[18:35:31] <cheeser> your .gz has .gz files in it? that doesn't sound right.
[18:35:32] <mikebommarito> 2016-04-20T18:26:11.530+0000 will listen for SIGTERM and SIGINT
[18:35:34] <mikebommarito> 2016-04-20T18:26:11.530+0000 starting restore routine with id=1
[18:35:35] <mikebommarito> 2016-04-20T18:26:11.530+0000 ending restore routine with id=1, no more work to do
[18:35:36] <mikebommarito> 2016-04-20T18:26:11.530+0000 starting restore routine with id=0
[18:35:36] <cheeser> don't paste here
[18:35:39] <mikebommarito> sorry
[18:35:42] <mikebommarito> first time
[18:35:50] <mikebommarito> it even asked if I was sure :)
[18:36:23] <mikebommarito> I’ll go the old way I have been doing it…..just glad I have a place to ping people that are way smarter than me.
[18:36:58] <cheeser> or at least more experienced. ;)
[18:37:19] <mikebommarito> positive way to look at it thanks!
[19:18:39] <kurushiyama> cheeser: Maybe a über-gz containing the dump files?
[19:19:08] <cheeser> yeah. but that sounds wrong.
[19:20:38] <kurushiyama> Well, it does not make sense, that is for sure.
[20:32:26] <Doyle> Hey. In the mongodb docs for mongodump --oplog, it says "no effect when running mongodump against a mongos instance to dump the entire contents of a sharded cluster" Will --oplog work when a collection isn't sharded?
[20:32:33] <Doyle> or not at all against a mongos?
[22:45:32] <kurushiyama> Doyle: access via mongos == no effect, regardless of sharding status of collection. --oplog only has an effect when run against a replset directly.
[22:48:03] <Doyle> ah, obviously, no oplog on mongos server
[22:48:11] <Doyle> thanks kurushiyama, it just clicked
[22:48:30] <kurushiyama> Doyle: You are very welcome!