PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 5th of June, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:00:52] <bjori> generally
[00:01:09] <bjori> Aartsie: are you just running mongodb on localhost? then thats perfectly safe yes
[00:01:19] <bjori> if you have 50+ shards.. maybe not
[00:01:38] <Aartsie> oke :)
[00:01:59] <Aartsie> how can you update a database by 50+ shares
[00:02:15] <Aartsie> I really love mongodb but have a lot to learn
[00:04:22] <ackspony> what does that mean
[00:05:00] <ackspony> out he said shards
[00:05:21] <ackspony> Aartisie, if you have replication impelmented, dont just do an apt-get upgrade
[00:05:28] <ackspony> that's all he's saying :p
[00:18:54] <bjori> it should be fine with replcasets, just start on the secondaries and execute stopdown on the primary before doing apt-get upgrade on the primary
[00:19:08] <bjori> but in a large sharded environment things become a littlebit more tricky :)
[00:59:42] <ehershey> /2
[00:59:57] <ehershey> doh
[01:35:05] <ckd> anyone from 10gen happen to be awake?
[06:42:27] <bjori> its better to actually ask the question then to just ask if someone is around..
[06:55:33] <Kosch> rehi
[06:56:19] <Kosch> Is it possible to configure replica is performed in a synchronous way?
[07:57:14] <platonic> mornin'
[07:57:36] <platonic> i seem to remember there used to be some sort of mongo iops testing tool somewhere - i just can't find it. anyone got a pointer?
[07:57:56] <Nodex> mongostat?
[07:58:01] <Kosch> Hm. I've got a replica set with a master and a slave. I stopped the master and rs.status() on the secondary node shows me, that it is recognized, that the master is gone, but secondary does not become primary. Inside log I see "replSet can't see a majority, will not try to elect self". So the 2ndary never becomes primary?
[07:58:08] <Nodex> mongoperf?
[07:58:22] <Nodex> mongotop?
[07:58:35] <platonic> mmmnope. think it was a bit of a hack
[07:59:00] <platonic> doesn't directly relate to mongo (i.e. doesn't actually need a mongodb to install). but actually, i'll try to dig deep in those to see if i can't figure out what's going on
[07:59:33] <double_p> Kosch: you need an uneven amount of replSet members (at least 3) to vote. you can use "empty" mongod instances to fulfill that
[07:59:48] <Kosch> double_p: ah, k. thx
[08:00:46] <Kosch> double_p: the arbiter stuff, isn't it?
[08:01:20] <platonic> think mongostat can at least help out a bit
[08:01:48] <platonic> anyone care to see if they can point me in the right direction with my issue? it's a really simple one, one that i'm sure you've never heard before.. 'my mongo is actually running insanely slow and i don't know why'
[08:02:11] <double_p> Kosch: yep
[08:02:25] <platonic> fresh rhel6 install, nothing else running on it, 12gb ram, fast 15k sas disks...
[08:02:57] <double_p> how big is the dataset?
[08:03:40] <Nodex> is it building indexes?
[08:03:49] <Kosch> double_p: another question: Do I need to configure all members of a replica set at client configuration?
[08:04:02] <platonic> dataset is damn near empty at the moment, it's inserting that's slow
[08:04:04] <platonic> from mongostat:
[08:04:08] <platonic> http://pastie.org/private/u7adtmw3cyj3xbjkeqa
[08:04:21] <platonic> essentially, right now i'm hammering it as much as i can with updates (upserts).
[08:04:32] <platonic> and it's going between 8-13.
[08:04:47] <platonic> refresh rate is per second on that default mongostat so i presume that's actually the performance i'm getting
[08:05:00] <double_p> Kosch: i think it depends on the driver. but usually the replset members will be known to the client. if you leave some out in the configuration, you must ensure that the ones in the configuration are alive at client startup
[08:05:23] <Nodex> platonic : does it start off slow or gradually get slower?
[08:05:37] <double_p> like, if you add members to the replset while client is running, at some point (refresh whatever) they'll be contacted by the clients
[08:05:41] <platonic> i can wipe it and see what happens
[08:05:48] <platonic> right now it has 96k documents in it
[08:05:59] <platonic> i don't seem to remember it being /this/ slow in the beginning
[08:06:08] <platonic> one thing that concerns me though is that i don't see mongo actually taking any memory
[08:06:27] <platonic> Mem: 12133992k total, 1145916k used, 10988076k free, 210264k buffers
[08:06:29] <platonic> that's from top
[08:07:11] <platonic> 907m virt, 88m res, 47m shr
[08:07:12] <Kosch> double_p: ic. What I thought. I use the NodeJS MongoDB driver... but after shutdown of the primary, the application fails. Can this be caused by not having a primary at the moment?
[08:07:13] <double_p> free memory w/ mongo? HOW did that happen? :D
[08:07:28] <platonic> is what the process is using..
[08:07:32] <platonic> i've checked ulimits
[08:08:31] <platonic> ulimits for the pid if anyone's interested: http://pastie.org/private/mnrjmn8tqo9rtso3zpcniw
[08:09:27] <platonic> nodex: if i wipe it now and restart my insertion, any tips on how to look at it? can't use mms unfortunately (firewall issues for now.. :/)
[08:09:33] <Nodex> is there an index on the collection / query you're inserting?
[08:09:37] <platonic> there is
[08:09:43] <platonic> i can try to remove it and try again
[08:09:51] <Nodex> it could well be the index
[08:10:58] <Nodex> is read performance also bad?
[08:11:14] <platonic> don't think so actually
[08:11:26] <platonic> havent tried reading except through robomongo, and that feels pretty responsiv
[08:11:26] <platonic> e
[08:12:23] <Nodex> dunno what that is sorry
[08:12:24] <platonic> i've really flooded mongo with, well, about a million update requests with writeconcern 0 at the moment just to see
[08:12:37] <platonic> just dropped the indexes - should i see any difference in performance then?
[08:12:53] <platonic> or do you think i would be better off restarting the operation
[08:13:11] <platonic> i'm kindof hoping it might be the disks being slow for some reason (hence why i tried to find that iops thing)
[08:14:09] <platonic> i presume though that upserts (update with upsert:true) are generally slower than inserts - that's fine enough.
[08:14:23] <Nodex> test your disks with an external application to see thruput
[08:19:50] <platonic> just to make sure i'm not expecting too much here though - 10 upserts per second IS slow right?
[08:20:21] <platonic> just as a general feeling
[08:32:13] <Nodex> yes, that's not very fast
[08:47:40] <ggherdov> Hi all. Does anybody happens to knwo what the name of the ubuntu package for the C++ mongodb drivers ?
[08:48:20] <ggherdov> also: is such a package provided at the 10gen ubuntu repo, http://downloads-distro.mongodb.org/repo/ubuntu-upstart
[08:48:22] <ggherdov> ?
[09:07:38] <ggherdov> HI again. Getting a bunch of errors while compiling the tutorial for C++ drivers. Here the code, the compilation command and the errors: http://bpaste.net/show/vXN0IXjthUUC64FUVDog/ what am I doing wrong ?
[09:08:08] <ggherdov> BTW the ubuntu package for C++ drivers is mongodb-dev
[09:09:30] <remonvv> \o
[09:09:57] <remonvv> Does anyone know where I can find a detailed spec about what happens and what guarantees are given during removeShard/addShard?
[09:49:56] <dsds__> Hi all! I'm using nodejs + mongodb (and the native driver). Should I create a new MongoClient for every request I receive in the server? Right now I'm using an unique instance of the MongoClient shared by every request to communicate with mongo, and it works quite well actually, but I'm used to MySQL where you must end connections as soon as possible, and something seems off...
[10:03:25] <remonvv> dsds__, I have no experience with node.js but the answer is almost certainly no. Never do heavy lifting on a per-request basis.
[10:07:37] <dsds__> remonvv, have you got experience with any mongo driver? I suspect that behavior must be alike in other drivers
[10:19:20] <Nodex> in the PHP driver the driver maintains the connection pool
[10:19:40] <Nodex> i/e you call connect and it either uses an emtpy connection or it makes a new one
[10:42:32] <dsds__> well, it seems that actually you only open the client once, and then reuse the db in each request (nodejs driver writer dixit), so I guess my approach is OK, thanks for the answers
[10:43:59] <remonvv> dsds__, yes, and in most other drivers it's a singleton, heavy-weight object
[10:56:15] <khushildep> Hi all
[10:56:31] <khushildep> Can any one help with this (http://pastebin.com/h3FaTDcP) error please?
[11:08:19] <remonvv> khushildep, what kind of help are you expecting? There's zero information in that paste other than that something's crashing.
[11:14:37] <Nodex> somehting looks like it's not working in that paste to me :P
[11:15:45] <Kosch> have you tried turn off and on again? *scnr*
[11:37:20] <dirk_> Hi, anyone here that could possibly help me out with a mapreduce issue on a sharded collection?
[11:41:52] <Nodex> better to just ask the question
[11:42:03] <khushildep> yeah - I'm basically trying to start and it's just dumping out - it's a really minimal config - just a database path and a log path.
[11:57:08] <dirk_> Ok here goes. I am doing a map reduce on a collection with out-reduce set to a large sharded collection. But it seems like the data never gets to the shards looking at postProcessCounts.
[11:57:41] <dirk_> If I do the same against a small collection with the same indexes and sharded as well postProcessCounts shows that it has written the new data.
[11:58:36] <dirk_> The only difference I can come up with is that one of the collections is small and the other is large.
[12:07:08] <dirk_> It somehow just doesn't get to the reduce state when the out: reduce options points to a large collection. I am at a loss as how to fix this.
[12:33:06] <ggherdov> Hi all, problems installing pymongo on Ubuntu. I can "import pymongo", but not "from pymongo import MongoClient" look http://bpaste.net/show/6Y9mHc9z2gdaRQDlcV40/ is the client class in another package? what is my mistake ?
[12:33:06] <ggherdov__> Hi all, problems installing pymongo on Ubuntu. I can "import pymongo", but not "from pymongo import MongoClient" look http://bpaste.net/show/6Y9mHc9z2gdaRQDlcV40/ is the client class in another package? what is my mistake ?
[13:01:56] <harenson> ggherdov__: hi, how are you installing pymongo?
[13:03:23] <harenson> ggherdov__: import pymongo; then; connection = pymongo.MongoClient('mongodb://localhost')
[13:09:02] <ggherdov__> harenson: via apt-get . and I am on ubuntu 13.04 , which ships pymongo 2.2-4 . and the file mongo_client.py https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/mongo_client.py
[13:10:08] <ggherdov__> ... is not shipped
[13:12:03] <ggherdov__> harenson: I just finished this report that I hoped to hand to 10gen somehow to get some hint http://bpaste.net/show/wMp96gy8I66Ix1tX9OEp/
[13:12:03] <ggherdov__> but maybe is as simple as that: pymongo 2.2-4 *does not have *MongoClient"
[13:13:26] <harenson> ggherdov__: you have to exit from the actual ipython, bpython or whatever you're using
[13:13:38] <harenson> then try again with: from pymongo import MongoClient
[13:15:23] <ggherdov__> harenson: done that but no luck. But this is NOT surprising: the file mongo_client.py is *not* in my system. The MongoClient class is in that file. no file, no class.
[13:23:10] <d1b1> morning. Is this correct place to pose a index question?
[13:37:17] <harenson> ggherdov__: try opening new shell
[13:37:45] <harenson> ggherdov__: or create a new .py file
[13:41:17] <harenson> ggherdov__: create new .py file with this content http://pastebin.com/gj2vZYYG
[13:42:14] <harenson> then execute it e.g: python /path/to/file.py
[13:42:31] <harenson> what's the output?
[13:53:32] <pinvok3> Hey there, I'm pretty new with mongodb and I come from a long time SQL development, so it's hard to get the whole syntax. I have this layout: http://pastebin.com/U7a10gcu I want to count the errors.key.message in this array. I have two times the same message. How should the aggregation look like? I just don't get it. Thanks in advance
[13:57:53] <harenson> pinvok3: could you paste a mongodb output?
[13:58:16] <harenson> pinvok3: I mean, the same query, but, the results in json
[14:01:20] <pinvok3> harenson: http://pastebin.com/M3zhbXzP the 2048 and the 8 key is the Id of an PHP error. I think it was E_WARNING or so.. Its a simple error log and I want to have some statistics about the most occouring errors and warnings
[14:02:59] <Nodex> you should aviod naming your key's as numbers
[14:05:04] <pinvok3> Nodex: Is it a "you should" or a "its important that you don't use numbers"? It's easier to work with those numbers in php when Ive fetched the data
[14:07:35] <Nodex> they will work but when it comes to adjusting objects/arrays you will run into problems
[14:15:32] <ggherdov__> harenson: thank you for your support; I ended up re-installing pymongo with pip this time, got a newer version (2.5.something) and it went all fine. Ubuntu just packages some obsolete crap
[14:15:33] <pinvok3> Nodex: What other approach would you suggest? Save every message in a seperate collection and link an id?
[14:23:03] <Nodex> in a separate colleciton . NO, a separate document maybe
[14:23:17] <Nodex> collection ~ table.... document ~ row
[14:23:26] <Nodex> roughly speaking
[14:32:27] <platonic> nodex: think i figured out why upserts took so damn long
[14:33:36] <Nodex> what was it platonic ?
[14:33:39] <platonic> lookup:{receiver:{fullAddress:[{name: xx, addr1: yyy, zip: 242, place: nnn},{...}]}}
[14:33:58] <platonic> basically i was upserting based on a fullAddress object
[14:34:16] <platonic> and noticed that it's going daaaamn
[14:34:17] <Nodex> pinvok3 : how I learned Mongo quickly was forget everything I knew about SQL and treated objects like PHP arrays and or JSON objects
[14:34:25] <platonic> slow.
[14:34:34] <Nodex> I did wonder how large your upsert was
[14:35:01] <platonic> so anyway, inserts went quickly, about 4000/s or so
[14:35:20] <platonic> so now that i have my data, i've started querying
[14:35:23] <platonic> and that's fine
[14:35:37] <platonic> except if i try to find a specific address.. then it takes about 1.5 seconds
[14:35:45] <harenson> ggherdov__: that's better
[14:35:49] <redsand_> looking it up by _id ?
[14:35:49] <Nodex> add an index and it wont ;)
[14:35:50] <platonic> even adding an index to the array of objects.
[14:35:58] <harenson> ggherdov__: your welcome :D
[14:36:15] <platonic> yeah, see, that's the thing - thought that the index would help - but doesn't seem to at the moment
[14:36:22] <platonic> so i think i may need to reconsider my structure
[14:36:47] <Nodex> you might have the wrong index ;)
[14:37:14] <platonic> well, can't see how i can index it better..
[14:37:24] <platonic> key: [{obj}]
[14:37:28] <platonic> i can only index the key
[14:37:45] <platonic> or rather, can i actually index inside the object inside the array?
[14:37:57] <Nodex> depends what your document looks like
[14:38:08] <Nodex> please pastebin a typical document
[14:38:17] <platonic> will do
[14:38:18] <Nodex> (remove sensitive information)
[14:38:26] <platonic> yup ;)
[14:40:18] <platonic> http://pastie.org/private/uhx29jufsjvuytp19sbvig
[14:40:45] <platonic> index is currently on lookup.receiver.fullAddress
[14:40:47] <Nodex> is reciever only ever 2 addresses?
[14:40:57] <platonic> that's the problem, it's 1:N
[14:41:08] <platonic> so i'm thinking of restructuring the document logic if i have to
[14:41:25] <platonic> just making sure i actually have shot myself in the foot with the current document structure
[14:41:38] <Nodex> you can do a compound index on 4 fields
[14:41:42] <platonic> there are two usecases for this - one is easy - using services.xxxx.active and updateClass as keys
[14:42:11] <platonic> nodex: so you're saying i can actually index exactly inside lookup.receiver.fullAddress.fullname for example?
[14:42:17] <Nodex> yeh
[14:42:20] <platonic> damn
[14:42:34] <platonic> here i thought that i could only index up until the array and not inside of it
[14:42:37] <Nodex> but you're limited on the number of indexes a document can have
[14:43:25] <platonic> how so?
[14:43:34] <platonic> i'm not expecting to have more than 5 or so.
[14:43:36] <platonic> aha
[14:43:45] <platonic> "err" : "ns name too long, max size is 128",
[14:44:03] <Nodex> you will need to rename your keys to make it work
[14:44:10] <Nodex> a.b.c.d
[14:44:46] <platonic> funky.
[14:45:28] <Nodex> large keys are a huge waste of space, mongo doens't compress them (at the moment)
[14:45:34] <platonic> so the name is the combinened length of all the keys
[14:45:44] <platonic> for the index that is
[14:45:46] <Nodex> db.foo.getIndexes();
[14:45:53] <Nodex> see for yourself how it's made up
[14:46:04] <platonic> bwahahahahaha.... 0.002 seconds for the find that previously took 1.5 seconds.
[14:46:23] <Nodex> try a different query, that one may have been cached
[14:46:28] <platonic> "lookup.receiver.fullAddress.fullname_1_lookup.receiver.fullAddress.adr1_1" : 93059232
[14:47:36] <platonic> yup, still ~0.002s
[14:47:40] <platonic> beautiful
[14:48:01] <Nodex> ;)
[14:48:05] <platonic> think i actually got my head mixed up from some old typical walkthroughs.. like indexing arrays of tags
[14:48:26] <platonic> since, well, there's no other way. this is just. heh. awesome.
[14:50:27] <hectron> That was interesting.
[14:50:54] <Nodex> make sure you drop your old index else it will take up space
[14:51:35] <platonic> yup, i'll rebuild before we start heavy development anyway - probably wise to trim the keys regardless
[14:53:48] <Nodex> I wrote a map for mine, I can feed into it what I want and it give me back the corresponding mongo key / solr key / redis key w/e
[14:56:38] <hectron> Asking once more since I got a lag spike earlier: I am going to be participating in a hackathon tomorrow and wanted to leverage MongoDB. I wanted to add functionality to an existing website, and include a Javascript file that collects user information and saves it to my db. Do I need Node.js?
[14:57:56] <Nodex> you want to take data from a website and add it to mongodb?
[15:02:46] <hectron> Yes.
[15:02:58] <harenson> hectron: no
[15:03:26] <harenson> hectron: you could use ajax + your server side programming language of preference
[15:03:57] <hectron> I see. I was hoping of having like a drop-in solution, which the reason that I was leaning towards javascript.
[15:05:00] <hectron> The applications that I would be implementing this would be on C# websites and potential Python/Django websites.
[15:06:24] <harenson> hectron: http://api.jquery.com/jQuery.ajax/
[15:07:29] <harenson> hectron: if you want to use javascript, I recommend to you jQuery
[15:08:09] <hectron> Thank you harenson!
[15:09:03] <harenson> your welcome hectron :D
[15:11:52] <hectron> Pardon the noobery, but this is my first time actually creating something rather than following or maintaining.
[15:11:57] <hectron> So it's exciting.
[15:13:59] <harenson> hectron: #IknowThatFeelBro http://th03.deviantart.net/fs70/PRE/f/2011/288/7/a/i_know_that_feel_bro_by_rober_raik-d4cxn5a.png
[15:14:13] <hectron> rofl.
[15:25:27] <hectron> My registration icon for the event is the feels guy. So I think that it's definitely relevant.
[15:27:38] <kurtis> Hey guys, I am looking for a good "caching scheme". We're doing 'real-time' analysis and I'd like to use existing analysis results (cached) combined with the later results. The primary place I'm running into trouble is determining whether or not it's safe to save a "date" (or objectid) and simply grab all of the objects that came after-wards. Any suggestions?
[15:28:38] <Nodex> chuck the results in redis/memcaache and check that first?
[15:31:40] <kurtis> Nodex, Well -- I'll definitely be using redis. But, I'm trying to identify a sane methodology for separating "what's already been processed" and what's "new and needs to be processed". As a simple example -- we have millions of MongoDB documents. I need to take a single field, count the number of occurences, and sort it, and return it as a list. The more data we have, the slower this operation becomes