PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 1st of October, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[03:29:38] <sorabji> howdy. i'm working on an application to tail the oplog in a write intensive mongo cluster. i have a hidden secondary that i'm connecting to and creating a tailable cursor to its local oplog
[03:31:04] <sorabji> this issue i'm having is keeping up with the rate that documents are being inserted into the oplog. i'm wondering if there's any pointers you can offer to speed the process up
[06:48:57] <plodder_> hello
[06:49:23] <plodder_> can anyone tell me what are some popular web apps that use mongodb?
[06:52:22] <plodder_> e.g. chat clients, discussion forums
[06:52:50] <plodder_> or specialized apps, preferably open source
[06:59:54] <Boomtime> @plodder_: https://www.mongodb.org/community/deployments
[07:00:41] <Boomtime> scroll down that page, you probably want to start with 'use cases'
[07:01:16] <Boomtime> but what you're asking for is going to produce a very long list - that list is just a selected set of novel deployments
[07:09:51] <plodder_> @Boomtime thanks for the list, but most of it seems to be proprietary
[07:10:07] <plodder_> would you know of any open source stuff?
[07:11:20] <plodder_> that i can tinker with?
[07:14:24] <psuriset__> ping
[07:14:39] <Boomtime> pong
[07:15:02] <psuriset__> To run Mongodb with numa interleave on centos
[07:15:02] <Boomtime> @plodder: dunno any specific, but you'll find plenty of projects on github using mongodb in some form
[07:15:28] <psuriset__> i see that my nodes are interleaving fine
[07:15:28] <plodder_> @Boomtime: Ok. Thanks for you help!
[07:15:29] <Boomtime> @psuriset__: http://docs.mongodb.org/manual/administration/production-notes/#mongodb-and-numa-hardware
[07:15:29] <psuriset__> numactl numactl --interleave=all /usr/bin/mongod -f /etc/mongodb.conf
[07:16:00] <psuriset__> Boomtime, i checked there. i did my work before checking.
[07:16:13] <Boomtime> @psuriset__: yeah, mileage varies, push it hard and you'll probably see differences
[07:16:14] <psuriset__> Boomtime, looks like mongodb fails to start after interleave.
[07:17:15] <psuriset__> Boomtime, For centos, i dont find init scripts. i suspect it to be init issue.
[07:17:34] <psuriset__> Boomtime, ANy centos init script to do numa interleave?
[07:19:10] <Boomtime> do you get a mongod log, or anything in syslog?
[07:21:02] <psuriset__> Boomtime, yes.
[07:21:14] <psuriset__> 1 sec. let me send in pastebin
[07:23:54] <psuriset__> Boomtime, https://gist.github.com/psuriset/6f1dfed1ad5ad68141c8
[07:24:40] <psuriset__> Boomtime, Two things. 1) Thu Oct 1 03:19:03.084 [initandlisten] exception in initAndListen: 10310 Unable to lock file: /var/lib/mongodb/mongod.lock. Is a mongod instance already running?, terminating
[07:25:05] <psuriset__> 2) Already interleave done as you see in logs. still it does mention to interleave
[07:25:29] <Boomtime> -> exception in initAndListen: 10310 Unable to lock file: /var/lib/mongodb/mongod.lock. Is a mongod instance already running?, terminating
[07:25:57] <psuriset__> Boomtime, yes. same thing. i killed mongod and started with numactl interleave
[07:26:20] <psuriset__> Boomtime, then when i start mongod service i see this
[07:27:19] <Boomtime> yep, sorry i was looking at the log same time as you apparently..
[07:27:38] <psuriset__> Boomtime, Without interleave : https://gist.github.com/psuriset/b465710fdb4560322a25
[07:27:48] <psuriset__> it dowks fine as expected
[07:28:20] <psuriset__> s/dowks/works
[07:28:53] <Boomtime> what the heck.. can you repeat that experiement?
[07:29:03] <psuriset__> Boomtime, tried multiple times.
[07:29:47] <Boomtime> and whenever you specify interleaving (which doesn't "take" btw since the process reports none in both cases), the lockfile error occurs instead?
[07:30:15] <psuriset__> Boomtime, After little google, i realised init script need to be added. Added that too.
[07:30:17] <psuriset__> Boomtime, https://gist.github.com/psuriset/6df05a7410ebe0fe8cb8
[07:31:07] <psuriset__> Boomtime, interleaving is happening and listing on port. i feel interleave is working.
[07:31:21] <psuriset__> Boomtime, as you see its interleaving between nodes 0-1
[07:33:12] <psuriset__> Boomtime, looks like similar issue: https://jira.mongodb.org/browse/SERVER-6008
[07:33:12] <psuriset__> ?
[07:34:09] <Boomtime> centos isn't deb though right? and fixed before your version... otherwise, yeah, looks similar
[07:35:54] <psuriset__> Boomtime, yes. looks similar
[07:37:31] <Boomtime> ok, can you test this without the init.d? like, try using a command-line and don't even bother with --fork or log, just try launching a server direct in a shell
[07:38:17] <Boomtime> when possible, you should at least get the latest 2.4 too, or better upgrade to latest 3.0
[07:39:00] <psuriset__> Boomtime, ok.
[07:41:05] <psuriset__> Boomtime, https://gist.github.com/psuriset/36df14fb44a7428729af
[07:42:01] <psuriset__> Boomtime, my idea it to get perf numbers with/without interleave. Ofcourse interleave would be good.
[07:42:25] <psuriset__> Boomtime, Just want to make sure, how other components will perform in such scenario
[07:51:03] <psuriset__> Boomtime, could be a bug?
[07:51:30] <psuriset__> Boomtime, i am running 2.4.9
[12:10:57] <terabyte> hey
[12:12:49] <terabyte> in my mongodb directory I have a number of files. myschema.0,.1,.2,.3,.4,.5 they double in size each time. I understand each file is 'preallocated' so given the latest is 2gb i haven't actually used the 2gb yet. just wondering, is the database data all contained in the latest 2gb file, or is it spread across all 5 files?
[12:30:28] <Grummfy> hello, a question about how to do thing. I need to check on a single collection all record that match an element 'a' with another element 'b' but 'b' is in an array. So elementmatch + where is not possible, so i try with aggregate + unwind but where is not possible inside. So except with a map reduce I don't see easy way to do it
[12:33:36] <sorabji> howdy. i'm working on an application to tails the oplog in a write intensive mongo cluster. i have a hidden secondary that i'm connecting to and creating a tailable cursor to its local oplog
[12:33:51] <sorabji> the issue i'm having is keeping up with the rate that documents are being inserted into the oplog. i'm wondering if there's any pointers you can offer to speed the process up
[12:43:22] <Bookwormser> Good morning. Did the method for importing a schema change between 2.6.9 and 3.0.6? I keep getting a bad JSON array format - found no opening bracket '[' in input source error, though the schemas have never contained brackets.
[12:43:30] <Bookwormser> It worked with 2.6.9
[12:50:06] <Grummfy> solved my issue with a big where function
[12:56:10] <jamiel> Hi all, does WiredTiger do any prefixing of field names for collection storage like it does on indexes. ie. Does it make using short field names redundant as a space saving technique? Or will that still help..? Thanks.
[13:14:33] <Bookwormser> exit
[13:16:11] <NoReflex> hello! I have a collection af about 121 million rows with an average size of 168 bytes; I'm running a query which uses an index that should retrieve around 1.4 million rows
[13:16:50] <NoReflex> this query runs in about 5000 seconds on my machine while with postgres with the same database and index it runs in about 200 seconds (that is 25x faster in PG)
[13:18:59] <NoReflex> the collection holds data sent by many devices; ti represents the device, mo is the moment, rt is the record type while the other data is not useful for indexing
[13:19:46] <NoReflex> here you can see the definition in PG and the queries used: http://pastebin.com/92yyQU3w
[13:20:32] <NoReflex> the only difference is that some key -values are transformed in columns in postgres
[13:20:57] <NoReflex> (the ones that appear in every document)
[13:21:08] <NoReflex> where could this huge difference come from?
[13:26:10] <jamiel> Extra 10 years of development on PG? :) j/k ... you probably want to send us the explain() results for that query and the index definition
[13:29:00] <NoReflex> jamiel, index on Mongo: "v" : 1,
[13:29:00] <NoReflex> "key" : {
[13:29:00] <NoReflex> "ti" : 1,
[13:29:00] <NoReflex> "rt" : 1,
[13:29:00] <NoReflex> "mo" : 1
[13:29:00] <NoReflex> },
[13:29:02] <NoReflex> "name" : "ti_1_rt_1_mo_1",
[13:29:07] <Ulrar> Hi, I installed a mongodb for a client and I have some problems with access rights. I have a user with a userAdmin role on a databse, but that user still gets "not authorized" errors when trying to insert data in that database
[13:29:25] <Ulrar> Any idea what could be going wrong ?
[13:32:28] <NoReflex> jamiel, I pasted the useful parts of the explain output here: http://pastebin.com/pTmmFN1j
[13:32:56] <NoReflex> I just used ... in order not to paste the whole ranges for the ti value
[13:35:21] <Ulrar> Okay my bad, I added the readWrite role and it works. feels a little strange that userAdmin can't write, but okay
[13:40:32] <jamiel> @NoReflex you should try creating the index with the field which has the highest cardinality first or which would be the most restrictive of the dates ... "rt" should definitely be the last field in the index as it provides no benefit really, it really depends on your data but you should experiment with {ti: 1, mo: 1, rt: 1 } and { mo: 1, ti: 1, rt: 1 } and
[13:40:32] <jamiel> compare results
[13:41:01] <jamiel> Also do a db.coll.stats(1024 * 1024) and check the size of the index whether it fits in memory or if the query is going to disk
[13:48:18] <NoReflex> jamiel, there are around 2200 distinct ti values in the collection of which I'm selecting around 500, 4 rt values (0,1,2,3) of which I'm selecting only one
[13:48:46] <NoReflex> regarding the moment (mo) the data spans around 10 months fo which I'm selecting 6 months
[13:50:24] <NoReflex> here are the stats: http://pastebin.com/WCqr28nM
[13:55:47] <MadWasp> Hi, I need to change approximately 20 million documents in my mongo db. They currently reference another collection in which documents have a params field. This params field should become part of the 20 million documents. What would be the fastest way to achieve this?
[14:18:42] <MadWasp> For now I’ve written a PHP Script to do it but it takes like a couple of days to run.
[14:31:21] <saml> are you saying writes are slow?
[14:35:39] <MadWasp> the whole thing is pretty slow
[14:35:47] <MadWasp> i have to read a subset of all documents
[14:35:58] <MadWasp> query another collection for each document and save the document again
[14:45:17] <mantovani> sparse data bases are less efficiency to read than not sparse data bases.
[14:46:56] <mantovani> wrong channel, sorry.
[16:23:55] <bmarick> Hey, I have come across an SSL host validation issue. Is there anyone here that could point me to a location in the code base so I can try to track down the error?
[16:33:01] <brianboyko> Hello. I'm building out the schemas now using Mongoose. How do you create an array of objects? Specifically, each test has questions, each question has a correct response, a student response, and a point value.
[16:33:27] <StephenLynx> I strongly suggest you don't use mongoose.
[16:33:39] <StephenLynx> it is slow, incomplete and inconsistent.
[16:33:55] <StephenLynx> hands down worst ODM in existance.
[16:34:21] <brianboyko> Open to alternatives...
[16:34:27] <StephenLynx> the regular driver.
[17:36:48] <spellett> hello all
[18:18:50] <steeze> is there a way to do an exclusive query?
[18:18:58] <steeze> like db.users.find({roles: {$in: ['user']}})
[18:19:21] <StephenLynx> yes.
[18:19:26] <steeze> but like $only. i want users who ONLY have 'user' in their roles array. this matches any user with 'user' + other roles
[18:19:27] <StephenLynx> $nin, $ne
[18:19:36] <StephenLynx> ah
[18:19:41] <StephenLynx> roles : ['user']
[18:19:54] <steeze> ahh, dont need the $in
[18:19:56] <steeze> i was over thinkign this
[18:19:58] <StephenLynx> yeah
[18:20:08] <StephenLynx> you can ask for a 100% match on arrays.
[18:20:22] <StephenLynx> that way
[18:20:47] <steeze> well i feel silly haha
[18:20:48] <steeze> thanks
[19:31:47] <blizzow> I'm running a query via pymongo: db.mycollection.find( { "_id": "mydocumentname", $and: [ { "timestamp": { $exists: true } } ] } )
[19:31:47] <blizzow> That returns: {u'timestamp': u'201510011850', u'_id': u'mydocumentname'}
[19:31:47] <blizzow> Is there a way to make the query only return the value of the timestamp field?
[19:32:13] <StephenLynx> use the projections
[19:32:26] <StephenLynx> usually you put the projection on the second argument of the find.
[19:32:32] <StephenLynx> but I don't know pymongo or python.
[19:33:48] <blizzow> StephenLynx: Thanks. I'll read up on projection.
[19:48:27] <teknopaul> Hi I'm finding that update() or updateOne() wipes all the data in an object, my update has no need for "dot notatioin" AFACS its a flat object, so I'm just trying to add some fields { '$set': { desc: 'baa', lastUpdate: 1443728439630 } }
[19:52:30] <cheeser> can you pastebin the document and your entire update command?
[19:55:21] <teknopaul> ok
[19:56:51] <teknopaul> http://pastebin.com/xrFHKL9X
[19:57:16] <teknopaul> ok sorry my bad
[19:57:41] <teknopaul> I'm missing the $set when it calls the update
[19:57:44] <StephenLynx> :v
[19:57:58] <StephenLynx> about to say that and didn't even had to check your code. that happened to me sometimes when I started with mongo
[19:58:22] <StephenLynx> when you wipe your stuff once or twice you don't forget it anymore v:
[19:59:07] <teknopaul> :)
[19:59:40] <teknopaul> if (! update.$set) throw new Error() :)
[20:01:06] <StephenLynx> you might want to make an update that does other stuff without a $set though
[20:39:08] <pagios> pagios: Don't use MongoDB - it has many issues, and is essentially never the right solution to your problem. Here's a list of issues: http://cryto.net/~joepie91/blog/2015/07/19/why-you-should-never-ever-ever-use-mongodb/ and a more technical article at http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/ CAN SOMEONE please explain this to me once and for all coz i am tired of ppl's comments and noise
[20:39:27] <pagios> i am new to DB and i wnt to choose a db to use with node.js
[20:39:31] <pagios> mainly for prototyping
[20:39:58] <pagios> featurs of my app: authentication, checking credits, credit manipulation, backend user statistics
[20:43:45] <StephenLynx> explain what?
[20:44:09] <pagios> if i should use mongodb or not
[20:44:12] <StephenLynx> btw, first tip I can give: use the nodejs native driver.
[20:44:27] <pagios> k
[20:44:46] <StephenLynx> if you should? from the little you said, it seems like you COULD use.
[20:44:50] <pagios> are those comments true?
[20:44:57] <StephenLynx> probably not.
[20:44:59] <pagios> read that article...
[20:45:05] <pagios> it is scary
[20:45:09] <teknopaul> For prototyping with nodejs I'd say mongodb has to be a pretty good option. Using a document DB is very handy for this.
[20:45:20] <StephenLynx> already read a bunch of articles like those. they are all rubbish.
[20:45:30] <pagios> even for user authentication, data manipulation etc?
[20:45:43] <StephenLynx> "hurr I used the database wrong, I am not retarded, is the database that is bad derpderpderp"
[20:46:04] <StephenLynx> thats tl,dr for all those "mongo is the suxx0r articles"
[20:47:08] <pagios> so when should i use a relationsal db and when should i use a document based db?
[20:47:20] <pagios> what are the liimitations of mongo
[20:47:59] <StephenLynx> when your data is heavily relational, when you need joins.
[20:48:04] <StephenLynx> those are the bases I would use a RDB.
[20:48:25] <StephenLynx> mongo's limitations: counts are slow, if I am not mistaken. you have a 16mb limit on documents
[20:48:28] <teknopaul> Read this "For most cases, what you want is actually a relational database. " and you realise the guy is on a rant.
[20:48:40] <StephenLynx> no relational integrity checks or tools, no joins
[20:48:47] <pagios> i am concerned with data loss
[20:48:51] <StephenLynx> not an issue.
[20:48:59] <pagios> if a problem happens when data is being processed
[20:49:04] <StephenLynx> you get an error.
[20:49:05] <pagios> no locking in document based
[20:49:12] <StephenLynx> and then you work based on the error.
[20:49:15] <pagios> in relational it is aCID
[20:49:16] <StephenLynx> simple as that.
[20:49:30] <pagios> so i treat that in the application logic..
[20:49:32] <StephenLynx> yes.
[20:49:41] <pagios> revert the commit
[20:49:52] <StephenLynx> but keep in mind
[20:50:07] <StephenLynx> that your data shouldn't be too relational and involve too many collections in the first place.
[20:50:22] <StephenLynx> if your operation involves a dozen collections, you really should use a relational database.
[20:50:28] <pagios> coz i would need to revert back many stuff in case of erros right?
[20:50:36] <StephenLynx> yes.
[20:50:41] <StephenLynx> actually, no.
[20:50:46] <StephenLynx> its because your data is too relational.
[20:50:47] <StephenLynx> period.
[20:51:18] <StephenLynx> if you need to keep track of a considerable amount of relations, thats where relational dbs comes in.
[20:51:30] <pagios> whenyou say relational it means my documents are so interelated, tables are so interconnected?
[20:51:35] <StephenLynx> yes.
[20:51:47] <idd2d> I have Users, each has_many Addresses (via parent referencing). What's the best way to find which User has the most Addresses?
[20:51:50] <StephenLynx> like, over 3 or 4
[20:51:53] <idd2d> speaking of relations.
[20:52:10] <pagios> i get your point
[20:52:11] <StephenLynx> and if you NEED to have relational integrity.
[20:52:29] <StephenLynx> if your relations can get wrong and thats not a big issue, then is ok.
[20:52:40] <pagios> if i have say a user profile document, a user credit document, and a user token document , that not very tightly relational
[20:52:46] <StephenLynx> yeah.
[20:52:55] <StephenLynx> those are vaguely related.
[20:52:57] <pagios> coz a transaction would involve 'changing' one at a time
[20:53:19] <StephenLynx> and even then, is not that many stuff to relate.
[20:53:41] <StephenLynx> idd2d I would pre-aggregate that kind of information on the users.
[20:53:41] <pagios> but if i have a query that reads document1 , compares with doc 2 and modified docu3 thats too relationsal righht?
[20:53:49] <StephenLynx> not really.
[20:54:11] <StephenLynx> if the modification on doc3 goes wrong, what do you have to revert?
[20:54:17] <pagios> doc3
[20:54:21] <StephenLynx> no
[20:54:27] <StephenLynx> that modification never happened.
[20:54:38] <pagios> i dont revert anything then
[20:54:42] <StephenLynx> exactly.
[20:54:58] <pagios> can you give me an example of a too relationsal case?
[20:55:05] <StephenLynx> hm
[20:55:18] <StephenLynx> when you have to make an update that keeps relational integrity across 4 collections or so.
[20:55:26] <StephenLynx> and thats a core part of the application.
[20:55:35] <pagios> concrete example?
[20:55:38] <StephenLynx> dude
[20:55:40] <StephenLynx> dunno :v
[20:55:43] <StephenLynx> use your imagination.
[20:56:19] <pagios> compare col1 col2 col3 col4, if true add col1 to col2 - col3 * col4
[20:56:38] <StephenLynx> you only modified col2
[20:57:01] <StephenLynx> are you 100% new to databases at all?
[20:57:05] <pagios> store all in col1 col2 col3 col4
[20:57:08] <pagios> no
[20:57:27] <pagios> i know SQL
[20:57:32] <StephenLynx> how much?
[20:57:35] <pagios> basic
[20:57:49] <StephenLynx> then stick to it for a while and then check mongodb.
[20:57:57] <StephenLynx> mongo gives you freedom, lots of it.
[20:58:12] <StephenLynx> and with it, comes a steeper learning curve and more traps.
[20:58:58] <StephenLynx> gotta go
[23:01:34] <Torkable> hey, having trouble comparing dates
[23:01:45] <Torkable> or searching by date rather
[23:02:11] <Torkable> have a query that looks like
[23:02:18] <Torkable> find { createDate: { $gte: new Date('2015-09-29T00:00:00.000Z') }})
[23:02:32] <Torkable> doesn't seem to match anything'
[23:03:12] <Torkable> using node.js btw
[23:04:11] <Torkable> and that query finds lots of documents when I just run it on mongo directly
[23:10:19] <joannac> Torkable: pastebin your query?