PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 24th of October, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:15:53] <Guest76948> hi all, im pretty new to mongo. ive read a lot about using collections vs embedded documents. im having trouble understanding which one i should use in a given situation.
[00:15:58] <Guest76948> any general tips?
[02:21:44] <dharmaturtle> In this tutorial, what does the "previous" parameter do? Its only in the code once, and is never used. http://cookbook.mongodb.org/patterns/count_tags/
[02:24:05] <cheeser> reduce() takes a key/value pair.
[02:24:17] <cheeser> why it's called previous, i dunno
[02:26:38] <dharmaturtle> so it could be anything? I could call it "lisjflkjslefj" if I wanted to?
[02:37:11] <begizi> Anyone out there using node and mongoose, what is a clean and secure way to prevent sensitive user data from being sent out from the database on an api? ie keep the hashed password from being sent via the user rest api
[05:05:56] <dharmaturtle> Does anyone know why this isn't incrementing "total", but appending it like its an array? I think? I'm very new to javascript. http://imgur.com/DkAnVJf
[05:08:25] <joannac> Reduce takes in a key and an *array of elements*, not a single element (which I think is what you're assuming)
[05:08:40] <joannac> array of elemetns for the "value"
[05:09:06] <dharmaturtle> oh... okay. Hm. Thank you.
[08:32:40] <ayemeng> hey does anyone know if it's possible to specify the write concern in the mongo shell?
[08:38:59] <ayemeng> ping?
[08:39:12] <zokko> icmp reply timeout.
[08:39:56] <Nodex> ayemeng : it's all documented in the docs
[08:40:08] <ayemeng> i've been looking at the docs
[08:40:25] <ayemeng> and in particular the the db.collection.update() api doesn't take in a w
[08:40:56] <ayemeng> and i attempted to try and set getLastErrorDefaults to set it globally for the db
[08:41:07] <ayemeng> but that doesn't seem to be setting the write concern
[08:42:04] <ayemeng> i am able to set it correctly if i use the pymongo driver
[08:42:28] <ayemeng> but in the mongo shell, i can't find a way to set it at the db, collection, or operation level
[08:42:31] <Nodex> I think by default it's a safe write due to it allways calling getLastError after a command
[08:42:57] <ayemeng> right, that's why i'm assuming. so i thought setting getLastErrorDefaults would do the trick.
[08:43:04] <ayemeng> but it doesn't appear so. there is no improvement in speed.
[08:43:52] <Nodex> you could try "w" in the options
[08:44:43] <Nodex> options being the last object... db.foo.update({criteria},{$set...},{multi:true.....})
[08:44:48] <ayemeng> that doesn't work
[08:44:58] <ayemeng> DBCollection.prototype.update = function( query , obj , upsert , multi )
[08:46:18] <Nodex> on the actual mongo cli there is only 3 objects not 4
[08:48:01] <ayemeng> okay i just modified my code. it doesn't work. it doesn't seem to be respecting the w option
[08:48:27] <ayemeng> my call is the following: db.collection.update(query, update, { 'w': 0, 'j': 0})
[08:49:06] <Nodex> what esactly are you trying to do ? some sort of test ?
[08:49:20] <ayemeng> sure, i have a js file that is loaded on the server
[08:49:29] <ayemeng> which does a bunch of udpates
[08:49:44] <ayemeng> the pseudo code is the following
[08:50:05] <ayemeng> foreach doc in collection_1
[08:50:26] <ayemeng> look up an identifier
[08:50:30] <Nodex> and you want these updates to be async ?
[08:50:33] <ayemeng> yes
[08:50:42] <Nodex> or Fire and Forget
[08:50:51] <ayemeng> sorry i want fire and forget
[08:50:53] <Nodex> ok. what speed are you comparing to
[08:51:10] <ayemeng> i am comparing it to a sample function i wrote in python
[08:51:13] <ayemeng> that does the same thing
[08:51:19] <Nodex> python is multi threaded yes?
[08:51:25] <ayemeng> basically, i have 800 records which are updated
[08:51:35] <ayemeng> i don't use threads in python
[08:51:46] <ayemeng> python, the 800 records takes ~37 seconds
[08:52:04] <ayemeng> loading the js server side, takes ~98 seconds
[08:52:10] <Nodex> wow, what on earth are you doing to the records that takes that long?
[08:52:42] <ayemeng> good question. when i profiled my code in python
[08:52:51] <ayemeng> the long pull is the db.collection.find_one() call
[08:53:08] <Nodex> does it have an index on it?
[08:53:13] <ayemeng> i setup a hash index on the field i'm looking for it
[08:53:25] <Nodex> I don't know hat a hash index is sorry
[08:53:28] <Nodex> what*
[08:53:53] <ayemeng> in short, yes i setup an index on nit.
[08:53:54] <ayemeng> on it*
[08:54:23] <Nodex> have you tailed your log to see what's taking the time?
[08:54:36] <ayemeng> how do i do that?
[08:54:55] <ayemeng> my mongodb is hosted on mongohq
[08:55:37] <Nodex> that I don't know!
[08:55:50] <Nodex> do they not have some kind of debugging?
[08:56:37] <ayemeng> they do but i can't seem to find any logs that would show information about my queries
[08:57:02] <Nodex> look for possibly "slow queries" maybe?
[08:57:32] <ayemeng> they do happen to have a slow query tab, but it's empty :(
[08:57:42] <Nodex> :/
[08:57:57] <Nodex> can you double check the indexes?
[08:59:03] <ayemeng> check for their existence?
[08:59:05] <ayemeng> they do exist
[08:59:33] <ayemeng> [
[08:59:34] <ayemeng> {
[08:59:35] <ayemeng> "v" : 1,
[08:59:37] <ayemeng> "key" : {
[08:59:38] <Nodex> pastebin
[08:59:38] <ayemeng> "_id" : 1
[08:59:40] <ayemeng> },
[08:59:41] <ayemeng> "ns" : "<removed>",
[08:59:43] <ayemeng> "name" : "_id_"
[08:59:44] <Nodex> DONT paste multi lines in IRC
[08:59:44] <ayemeng> },
[08:59:46] <ayemeng> {
[08:59:47] <ayemeng> "v" : 1,
[08:59:49] <ayemeng> "key" : {
[08:59:50] <ayemeng> "number" : "hashed"
[08:59:52] <ayemeng> },
[08:59:53] <ayemeng> "ns" : "<removed>",
[08:59:55] <ayemeng> "name" : "number_hashed"
[08:59:56] <ayemeng> }
[08:59:58] <ayemeng> ]
[08:59:59] <ayemeng> woops
[08:59:59] <ayemeng> sorry
[09:00:00] <ayemeng> http://pastebin.com/nwJ2dWKn
[09:01:24] <Nodex> are on you Mongodb >=2.4 ?
[09:01:40] <ayemeng> nope 2.2.5
[09:02:04] <Zelest> Nodex, on sunday it's time for daylightsavings btw..
[09:02:10] <Zelest> Nodex, time to see if my TTL index b0rks again :)
[09:02:13] <Nodex> hashed index are in >=2.4
[09:02:19] <Nodex> Zelest : hahah
[09:02:31] <ayemeng> I see, I can change it back to single field index
[09:02:39] <ayemeng> but it's still the same issue
[09:02:48] <Nodex> remove it and re-add it as a single value
[09:02:50] <ayemeng> I just dropped the index and readded a single field index
[09:02:55] <Zelest> what's so special about hashed indexes btw?
[09:03:01] <joannac> What does the explain say?
[09:03:04] <Zelest> <-- clueless moron
[09:03:25] <Nodex> Zelest : first I have read about them today. It seems they can serve some kind of ring function for sharding / equality
[09:03:43] <Zelest> ah
[09:03:58] <Nodex> I can't think of a case where I personaly need one atm
[09:04:23] <Nodex> http://docs.mongodb.org/manual/core/index-hashed/#index-type-hashed
[09:05:12] <Nodex> though they must have some use else the funciton would not exist
[09:05:35] <Nodex> you noticed what Google have done to php.net today?
[09:06:08] <Zelest> I've heard peopel complain about it :)
[09:06:09] <Nodex> "This web page at php.net has been reported as an attack page and has been blocked based on your security preferences."
[09:06:10] <ayemeng> fyi, changing to a single field index did not change performance
[09:06:24] <Nodex> ayemeng : you are going to have to run an explain
[09:07:40] <ayemeng> http://pastebin.com/2zTvZrH5
[09:07:44] <ayemeng> do you see anything unusual?
[09:08:33] <Nodex> Zelest : http://imgur.com/g7H0Cej
[09:09:05] <Nodex> apart from "nscanned" being zero
[09:09:14] <Zelest> oh
[09:09:34] <ayemeng> fyi in this particular case for this data
[09:09:45] <ayemeng> there is no record that matches the key I'm looking for
[09:10:09] <Nodex> please run an explain on a vlid record
[09:10:13] <Nodex> valid*
[09:11:39] <ayemeng> http://pastebin.com/EkWZPkrx
[09:15:54] <joannac> That's as good as you're going to get
[09:16:14] <joannac> nscanned 1
[09:16:56] <Nodex> are you sure your script is using an int and not a string/
[09:18:00] <ayemeng> in python it's an int
[09:18:12] <ayemeng> in mongo shell/js i don't know how to tell
[09:18:34] <Nodex> if you quote it then it becomes a string
[09:18:42] <ayemeng> no its not quoted
[09:18:43] <ayemeng> it's
[09:19:18] <Nodex> I'm out of ideas, all I can advise is to dump the db to a local db which will tell you if it's a communication problem between you and mongohq
[09:19:28] <Nodex> dump it + run it
[09:19:41] <ayemeng> i see
[09:19:41] <ayemeng> okay
[09:19:45] <ayemeng> thanks for your help nodes :)
[09:19:49] <Nodex> spin up an amazon free tier, takes a minute
[09:19:50] <ayemeng> nodex* damn autocorrect
[09:20:06] <Nodex> no problems, let me know what it was when you find the error !
[09:20:12] <ayemeng> sure
[09:20:17] <ayemeng> i might just restructured my code
[09:20:27] <ayemeng> and build the join in memory
[09:20:39] <ayemeng> cuz basically we are joining collections ish
[09:20:54] <ayemeng> we are reading data from another collection and appending it to an existing collection
[09:21:15] <ayemeng> rather than for each iteration lookup in the other collection for the joining record
[09:21:20] <Nodex> still shouldn't take 30+ seconds on 800 records
[09:21:39] <ayemeng> i entirely agree :)
[09:21:52] <ayemeng> when i restructure it, it takes only a couple of seconds
[09:22:03] <ayemeng> ill ping mongohq and see if they can possibly give some insight
[09:22:07] <ayemeng> but it's 2am and i'm tired now.
[09:22:10] <ayemeng> i appreciate all the help
[09:22:12] <Nodex> on 50k+ records I can do a random lookup on a 1.8m doc collection and that takes less than 3 seconds
[09:22:35] <ayemeng> really? so if you iterate over the 50k records, and each iteration look up in a 1.9m doc collection
[09:22:38] <ayemeng> it takes only 3 seconds?
[09:22:41] <Nodex> yup
[09:22:56] <ayemeng> damn now that's fast
[09:23:24] <Nodex> good cpu's / ram but even so, on a slow machine it's within 10s probably
[09:23:40] <ayemeng> this is all local though right?
[09:23:43] <Nodex> yeh
[09:24:05] <ayemeng> do you have attest with a remote db?
[09:24:12] <Nodex> all over a local domain socket so zero tcp overhead (not that it's all that much)
[09:24:13] <ayemeng> err, have you done something similar with a remote db?
[09:24:22] <ayemeng> i c
[09:24:30] <Nodex> I've never used a remote DB and probably never would, I find them very expensive
[09:30:34] <joannac> ayemeng: what are we testing?
[09:31:08] <ayemeng> I'm running into a performance issue with live code
[09:31:17] <ayemeng> so i wrote a test to isolate the issue
[09:31:30] <ayemeng> thus far, i found two issues
[09:31:47] <ayemeng> 1. i can't seem to find a way to set the w option in js/mongo shell
[09:32:06] <ayemeng> 2. db.collection.find_one(query) is taking a really long time to find a record, despite being indexed on the field i'm querying
[09:32:22] <ayemeng> the collection in question is only 800+ documents
[09:32:32] <ayemeng> and it takes 30 seconds
[09:32:35] <joannac> was that the test you did before?
[09:32:42] <Number6> ayemeng: What are the specs of the machine?
[09:33:08] <joannac> because that test showed the doc was being served in 0ms
[09:33:42] <ayemeng> i don't know off top, it's probably m1.medium from aws
[09:34:02] <ayemeng> 3.75gb of memory
[09:34:17] <ayemeng> 1 vcpu (whatever that means)
[09:35:43] <Number6> ayemeng: Is the CPU or RAM being maxed out?
[09:36:11] <Number6> ayemeng: As for write concerns in the Mongo Shell, you need to call getLastError - which will set the desired write concern for you (journal, etc)
[09:36:18] <Number6> ayemeng: http://docs.mongodb.org/manual/reference/command/getLastError/#dbcmd.getLastError
[09:36:48] <ayemeng> it doesn't appear to be neither (cpu/ram) not maxed out
[09:37:10] <ayemeng> i don't follow how that works the api is confusing
[09:37:25] <ayemeng> i thought getlasterror is called by the user if they are curious if the write succeeded
[09:37:55] <ayemeng> do i call something like db.adminCommand({getLastError:1, w: 0, j: false})
[09:38:17] <Number6> For journal safe, it's db.runCommand( { getLastError: 1, j: "true" } )
[09:39:17] <ayemeng> okay i called that and got the following
[09:39:34] <ayemeng> http://pastebin.com/QKqJnKAX
[09:39:44] <ayemeng> is that expected? it seems to be success
[09:40:49] <Number6> Yes
[09:41:09] <ayemeng> okay does that persist for the session or db
[09:41:24] <ayemeng> like if i ended my live session with mongo
[09:41:29] <ayemeng> does that setting persist?
[09:42:41] <Number6> Just that connection
[09:44:31] <ayemeng> i see
[09:44:39] <ayemeng> i think my issue is that find_one is extremely slow for me
[09:44:42] <ayemeng> on my indexed fields
[09:44:48] <ayemeng> but explain shows it to be fast
[09:45:38] <Number6> ayemeng: I imagine that is because the documents are already in RAM. What is the RES output from top, for MongoD
[09:46:17] <ayemeng> 5.9 gigs
[09:47:27] <Number6> Are you sure that's not VIRT? 5.9G RES on a machine with 3.75G doesn't sound right
[09:47:46] <ayemeng> most likely my statement about the box's spec are wrong
[09:47:49] <ayemeng> i don't know
[09:47:54] <ayemeng> its probably a bigger box.
[09:48:04] <ayemeng> I'm looking on mongohq and under the RES column
[09:48:06] <ayemeng> its 5.9gigs
[09:54:46] <ayemeng> Numbers6: I'm heading to bed now. If you have an idea, feel free to send it my way. Thanks for your help.
[11:35:12] <Nodex> http://i.imgur.com/ord27qI.gif
[13:16:46] <Nodex> http://www.theregister.co.uk/2013/10/22/seagate_letting_apps_talk_direct_to_drives/
[13:23:24] <Number6> Nodex: So it's a micro SAN, essentially?
[13:37:01] <Nodex> looks that way but for all intents and purposes we can forgo the operating system to write to disk
[13:37:37] <cheeser> it reminds me of the S3 interface in a box, really.
[13:37:42] <cheeser> put this, get that.
[13:37:43] <Nodex> fire it across ethernet to A+B+C (write concerns) - hopefully it will have natting so it can work over WAN too
[13:37:58] <Nodex> hopefully it's faster than S3
[13:38:35] <Nodex> tbh I'm surprised nobody thought of Key/Value store drives before, I used to assume that's how they worked anyway
[13:39:30] <Nodex> Personaly I think the possibilities are endless with it
[13:40:03] <Nodex> they have essentially given us Application level raid at the same time
[13:43:56] <Number6> I miss my IB storage cluster, that was great for low latency
[14:27:36] <_Heisenberg_> Hi folks, I'm playing around with 2PC like described here: http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/ and came to a problem. Let's say I'd like to check the accountbalance in the same query where I update it. Can I determine for which reason the update may has failed? It could fail because the balance is not sufficient or because the transacation has already been applied (being in a recovery process)
[14:31:08] <Nodex> That's down to your applciation to determine
[14:33:39] <_Heisenberg_> Nodex: to clearify: http://pastie.org/8426986
[14:35:28] <_Heisenberg_> the problem is: if the update fails because the balance is too low, the transaction needs to abort. If it fails because the transaction has already been applied, it should go on.
[14:36:11] <_Heisenberg_> checking it in two queries is not possible since the queries would not be atomic
[14:38:14] <Nodex> I don't think 2pc will work for that
[14:41:58] <_Heisenberg_> I think they tried to explain how to do it under "Using Two-Phase Commits in Production Applications" but I don't understand it. In particular what is meant by: "... the application would also modify the values of the credits and debits as well as adding the transaction as pending." (first point of the second enumeration)
[14:54:27] <Nodex> if it were me I would just send very important transactions to a database designed for it
[15:00:24] <_Heisenberg_> Nodex: this is actually what I'm doing, just looking for alternatives ;)
[15:54:06] <akkroo_nick> Hi, quick question, is there a way to specify what interface mongodb uses to connect to other replica nodes
[15:54:48] <akkroo_nick> or is this specified by the IPs it is set to bind to
[16:01:07] <akkroo_nick> anyone?
[16:02:01] <cheeser> i don't think there is
[16:34:48] <defunctzombie> This is giving me access denied: http://downloads.mongodb.org/homebrew
[16:35:17] <defunctzombie> got that from: https://github.com/mxcl/homebrew/pull/23452
[16:48:11] <dabreaka`tv> re
[17:18:12] <Novimundus> I'm using PyMongo and I want to pull solely one field from every object in my collection. I keep using find({attr:val}), but it's returning the entire object...how do I pull solely the attr value?
[17:20:14] <retran> Novimundus, http://docs.mongodb.org/manual/core/read-operations/#projections
[17:20:37] <retran> you need to include a "projection" parameter
[17:20:39] <retran> in your find
[17:21:01] <Novimundus> Ah. It appears I am reading the wrong documentation. Much thanks.
[18:14:49] <ccmonster> hey guys - i have a pymongo script, and my mongo obj is Database(MongoClient('localhost', 27017), u'twitterData'), but i keep getting this error: serWarning: database name or authSource in URI is being ignored. If you wish to authenticate to twitterData, you must provide a username and password.
[18:15:07] <ccmonster> I don't know whyt hat is the case. There's no user/pass on the mongodb
[20:01:19] <ghostbar> hey guys. Someone using mongoose? Does mongoose freezes the responses? I'm trying to add more info into them manually and does not accept it. The response keeps static...
[21:22:22] <mrb_bk> Is there a $ keyword for the entire matched document inside a group in aggregate? I want to $push the results from all fields
[21:27:58] <dharmaturtle> Hi, could someone expalin why I'm getting a float result, despite calling Math.floor on this? http://i.imgur.com/zmdcK6L.png
[21:29:22] <jnewt> db.collection.ensureIndex( { a: 1 }, { unique: true } ) what is the 1 for? (example from manual)
[21:29:48] <cheeser> order ascending
[22:31:58] <eph3meral> so, I read recently (yesterday or so, somewhere along the docs/getting started) that mongo can and or will store and perform queries in memory provided there is enough memory available
[22:32:42] <eph3meral> is this true, how can I confirm this, and or where do I go to configure how much memory mongo is allotted and or whether or not mongo is allowed or supposed to keep everything in memory vs on disk
[22:33:37] <eph3meral> essentially I'm looking at using mongodb as a caching layer in which to read denormalized data that I package together from a relational DB
[22:34:13] <eph3meral> so, mongo won't be intended for userspace writes, just being synced up occasionally with the data in the SQL db through a pub/sub type listening daemon I've written
[22:56:49] <jnewt> i'm trying to model data that takes to form of a tree, top level being parts, second level being serial numbers, third being events, fourth being event logs. i can't figure out how to query across a level, say i want all the serial number records, regardless of part number (how to organize this?)
[22:58:17] <eph3meral> jnewt, map/reduce probably
[22:58:58] <eph3meral> jnewt, 1) filter all the parts that have matching serial number records 2) map the set to be just the serial number records
[23:00:21] <joannac> Can you just use the aggregation framework to unwind?
[23:03:13] <jnewt> let me explain a little better maybe. i'm moving over to mongo (hopefully), because i cannot handle the data very easily in mysql (which it resides now), righ now i have a table of parts, with names, descriptions, etc., as well as a table with serial numbers, notes, manufacture date, and some other stuff.
[23:03:40] <jnewt> the serials table has a fk pn,
[23:04:37] <jnewt> then there's a users table, and a users-serials map table (many to many)
[23:04:38] <eph3meral> jnewt, you may want to consider doing something like I'm trying right now - keep your data stored relationally and then use NoSQL as a denormalized caching layer for frontend/read only querying
[23:05:43] <jnewt> i need to select all the serials rows where the current user has a mapping row in the users-serials table
[23:06:48] <jnewt> eph3meral, i've hit a roadblock that will require me to either have a table of log data with a ton of columns or multiple log tables based on pn, or go to a name - value (all varchar or something) setup.
[23:07:49] <jnewt> mysql worked when there was only one part with all the same log data, now it doesn't, and the people over in #msyql pretty much confirmed that i've got a situation that could benefit from a nosql approach
[23:11:32] <eph3meral> jnewt, when all else fails paste some code
[23:12:08] <eph3meral> jnewt, if you can give us something that we can play with instantly, almost as good as jsfiddle.net, it'll be much easier for us to help
[23:13:27] <eph3meral> jnewt, as usual, strip out irrelevant details of your actual problem domain but give use e.g. 1) A few mongo insert statements for some example data to play with and 2) some example query results you would like to achieve 3) some example queries you are trying right now 4) how your example queries differ in results from what you would like
[23:13:53] <eph3meral> jnewt, it's the standard "how do I get help on IRC" principle, really... 1) show us you've done some work and 2) make it easy for us to help you
[23:14:12] <eph3meral> jnewt, I believe you have done work, it's just nearly impossible for us to connect with human worded descriptions of things
[23:14:33] <eph3meral> over a medium as terse and time consuming and easy to misinterpret as typing
[23:16:52] <jnewt> ok, i'll work out some inserts, and post in a few. but i'll have to give you the desired queries in sql or psuedo, as i cannot make the jump from the documentation to the acutal problem.
[23:17:26] <eph3meral> jnewt, that's fine, do that, the more important thing is to see the results you would like to achieve and the data set you currently have
[23:18:24] <eph3meral> jnewt, presuming to know the right way to get there is also no no number 2 of irc club - ask about what your actual end human goal is (relatively speaking here, you have a human goal of needing your results formatted in a certain way, just us what you would like in the end)
[23:18:39] <eph3meral> and then if we can copy paste your insert statements we can just play with ideas for queries
[23:19:12] <eph3meral> jnewt, but as usual, it always helps to show you've done your homework and what you have tried even if it's mostly in your head
[23:19:15] <eph3meral> or pseudocode