PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 5th of November, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[02:06:58] <pzuraq> in general is it a good idea to normalize hasMany relationships in a collection? Seems like given the cost of reallocation
[02:07:10] <pzuraq> if there is no upper bound on the number of child elements
[02:09:24] <pzuraq> Better question: When is it a good idea to not normalize a relationship?
[02:10:12] <cheeser> if you're always going to need those relationships everytime you fetch the parent doc, it can make sense to nest them.
[02:10:48] <cheeser> if there are going to be a *ton* of them, it might make sense to break them out and fetch them separately.
[02:11:36] <cheeser> especially if you're going to only want a subset e.g. for pagination, a separate collection would be better
[02:12:50] <pzuraq> cheeser: So lets say I have Posts, and every post has many Comments. When listing all posts you can't see the comments, but when looking at any post you will see all comments w/o pagination, and there can theoretically be an limit of several thousand comments
[02:13:14] <cheeser> i would put the comments in a separate collection.
[02:13:21] <pzuraq> mmk
[02:13:39] <pzuraq> can you give an example of something you would nest?
[02:14:07] <cheeser> say, orders and products
[02:15:47] <pzuraq> so an order can have many products, but it's not likely to have them change much or have a huge number
[02:15:52] <pzuraq> so yeah that makes sense
[02:17:12] <cheeser> right
[02:28:55] <pzuraq> cheeser: Would you do that even if you had places elsewhere in the app where you need the products on there own?
[04:46:47] <mun> hi
[04:46:56] <jkitchen> hi
[04:46:56] <mun> does anyone know of a nice gui viewer for mongo?
[04:47:04] <jkitchen> gui.
[04:47:21] <jkitchen> quick look I see there's mongovue
[04:47:40] <jkitchen> http://docs.mongodb.org/ecosystem/tools/administration-interfaces/
[04:47:45] <jkitchen> there's a page with some stuff
[04:48:01] <jkitchen> fang of mongo actually looks pretty nifty
[04:49:36] <mun> is mongo3 any good?
[04:52:00] <cheeser> is what now?
[04:53:30] <mun> mongovue is windows only right...
[04:53:59] <jkitchen> mun: I'm unsure. I haven't vetted any of those suggestions, was just trying to point you in the right direction
[04:54:09] <mun> i see
[04:54:14] <mun> jkitchen, do you use any yourself?
[04:55:26] <cirwin> Will anything bad happen if I set the MaxObjectPerShard 10 times higher?
[04:55:35] <jkitchen> I haven't, no
[04:56:19] <cirwin> *MaxObjectPerChunk
[06:01:37] <brucelee1> he
[06:01:39] <brucelee1> hey
[06:01:39] <brucelee1> guys
[06:02:07] <brucelee1> i have 3 clusters but why cant one cluster work by itself, and needs 2 clusters?
[06:02:13] <brucelee1> whats the point of having 3 clusters vs 2 then
[06:02:42] <brucelee1> i read that it's because of votes
[06:02:57] <joannac> Do you mean you have 3 nodes in a cluster (replica set)?
[06:03:08] <joannac> If you have 3 and lose 1, the other 2 can continue.
[06:03:22] <joannac> If you have 2 and lose 1, the other 1 can't stay up by itself
[06:03:37] <joannac> It will not accept writes
[06:04:41] <joannac> (because it can no longer see a majority of the set)
[06:04:49] <joannac> Does that answer your question?
[06:18:21] <brucelee1> joannac: yeah
[06:18:38] <brucelee1> joannac: but it makes the app unusable though since the db cant be written to
[06:18:47] <brucelee1> joannac: i read that an arbiter can be used to make the last node a primary
[06:19:16] <brucelee1> joannac: so to have a 3 node setup, where 2 can fail and still have teh whole thing functioning, it would need to actually have 3 nodes, and an arbiter right?
[06:23:52] <joannac> No. To keep going after 2 failed nodes you need 5 nodes.
[06:25:08] <joannac> You need a majority of votes. 2 votes out of 4 is not a majority. 3 votes out of 5 is a majority.
[06:27:58] <joannac> brucelee1: sorry i forgot to hilight you
[06:28:16] <brucelee1> joannac: but in a 3 node cluster, it makes sense that if 2 fails, 1 is still there
[06:28:33] <brucelee1> joannac: but from what youre saying, if 2 fails, it renders the app unusable
[06:28:38] <brucelee1> (since we cant write to the db)
[06:29:04] <brucelee1> joannac: infact this is what happens (it will fail after 2 nodes fail) in our 3 node setup
[06:29:04] <joannac> brucelee1: Correct. THe remaining node can't tell the difference between the other 2 nodes dying, and a network partition.
[06:29:13] <brucelee1> joannac: right
[06:29:24] <brucelee1> joannac: so i guess its protection mechanism
[06:29:28] <brucelee1> joannac: but whats the point of having 3 then
[06:29:30] <joannac> FOr all it knows, the other 2 nodes are running happily on the other side of the network partition.
[06:29:44] <brucelee1> joannac: for the purposes of high availability for application, whats the point of going 3 nodes, vs 2
[06:29:56] <joannac> 2 will not give you any failover.
[06:30:10] <brucelee1> 2 can survive 1 node failure right?
[06:30:11] <joannac> If you have 2 and 1 dies, or there's a network partition, you will have no primary.
[06:30:16] <brucelee1> ah ok
[06:30:17] <joannac> and no writes.
[06:30:23] <brucelee1> well, what about 2 nodes and an arbiter, vs 3 nodes
[06:30:27] <joannac> With 3, if one dies the other 2 can keep going
[06:30:28] <brucelee1> the same thing right?
[06:30:39] <joannac> 2 and an arbiter is okay if you don't have the resources to run 3
[06:31:06] <brucelee1> joannac: so from 3, we have to go to 5 to survive 2 node failures
[06:31:22] <joannac> correct
[06:31:27] <brucelee1> joannac: can we create 2 arbiters in that case?
[06:31:38] <brucelee1> just to be able to have 2 node failures, is that bad practice
[06:31:41] <joannac> you can, but arbiters are non data bearing
[06:31:58] <brucelee1> yeah they contain no data, what are you implying there? (any pitfalls?)
[06:32:40] <joannac> less secondaries to read from / sync from, less data redundancy
[06:33:04] <brucelee1> for a 3 node cluster, i presume everybody wants it set so it can support up to 2 failures (other they can just use 2 nodes and 1 arbiter)
[06:33:25] <joannac> I'm sorry, I don't understand that question
[06:33:47] <brucelee1> if resources/money is no prob, ideally everyone would run a lot of real nodes, rather than arbiters
[06:33:54] <joannac> right
[06:34:00] <brucelee1> but since it is, often times, the goal is to achieve high availability
[06:34:12] <brucelee1> and with 2 nodes, and 1 arbiter, you get the same amount of high availabibility as 3 nodes
[06:34:26] <joannac> But less redundancy
[06:34:31] <brucelee1> oh yeah
[06:34:35] <brucelee1> because arbiters dont replicate
[06:34:41] <joannac> Correct
[06:35:03] <brucelee1> so i presume if someone goes from 2 nodes and 1 arbiter, they get the added redundancy by upgrading to 3 nodes
[06:35:20] <brucelee1> but still the entire app is not any more highly available
[06:35:31] <joannac> umm, it totally is.
[06:35:38] <brucelee1> so with 3 nodes, would it ever note make sense to add in 2 mroe arbiters to make it able to withstand 2 node failures
[06:35:43] <joannac> say your secondary has a dodgy disk that dies
[06:35:50] <joannac> and you replace it.
[06:36:03] <joannac> if you have another secondary you can initial sync from the other secondary
[06:36:15] <joannac> else you need to sync from the primary AND handle your normal app load
[06:36:30] <brucelee1> joannac: i see
[06:36:30] <brucelee1> joannac: good pt
[06:36:46] <joannac> (you can restore from backup but still need to sync from backup time to current time)
[06:36:54] <brucelee1> joannac: so i guess my final question is, any reason not to add 2 more arbiters to make a 3 node able to withstand 2 nodes down
[06:37:12] <brucelee1> or is that not good practice
[06:37:18] <joannac> sure, but you have the same problem as before
[06:37:24] <hkonkhon> sorry to be such a newbie, but what's the etiquette for asking a new question? do I just type it here?
[06:37:30] <joannac> you lose 2 nodes and you have 1 node left.
[06:37:46] <jkitchen> hkonkhon: yup, ask away. don't paste to the channel, use a pastebin type service
[06:37:56] <brucelee1> joannac: losing 2, have 1 left, the other 2 needs to be completely sync'd up AND take normal loads when it comes up
[06:38:01] <brucelee1> joannac: thats the problem right?
[06:38:04] <joannac> you need to repair 2 data nodes from 1 remaining data node. hope the last one doesn't die. keep up with app load.
[06:38:21] <brucelee1> joannac: i see, would you do it?
[06:39:19] <brucelee1> if theres 2 arbiters and 3 nodes, theres 5 voters altogether, if one dies, theres 4 voters, unable to make even vote right?
[06:39:35] <brucelee1> unable to make majorityy* vote
[06:39:41] <joannac> brucelee1: like you said, sometimes you don't have the resources to run as many nodes as you want. You can make the decision about what kind of risks you're willing to take
[06:40:22] <joannac> if one dies, you have 4 votes out of 5. that's an easy majority
[06:46:10] <brucelee1> joannac: ahh
[06:46:20] <brucelee1> joannac: thanks for all the info :P
[06:46:48] <joannac> brucelee1: No problem
[06:47:18] <joannac> brucelee1: I mean, this is all very worst-case. But High Availability is pretty worst-case thinking anyway :)
[06:50:35] <jblack> It's all single point of failure anyway. one meteor can take out the whole planet
[07:01:04] <jkitchen> jblack: I need synchronous mongodb replication to my DR cluster on titan
[07:10:23] <rahulkmr> What happens when I add a mongodb node as a replica to another node? I understand it ships the oplog but will the initial sync wipe out the data on the just added replica. I have a primary containing a db say Foo which I want to be synced with the secondary. Secondary already has another db(Bar) and I don't want Bar to be wiped out. I just want Foo to be synced to the secondary so that I can run both Foo and Bar from the secondary
[07:26:54] <jblack> Still a SPP. A wandering black hole could wander into the sun
[07:28:33] <brucelee1> how would i go about upgrading monogdb schema without affecting application in a 3 node set up?
[07:28:42] <brucelee1> is there anyway to do it without affecting application functioanlity?
[07:28:50] <brucelee1> i dont want to take down the application
[07:36:56] <k_sze[work]> I don't quite understand how querying works in MongoDB. I can't return only parts of a document or querying only a part of a document?
[07:38:45] <xpen> k_sze[work], yes, you can
[07:39:15] <xpen> find({query_conditions}, {fields:1, your:1, want:1})
[07:41:23] <k_sze[work]> Can I extract only the education objects that are of postgraduate-level, for example?
[07:49:53] <k_sze[work]> I guess my question is really: can I query subdocuments?
[07:50:08] <GeertJohan> Question, is GridFS often being used with filenames like "/profilepics/<username>/firstUpload.png" ?
[07:50:44] <GeertJohan> k_sze[work]: yes, http://docs.mongodb.org/manual/reference/method/db.collection.find/#query-subdocuments
[08:47:06] <st0ne2thedge> How full must a collections allocated file become before it gets a new allocation? ^^
[08:47:29] <Derick> you mean the database files? They are per database, not collection.
[08:47:46] <Derick> The moment there is a write in filename.n, the file with filename.(n+1) gets created
[08:55:44] <st0ne2thedge> Derick: Right, so Looking at the .stats().size and comparing it to .stats().storageSize isn't going to give me an idea when the mongodb will grow in size?
[08:57:45] <Derick> st0ne2thedge: no, not likely
[09:25:58] <st0ne2thedge> Derick: How would you advice going about it then? Right now I'm looping through every collection in every database and asking for die size variables
[09:34:24] <k_sze[work]> GeertJohan: I don't think that's what I mean.
[09:34:26] <Derick> st0ne2thedge: what are you trying to accomplish?
[09:36:51] <k_sze[work]> Suppose I have a rich_people collection, each document in rich_people has an array of cars (which are subdocuments in MongoDB parlance, I suppose). Would it be possible to use MongoDB API to first get Bill Gates out of the rich_people collection, and then return the blue cars that Bill Gates owns?
[09:37:30] <Derick> k_sze[work]: an example document on pastebin helps a lot
[09:38:11] <st0ne2thedge> Derick: I'd like to be able to calculate how full the allocated fysical file is with actual data. That way I'd be able to guess when a new allocation will occur, making me able to guess when my partition will be full ^^
[09:39:07] <st0ne2thedge> Derick: What I had understood from reading some documentation I was convinced looking into the stats() of the collections would do the trick :P I appear to have been wrong ^^
[09:39:38] <Derick> it's difficult to guess file allocation as fragmentation of data files plays a big role in that
[09:40:15] <Derick> if you're not in a really important production environment, then you can start mongodb without preallocation
[09:40:27] <Derick> then just make sure you always have >2GB free on disk and you'd be fine
[09:42:08] <NodeX> k_sze[work] : you've said that your rich_people document has a sub document so you already have the list of cars he owns
[09:46:56] <k_sze[work]> http://pastie.org/8456571
[09:47:47] <Derick> k_sze[work]: better to store "Bill Gates" with each cat.
[09:47:49] <Derick> car*
[09:48:09] <k_sze[work]> Effectively flattening it?
[09:48:23] <Derick> yes, you can't (effectively) return only one of the subdocuments
[09:48:39] <Derick> you tend to always get the whole document back (with or without some top-level fields)
[09:48:49] <k_sze[work]> Wouldn't that be ... inefficient, storage space wise?
[09:49:00] <k_sze[work]> Or are mongo users generally not concerned about storage space?
[09:49:03] <Derick> k_sze[work]: yes, but it'd be a lot easier to query
[09:49:08] <Derick> hard disk space is cheap :-)
[09:49:15] <NodeX> k_sze[work] : a document is more like a Row in a traditional RDBMS, you seem to be treating it more like a table
[09:49:17] <Derick> it's a trade-off of course
[09:50:55] <tilya> good afternoon. is this the right place to ask questions about mongodb administration?
[09:51:25] <k_sze[work]> I thought the whole idea about schemaless document-based database is that each document would be self contained.
[09:52:04] <NodeX> k_sze[work] : each doc is self contained - however your pastebin describes ONE document
[09:52:11] <NodeX> tilya : best to just ask
[09:53:30] <k_sze[work]> NodeX: but by taking the cars out of Bill Gates, and flattening it, I would need to query multiple collections to reconstruct a profile of Bill Gates.
[09:53:46] <k_sze[work]> (maybe I completely misundertand the definition of 'self contained'?)
[09:53:48] <NodeX> I didn't say anything about flattening anything
[09:54:10] <tilya> i have a problem since 2.4.6. i have three nodes in replicaset, primary, secondary and arbiter, with authentication enabled. for the last few versions unfortunately, there's a problem with authentication on the arbiter node. is there a way to solve that, or is it on purpose?
[09:54:24] <k_sze[work]> NodeX: are we even saying 'flatten' to mean the same thing? :P
[09:54:46] <NodeX> you probably mean keep "cars" in a separate collection and do a "join" yes?
[09:55:09] <Derick> NodeX: no, what I suggested was to put bill gates on each of the cars.
[09:55:18] <NodeX> you currently have the right concept, you're just doing it one level to deep
[09:55:28] <NodeX> Derick : I wasn't suggesting you were
[09:55:32] <Derick> k_sze[work]: yes, a whole profile belongs together, but if a query shows that it's not efficient, change it.
[09:56:20] <Derick> NodeX: but yes, I do see a document level too much in there. Perhaps just badly pastebinned though... as the terminology that k_sze[work] uses in it is right.
[09:56:45] <NodeX> Derick : it's one document though - not an array of results from a find();
[09:56:47] <Derick> tilya: arbiters don't store data, so why does it need auth?
[09:57:03] <Derick> k_sze[work]: is what NodeX says true?
[09:57:16] <NodeX> else he would already have the cars [] ;)
[09:57:25] <k_sze[work]> rich_people is a collection in the silicon_valley database
[09:57:28] <NodeX> (when itterating each doc)
[09:57:46] <k_sze[work]> so you would go: use silicon_valley; db.rich_people.find(/* etc */);
[09:57:53] <Derick> k_sze[work]: so that's spot on
[09:58:02] <Derick> NodeX: no, just finding his blue cars is tricky
[09:58:08] <NodeX> k_sze[work] : pleae pastebin db.rich_people.findOne();
[09:59:09] <NodeX> personaly I would store the cars like this. cars : [{name:"Mini",color:"blue"}]
[09:59:22] <Derick> NodeX: that's what he does :P
[09:59:28] <NodeX> not in his docs
[09:59:41] <Derick> yes he does, he just pastebinned it awkwardly
[09:59:45] <NodeX> I am really blind today my mistake
[09:59:59] <Derick> pretend line 5,6, 18 and 20 don't exist
[10:00:04] <NodeX> hahaha
[10:00:16] <NodeX> right, now I forgot the original question
[10:00:18] <Derick> k_sze[work]: let that be a lesson for you too, don't wrap it in an extra layer in examples :P
[10:00:23] <tilya> Derick: it seemed as a good idea for monitoring reachability over network, back in times it worked :)
[10:00:40] <k_sze[work]> ok. :D
[10:00:49] <Derick> tilya: I don't think it worked since 2.4.0
[10:00:53] <NodeX> ok, in this instance you need to doa compounded query
[10:01:19] <NodeX> you can use positional operator to return parts of the array you're interested in
[10:01:50] <NodeX> http://docs.mongodb.org/manual/reference/operator/projection/positional/ <---
[10:01:53] <tilya> Derick: it somehow worked until 2.4.6. but well, if that's intended and not a bug, then i will dump this dumb monitoring and do something else. thanks.
[10:03:00] <NodeX> db.people.find({name:"bill","cars.color":'Blue'},{"cars.$"});
[10:03:05] <NodeX> or somehting along those lines
[10:06:36] <Derick> NodeX: does that work, just cars.$ ?
[10:07:59] <NodeX> it's either cars.$ or cars.color.$, I can't remember
[10:08:15] <NodeX> hence the "or something along those lines" haha
[10:08:25] <Derick> :-)
[10:12:56] <st0ne2thedge> Derick: Can one run an existing database without preallocation? Is there any danger to existing data? We are currently storing our packaging into the database (pulpproject)
[10:13:39] <Derick> no, I don't think that is a problem
[10:18:12] <st0ne2thedge> Derick: you said you'd advice thresholding the mongodb's partition at >2G remaining space right? ^^
[10:21:09] <k_sze[work]> Wouldn't cars.$ return only the first blue car?
[10:25:41] <k_sze[work]> going home. might check back later.
[10:40:17] <gregorm> hey
[10:56:23] <Number6> Shun the non-believer... SHUNNNNNN
[10:57:52] <gregor0> lol
[10:58:03] <gregor0> i'm a believer :)
[10:58:09] <gregor0> i work for mongodb support :)
[11:00:31] <gregor0> hey derick :)
[11:00:32] <Derick> gregor0: real believers don't have "Adium User" as their ircname
[11:01:51] <robertjpayne> the "auth" setting isn't affected in any way by latency or network conditions is it? I'm struggling getting it working on a remote mongo but works just fine in a local vagrant mongo
[11:09:25] <robertjpayne> Gah mongodb host has to be an IP address always not a domain name it seems?
[11:09:36] <Derick> no
[11:09:43] <Derick> please use hostname
[11:10:22] <robertjpayne> yea hmm now trying to use my vagrant to connect to the one on the server it only works with IP but may be because of network briding and sorts
[11:11:05] <Derick> yeah...or a wrong DNS/host configuration
[11:11:27] <robertjpayne> SSH works fine
[11:11:36] <robertjpayne> firewall is turned off
[11:11:36] <rickogden> haha Derick, sorry for the delay, got distracted by real work
[11:11:56] <rickogden> does anyone know of any MongoDB-as-a-service providers they recommend for open source projects?
[11:12:41] <robertjpayne> rickogden: MongoHQ/MongoLab are the two big ones -- not sure they give any discounts for OSS
[11:13:32] <rickogden> robertjpayne: they both have free tiers though, which will do for now
[11:13:43] <robertjpayne> rickogden: yup :)
[11:13:47] <rickogden> just wondered if there were any companies I was missing
[11:14:09] <robertjpayne> rickogden: I'm sure there are others too but those are the two big ones I know of
[11:53:23] <robertjpayne> with the new role based auth is it possible to allow a single action?
[11:53:30] <robertjpayne> IE allow ["find"]
[11:55:55] <kamal_> afaik, no
[12:16:10] <HashMap> Hi there people.. I am just curious, I am doing the MongoDB for DBAs course and currently there is a week about sharding.. And my question is, for example there are three config servers.. does this usually in production means three separate physical machines?
[12:18:44] <robertjpayne> HashMap: Sharding normally will happen across separate physical machines to gain more durability in the event of hardware failure.
[12:19:11] <robertjpayne> HashMap: That's not to say it's possible to run the shards in LXC or similar virtualized containers
[12:19:18] <robertjpayne> impossible*
[12:20:48] <HashMap> I see, thanks..
[12:20:59] <st0ne2thedge> I'm reading up on mongodb writes in mongodb v2.0, where as soon as a write has been buffered in the outgoing socket buffer of the client host, the insert is 'completed'. Is this fixed in later releases/
[13:03:46] <st0ne2thedge> right it is... up to the level of 1 of the mirroring server's memory?
[13:05:17] <kali> st0ne2thedge: look for the writeconcern options in your client library
[13:05:36] <kali> st0ne2thedge: what you describe used to be the default
[13:05:47] <kali> st0ne2thedge: but it's been changed for about one year
[13:16:48] <st0ne2thedge> kali: Im reading through the release notes of 2.2 but having trouble finding anything about WriteConcern though
[13:18:33] <kali> st0ne2thedge: in client side, and it was more at the time of 2.4
[13:18:52] <kali> st0ne2thedge: http://docs.mongodb.org/manual/release-notes/drivers-write-concern/
[13:20:36] <gregorm> http://docs.mongodb.org/v2.2/release-notes/drivers-write-concern/
[13:23:04] <kali> indeed, it's somewhere in between 2.2 and 2.4
[13:25:04] <st0ne2thedge> kali, gregorm: thx for the links!
[13:57:09] <alFReD-NSH> Any one know here what might cause "Assertion: 10307:Client Error: bad object in message: invalid bson type" on node js driver?
[17:48:52] <JEisen> Using collMod() to set usePowerOf2Sizes is a non-blocking operation on a primary, right?
[19:18:15] <ivica> hello everyone. is this the right place to ask a question about pymongo?
[19:44:35] <astropirate> I just added an Index on a field, now my queries return 0 documents
[19:44:45] <astropirate> nothing in the query has been changed
[19:45:26] <astropirate> wut broke
[19:46:11] <astropirate> :\
[19:46:17] <astropirate> i restarted mongod and ti works now
[19:46:19] <astropirate> silly
[19:46:47] <astropirate> nope
[19:46:48] <astropirate> worked
[19:46:50] <astropirate> jsut stoped
[19:47:35] <astropirate> mongodb y u not work?!
[19:48:28] <astropirate> what the hell
[19:48:39] <astropirate> the first query works right after mongod restart
[19:48:47] <astropirate> but subsequent queries return 0 values
[19:49:16] <astropirate> Tue Nov 5 12:46:42.358 [conn39] query myDB.rankings query: { $query: { poll: ObjectId('51daef3076135c762d030238') }, orderby: { rank.weighted: 1 } } ntoreturn:100 ntoskip:0 nscanned:2781 scanAndOrder:1 keyUpdates:0 locks(micros) r:8222 nreturned:100 reslen:23671 8ms
[19:49:54] <astropirate> Tue Nov 5 12:46:48.771 [conn30] query myDB.rankings query: { $query: { poll: ObjectId('51daef3076135c762d030238') }, orderby: { rank.weighted: 1 } } ntoreturn:100 ntoskip:0 nscanned:0 keyUpdates:0 locks(micros) r:86 nreturned:0 reslen:20 0ms
[19:50:15] <astropirate> any ideas
[19:52:48] <astropirate> ECHO... Echo... echo... ooo
[20:01:01] <tystr> What are some metrics I can look at to see if I need to look into sharding?
[20:02:07] <JEisen> Lock percentage going up and if your working memory set is close to physical mem are your two biggest indicators.
[20:02:29] <JEisen> i.e. page faults growing
[20:02:52] <tystr> ya we had some issues the other week with 100s of page faults and disk io maxing out
[20:03:26] <tystr> but the servers weren't configured correctly with the suggested optimizations (ulimit, readahead, etc)
[20:03:40] <tystr> so I fixed those problems and things have been running smoothly
[20:03:49] <JEisen> you'll want to be really careful about whether sharding makes sense for your app in general. there are some major caveats.
[20:03:56] <tystr> right
[20:04:28] <JEisen> I would recommend upgrading the servers before sharding if that's an option, unless it really makes sense.
[20:04:32] <tystr> I'm just trying to understand if the issues we had were soley related to the misconfiguration, or if we're beginning to reach a bottleneck
[20:05:21] <tystr> how can I see the size of the working memory set/
[20:05:22] <tystr> ?
[20:05:27] <JEisen> keep an eye on those two measurements. :)
[20:05:51] <JEisen> it'll be in serverStatus() as of 2.4
[20:06:18] <tystr> running 2.2.3 on these
[20:06:27] <JEisen> keep an eye on your page faults.
[20:07:00] <tystr> I've got a few spikes over 100
[20:07:41] <JEisen> how big is the DB, how much RAM?
[20:07:44] <astropirate> Anyone know what is going on with my problem? I couldn't drop the index, it says db is currupt. I did a db repair, again it worked for the first query and then returns 0 values for subsequent queries
[20:07:48] <tystr> 64G ram on these
[20:08:47] <tystr> "dataSize" : 881968,
[20:09:13] <tystr> er, wait
[20:09:32] <JEisen> what scale? :)
[20:09:40] <joannac> astropirate: you run the exactsame query twice in a row, and the first time you get results, the second you don't?
[20:09:50] <astropirate> joannac, exactly
[20:09:51] <tystr> that was the wrong db
[20:09:53] <astropirate> until i restart mondo
[20:09:59] <tystr> but the docs don't say what scale that is heh
[20:10:08] <tystr> oh im dumb
[20:10:53] <joannac> astropirate: erm. do a count before and after?
[20:11:03] <astropirate> joannac, will do
[20:12:00] <JEisen> looks like the default is bytes
[20:12:02] <joannac> astropirate: If you really did nothing in between... i dunno, let me think about it.
[20:12:14] <astropirate> joannac, I just did db.repairDatabase() again
[20:12:18] <astropirate> and this time seems to have fixed it
[20:12:27] <astropirate> it isn't a deterministic operation? haha
[20:12:35] <astropirate> nothing new was inserted or removed from db
[20:12:40] <tystr> JEisen our total data is < 1 TB
[20:13:05] <joannac> astropirate: smells fishy.
[20:13:12] <astropirate> joannac, extremely
[20:15:46] <JEisen> tystr: Depending on your usage patterns, yeah, sharding might be valuable. But I'm definitely not saying "Go shard."
[20:20:12] <J-Gonzalez> hello everyone
[20:20:28] <J-Gonzalez> I'm trying to figure out the best way to organize an app I'm building
[20:21:14] <azathoth99> common lisp coreserver or www.happstack.com
[20:21:29] <J-Gonzalez> I've got an events collection, with many events that get create by users
[20:21:58] <J-Gonzalez> these events have options for purchasing tickets. When a user purchases tickets, an order gets created in an order collection
[20:22:28] <J-Gonzalez> the one thing I can't seem to decide on is where a TICKET should be create (think like a hard ticket when you go to sporting events or concerts)
[20:22:49] <J-Gonzalez> Should tickets be it's own collection that has a reference to the order and event
[20:23:10] <J-Gonzalez> or should tickets live as sub document within each event
[20:23:15] <JEisen> how often are you going to be querying on just the tickets vs. as part of the order,for example?
[20:23:56] <J-Gonzalez> The tickets get queried semi-often, during event checkins at different venues
[20:24:26] <JEisen> you'll need to weigh the disadvantage of having multiple round-trip queries vs. having to transfer/sort through a lot of unrelated data.
[20:25:35] <JEisen> but, say, if when you get the ticket you also want to know the order that generated it and so would use that data anyway… that might not be so bad.
[20:26:38] <J-Gonzalez> The orders don't necessarily get queried as often, and are more for accounting purposes, say a user wants to see all their past orders
[20:27:27] <JEisen> that sounds pretty separate to me, based on what you've said.
[20:28:12] <tystr> JEisen ya I've been sternly cautioned about sharding hehe
[20:28:54] <J-Gonzalez> yea the orders are pretty separate - basically, we just need the easiest way to pull up tickets when customers bring them in. And either having them as subdocs within the event that is being checked in, or querying the entire tickets collection
[20:29:59] <JEisen> my perspective is, don't use subdocs unless you automatically want that data when you get the parent data. but that may be hardline.
[20:30:45] <J-Gonzalez> hmm ok
[20:31:56] <tystr> or if your subdocs will grow in size a lot
[20:32:09] <tystr> JEisen thanks for the feedback btw
[20:33:13] <JEisen> anyone happen to know if collMod() is blocking or not? (looking at turning on usePowerOf2Sizes)
[20:38:46] <J-Gonzalez_> Thanks for your help JEisen
[20:38:55] <JEisen> np
[20:39:00] <J-Gonzalez_> I'm going to try the safe collection route for now
[20:39:25] <vicTROLA> Does anyone have any advice on best practices for paging through large datasets and 'maintaining state' (for lack of a better phrase)? My app is supposed to take the user through potentially thousands of individual documents. I'm unsure how to preserve offset state in the next request to avoid showing duplicates
[20:40:00] <vicTROLA> I'm thinking about storing limit skip offsets in cache and modifying them on every request. Is there a better way?
[20:40:20] <vicTROLA> or is there a way to 'freeze' the cursor and re-instantiate it on demand?
[20:41:01] <joannac> JEisen: I doubt it, it only changes future allocations.
[21:03:52] <liquid-silence> I see the mongodb-native does node readable streams
[21:04:17] <liquid-silence> can anyone explain this to me, as it seems super slow on seaking
[21:07:14] <astropirate> liquid-silence, what kind of query are you running?
[21:07:26] <liquid-silence> gridfs file
[21:07:29] <liquid-silence> and then doing a seek
[21:10:08] <liquid-silence> something like this
[21:10:08] <liquid-silence> https://gist.github.com/psi-4ward/7099001
[21:10:24] <liquid-silence> hitting the range header code
[21:19:28] <flatr0ze> I'm storing binary strings in mongo (both files and encrypted text), and getting console.log() output of "\u0012\u0032..." mixed with glyphs and symbols in the output. Can it be that it's just the output, and I'm not really saving it as UTF-16? I'm really trying to save space and store only bytes, no Unicodes... Using Node.js w/ MeteorJS framework, GNU/Linux.
[21:22:07] <cirwin> flatr0ze: unfortunately that's not going to work very reliably, I don't think the node mongo driver has good support for buffers
[21:22:41] <cirwin> you should also be careful because not all byte strings are technically valid unicode, but as far as I'm aware neither mongo nor node enforce that strictly
[21:23:45] <flatr0ze> cirwin: you think there's any way to store just byte array? like in the ol' good times? I'm getting pretty sick of having utf where I don't need it at all
[21:25:36] <cirwin> flatr0ze: I think there's a way of putting raw data into BSON, but I've not tried super-hard
[21:25:48] <flatr0ze> I see
[21:26:48] <flatr0ze> thanks, cirwin I'll research the raw data insertion more!
[21:28:22] <liquid-silence> cirwin care to look at my issue?
[21:28:43] <liquid-silence> https://gist.github.com/psi-4ward/7099001
[21:28:48] <liquid-silence> the performance is really bad
[21:29:01] <liquid-silence> takes like 5 minutes for it actually starts sending other ranges
[21:37:04] <ramsey> Derick: what's the alternative to using setSlaveOkay() in the PHP Mongo driver? We upgraded and now we're getting deprecation notices.
[21:39:48] <tavooca__> mongodb example python mapbox
[21:40:58] <Derick> ramsey: one sec
[21:41:20] <Derick> ramsey: on what object, mongocursor or mongocollection, or?
[21:41:37] <ramsey> Derick: mongoclient
[21:41:52] <Derick> http://us1.php.net/manual/en/mongoclient.setreadpreference.php
[21:42:14] <Derick> $client->setReadPreference(MongoClient::RP_SECONDARY_PREFERRED);
[21:42:32] <ramsey> Derick: thanks. I think our problem is that we don't understand what the read preference should be :-)
[21:42:35] <ramsey> is that it?
[21:42:39] <Derick> yup
[21:42:49] <ramsey> Thanks a bunch!
[21:42:52] <Derick> be careful with what you use that for
[21:42:56] <Derick> there is replication lag
[21:43:43] <liquid-silence> @Derick what is the status on gridstream?
[21:43:48] <liquid-silence> for native driver?
[21:44:03] <Derick> liquid-silence: "native driver" ?
[21:44:12] <liquid-silence> node native
[21:44:19] <Derick> i've no idea
[21:45:07] <liquid-silence> who here works on it?
[21:45:35] <Derick> he's not on here much I think
[21:45:45] <liquid-silence> hmm
[21:45:50] <liquid-silence> having a major issue
[21:46:01] <liquid-silence> going to try 1.4 release but done have much hope
[22:33:45] <tab1293> I'm trying to insert an integer value of 5368709120 but whenever I try to insert it into a field it stores in the db as 1073741823
[22:33:56] <tab1293> Im using the php drivers
[22:36:07] <azathoth99> sorcery!!
[22:36:23] <Derick> tab1293: you're using a 32bit PHP build?
[22:36:39] <joannac> That looks bigger than the largest 32-bit int
[22:36:40] <Derick> it can't deal with numbers larger than 2048000000 or so
[22:36:44] <Derick> right
[22:36:47] <angasulino> what's your driver version?
[22:36:49] <tab1293> Derick: I didn't think I was
[22:37:01] <Derick> no no, Derick needs to go sleep
[22:38:49] <tab1293> angasulino: its 1.4.1
[22:39:01] <bjori> tab1293: php -r 'var_dump(PHP_INT_MAX);'
[22:39:08] <bjori> or are you on windows?
[22:39:24] <tab1293> int(9223372036854775807)
[22:39:26] <tab1293> bjori:
[22:39:31] <bjori> interesting, thats 64
[22:39:37] <bjori> what about your mongod?
[22:39:55] <tab1293> im running 64 bit ubuntu so im assuming thats 64 as well
[22:39:59] <tab1293> i installed it with apt
[22:40:27] <tab1293> its weird cause if I insert the number with mongo shell it works, but with php it doesn't
[22:40:38] <bjori> can I see that actual PHP code?
[22:40:56] <tab1293> yeah hold on
[22:41:33] <tab1293> http://pastie.org/8458297
[22:41:47] <tab1293> its in the insert function of that class
[22:43:07] <bjori> interesting
[22:43:29] <bjori> tab1293: could you enable the context logging and see what we send over?
[22:43:38] <bjori> or wireshark for that matters if its not ssl
[22:44:23] <tab1293> bjori: I am connecting to the server over ssh so wireshark may be a pain
[22:44:33] <tab1293> how can I enable context logging?
[22:48:13] <bjori> tab1293: 1min
[22:50:29] <bjori> tab1293: http://pastie.org/8458312
[22:50:40] <bjori> tab1293: see the modified constructor, makeCTX and log_insert
[22:51:56] <tab1293> okay, where is that being logged to
[22:52:21] <tab1293> and also could this be the issue http://php.net/manual/en/mongo.configuration.php the native long variable
[22:53:56] <bjori> oh, I just var_dump()ed it, so it'll be there somewhere :)
[22:54:23] <bjori> and for native long.. I always get confused by that voodoo
[22:54:26] <bjori> try it :)
[22:56:43] <bjori> you are right!
[22:57:31] <bjori> http://pastebin.com/6tr2jG0n
[22:57:39] <bjori> tab1293: so.. set that setting to 1 :D
[23:23:18] <tab1293> bjori: okay i will try it
[23:25:22] <tab1293> bjori: where do i set that to 1 though? php.ini?
[23:30:20] <bjori> tab1293: in your php.ini file, or mongo.ini file if your distro splits extensions like that
[23:30:23] <bjori> tab1293: check phpinfo()
[23:30:53] <bjori> tab1293: it'll tell you which php.ini is loaded, and if other ini files are loaded. if you see a "mongo.ini" then use that, otherwise just add it to the main php.ini
[23:32:10] <tab1293> ok cool thank you!
[23:32:23] <tab1293> bjori: any chance you work for mongo?
[23:51:06] <bjori> tab1293: I do
[23:54:16] <tab1293> bjori: Do you know anything in regards to when summer intern applicants should hear back from you guys? I just put in an application last week