[02:06:58] <pzuraq> in general is it a good idea to normalize hasMany relationships in a collection? Seems like given the cost of reallocation
[02:07:10] <pzuraq> if there is no upper bound on the number of child elements
[02:09:24] <pzuraq> Better question: When is it a good idea to not normalize a relationship?
[02:10:12] <cheeser> if you're always going to need those relationships everytime you fetch the parent doc, it can make sense to nest them.
[02:10:48] <cheeser> if there are going to be a *ton* of them, it might make sense to break them out and fetch them separately.
[02:11:36] <cheeser> especially if you're going to only want a subset e.g. for pagination, a separate collection would be better
[02:12:50] <pzuraq> cheeser: So lets say I have Posts, and every post has many Comments. When listing all posts you can't see the comments, but when looking at any post you will see all comments w/o pagination, and there can theoretically be an limit of several thousand comments
[02:13:14] <cheeser> i would put the comments in a separate collection.
[06:18:38] <brucelee1> joannac: but it makes the app unusable though since the db cant be written to
[06:18:47] <brucelee1> joannac: i read that an arbiter can be used to make the last node a primary
[06:19:16] <brucelee1> joannac: so to have a 3 node setup, where 2 can fail and still have teh whole thing functioning, it would need to actually have 3 nodes, and an arbiter right?
[06:23:52] <joannac> No. To keep going after 2 failed nodes you need 5 nodes.
[06:25:08] <joannac> You need a majority of votes. 2 votes out of 4 is not a majority. 3 votes out of 5 is a majority.
[06:27:58] <joannac> brucelee1: sorry i forgot to hilight you
[06:28:16] <brucelee1> joannac: but in a 3 node cluster, it makes sense that if 2 fails, 1 is still there
[06:28:33] <brucelee1> joannac: but from what youre saying, if 2 fails, it renders the app unusable
[06:28:38] <brucelee1> (since we cant write to the db)
[06:29:04] <brucelee1> joannac: infact this is what happens (it will fail after 2 nodes fail) in our 3 node setup
[06:29:04] <joannac> brucelee1: Correct. THe remaining node can't tell the difference between the other 2 nodes dying, and a network partition.
[06:31:27] <brucelee1> joannac: can we create 2 arbiters in that case?
[06:31:38] <brucelee1> just to be able to have 2 node failures, is that bad practice
[06:31:41] <joannac> you can, but arbiters are non data bearing
[06:31:58] <brucelee1> yeah they contain no data, what are you implying there? (any pitfalls?)
[06:32:40] <joannac> less secondaries to read from / sync from, less data redundancy
[06:33:04] <brucelee1> for a 3 node cluster, i presume everybody wants it set so it can support up to 2 failures (other they can just use 2 nodes and 1 arbiter)
[06:33:25] <joannac> I'm sorry, I don't understand that question
[06:33:47] <brucelee1> if resources/money is no prob, ideally everyone would run a lot of real nodes, rather than arbiters
[06:36:46] <joannac> (you can restore from backup but still need to sync from backup time to current time)
[06:36:54] <brucelee1> joannac: so i guess my final question is, any reason not to add 2 more arbiters to make a 3 node able to withstand 2 nodes down
[06:37:12] <brucelee1> or is that not good practice
[06:37:18] <joannac> sure, but you have the same problem as before
[06:37:24] <hkonkhon> sorry to be such a newbie, but what's the etiquette for asking a new question? do I just type it here?
[06:37:30] <joannac> you lose 2 nodes and you have 1 node left.
[06:37:46] <jkitchen> hkonkhon: yup, ask away. don't paste to the channel, use a pastebin type service
[06:37:56] <brucelee1> joannac: losing 2, have 1 left, the other 2 needs to be completely sync'd up AND take normal loads when it comes up
[06:38:01] <brucelee1> joannac: thats the problem right?
[06:38:04] <joannac> you need to repair 2 data nodes from 1 remaining data node. hope the last one doesn't die. keep up with app load.
[06:38:21] <brucelee1> joannac: i see, would you do it?
[06:39:19] <brucelee1> if theres 2 arbiters and 3 nodes, theres 5 voters altogether, if one dies, theres 4 voters, unable to make even vote right?
[06:39:35] <brucelee1> unable to make majorityy* vote
[06:39:41] <joannac> brucelee1: like you said, sometimes you don't have the resources to run as many nodes as you want. You can make the decision about what kind of risks you're willing to take
[06:40:22] <joannac> if one dies, you have 4 votes out of 5. that's an easy majority
[06:47:18] <joannac> brucelee1: I mean, this is all very worst-case. But High Availability is pretty worst-case thinking anyway :)
[06:50:35] <jblack> It's all single point of failure anyway. one meteor can take out the whole planet
[07:01:04] <jkitchen> jblack: I need synchronous mongodb replication to my DR cluster on titan
[07:10:23] <rahulkmr> What happens when I add a mongodb node as a replica to another node? I understand it ships the oplog but will the initial sync wipe out the data on the just added replica. I have a primary containing a db say Foo which I want to be synced with the secondary. Secondary already has another db(Bar) and I don't want Bar to be wiped out. I just want Foo to be synced to the secondary so that I can run both Foo and Bar from the secondary
[07:26:54] <jblack> Still a SPP. A wandering black hole could wander into the sun
[07:28:33] <brucelee1> how would i go about upgrading monogdb schema without affecting application in a 3 node set up?
[07:28:42] <brucelee1> is there anyway to do it without affecting application functioanlity?
[07:28:50] <brucelee1> i dont want to take down the application
[07:36:56] <k_sze[work]> I don't quite understand how querying works in MongoDB. I can't return only parts of a document or querying only a part of a document?
[08:47:06] <st0ne2thedge> How full must a collections allocated file become before it gets a new allocation? ^^
[08:47:29] <Derick> you mean the database files? They are per database, not collection.
[08:47:46] <Derick> The moment there is a write in filename.n, the file with filename.(n+1) gets created
[08:55:44] <st0ne2thedge> Derick: Right, so Looking at the .stats().size and comparing it to .stats().storageSize isn't going to give me an idea when the mongodb will grow in size?
[09:25:58] <st0ne2thedge> Derick: How would you advice going about it then? Right now I'm looping through every collection in every database and asking for die size variables
[09:34:24] <k_sze[work]> GeertJohan: I don't think that's what I mean.
[09:34:26] <Derick> st0ne2thedge: what are you trying to accomplish?
[09:36:51] <k_sze[work]> Suppose I have a rich_people collection, each document in rich_people has an array of cars (which are subdocuments in MongoDB parlance, I suppose). Would it be possible to use MongoDB API to first get Bill Gates out of the rich_people collection, and then return the blue cars that Bill Gates owns?
[09:37:30] <Derick> k_sze[work]: an example document on pastebin helps a lot
[09:38:11] <st0ne2thedge> Derick: I'd like to be able to calculate how full the allocated fysical file is with actual data. That way I'd be able to guess when a new allocation will occur, making me able to guess when my partition will be full ^^
[09:39:07] <st0ne2thedge> Derick: What I had understood from reading some documentation I was convinced looking into the stats() of the collections would do the trick :P I appear to have been wrong ^^
[09:39:38] <Derick> it's difficult to guess file allocation as fragmentation of data files plays a big role in that
[09:40:15] <Derick> if you're not in a really important production environment, then you can start mongodb without preallocation
[09:40:27] <Derick> then just make sure you always have >2GB free on disk and you'd be fine
[09:42:08] <NodeX> k_sze[work] : you've said that your rich_people document has a sub document so you already have the list of cars he owns
[09:53:30] <k_sze[work]> NodeX: but by taking the cars out of Bill Gates, and flattening it, I would need to query multiple collections to reconstruct a profile of Bill Gates.
[09:53:46] <k_sze[work]> (maybe I completely misundertand the definition of 'self contained'?)
[09:53:48] <NodeX> I didn't say anything about flattening anything
[09:54:10] <tilya> i have a problem since 2.4.6. i have three nodes in replicaset, primary, secondary and arbiter, with authentication enabled. for the last few versions unfortunately, there's a problem with authentication on the arbiter node. is there a way to solve that, or is it on purpose?
[09:54:24] <k_sze[work]> NodeX: are we even saying 'flatten' to mean the same thing? :P
[09:54:46] <NodeX> you probably mean keep "cars" in a separate collection and do a "join" yes?
[09:55:09] <Derick> NodeX: no, what I suggested was to put bill gates on each of the cars.
[09:55:18] <NodeX> you currently have the right concept, you're just doing it one level to deep
[09:55:28] <NodeX> Derick : I wasn't suggesting you were
[09:55:32] <Derick> k_sze[work]: yes, a whole profile belongs together, but if a query shows that it's not efficient, change it.
[09:56:20] <Derick> NodeX: but yes, I do see a document level too much in there. Perhaps just badly pastebinned though... as the terminology that k_sze[work] uses in it is right.
[09:56:45] <NodeX> Derick : it's one document though - not an array of results from a find();
[09:56:47] <Derick> tilya: arbiters don't store data, so why does it need auth?
[09:57:03] <Derick> k_sze[work]: is what NodeX says true?
[09:57:16] <NodeX> else he would already have the cars [] ;)
[09:57:25] <k_sze[work]> rich_people is a collection in the silicon_valley database
[10:01:53] <tilya> Derick: it somehow worked until 2.4.6. but well, if that's intended and not a bug, then i will dump this dumb monitoring and do something else. thanks.
[10:12:56] <st0ne2thedge> Derick: Can one run an existing database without preallocation? Is there any danger to existing data? We are currently storing our packaging into the database (pulpproject)
[10:13:39] <Derick> no, I don't think that is a problem
[10:18:12] <st0ne2thedge> Derick: you said you'd advice thresholding the mongodb's partition at >2G remaining space right? ^^
[10:21:09] <k_sze[work]> Wouldn't cars.$ return only the first blue car?
[10:25:41] <k_sze[work]> going home. might check back later.
[11:00:32] <Derick> gregor0: real believers don't have "Adium User" as their ircname
[11:01:51] <robertjpayne> the "auth" setting isn't affected in any way by latency or network conditions is it? I'm struggling getting it working on a remote mongo but works just fine in a local vagrant mongo
[11:09:25] <robertjpayne> Gah mongodb host has to be an IP address always not a domain name it seems?
[11:10:22] <robertjpayne> yea hmm now trying to use my vagrant to connect to the one on the server it only works with IP but may be because of network briding and sorts
[11:11:05] <Derick> yeah...or a wrong DNS/host configuration
[12:16:10] <HashMap> Hi there people.. I am just curious, I am doing the MongoDB for DBAs course and currently there is a week about sharding.. And my question is, for example there are three config servers.. does this usually in production means three separate physical machines?
[12:18:44] <robertjpayne> HashMap: Sharding normally will happen across separate physical machines to gain more durability in the event of hardware failure.
[12:19:11] <robertjpayne> HashMap: That's not to say it's possible to run the shards in LXC or similar virtualized containers
[12:20:59] <st0ne2thedge> I'm reading up on mongodb writes in mongodb v2.0, where as soon as a write has been buffered in the outgoing socket buffer of the client host, the insert is 'completed'. Is this fixed in later releases/
[13:03:46] <st0ne2thedge> right it is... up to the level of 1 of the mirroring server's memory?
[13:05:17] <kali> st0ne2thedge: look for the writeconcern options in your client library
[13:05:36] <kali> st0ne2thedge: what you describe used to be the default
[13:05:47] <kali> st0ne2thedge: but it's been changed for about one year
[13:16:48] <st0ne2thedge> kali: Im reading through the release notes of 2.2 but having trouble finding anything about WriteConcern though
[13:18:33] <kali> st0ne2thedge: in client side, and it was more at the time of 2.4
[13:23:04] <kali> indeed, it's somewhere in between 2.2 and 2.4
[13:25:04] <st0ne2thedge> kali, gregorm: thx for the links!
[13:57:09] <alFReD-NSH> Any one know here what might cause "Assertion: 10307:Client Error: bad object in message: invalid bson type" on node js driver?
[17:48:52] <JEisen> Using collMod() to set usePowerOf2Sizes is a non-blocking operation on a primary, right?
[19:18:15] <ivica> hello everyone. is this the right place to ask a question about pymongo?
[19:44:35] <astropirate> I just added an Index on a field, now my queries return 0 documents
[19:44:45] <astropirate> nothing in the query has been changed
[20:04:28] <JEisen> I would recommend upgrading the servers before sharding if that's an option, unless it really makes sense.
[20:04:32] <tystr> I'm just trying to understand if the issues we had were soley related to the misconfiguration, or if we're beginning to reach a bottleneck
[20:05:21] <tystr> how can I see the size of the working memory set/
[20:07:41] <JEisen> how big is the DB, how much RAM?
[20:07:44] <astropirate> Anyone know what is going on with my problem? I couldn't drop the index, it says db is currupt. I did a db repair, again it worked for the first query and then returns 0 values for subsequent queries
[20:20:28] <J-Gonzalez> I'm trying to figure out the best way to organize an app I'm building
[20:21:14] <azathoth99> common lisp coreserver or www.happstack.com
[20:21:29] <J-Gonzalez> I've got an events collection, with many events that get create by users
[20:21:58] <J-Gonzalez> these events have options for purchasing tickets. When a user purchases tickets, an order gets created in an order collection
[20:22:28] <J-Gonzalez> the one thing I can't seem to decide on is where a TICKET should be create (think like a hard ticket when you go to sporting events or concerts)
[20:22:49] <J-Gonzalez> Should tickets be it's own collection that has a reference to the order and event
[20:23:10] <J-Gonzalez> or should tickets live as sub document within each event
[20:23:15] <JEisen> how often are you going to be querying on just the tickets vs. as part of the order,for example?
[20:23:56] <J-Gonzalez> The tickets get queried semi-often, during event checkins at different venues
[20:24:26] <JEisen> you'll need to weigh the disadvantage of having multiple round-trip queries vs. having to transfer/sort through a lot of unrelated data.
[20:25:35] <JEisen> but, say, if when you get the ticket you also want to know the order that generated it and so would use that data anyway… that might not be so bad.
[20:26:38] <J-Gonzalez> The orders don't necessarily get queried as often, and are more for accounting purposes, say a user wants to see all their past orders
[20:27:27] <JEisen> that sounds pretty separate to me, based on what you've said.
[20:28:12] <tystr> JEisen ya I've been sternly cautioned about sharding hehe
[20:28:54] <J-Gonzalez> yea the orders are pretty separate - basically, we just need the easiest way to pull up tickets when customers bring them in. And either having them as subdocs within the event that is being checked in, or querying the entire tickets collection
[20:29:59] <JEisen> my perspective is, don't use subdocs unless you automatically want that data when you get the parent data. but that may be hardline.
[20:39:00] <J-Gonzalez_> I'm going to try the safe collection route for now
[20:39:25] <vicTROLA> Does anyone have any advice on best practices for paging through large datasets and 'maintaining state' (for lack of a better phrase)? My app is supposed to take the user through potentially thousands of individual documents. I'm unsure how to preserve offset state in the next request to avoid showing duplicates
[20:40:00] <vicTROLA> I'm thinking about storing limit skip offsets in cache and modifying them on every request. Is there a better way?
[20:40:20] <vicTROLA> or is there a way to 'freeze' the cursor and re-instantiate it on demand?
[20:41:01] <joannac> JEisen: I doubt it, it only changes future allocations.
[21:03:52] <liquid-silence> I see the mongodb-native does node readable streams
[21:04:17] <liquid-silence> can anyone explain this to me, as it seems super slow on seaking
[21:07:14] <astropirate> liquid-silence, what kind of query are you running?
[21:10:24] <liquid-silence> hitting the range header code
[21:19:28] <flatr0ze> I'm storing binary strings in mongo (both files and encrypted text), and getting console.log() output of "\u0012\u0032..." mixed with glyphs and symbols in the output. Can it be that it's just the output, and I'm not really saving it as UTF-16? I'm really trying to save space and store only bytes, no Unicodes... Using Node.js w/ MeteorJS framework, GNU/Linux.
[21:22:07] <cirwin> flatr0ze: unfortunately that's not going to work very reliably, I don't think the node mongo driver has good support for buffers
[21:22:41] <cirwin> you should also be careful because not all byte strings are technically valid unicode, but as far as I'm aware neither mongo nor node enforce that strictly
[21:23:45] <flatr0ze> cirwin: you think there's any way to store just byte array? like in the ol' good times? I'm getting pretty sick of having utf where I don't need it at all
[21:25:36] <cirwin> flatr0ze: I think there's a way of putting raw data into BSON, but I've not tried super-hard
[21:28:48] <liquid-silence> the performance is really bad
[21:29:01] <liquid-silence> takes like 5 minutes for it actually starts sending other ranges
[21:37:04] <ramsey> Derick: what's the alternative to using setSlaveOkay() in the PHP Mongo driver? We upgraded and now we're getting deprecation notices.
[21:39:48] <tavooca__> mongodb example python mapbox
[21:46:01] <liquid-silence> going to try 1.4 release but done have much hope
[22:33:45] <tab1293> I'm trying to insert an integer value of 5368709120 but whenever I try to insert it into a field it stores in the db as 1073741823
[23:30:53] <bjori> tab1293: it'll tell you which php.ini is loaded, and if other ini files are loaded. if you see a "mongo.ini" then use that, otherwise just add it to the main php.ini
[23:54:16] <tab1293> bjori: Do you know anything in regards to when summer intern applicants should hear back from you guys? I just put in an application last week