[00:27:04] <mikehaas763> I was watching a video on foundationdb. The guy said each piece of data is stored in 3 locations/nodes. Does mongodb work this way or does each mongo node an exact replica?
[00:33:09] <cheeser> each replica is an exactish copy
[00:47:21] <mikehaas763> cheeser: Ok thank you. I'm assuming by exactish you mean that it has eventual consistency. Once any syncing that needs to happen happens
[09:30:27] <receptor> how can I save timezone in mongo ISODate?
[09:39:28] <kali> Gr1: you're confused then. each shard is a replica set. if you want sharding to do something, you need two shards, so two replica sets
[09:48:17] <Gr1> I see. So when I removed a secondary node, and when I re-added the same, it now says couldn't connect to new shard ReplicaSetMonitor no master found for set: rs0 from the mongos
[10:41:50] <gellious> is group function in mongo works slowly then map- reduce ?
[10:43:46] <Gr1> Hi kali I hit with a rather strange issue atleast for me with my sharded replica set
[10:44:23] <Gr1> I have 3 boxes, box1, box2, box3 where box1 and 2 are secondary and box3 is primary
[10:44:45] <Gr1> when I take down box3 down, and whenever box1 comes elected as primary,
[10:44:57] <Gr1> I see exception: ReplicaSetMonitor no master found for set: rs0 from my mongos
[10:46:25] <Gr1> Even the data directory size (/var/lib/mongodb) is also 2 GB less on box1 when compared to box2 and box3
[12:54:18] <agileadam> I'm building a simple task app in Node. Each user will have zero or one active task at all times. Should I store the active task ID in the user model, or should I have an "active" boolean in my task model? Can you explain why I should choose one over the other?
[12:57:49] <Nodex> do you query tasks or users more?
[13:00:31] <agileadam> Well, I'm often loading a large number of tasks for the user that's requesting them
[13:00:46] <agileadam> find all tasks associated with user X
[13:01:21] <agileadam> I rarely need the user exclusively.
[13:03:27] <Nodex> personaly I would store all the information in a tasks collection
[13:05:32] <agileadam> Nodex: now that I thought about your question I have to agree.
[13:05:51] <agileadam> it is storing "more" but I should be able to access it more efficiently.
[13:06:37] <agileadam> Nodex: thanks for your input
[14:05:43] <nick_> Is there a loss of disk usage efficiency having nested objects stored in a mongodb?
[14:09:27] <Nodex> as opposed to storing them in their own collection?
[14:09:59] <Nodex> the padding is the same I would imagine, you probably hav eless overhead as you're not adding another _id index into RAM if you nest it
[14:10:28] <nick_> I'm looking at moving some data storage to mongodb to speed up writing.
[14:10:36] <bodie_> don't people use mongo because they're not worried about disk efficiency?
[14:10:53] <nick_> But it seems that just naively throwing my data in the database was less space efficienty than just an ascii representation, which I thought was odd.
[14:11:50] <nick_> I was wondering if there's something about the structure of my data causing it to take so much space.
[14:12:26] <Nodex> disks are cheap, mongodb is not efficient with them, there is no comression
[14:28:43] <starfly> nick_: You statement "I'm looking at moving some data storage to mongodb to speed up writing." is interesting, ç
[14:39:14] <nick_> starfly: I work on a particle physics experiment.
[14:39:36] <nick_> We're running a prototype detector at the moment and were just dumping some data into gzipped text files.
[14:39:57] <nick_> But I just upgraded our system so it can take data at a much higher rate, so I need to write data faster.
[14:40:16] <nick_> I'd had a play with mongodb before, so I thought I'd see how that would work.
[14:40:56] <nick_> Nodex: I know the database claims blocks of space, but I thought that db.name.stats() gives the "size" variable which is the size that is actually used, no?
[14:41:36] <starfly> nick_: do you need the attributes of MongoDB or any other database for downstream consumption? Otherwise, seems like you could stream into striped filesystems faster. Any database will add index generation overhead and space at a minimum, but realistically also database engine overhead
[14:42:55] <nick_> I don't strictly need the database.
[14:43:12] <nick_> Other fast, space efficient solutions would be good.
[14:43:46] <nick_> Although down the line when our system has increased in size we might need a database to have a central location for the data coming in from lots of data streams.
[14:44:28] <nick_> Is there much space overhead with subdocuments?
[14:44:38] <starfly> nick_: sounds like you'd be better off just storing metadata about what you're loading in a database and other, more simple ways of storing the actual data/files
[14:51:48] <hersha> HI quick question. I think I'm having an issue with collection fragmentation. I have a fairly high write load of 23000000 documents a day with a 3 day retention. It works great for about 3 day after it reaches its average document load and then the disk IO begins to greatly increase to the point of IO contention. Any suggestions on where to start to deal with it?
[16:05:19] <Wil_> If I have an index, "address_1", an index, "createdAt_1", and an index, "address_1_createdAt_1"... can I delete the address_1 and createdAt_1 indexes?
[16:05:28] <Wil_> Does the compound index cover both?
[16:10:00] <cheeser> if you query them in that order
[16:10:11] <cheeser> address_1 is redundant either way
[18:09:26] <hopf> I have a replica set with one primary and two replicas. Right now, the replicas are slipping behind the primary, and there are a bunch of serverStatus ops that are taking a long time. What would cause this?
[18:14:35] <heewa> rkgarcia: hey, working with hopf. Stats show on a replica box that mongod process is pegging one of the four cpu cores at 100%, the other 3 are almost idle.
[18:15:38] <rkgarcia> heewa: primary status it's ok?
[18:16:52] <hopf> rkgarcia: The primary is in state "PRIMARY". The secondaries are both in state "SECONDARY"
[18:18:24] <heewa> rkgarcia: The secondary is currently indexing, very slowly. But the index was created on the primary with background: true, so as I understand it, it should be allowing Ops through during this, right?
[18:27:01] <heewa> rkgarcia: network looks pretty normal (Or not any different than it's been in the last few days). In the ways that I can tell, primary is pretty healthy. It's the secondaries that seem… stuck..
[18:35:48] <G1eb> hello, I want to set something like last_seen value on a chat message stored in mongo
[18:36:11] <G1eb> and also which user it was, so kind of a link table
[18:36:32] <hopf> rkgarcia: Quite large! We just set a write lock on the primary, so I can't get the stats right now. However, this page is informative: http://docs.mongodb.org/manual/tutorial/build-indexes-on-replica-sets/
[18:36:35] <G1eb> what is the best way to do this, create a whole separate document as a link table?
[18:36:45] <hopf> apparently background index creation on primary becomes foregrounded on secondary
[18:43:22] <unholycrab> "the general rule is: minimum of N servers for N shards" is this true? if i get to the point where i have 4 shards, should i convert each shard to a 5 member replica set?
[18:46:45] <cheeser> well, at least a 3 member repl set
[18:46:49] <rkgarcia> hopf: you follow this steps http://docs.mongodb.org/manual/tutorial/build-indexes-on-replica-sets/#tutorial-index-on-replica-sets-stop-one-member
[20:11:41] <saml> "For nearly every case where you want to store a relationship between two documents, use manual references."
[20:16:15] <Aboba> Quick question, when building an app on node, is it better to just use a single open mongo connection, or open and close per user. Say for <100 users?
[20:16:48] <cheeser> open and closing would slow things down quite a bit.
[21:18:28] <gsd> should repl sets be on diff ports
[21:19:11] <gsd> right now, our connection string looks like: user:pass@host1:27017,host2:27017,host3:27017/my_db
[21:22:39] <SecretAgentX9> Hello, I have a schemea question. The top level object of this is a Keg https://github.com/RightpointLabs/Pourcast/blob/master/RightpointLabs.Pourcast.DataModel/Schema.json "Keg : { Beer : { .. } }". Now I like that schemea cause I'm not terribly worried about Beer or Brewery info ever changing. The problem I'm running into is if a user wants to add beer or brewery without a keg.
[21:22:47] <SecretAgentX9> And I think I've answered my own question