PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Friday the 14th of March, 2014

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:27:04] <mikehaas763> I was watching a video on foundationdb. The guy said each piece of data is stored in 3 locations/nodes. Does mongodb work this way or does each mongo node an exact replica?
[00:33:09] <cheeser> each replica is an exactish copy
[00:47:21] <mikehaas763> cheeser: Ok thank you. I'm assuming by exactish you mean that it has eventual consistency. Once any syncing that needs to happen happens
[01:37:39] <cheeser> mikehaas763: yes. exactishly.
[05:06:37] <jfisk1987> hey all
[05:06:51] <jfisk1987> is there a way to view the value of a BSON?
[05:07:01] <jfisk1987> i have documents that store objectIDs of course, and would like to see the value of them
[06:26:43] <jfisk1987> hey guys
[06:27:08] <jfisk1987> anyone have a favorite GUI for browsing their db?
[06:30:03] <rkgarcia> robomongo and mongovue
[06:30:14] <rkgarcia> but i prefer the shell :D
[07:43:27] <Gr1> Hi guys
[07:44:00] <Gr1> How can I make sure that my mongodb replicated sharded instance is actually sharding?
[07:44:03] <Gr1> I can see
[07:44:04] <Gr1> chunks:
[07:44:05] <Gr1> rs0 26
[07:44:16] <Gr1> where rs0 is my replica set name
[07:44:38] <Gr1> It does not show shard0 and shard1 and shard2 chunks
[07:50:49] <kali> you only have one replica set ?
[09:08:24] <gellious> hi
[09:11:35] <Gr1> Yep kali
[09:30:27] <receptor> how can I save timezone in mongo ISODate?
[09:39:28] <kali> Gr1: you're confused then. each shard is a replica set. if you want sharding to do something, you need two shards, so two replica sets
[09:39:39] <kali> receptor: you can't
[09:48:17] <Gr1> I see. So when I removed a secondary node, and when I re-added the same, it now says couldn't connect to new shard ReplicaSetMonitor no master found for set: rs0 from the mongos
[10:41:50] <gellious> is group function in mongo works slowly then map- reduce ?
[10:43:46] <Gr1> Hi kali I hit with a rather strange issue atleast for me with my sharded replica set
[10:44:23] <Gr1> I have 3 boxes, box1, box2, box3 where box1 and 2 are secondary and box3 is primary
[10:44:45] <Gr1> when I take down box3 down, and whenever box1 comes elected as primary,
[10:44:57] <Gr1> I see exception: ReplicaSetMonitor no master found for set: rs0 from my mongos
[10:44:59] <Gr1> But
[10:45:06] <Gr1> whenever box1 remains secondary
[10:45:14] <Gr1> and box2 or box3 becomes primary elected,
[10:45:15] <Gr1> it works
[10:45:33] <Gr1> Only box1 as primary shows up this error.
[10:45:44] <Gr1> It I switch it to another as primary, the error is gone
[10:45:51] <Gr1> I have no clue why
[10:46:25] <Gr1> Even the data directory size (/var/lib/mongodb) is also 2 GB less on box1 when compared to box2 and box3
[12:54:18] <agileadam> I'm building a simple task app in Node. Each user will have zero or one active task at all times. Should I store the active task ID in the user model, or should I have an "active" boolean in my task model? Can you explain why I should choose one over the other?
[12:57:49] <Nodex> do you query tasks or users more?
[13:00:31] <agileadam> Well, I'm often loading a large number of tasks for the user that's requesting them
[13:00:46] <agileadam> find all tasks associated with user X
[13:01:21] <agileadam> I rarely need the user exclusively.
[13:03:27] <Nodex> personaly I would store all the information in a tasks collection
[13:05:32] <agileadam> Nodex: now that I thought about your question I have to agree.
[13:05:51] <agileadam> it is storing "more" but I should be able to access it more efficiently.
[13:06:37] <agileadam> Nodex: thanks for your input
[13:07:26] <Nodex> disks are cheap :)
[13:07:57] <agileadam> :D
[14:05:43] <nick_> Is there a loss of disk usage efficiency having nested objects stored in a mongodb?
[14:09:27] <Nodex> as opposed to storing them in their own collection?
[14:09:59] <Nodex> the padding is the same I would imagine, you probably hav eless overhead as you're not adding another _id index into RAM if you nest it
[14:10:28] <nick_> I'm looking at moving some data storage to mongodb to speed up writing.
[14:10:36] <bodie_> don't people use mongo because they're not worried about disk efficiency?
[14:10:53] <nick_> But it seems that just naively throwing my data in the database was less space efficienty than just an ascii representation, which I thought was odd.
[14:11:50] <nick_> I was wondering if there's something about the structure of my data causing it to take so much space.
[14:12:26] <Nodex> disks are cheap, mongodb is not efficient with them, there is no comression
[14:12:29] <Nodex> compression*
[14:12:49] <nick_> How is the data actually stored?
[14:13:12] <nick_> I thought it was a binary representation.
[14:13:33] <Nodex> it's stored as BSON on disk
[14:15:36] <bodie_> how it BSON
[14:15:38] <bodie_> ;)
[14:15:50] <bodie_> correct response: true... true
[14:16:19] <_boot> wut
[14:16:32] <Nodex> eh?
[14:16:38] <Zelest> bodie_, bored? lol
[14:17:54] <nick_> So, is it less efficient to have, say, a list of list of ints, than one large list of ints?
[14:18:07] <bodie_> maybe
[14:18:37] <Nodex> the documents are padded out, if your large list of int's breaks the padding threshold then it will jump to the next
[14:18:52] <Nodex> hard to answer on a use by use basis
[14:19:00] <nick_> Naively I would expect that BSON uses less space than JSON.
[14:19:11] <_boot> i was under the impression it used more
[14:19:15] <_boot> as it stored field lengths etc
[14:19:28] <starfly> there's no substitute for trials
[14:19:39] <nick_> Well, I did a trial.
[14:19:54] <starfly> they keeping trying other things! :)
[14:19:57] <starfly> then
[14:20:01] <nick_> And was confused that mongodb used twice as much space, which is too much.
[14:20:32] <nick_> So I was wondering if there are ways I could re-organise my data to be more space efficient.
[14:20:33] <Nodex> that's explained somewhere in the docs
[14:20:38] <Nodex> one sec, I'll find the link
[14:21:08] <Nodex> http://docs.mongodb.org/manual/faq/storage/#why-are-the-files-in-my-data-directory-larger-than-the-data-in-my-database
[14:28:43] <starfly> nick_: You statement "I'm looking at moving some data storage to mongodb to speed up writing." is interesting, ç
[14:39:14] <nick_> starfly: I work on a particle physics experiment.
[14:39:36] <nick_> We're running a prototype detector at the moment and were just dumping some data into gzipped text files.
[14:39:57] <nick_> But I just upgraded our system so it can take data at a much higher rate, so I need to write data faster.
[14:40:16] <nick_> I'd had a play with mongodb before, so I thought I'd see how that would work.
[14:40:56] <nick_> Nodex: I know the database claims blocks of space, but I thought that db.name.stats() gives the "size" variable which is the size that is actually used, no?
[14:41:36] <starfly> nick_: do you need the attributes of MongoDB or any other database for downstream consumption? Otherwise, seems like you could stream into striped filesystems faster. Any database will add index generation overhead and space at a minimum, but realistically also database engine overhead
[14:42:55] <nick_> I don't strictly need the database.
[14:43:12] <nick_> Other fast, space efficient solutions would be good.
[14:43:46] <nick_> Although down the line when our system has increased in size we might need a database to have a central location for the data coming in from lots of data streams.
[14:44:28] <nick_> Is there much space overhead with subdocuments?
[14:44:38] <starfly> nick_: sounds like you'd be better off just storing metadata about what you're loading in a database and other, more simple ways of storing the actual data/files
[14:49:25] <nick_> starfly: maybe.
[14:51:48] <hersha> HI quick question. I think I'm having an issue with collection fragmentation. I have a fairly high write load of 23000000 documents a day with a 3 day retention. It works great for about 3 day after it reaches its average document load and then the disk IO begins to greatly increase to the point of IO contention. Any suggestions on where to start to deal with it?
[16:05:19] <Wil_> If I have an index, "address_1", an index, "createdAt_1", and an index, "address_1_createdAt_1"... can I delete the address_1 and createdAt_1 indexes?
[16:05:28] <Wil_> Does the compound index cover both?
[16:10:00] <cheeser> if you query them in that order
[16:10:11] <cheeser> address_1 is redundant either way
[16:12:52] <Wil_> Well
[16:13:05] <Wil_> I'll either be doing address AND createdAt, OR just address.
[16:13:15] <Wil_> I don't believe I'm ever querying JUST createdAt
[16:29:39] <Nodex> pretty sure i told you yesterday your address index was redundant
[16:34:29] <gellious> Hi , people. Please tell how to Update nested documents ?
[16:34:38] <gellious> nested arrays
[16:34:48] <chiel> hi guys - what's the best way to update only part of a document?
[16:34:57] <Joeskyyy> chiel: $set
[16:35:06] <gellious> add some new element
[16:35:16] <cheeser> $set
[16:35:19] <gellious> ok
[16:35:26] <gellious> thnx
[16:35:27] <cheeser> updates are documented on the website
[16:35:30] <gellious> I'll try
[16:43:39] <gellious> no, for adding new element to nested array of document - $push
[16:43:47] <gellious> with {upsert: true}
[16:44:13] <gellious> if element doesn't exists, it will add ..
[17:07:56] <Mmike> Hi, lads.
[17:08:12] <Mmike> What do you use to stresstest or just get some performance testing done, regarding mongodb?
[17:20:46] <tham> test
[18:02:06] <Hyperking> how do i write empty values into an array?
[18:02:40] <rkgarcia> insert ""
[18:04:35] <Hyperking> would this be correct? http://bpaste.net/show/lZ1fZsGTwTFI6MpbHcM2/
[18:05:18] <Hyperking> and then call db.Players.save(Players)
[18:07:55] <Nodex> null != ""
[18:09:26] <hopf> I have a replica set with one primary and two replicas. Right now, the replicas are slipping behind the primary, and there are a bunch of serverStatus ops that are taking a long time. What would cause this?
[18:11:39] <rkgarcia> network
[18:12:06] <rkgarcia> and 3 servers are ok in resources? cpu, memory?
[18:12:11] <rkgarcia> hopf:
[18:14:35] <heewa> rkgarcia: hey, working with hopf. Stats show on a replica box that mongod process is pegging one of the four cpu cores at 100%, the other 3 are almost idle.
[18:15:38] <rkgarcia> heewa: primary status it's ok?
[18:16:52] <hopf> rkgarcia: The primary is in state "PRIMARY". The secondaries are both in state "SECONDARY"
[18:17:52] <rkgarcia> but status in cpu, ram
[18:18:24] <heewa> rkgarcia: The secondary is currently indexing, very slowly. But the index was created on the primary with background: true, so as I understand it, it should be allowing Ops through during this, right?
[18:18:38] <rkgarcia> yes
[18:21:26] <hopf> rkgarcia: The primary is at 50% utilization, split among 8 cores
[18:23:54] <hopf> rkgarcia: mongo is using 62% of ram on the primary
[18:24:18] <rkgarcia> hopf: and network?
[18:27:01] <heewa> rkgarcia: network looks pretty normal (Or not any different than it's been in the last few days). In the ways that I can tell, primary is pretty healthy. It's the secondaries that seem… stuck..
[18:30:14] <rkgarcia> heewa: index size?
[18:35:48] <G1eb> hello, I want to set something like last_seen value on a chat message stored in mongo
[18:36:11] <G1eb> and also which user it was, so kind of a link table
[18:36:32] <hopf> rkgarcia: Quite large! We just set a write lock on the primary, so I can't get the stats right now. However, this page is informative: http://docs.mongodb.org/manual/tutorial/build-indexes-on-replica-sets/
[18:36:35] <G1eb> what is the best way to do this, create a whole separate document as a link table?
[18:36:45] <hopf> apparently background index creation on primary becomes foregrounded on secondary
[18:41:22] <G1eb> nevermind got it
[18:43:22] <unholycrab> "the general rule is: minimum of N servers for N shards" is this true? if i get to the point where i have 4 shards, should i convert each shard to a 5 member replica set?
[18:46:45] <cheeser> well, at least a 3 member repl set
[18:46:49] <rkgarcia> hopf: you follow this steps http://docs.mongodb.org/manual/tutorial/build-indexes-on-replica-sets/#tutorial-index-on-replica-sets-stop-one-member
[18:46:56] <rkgarcia> for index creation
[18:47:07] <hopf> rkgarcia: I will surely do that in the future!
[18:47:24] <unholycrab> cheeser: right. im asking if my replica sets need to grow as the number of shards in my cluster grows
[18:47:42] <rkgarcia> hopf: may be the problem with ops
[18:47:54] <cheeser> hopf: for the record, this is coming in 2.6 https://jira.mongodb.org/browse/SERVER-2771
[18:47:59] <unholycrab> cheeser: ie: my quote from this article http://www.kchodorow.com/blog/2010/08/09/sharding-and-replica-sets-illustrated/
[18:48:04] <cheeser> unholycrab: i wouldn't think so...
[18:48:48] <cheeser> that still use the master/slave language so it might not really apply to replica set clusters
[18:49:06] <cheeser> that page is 3+ years old at this point
[18:49:15] <unholycrab> yeah, it is old
[18:49:18] <hopf> cheeser: thx for the heads up!
[18:50:02] <cheeser> hopf: np
[18:51:36] <synth_> is it possible to limit the amount of documents returned when doing a find() ?
[18:52:11] <synth_> i'm finding php is taking forever to render 1,200+ worth of documents on a webpage so i'd like to limit it to maybe 300
[18:53:17] <cheeser> .limit()
[18:53:45] <synth_> would i use it like $my_cursor.find($my_search_criteria).limit(300) ?
[18:54:10] <cheeser> the shell syntax is something like that. not sure what the php form would be
[18:56:01] <synth_> oops, my error was . instead of ->; $my_cursor->find($my_search_criteria)->limit(300); worked. thank you!
[18:56:12] <cheeser> np
[19:08:27] <ekristen> what is Average Flush Time and what affects it?
[19:28:34] <zumba_addict> does mongo have to be reindexed? it's because one server is slow
[19:31:23] <unholycrab> i don't think so, zumba_addict
[19:31:56] <zumba_addict> k
[19:32:05] <unholycrab> zumba_addict: http://docs.mongodb.org/manual/reference/command/reIndex/
[19:32:06] <zumba_addict> that means, problem is coming from a different area
[19:33:09] <unholycrab> i suppose you could try reindexing, or change what you are indexing
[19:33:10] <bodie_> hi all, trying to build from source and I'm getting a truckload of errors because of:
[19:33:12] <bodie_> cc1plus: all warnings being treated as errors
[19:33:19] <bodie_> 2.4.9
[19:33:30] <bodie_> is that expected?
[19:34:29] <zumba_addict> k
[19:35:01] <bodie_> ./SConstruct: env.Append( CCFLAGS=["-Werror", "-pipe"] )
[19:35:07] <bodie_> so I'm assuming 2.4.9 is breaking?
[19:38:11] <zumba_addict> i'll tell it to the other team since I don't have control of the db
[19:38:49] <bodie_> fyi: https://svn.boost.org/trac/boost/ticket/7242
[20:10:45] <saml> "Unless you have a compelling reason for using a DBRef, use manual references."
[20:10:50] <saml> is DBRef that bad?
[20:11:41] <saml> "For nearly every case where you want to store a relationship between two documents, use manual references."
[20:16:15] <Aboba> Quick question, when building an app on node, is it better to just use a single open mongo connection, or open and close per user. Say for <100 users?
[20:16:48] <cheeser> open and closing would slow things down quite a bit.
[20:16:54] <Aboba> thats what I was thinking
[20:17:02] <Aboba> any downsides to only opening once?
[20:17:22] <cheeser> everything has downsides...
[20:17:33] <Aboba> that's why I'm asking :)
[20:17:41] <Zelest> open and close a single one! :D
[20:17:44] <Zelest> *solved it!*
[20:17:54] <cheeser> many drivers actually have pools internally
[21:10:34] <gsd> what could cause connecting to a database to be so slow (77970ms)
[21:11:05] <gsd> poolSize is only 10
[21:11:19] <gsd> it is a rather large db but i don't see how that can slow down connections
[21:12:24] <cheeser> wow. that shouldn't happen
[21:12:33] <gsd> yea ...
[21:18:28] <gsd> should repl sets be on diff ports
[21:19:11] <gsd> right now, our connection string looks like: user:pass@host1:27017,host2:27017,host3:27017/my_db
[21:22:39] <SecretAgentX9> Hello, I have a schemea question. The top level object of this is a Keg https://github.com/RightpointLabs/Pourcast/blob/master/RightpointLabs.Pourcast.DataModel/Schema.json "Keg : { Beer : { .. } }". Now I like that schemea cause I'm not terribly worried about Beer or Brewery info ever changing. The problem I'm running into is if a user wants to add beer or brewery without a keg.
[21:22:47] <SecretAgentX9> And I think I've answered my own question
[21:22:50] <SecretAgentX9> well maybe not
[21:22:53] <SecretAgentX9> well maybe
[21:23:19] <SecretAgentX9> It would suck to make 3 calls on mongo to create a new Keg if there is no beer/brewery info
[21:24:38] <SecretAgentX9> I guess that's not a bad hit to take
[21:58:28] <gsd_> created a stack question if someone has any ideas about why connecting to my db is slow
[21:58:28] <gsd_> http://stackoverflow.com/questions/22416428/why-does-it-take-so-long-to-connect-to-my-mongo-db-in-node
[23:17:33] <Bilge> Wow, amazing, thanks for sharing