[00:01:57] <Astral303> brycelane, depends on your use case… one downside i can think of is that if you store everything in one collection, and index on a common field, you will have a single large index. if you never wanted to use that index across multiple types of objects, then you will be unnecessarily working with a larger-than-necessary index. in contrast, if you had a collection per type of object, then you'd have index only as large as needed for
[00:01:57] <Astral303> given type, even for a common property.
[00:03:46] <brycelane> I'm considering the case of an ORM, which I suspect typically follows the method you describe. I noticed that there is a relatively small limit on the number of indexes possible, which could get conjested in a "one big collection" style.
[00:07:20] <Astral303> brycelane, the biggest reason for putting things into one collection would be that you can run aggregations on that one collectoin
[00:07:38] <Astral303> so if you have similar objects that you'd like to aggregate across, it might make sense
[00:08:53] <brycelane> Oh, I see. That does throw a wrench in my thinking.
[04:19:28] <thesheff17> anyone compile mongodb scons with distcc? I have done this in the past but for some reason I can't get it working this time around
[14:16:12] <manjush> I can't imagine what a multi version concurrency protocol means.
[14:16:43] <manjush> can anyone help me out with this
[14:43:51] <justHannes> hi everybody, i am somehwat new to mongodb (using the doctrine ODM) i have a setup where i have a relation from posts to provider ( 1:M ) and i want to select posts based on information stored in the provider ( my setup looks like http://pastebin.com/0b8sgmDR )
[14:44:22] <justHannes> what i would like to do is something like ` db.Post.find({ "publicationStatus": "1", "provider.publicationStatus": "1" } ) ` on post
[14:45:18] <monbro> is it common to reference embeded documents?
[14:48:56] <astro73|roam> justHannes: I haven't found any cross-document JOINs on Mongo, period.
[14:49:43] <astro73|roam> The aggregation framework _might_ (big might) have something usable. Otherwise, you'd have to do it in your application
[14:50:07] <astro73|roam> monbro: I would argue that if you don't have embedded documents, you're not doing it right
[14:50:37] <justHannes> astro73|roam thanks for the info, i feared as much ... well, on the application level would be a bummer for pagination ... i will probably handle it provider update events
[14:52:29] <monbro> astro73|roam: yes, that makes sense :-) I am struggling because doctrine is not handling dbref to embeded documents correct… so I was confused
[14:53:01] <astro73|roam> oh, i thought you meant subdocuments, not referenced documents
[15:38:05] <monbro> a document size with more than 1 mb does not speak for a good scheme?!
[15:49:07] <astro73|roam> monbro: it depends on what's in it
[16:04:07] <monbro> starfly: right, but you can change the size as well up to 2gb. I am quiet unsure what document size is good for a "proper" process for mongodb itself
[16:04:17] <astro73|roam> starfly: that's the hard upper limit, not necessarily a practical one
[16:14:24] <starfly> monbro: there are many factors to consider, how many documents in your larger collections, how much physical memory will you have to accommodate large working sets, what kind of disk I/O throughput you'll have (MB/second), etc. Although you can technically modify nssize, I think most people would agree that would be ill-advised
[16:17:09] <monbro> starfly: thank you very much, yeah I just red something on stackoverflow, that the default maximum at 16mb per document is not bad at all and good to orientation I guess
[16:18:02] <monbro> starfly: so in terms I got a better I/O troughput and more physical memory, the faster bigger documents would be proceeded?
[16:20:01] <starfly> monbro: sounds good, I think the best overall advice is to think carefully about what you'll need to store, model some options (embedding vs. linking), consider your data access patterns, etc. The truth is whatever you design will very likely be changed in the future
[16:20:24] <astro73|roam> monbro: if only computing were that simple
[16:21:53] <starfly> monbro: bigger isn't necessarily better, you have to weigh the traditional (from SQL world) issue of normalization vs. denormalization issues when considering embedding a lot vs. linking collections in code
[16:24:16] <starfly> monbro: model and design as much as possible up front to save refactoring later, but again, no matter which way you go, you'll likely evolve it into something else later.
[16:24:36] <monbro> starfly: mh thank you a lot, that makes sense and will keep me / us thinking and concepting some more time :-)
[16:25:06] <starfly> monbro: good deal and I agree with astro73|roam, which it was simple!
[16:42:44] <starfly> monbro: one last thing, try to use small key names to minimize doc footprint, those are included and can quickly add up; changing them later is a pain
[16:44:05] <monbro> starfly: ah you mean "name" instead of "customerName" ?
[16:44:28] <monbro> starfly: or better "n" than "name"
[16:48:09] <codenapper> Hi, could someone have a look at the explain() of my slow running query (100ms as opposed to ~10ms for similar queries)? Indexes are set and are being used, but it's still so much slower that I wonder what I'm doing wrong.. http://pastebin.com/MRQujMK1
[16:51:23] <starfly> monbro: I mean to use name vs. customerName, you want the key names to be meaningful, but not over the top given impact on document footprint
[16:52:06] <monbro> starfly: hehe alright, yeah I agree
[16:57:20] <cTIDE> how do you address a replica set in which the secondary is consistently falling out of sync with the master?
[16:57:50] <cTIDE> we basically end up resetting our secodnary from scratch every 2 weeks because it falls off
[16:58:45] <starfly> cTIDE: getting behind because of inadequately-performing hardware, write-storms, network saturation--any ideas?
[17:00:30] <starfly> cTIDE: sounds like your secondary is probably not hardware-sized like what's in use for the primary? You may need to throw hardware at the problem
[17:01:02] <cTIDE> ok, yeah, our master is an m2.xlarge and the slave is an m1.large
[17:01:28] <starfly> cTIDE: then it sounds like you're getting a predictable outcome… :)
[17:01:48] <cTIDE> well, our slave is really only in use as backup
[17:01:55] <cTIDE> so we figured it'd be safe enough to have it be smaller hardware
[17:02:00] <cTIDE> but i guess that's not the case :)
[17:02:30] <starfly> cTIDE: well, it's fine as long as you're up for cost of time to rebuild secondary every 2 weeks :)
[17:03:03] <cTIDE> ok, since we're going to rebuild the box
[17:03:08] <cTIDE> can you have a secondary of a different version?
[19:33:32] <spicewiesel> I could need some help sizing my mongdb instances, but I could not find some calculation examples, could you help me with that?
[19:37:13] <kali> yeah maybe, i don't know... ask the real question :)
[19:39:18] <spicewiesel> first the simple questions: Do I have to keep the index in RAM and do I have to keep all data in RAM?
[19:39:51] <spicewiesel> as I know so far, I have to keep the index, but I do not have to keep all the data (with performance limitations then, of course)
[19:39:55] <kali> spicewiesel: index performance are absolutely terrible when not in RAM
[19:41:18] <kali> for the rest, it depends on the request rate and the expected performance of course, but DBs with index in RAM and documents on disk are suitable for many apps
[19:45:45] <spicewiesel> okay, fine. That's what I learned while reading the last days.
[20:15:53] <jcalvinowens> Hello all. Is it possible to conditionally project a variable in an agggregation query? As in, {"$project: {"var": {"$cond": [CONDITION,True,False]}}}?
[20:16:15] <jcalvinowens> That always yields a literal "True", instead of including the variable
[21:43:34] <tjmehta> kali , updated : http://codeshare.io/mX5Fz
[21:43:43] <tjmehta> added help comments in some js code
[23:35:45] <jaCen915> I know it's possible to find documents based off an array of values but is it possible to do the same with an update? for example db.products.update({ _id: [1,2,3,4,5]}, {$set: {blah:blah}})
[23:49:46] <jude0_> does anyone know if you can query getCollectionNames from mongo's rest api?
[23:58:48] <bjori> does the rest api support commands?