[12:16:16] <Nodex> anyone know if $addToSet will push a key/value into an object (not an array) ... for example my object.... v:{"a":"foo"} ... db.foo.update({},{$addToSet:{v:{"b":"bar"}}}); ... to make my object look like .... v:{"a":"foo","b":"bar"}
[13:16:09] <Nodex> $set will just overwrite what I tell it to
[13:20:59] <scoates> Nodex: what I mean is taht $set only works on parts of the document. In that example, if you had {c: 'blerp'}, you'd end up with { "_id" : ObjectId("51791d67256e0c9fb80cf774"), "a" : { "bar" : "baz", "foo" : "bar" }, "c": "blerp" }
[13:21:34] <Nodex> yes I know that lol, I don't see why I would ever be changign "c" when I am after changing "a"
[13:22:09] <scoates> you just asked if you can use it to set a whole object. I just wanted to be sure you knew it wasn't going to replace the document.
[13:23:06] <Nodex> no, I stated I am using it - hence the "self answer"
[13:43:50] <diffuse> is there a good resources for examples using the MongoDB C API?
[13:47:06] <algernon> there's the tutorial: http://api.mongodb.org/c/current/tutorial.html
[13:47:42] <diffuse> algernon: been through that, it's out of date and only covers a very small subset of the api.
[13:48:11] <algernon> then I can't help you, I'm afraid. unless you're ok with using an alternative C driver :)
[13:48:32] <diffuse> algernon: i'd consider something if it was more complete :-)
[13:49:12] <Bilge> How would you write a query such that documents with a field set are ordered before the rest? i.e. sticky posts are ordered first
[13:49:25] <algernon> one sec. seems github dropped the pages.
[13:51:36] <algernon> diffuse: https://github.com/algernon/libmongo-client/blob/master/examples/mongo-dump.c is a fairly simple, but complete example. full docs at http://algernon.github.io/libmongo-client/
[15:36:33] <theRoUS> i'm trying to set up a replica set using the same script on all hosts. one of those is the arbiter. the order in which the hosts come up is indeterminate. so far, either the two data hosts come up as primaries and don't communicate, or with some hand-tweaking #2 says 'Thu Apr 25 11:27:24 [conn4] authenticate db: local { authenticate: 1, nonce: "728f59a5be5467cf", user: "__system", key: "d2e82ce59f37634ecc122e03579a3a51" }' when #1 tries to set
[15:36:33] <theRoUS> up, or #1 says 'couldn't initiate : need all members up to initiate, not ok'
[15:41:42] <oskie> hm.. I installed 2.4.3 in debian, and now the version number says 2.4.4-pre- ... anyone else seen this?
[16:16:31] <ramesh> hi which version of mongodb need to be installed to work on Windos XP 32 bit
[16:22:30] <vince_prignano> Anyone here uses pymongo?
[16:26:30] <vince_prignano> When I convert a find() query to a list, memory will be allocated in python in order to manage that objects. Now when I run the same query is normal that is eating double the ram?
[16:29:31] <vince_prignano> anyone can give me a suggestion?
[16:31:07] <Bilge> Does mongo actually store the same key names in every document or does it just store them once in some kind of lookup table?
[16:31:34] <Bilge> I mean would it make sense to use terse key names to make the document smaller or doesn't it matter because it only stores them once anyway
[16:32:14] <kali> Bilge: the keys are stored every time you use them
[16:32:42] <Bilge> So I might want to favor short key names?
[16:33:03] <kali> Bilge: so it make sense to keep them short if you have a high number of small fields
[16:33:20] <kali> Bilge: if you store whole paragraphs of text, it's less relevant
[16:41:23] <scoates> Bilge: foursquare does their own keyname lookups to save a lot of working set RAM.
[18:21:46] <vince_prignano> When I convert a find() query to a list, memory will be allocated in python in order to manage that objects. Now when I run the same query is normal that is eating double the ram? Please anyone?
[18:46:52] <tombee> Hey folks, just looking at a few NoSQL technologies. One of the problems I'm trying to solve is I want to data warehouse raw incoming data, whether it be in XML, JSON or whatever. There's a lot of it, and it's growing fast. Would it be feasible to store this stuff in MongoDB?
[18:49:52] <rickibalboa> MongoDB is capable of handling massive amounts of data
[18:51:12] <tombee> rickibalboa: I suppose I'm trying to weigh up MongoDB's "data warehousing" capabilities, so for example, why should I use HDFS/MapReduce and inherit all that complexity if MongoDB can store all this data in a much simpler fashion, do you understand what I mean?
[18:53:49] <LesTR> tombee: depends on usecases. If you need iterate over whole dataset, mongodb isn't good options (from my exp)
[18:54:24] <tombee> LesTR: OK that's good to know!
[18:54:46] <rickibalboa> Hmm yeah I think so. And I agree with LesTR
[18:55:06] <tombee> LesTR: 'intuitively' I probably see a balance somehow between using something like HDFS and MongoDB. I probably need evidence why though.
[18:55:28] <LesTR> mongodb map/reduce are a parody on hadoop m/r and aggregation framework is still new and hadoop are still too mutch better
[18:56:56] <LesTR> tombee: we has one "similar" usecase - store raw data in many formats
[18:57:41] <LesTR> mongodb are very terrible here. Now we has this data in HBase (we need random read near realtime access) and all works fine
[18:58:09] <LesTR> offline analytics are perfect because hadoop : )
[18:58:24] <tombee> LesTR: OK, good information. Did you read any articles about this information?
[18:58:41] <tombee> I'm searching around for evidence to put together my case for technology choices
[18:59:01] <LesTR> you mean about using mongodb and hadoop?
[18:59:10] <tombee> Yeah and the performance issues you encountered
[18:59:21] <rickibalboa> A good source of info is this. http://www.mongodb.org/about/production-deployments/ if you haven't already seen it, it gives you an idea of what mongodb is currently being used for. A lot of them talk about tick databases and other large scale storage
[19:00:44] <LesTR> thats very hard, because when i start finding solution for this usecase, mongodb has hadoop driver which can works only with exported files (mongoexport)
[19:01:03] <LesTR> for it we move out this data from mongodb
[19:03:43] <LesTR> (i talk about small dataset ~3TB. Its nothing for Hbase but storing this data in mongodb is too mutch harder, sharding, ...)
[19:05:16] <tombee> LesTR: Yeah, I know what you mean. Did you read anything about people that had already encountered these problems though? Or did you just decide to use HBase without any research ?
[19:05:54] <rickibalboa> Yeah, HBase is really incredibly for big data storage, but it depends on your use case, like les said if you need fast random reads, and have a huge data set (TB's), theres no way mongodb is getting it's working set into RAM like it tries to do, even if you have a maxed out machine.
[19:06:56] <LesTR> yes, all its about usecases and oportunity
[19:07:41] <rickibalboa> MongoDB has plenty of use cases, but like I said if you need fast random reads on a large set, maybe consider HBase
[19:08:08] <LesTR> tombee: i read a lot of articles about hadoop and solution based on it. The big goal for what we need hadoop is map/reduce.
[19:08:13] <rickibalboa> Although, if your pumping massive amounts of raw data into your data set and leaving it for analytics another time and don't need to read from it often or even rarely, then sure mongodb is fine. Scales well too
[19:08:56] <rickibalboa> Well, scales indefinitely really with sharding.
[19:09:03] <tombee> rickibalboa: Well the data warehousing thing is just 1 piece of the puzzle
[19:09:16] <LesTR> HBase is too mutch harder for understanding and managing
[19:10:51] <rickibalboa> Yeah agreed, mongodb is super simple to start off with. If thats just a piece of your puzzle, use a combination of data storage systems then :P
[19:11:26] <tombee> rickibalboa: Yeah, just wondering where you draw the line when it comes to using a combination.
[19:11:43] <tombee> So the offline analysis could feed MongoDB or something, I don't know! :)
[19:13:13] <rickibalboa> tombee, yeah i understand. For example I have a similar data model to a relational, however with less joins. But do a lot of large ish storage too, not huge, xx millions of rows etc. I just decided to not bother drawing the line and use mongo for the large storage stuff and the basic storage which would have been suitable in mysql or something.
[19:13:25] <vince_prignano> I'm asking this: is MongoDB safe as a unique backend for an application that serves data to webapp/iphoneapp/androidapp ?
[19:13:31] <rickibalboa> It's not a huge scale operation though so takes out the hassle of using multiple DBMS
[19:23:51] <vince_prignano> anyone can actually answer to my question?
[20:19:01] <tombee> rickibalboa: From what I've read, an issue can be trying to get MongoDB to keep the important stuff in RAM, if it's not then the performance degrades quite quickly.
[20:19:20] <tombee> Especially if there are a lot of indexes, as then it doesn't leave much room for keeping the data itself in RAM
[20:20:10] <tombee> Example here: http://blog.serverdensity.com/mongodb-monitoring-keep-in-it-ram/
[20:21:01] <tombee> Looks like their 50 gig database has 19 gig of indexes alone, so I presume they'd need to start sharding it.
[20:22:11] <scoates> if you shard, you still need that data in RAM. it just doesn't need to be all on the same box.
[20:23:13] <scoates> I currently have 45G resident on my primary. 180G mapped. 0 faults (for now)
[20:23:39] <tombee> scoates: OK, I'm not sure if I've misunderstood or just explained poorly :)
[20:24:03] <tombee> scoates: So in their case, they have a single server with 20G of RAM, and 19G is taken with indexes
[20:24:30] <tombee> scoates: If they sharded it, and set up a second identical server, they'd have a total of 40G of RAM.
[20:24:46] <scoates> yeah. I just added more RAM to my boxes. (-:
[20:24:57] <scoates> going to hit the limit soon, though
[20:25:26] <tombee> scoates: So my understanding is that would (in theory!) relieve the first server, so they might end up with 10G of indexes each in the best case.
[20:25:44] <tombee> leaving a spare 10G on each server for data 'caching'
[20:26:06] <scoates> not sure how that works in practive. I've avoided the headache of sharding for now.
[22:10:11] <dan__t_> I can find release notes, but not full documentation, for 2.0. Anyone know where I can find this?
[22:15:55] <scoates> dan__t_: I know this doesn't answer your question, but as an aside, I just upgraded my cluster from 2.0 through to 2.4, and it made a huge difference.
[22:16:30] <dan__t_> Yessir, I've seen that before, too.
[22:16:40] <dan__t_> I'm kind of locked in, unfortunately.
[22:46:29] <GRMrGecko> Hello, I am trying to find out how to view the data which was put into mongo via perl in the mongo client. I inserted into "$client->get_database('db')/$database->get_collection('test');". When I try "db.test.find();", I get nothing.