[03:33:53] <dbe_> Anyone run into mongo having huge virtual memory (70 gigs on a machine with 8 gig physical) while no queries are running?
[03:37:41] <jmar777> dbe_: that's pretty normal. mongo does memory mapping for just about everything, so your virtual size includes any and all opened data files (+ some miscellaneous stuff)
[03:39:58] <dbe_> jmar777: So..if my database is 30gigs..then the 70gigs is probably indices etc?
[03:42:45] <jmar777> dbe_: you'll have to ask someone more knowledgeable on the specifics of what's in there. but i would suspect that that's true. the raw data, indexes, and journal are probably the biggest culprits
[06:18:52] <rh1n0> greetings, im working with a mongodb replica set. each node is a separate server that contains a full data set. 3 nodes, no shards, no arbiter. Do you need to configure some form of load balancer to detect when the node your application is using goes down? or does the application adapter handle this for you?
[06:22:23] <Oddman> rh1n0, I thought you were meant to configure your application to use all of them
[06:22:27] <Oddman> and mongo handles where the request goes
[06:22:49] <Oddman> I could be completely wrong, however - I'm not too educated on that part of mongo, just based on the small info I've read
[06:24:02] <rh1n0> we are using mongoid with rails. We do have the replica set configured in the configuration. In my experience (not with nosql) using regular application clusters we used hearbeat to detect failover and haproxy to balance the traffic. Im trying to understand how this scenario works with mongodb.
[06:24:40] <rh1n0> im having one heck of a time getting all three nodes to be recognized as part of the replica set
[06:30:00] <Oddman> I've done some small utilities in it so far
[06:30:00] <rh1n0> i had to monkey patch a lot of areas in the rails config to work with the asset pipeline because at that time compass wasnt ready for it.
[06:30:35] <Oddman> but I haven't liked rails since 2.3
[06:30:57] <Oddman> everything they put in rails 3, apart from the decoupling, should have been kept as a gem
[06:31:10] <rh1n0> yeah ruby is the primary reason im still using rails. I started out using ruby for non-rails work. Recently ive been heavy with chef and automation. Its awesome
[10:25:15] <Lujeni> Hello - My mongoS log is flooded by this message (http://pastebin.com/FXhJGEkc) it's normal ? Thx
[11:50:08] <bagvendt> Hi guys, i have been struggling with this mapReduce problem for quite some time now. Any help would greatly appreciated. My input and wanted output is --> http://pastebin.com/V3rpqYDA
[11:51:37] <bagvendt> Corrected some errors. Newest version -> http://pastebin.com/wk4xRDaX
[12:04:23] <bagvendt> Stackoverflow question here -> http://stackoverflow.com/questions/12745860/pivoting-data-in-mongodb
[12:21:51] <gheegh> hey all.. wanted an opinion.. I've got a single server and I'm about to add a second server to it for Mongo.. (and double the ram on both). We have a lot of batch processing that is pretty intensive on Mongo.. then we have a lot of occasional access/use. I'm wondering if anyone has opinions on whether to set the servers up as master/slave or to shard and split the tables onto different servers?
[13:09:14] <Carko> I'm having some trouble with the C# driver an LINQ when trying to use the Contains method - here's the simple example code and exception I get: http://pastie.org/4914659
[13:38:36] <estebistec> I think I know the answer, but just in case I'm missing something, is there a way to add a keyFile to a live replicaset and avoid the downtime?
[13:46:57] <stiang> Is it possible to create a query that matches one piece of the document against another part? I have a 'messages' collection, where each message has a 'replies' field (containing an array of reply documents), and an 'opened_by' field (containing an array of {user_id: …, last_opened: …} objects). I am trying to construct a query that will find all messages with new replies since the time in last_opened for a given user. Is that possible?
[13:49:09] <aster1sk> I have a serious problem with this model.
[14:01:12] <aster1sk> Perhaps, it's really just aggregating sub documents with unknown keys which is the problem.
[14:01:34] <NodeX> I count things by campaign id (issue) on date ("d")
[14:01:48] <aster1sk> Excellent, yeah I have no problem doing that.
[14:01:55] <aster1sk> The aggregate framework is perfect for that.
[14:02:01] <NodeX> I know all my keys , can't you make your keys values ?
[14:02:14] <aster1sk> Sure, but that's where it gets hairy
[14:02:25] <NodeX> the way I go about it is to start with a list of cid's (issues) and a date and loop it
[14:02:38] <aster1sk> As it's write heavy and you can't upsert a positional operator I'd require two queries on a write heavy db.
[14:03:01] <aster1sk> Not to mention we have read preference across our replica set, so it would be reading from the slave that may or may not have synced yet.
[14:03:06] <NodeX> I aggregate everything with todays date and cid etc and pull that into an upsert in another collection
[14:03:24] <NodeX> I also chop my data up and do it every ten minutes
[14:03:47] <NodeX> I keep my collections small as performance degrades heavily after about 1M docs (for me)
[14:03:53] <aster1sk> Not sure I can partition that way, there'd be far too much redundancy and we've a lot of users.
[14:04:21] <NodeX> I aggreghate a history collection (a log of every action on the whole sevrer)
[14:04:35] <NodeX> I am sure it's larger than yor user collection lol
[14:04:42] <aster1sk> Don't think we have the hardware to support that kind of throughput.
[14:04:57] <NodeX> I understand you cant drop users, how are you sharded ?
[14:45:09] <aster1sk> OK all of the issue level methods are sorted out, time for page level, then reading.
[14:45:46] <hariom> Hi guys, I am new to MongoDB. My requirements is I want to store RDF Triples with support for Sparql query lanaguage. Is there any libarary that can work with MongoDB as triple store?
[15:20:49] <coalado> is it normal that a pretty simple map/reduce call over ~ 400 000 objects takes minutes?
[15:36:27] <wereHamster> coalado: m/r is not the fastest.
[15:36:52] <Lujeni> Hello - My mongoS log is flooded by this message (http://pastebin.com/FXhJGEkc) it's normal ? Thx
[15:37:24] <coalado> wereHamster: but almost a minute for a mapreduce... this is too far away from fast
[15:38:23] <hariom> Hi guys, I am new to MongoDB. My requirements is I want to store RDF Triples with support for Sparql query lanaguage. Is there any libarary that can work with MongoDB as triple store?
[15:38:36] <wereHamster> 400k/minute ~= 6k/second. You should know that m/r calls a javascript funciton for each map and reduce step. That alone is a fair amount of overhead
[15:38:54] <ppetermann> hariom: you are kind of repeating yourself ;)
[15:47:42] <hariom> ppetermann: Many new members joined, many old left; Do you know the answer?
[15:56:28] <ppetermann> hariom: rdf by its standard is either xml or notation3 snipped, if you store it in mongo you either store a string, which makes it hard to search by sparql, or you store it in a document, however you have to parse and build the document from it then. as for sparql you would have to write and adapter that reads sparql and converts it to the right finds
[15:58:34] <hariom> ppetermann: correct. Is there any library alerady written which provides support for this?
[15:59:24] <ppetermann> hariom: if there is then google might be the best place to look for it, but i highly doubt it
[16:00:05] <ppetermann> actually, my first hit on google for you is: http://www.infoq.com/news/2011/12/mongograph-qa
[16:48:24] <Black_Phoenix> I just started using MongoDB, can't seem to figure out... if I have documents with tags, I need to find all documents which contain at least one of the given tags
[16:48:40] <Black_Phoenix> is that possible? not "all of the listed tags" but "either of the listed tags"
[16:55:07] <Black_Phoenix> I wouldn't put the docs for these things all over the place
[16:55:08] <wereHamster> Black_Phoenix: did you actually google? http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24all
[16:55:40] <Black_Phoenix> come on wereHamster, you're not paying attention to my question yourself. $all is an "and" operator, at least works like that for tags as I tested it
[16:55:46] <Black_Phoenix> "$in" is the right one, and it works
[17:00:20] <Black_Phoenix> thanks for help by the way
[17:20:57] <maxamillion> is this chanel alright to ask questions about building mongodb from source (for inclusion in a distro) or is there another irc channel that would be more appropriate?
[17:22:53] <_m> Black_Phoenix: It's obvious you didn't read/test/etc before asking your question. Why expect a response other than "google and read, yo."
[17:25:37] <Black_Phoenix> _m I sure did test $all, I just missed the $in part (I was mostly searching for multikey values, didn't realize that was relevant at first)
[17:25:44] <Black_Phoenix> $all works great, just not the functionality I want
[17:27:02] <phpnode> hi, i'm trying to decide between couchdb and mongo for a new project and I wondered if someone could help me with a few questions?
[17:27:20] <phpnode> I want to store a lot of documents based on different entity types from schema.org, so a document could represent anything from a web application to a movie to a physical location in the world.
[17:27:45] <phpnode> Would this be a problem in mongodb? would I have to create a different collection for each different type defined in the schema?
[17:36:28] <phpnode> ok, looks like i need to consider this carefully then
[17:36:45] <phpnode> perhaps couch is a better choice in this particular case, i'm not sure
[17:37:49] <phpnode> for now i just want somewhere i can put documents, i'm not sure about how i'm going to query the data or how much of it there will be at this stage
[17:46:45] <NodeX> phpnode : can you describe what you think a colleciton is?
[17:47:31] <phpnode> NodeX: i'm extremely new to mongo, but i gather that it is basically a namespace that can have certain indexes
[17:47:46] <NodeX> can you tellk me what ytou think it's the same as in sql?
[17:51:06] <NodeX> for whatever reason you thought you wanted to give each user its own collection (presumably a certain type) use that as a key to descrube/define a document
[17:54:06] <phpnode> NodeX: thanks, so i can store different kinds of document within that collection?
[18:12:06] <patroy> Hi, I have a live database and I want to create a replica set, can I make a backup with mongodump or do I need to stop the instance and copy the files over?
[18:24:27] <coopsh> "The mongodump utility can create a dump for an entire server/database/collection […] even when the database is running and active."
[18:24:50] <coopsh> you might be interested in the --oplog option
[18:26:14] <patroy> I read somewhere that only worked for master/slave and not replica-set but they might be wrong
[18:26:22] <patroy> I'll try that for sure thx coopsh!
[18:39:19] <arex\> As I understand it, Replication is using multiple servers for redundancy/failover (same data on all servers) and Sharding is using multiple servers for scaling (data spread out on multiple servers). Do I have that right?
[18:43:10] <kali> arex\: yes. just one thing: replica set can also be used to scale read opereations
[18:43:27] <kali> arex\: just to be more confusing
[18:43:40] <arex\> No, that's awesome. I just finished setting that up in MS SQL
[18:43:57] <arex\> AlwaysOn Availability Groups I think they call it.
[18:44:12] <arex\> With "a readable secondary replica"
[18:44:48] <arex\> Should be fun to learn about essentially the same thing in MongoDB
[18:45:25] <karamorf> I want to get all documents that have the same value for a key. Is this going to be a mapreduce command? (just want to make sure I'm reading the right docs)
[18:45:29] <arex\> Now, I didn't do sharing in MS SQL. Is it unproblematic to use both replication AND sharding?
[18:50:15] <arex\> kali: I see. I'm just setting it up in a development environment, so that's not really a concern. I was just wondering which topic to start with / implement first
[18:50:58] <kali> if you're a developper, the biggest challenge is usually model design, as you'll need to unlearn about everything you know
[18:56:11] <arex\> In a vacuum I'd select something else, but everything is M$ where I work ;P
[18:59:04] <arex\> Looking forward to starting a magical adventure in mongo land
[18:59:19] <Black_Phoenix> I have started mine today, and I'm loving it
[19:00:19] <arex\> kali: A two member replica set should be good enough for a development environment? That's what I did for MSSQL (with a shared disk witness)
[19:00:55] <kali> arex\: two is not very good, unless you add an arbiter
[19:01:48] <arex\> kali: I can probably put an arbiter on a web or mssql server :) I don't want to overpopulate the vm cluster either :)
[20:42:44] <Derick> bhosie: no real schedule - ASAP
[20:44:52] <bhosie> Derick: ok cool. i'm running into this. https://jira.mongodb.org/browse/PHP-484 i'm guessing waiting patiently is the only way to have a replica set and aggregation framework?
[20:54:30] <krispyjala> anybody have any experience converting standalone to sharded?
[21:18:52] <arex\> kali: A couple of questions I don't see immediately mentioned in the docs. 1) Would it be a good idea to have MongoDB logging or data on seperate disks? 2) Should the replica members be able to communicate on a seperate private network?
[22:25:16] <MartinElvar> Evening guys. i have some issue regarding a query, which returns the used where swell.. please have alook here, got code examples http://stackoverflow.com/questions/12754583/mongodb-also-selects-my-where#comment17233408_12754583
[22:27:28] <MartinElvar> I can exclude _id, but players, is still not the root
[22:47:46] <arex\> A couple of questions I don't see immediately mentioned in the docs. 1) Would it be a good idea to have MongoDB logging or data on seperate disks? 2) Should the replica members be able to communicate on a seperate private network?
[22:54:34] <quazimodo> guys if i do db.foo.find() i can find all of foo
[22:54:55] <quazimodo> the problem I'm having is, i dont know what foo is called in the db, nor do i know what other document types may be in the db, how do I check?
[23:55:41] <arex\> A couple of questions I don't see immediately mentioned in the docs. 1) Would it be a good idea to have MongoDB logging or data on seperate disks? 2) Should the replica members be able to communicate on a seperate private network?