[01:40:33] <nicken> also, on a side note, I'm using collections as a sort of way to differentiate between types of documents
[01:40:42] <nicken> e.g. I'll have a users collection, articles collection, etc.
[01:40:49] <nicken> kind of in the same way as you would tables in SQL DBs
[01:41:04] <nicken> is that abusing the idea of collections?
[01:41:10] <cheeser> DBRefs keep explicit track of the remove collection but if it's always to the one collection, save some space and just use the ID values
[03:10:07] <sinelaw> retran the problem is "as a db user" that my machine can't possibly keep 20gb in memory, and it looks like mongo hits the disk way to often. I'm trying to understand if these numbers are "the usual" or is there is some config I should know about
[03:10:22] <sinelaw> or maybe mongo isn't a good match for this use case
[03:10:57] <retran> what does 20GB in memory have to do with anything
[03:11:25] <retran> stop looking at disk hits and disk space usage
[03:58:33] <joannac> you can see how it's implemented, and can translate it to Node
[04:22:09] <Soothsayer> I want to keep track of a Customer Session in my e-commerce site (for e.g User X viewed Product A, then Product B, then Category C, Loggedin, Added to Cart, etc).. Should I be storing one document per event or create a document per hour with an array of events?
[04:22:27] <Soothsayer> one document per hour per customer session*
[05:48:26] <nicken> aren't DBRefs good to use for efficiency?
[05:48:55] <nicken> I guess what I'm wondering is, do they provide any advantage over manual IDs?
[06:30:22] <Soothsayer> nicken: how do you define efficiency?
[06:30:34] <Soothsayer> nicken: if you want to be efficient, dbrefs might not have an advantage over simple references
[06:30:56] <nicken> do DBRefs reduce the number of queries I need to write?
[06:31:07] <Soothsayer> nicken: what language are you using?
[06:31:10] <nicken> or rather, the amount of code I need to write?
[06:39:32] <Soothsayer> i don't see why you wouldn't use an ODM
[07:31:27] <c3l> I have two models, objA and objB, they both reference each other. When I create a new objA a new objB should have been created, whose id should be put in objA.refB, and the id of objA should be put int objB.refA. Can I do this with less than 3 database requests? :)
[08:39:10] <KamZou> Hi, when i've the following entries : http://pastebin.com/yJucA887 What request could i type to list those lower to the date : 20140101 please ?
[09:12:24] <kas84> I am having trouble with an aggregate query that takes an undetermitistic amount of time to execute depending on wether I make other queries to the mongodb or not
[09:12:34] <kas84> doesn’t make a lot of sense to me :/
[09:39:11] <Nodex> "mappedWithJournal" : 159732 <---- I think your system is thrashing when you're doing the aggregations
[09:44:44] <KamZou> Nodex, do you know how could i construct a request to get elements lower than a specific date for "max" element when with these : http://pastebin.com/yJucA887 ?
[09:57:39] <Nodex> I don't even know what your data is supposed to be. Can you format it?
[10:23:14] <future28> Hi there, say I have a file full of ID's that I wish to find the address for in my database, is there a way to use mongo to iterate these ID's? Or shall I just write a script to output queries?
[10:23:53] <Vejeta> Hello, I have a question about django and mongodb. Is this a good place to ask?
[10:24:36] <future28> So I have in my database: UserID and EmailAddr - I collect the user ID's that log into the site within the last 2 weeks and wanting to send them all an email so I need to get the email address from the DB. My file full of UserID is just new line delimeted
[10:31:17] <harryy> Hey. :) I realise MongoDB always listens on a Unix socket - but I can't find a way to connect to a unix socket using the mongoshell - is this supported?
[11:07:45] <Guest19432> Will indexing sub documents result in compound indexes?
[11:09:38] <Vejeta> What's the best subdocument structure to query django-tastypie with mongoDB as backend? https://stackoverflow.com/questions/22792828/best-subdocument-structure-to-query-django-tastypie-with-mongodb-backend Hopefully, someone here knows :)
[13:36:02] <slap_stick> hey i have set auth to true in mongo yet i am still able to login without any authentication, is there something else that is required? i'd kind of have presumed that it would just fail if i made a connection unless credentials were supplied
[13:41:16] <harryy> slap_stick: there's a localhost exception by default
[13:41:25] <harryy> i think it's removed after the first user is created, not sure
[13:41:48] <harryy> or after mongo is restarted, idk
[13:42:33] <slap_stick> yeh it is still allowing me to login remotely without any auth
[13:42:39] <slap_stick> and mongo has been restarted
[13:42:46] <slvrbckt> hey guys, i was wondering if there was a way to query for records /after/ a specific _id?
[13:42:49] <slap_stick> well to /test and then i can use db and switch db's
[13:43:01] <slvrbckt> ie. sorted by date, give me 5 records that occur after X id
[13:49:36] <vparham> @slap_stick, can you actually see the dbs (show dbs) or see any data (show collections, db.collection.find())?
[13:51:56] <slap_stick> if i do show dbs i get unauthorized or if i run any queries then yeh it wont work
[13:52:37] <vparham> That is similar to the behavior I see using keyfile auth. I couldn't find anything in the doc but assume it's "as intended" behavior.
[13:52:45] <vparham> Would love to know otherwise.
[13:53:50] <slap_stick> yeh kind of odd behavior, presumed it just would fail to connect completely as oppose allow a connection through
[14:28:59] <balboah> anyone know why mongorestore has chosen to restore indexes as the last step? is it a pro or was that just how it ended up being?
[14:30:17] <kali> balboah: it's just more efficient this way
[14:30:58] <skot> That is how it is designed for various reasons, mostly due to efficiency and optimizations.
[14:48:00] <MattBo> I think I have a simple question, but new to mongo and kinda new to document databases. need a little help seeing if this is a problem mongo can even solve or if I write my own...
[14:48:53] <MattBo> let's say I've got a whole buttload of these types of documents: { _id:"asdf", people:["matt","steve","mark"] }
[14:49:51] <MattBo> now, I want to know what _ids "matt" has access to, but doing a contains query against all of these is inefficient, I'd rather like to map the documents to something like: { person:"matt", ids:["asdf","asdf2","asdf3"] }
[14:50:10] <MattBo> so, basically, I want to map the original document to a set of new documents and "flip" the relationship.
[14:50:38] <MattBo> I feel like this is something a map/reduce could do, but I'm not sure how I could handle this. the alternative is I have to write a routine that manages the second documents when the first one is edited.
[14:51:13] <kali> first, forget about map/reduce. this is achievable with the aggregation pipeline
[14:51:24] <MattBo> oooh, please tell me more! =-}
[14:54:07] <kali> secondly, if this is an occasional query, aggregation pipeline (with an index on people) may be enough. But if this is a frequent query, you need to maintain the denormalisation at write-time.
[14:54:44] <MattBo> ok, that's what I was thinking of... the initial document is strictly for management, the second document would be queried pretty often
[14:54:48] <kali> basically, you have to pay for it at each read, or at each write. if your app is read intensive, you'll probably want to move this a write time
[14:58:27] <kali> to be honest, I think there is a theorically possible optimisation with a covered index in your case, but I think the optimiser does not know how to perform it yet
[15:00:40] <MattBo> kali, thanks for the help, I think this puts me in a great direction.
[15:00:53] <kali> MattBo: you may want to vote for SERVER-4463 :)
[15:31:53] <pradeepc_> I am running mongotop 1 but the total time it is showing is greater than 1
[15:35:36] <Katafalkas> Hey, is there a way to install only mongos on a server using official 10-gen distribution ?
[15:39:01] <pradeepc_> anyone can please explain me mongotop output
[15:40:34] <andrewferk> Yesterday, a co-worker and I were benchmarking Mongo with bulk importing. We started by running the node import script and mongo on the same vm.
[15:41:28] <andrewferk> We saw really good performance. Then, we mved the import script onto it's own VM, as the mongo VM was I/O bound, and we saw even greater performance
[15:43:17] <andrewferk> We then setup a shard environment, and the bulk import benchmark tanked, taking about 20% longer than when everything was on one vm
[15:43:49] <andrewferk> It seemed that the bottleneck was the mongo router (mongos)
[15:44:22] <andrewferk> We were under the impression that mongos was lightweight, but it was consuming about 50% of one the vCPUs
[15:44:37] <andrewferk> The only queries being run were inserts
[15:45:21] <andrewferk> Is there an explaination why mongos would be slow on batch inserts? Or maybe our shard is misconfigured?
[15:46:39] <pradeepc_> My read queries are taking extremely high time. The database is also very smalljust 1Gb. I want to know how i can debug
[15:46:45] <pradeepc_> can someone help me out please
[15:48:18] <cplantijn> Has anyone attempted to install mongo on a Hostgator/Bluehost server? I followed this tutorial http://rcrisman.net/article/11/installing-mongodb-on-hostmonster-bluehost-accounts. I get an error however
[15:49:32] <cplantijn> My bluehost account has a dedicated IP, but i get an error couldnt connect at src/mongo/shell/mongo.js:145
[16:14:41] <rafaelhbarros> pradeepc_: with that you can check if you have the right indexes and what not
[16:16:59] <Katafalkas> Is there a way to install only mongos, without installing entire mongodb from ubuntu distro ?
[16:19:21] <kali> andrewferk: well, mongos has an overhead... but there might be something else. have you pre-plit the data ? if you haven't all write are sent to the same shard
[16:19:51] <pradeepc_> rafaelhbarros: Thank you for the help, I have actually same data on two mongodb machines both have no indexes. When I am trying db.collection.find().explain(), I am seeing on one machine its taking long time than that on other
[16:20:05] <kali> andrewferk: then, the point of sharding is not better absolute performance, it's better scalability. so sharding will harm you before it helps you
[16:20:38] <rafaelhbarros> pradeepc_: well, what is the distribution of data?
[16:20:51] <kali> andrewferk: then, a struggling mongos is easy to scale: just add one and split the load
[16:22:27] <pradeepc_> rafaelhbarros: I didnt get you. Can you please explain what do you mean by distribution of data ?
[16:23:04] <rafaelhbarros> pradeepc_: how much of the data is coming from the slow machine?
[16:24:21] <pradeepc_> rafaelhbarros: mongotop shows 182502ms for one of the namespace in read column.
[16:24:29] <kali> pradeepc_: the two machines have a copy of the same data, and one is lwloser than the other ?
[16:35:29] <kali> pradeepc_: not from... is it php ?
[16:35:54] <pradeepc_> kali: ohh ok .. this time I am in shell only
[16:36:01] <rafaelhbarros> same for python, it returns a cursor, but once you try to iterate through it, it locks
[16:36:49] <pradeepc_> kali: One thing which is difference which I can see is mongo version. The better performing machine is of 2.4.3 and worse performing machine is of 2.4.8 version
[16:37:09] <rafaelhbarros> pradeepc_: are the machines with the same hardware?
[16:37:21] <pradeepc_> yes both are AWS c1.xlarge instances
[16:37:48] <abhishek__> hey, guys. DO you think i can query time difference in mongo if i have date format day-month-year stored as string .
[16:38:14] <kali> rafaelhbarros: well, performance between two instance can vary a lot, so...
[16:38:47] <pradeepc_> But this difference is huge .. And I dont see any system parameter which is becoming bottleneck
[16:39:19] <rafaelhbarros> pradeepc_: well, find like that, with no limit, can be the factor, are you reading the second one over the network or directly in the other shell?
[16:39:21] <andrewferk> kali: We were expecting an initial decrease in throughput, but not at the size we saw. And why would mongos be utilizing so much CPU?
[16:39:26] <rafaelhbarros> pradeepc_: is the second one a replica?
[16:39:33] <kali> pradeepc_: is that relevant ? because fetching 44k docs is not really a real-life usecase...
[16:40:19] <andrewferk> kali: also, we noticed that we never set the ulimit for mongos. Could that be a possible issue if we are slamming it with inserts?
[16:40:42] <pradeepc_> rafaelhbarros: there is no replication at all.
[16:40:49] <kali> andrewferk: which ulimit are you concerned about ?
[16:40:51] <rafaelhbarros> andrewferk: well, you need to do that anyways
[16:40:59] <kali> pradeepc_: how did you copu the data from one server to the other ?
[16:41:01] <andrewferk> kali: and our data was splitting correctly, it was just really slow
[16:41:42] <kali> andrewferk: you would probably get better result with a pre-split
[16:42:02] <andrewferk> i think the default ulimits are 1024. We use the recommended for the shards at 64000. But the mongos was still at the default of 1024
[16:49:47] <pradeepc_> ok kali thank you for helping me :) :)
[16:49:54] <kali> also, consider using m1 class machine. mongo is not cpu greedy, but loves its memory
[16:50:12] <kali> you may get better performance for the same price
[16:51:33] <pradeepc_> kali : I will try m1 instances as well and see performance difference
[16:52:29] <kali> pradeepc_: benchmarking on ec2 is hard, you need to bench on several "identical" machines and aggregate the results, or else you'll just get crap
[17:11:20] <andrewferk> OK. It was the only thing I could think of that we didn't do
[17:12:16] <andrewferk> I thought -n 1024 vs -n 64000 was pretty significant when we are trying to do batch inserts, but i really have no clue :)
[17:13:10] <kali> not really. -n is file descriptors, so it has to cover tcp connection and actual files. It would generate errors anyway, not slow it down
[17:14:01] <andrewferk> we were getting about 7500 inserts/sec when we had a mongo setup on a m3.2xlarge ec2 instance mostly using ephemeral ssd
[17:14:30] <andrewferk> without vertically scaling, we would like to see this number to be 10K+ inserts/sec, so that's why were are benchmarking sharding
[17:15:09] <andrewferk> having 2 shards, we were not coming close to 7500 inserts/sec, so we assume we are doing something wrong
[17:15:52] <andrewferk> we backtracked, and discoverd that the issue happens when we setup sharding.
[17:16:11] <andrewferk> we had node-vm inserting into mongo0-vm
[17:16:49] <pradeepc_> one question does mongodump and mongorestore copies indexes as well ?
[17:18:36] <andrewferk> and it knows to only send all requests to a single shard
[17:18:48] <pradeepc_> cheeser: so if i create some index on some custom field.. and take mongodump and restore on different machine the indexes willl also get copied ?
[17:18:51] <andrewferk> mongos uses 50% cpu and slows the insertions/sec
[17:19:02] <cheeser> pradeepc_: try it and see but yes it should
[19:32:39] <coreyfinley> Does anyone know of an easy way to clear the cache for MongoDB? I'm running some aggregations on a local dataset, and I need to benchmark the run times. But after the first one it just returns the copy from memory.
[19:33:39] <cheeser> i'm not sure that's really true. i don't recall ever hearing that aggregation results are cached in memory.
[19:33:58] <cheeser> now, the collection data processed might still be in memory but that's what you want anyway
[19:34:28] <coreyfinley> ruby benchmark results estimate something like 20 seconds for one of my jobs, then when rerunning it, it comes back as around 4. With no changes to the calculation.
[19:34:36] <LucasTT> nevermind, ran it with --repair and worked
[20:17:51] <proteneer> wtf there is a 10MB overhead for each mongodb connection?
[20:25:17] <skot> It is the default stack size (per thread) and configurable
[20:38:45] <andrewferk> kali rafaelhbarros thanks for your help earlier. We are no longer having mongos issues today. We weren't able to figure out the difference from yesterday. I'm guessing it was an incorrect setup
[20:38:54] <andrewferk> Someone else did yesteray, and did it today
[20:38:59] <andrewferk> didn't work yesterday, works today
[20:39:41] <andrewferk> i'm just excited to see it working
[21:57:51] <ctorp> Is there an easy way to target an array index by an attribute on one of the contained objects with mongo?
[21:59:20] <ctorp> I basically have {foo: [{id:1},{id:2}]} and want to do something like db.test.findById where id==2, but hopefully with some mongo internal instead of doing a js loop
[21:59:44] <ctorp> where foo[index].id == 2, I mean
[22:02:22] <skot> You can just issue this query: find({"foo.id" : 2}) using dot-notation
[22:10:49] <ctorp> skot: if I do a lookup based on another property like db.test.findByIdAndUpdate({vers:123}...) can I still do a subdocument update only where "foo.id"==2 ?