[00:48:22] <joshua> Can anyone tell me, when you set up a unique index and tell mongodb to dropdups, how does it decide which ones to keep?
[00:48:41] <joshua> Or if I am asking maybe I shouldnt be using it as a quick way to remove duplicates :)
[00:51:42] <joannac> it keeps the first one it sees, and drops the rest
[00:53:00] <joannac> if you have questions about "how do i tell which is the first one it sees" you should probably do the dropping yourself
[00:57:01] <joshua> Someone restarted a group of app servers and started mongod instead of mongos. What a mess.
[01:00:06] <joshua> The dupes are based off a compound of two fields, I can get those out of aggregation framework at least. Not a huge amount so probably doesn't matter. Client will re-create the data anyway if its missing.
[01:00:40] <quantalrabbit> trying to install mongodb on ubuntu 12.04 with these instructions and it fails with "W: Failed to fetch http://downloads-distro.mongodb.org/repo/ubuntu-upstart/dists/precise/10gen/binary-amd64/Packages 404 Not Found" Any help greatly appreciated!
[01:01:50] <joannac> with which instructions, quantalrabbit ?
[01:02:17] <quantalrabbit> oops... with these: http://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/
[01:08:40] <quantalrabbit> used salt (saltstack) to install place the specify the add the repo in a the file /etc/apt/sources.list.d/mongodb.list but specified "dist: precise" . "dist: dist" works.
[02:38:26] <whatadewitt> hello, is it possible for me to update all records mongo database using mongoose to "multiply" a value in my database?
[02:38:55] <whatadewitt> i basically have a constant that i want to multiply all my values with in a single "update" using mongoose
[02:39:08] <whatadewitt> and apparently i need someone to teach me how to explain these things :D
[05:15:34] <pasichnyk> i'm tring to write some schema migration scripts in the mongodb shell. If i have a document with a list of properties that are country codes (e.g., { _id: "somedoc", cc: { "us": {}, "cn": {} } } ) how can i enumerate the country codes properties?
[05:15:55] <Eckardt> joannac: just wondering if mongo has an api for it, because zorba's (xquery) mongodb interface supposedly maps xquery indexes to mongo's
[05:19:27] <pasichnyk> basically, looking for the equivalent of foreach(doc.cc) { //do something with each country }
[05:22:18] <Eckardt> pasichnyk: for (var attr in doc.cc) { var c = doc.cc[attr]; /* do something */ }
[05:27:53] <pasichnyk> ok, i was making it too hard. Thanks! And in that case attr is just the name of the property, not the acutal property right?
[05:30:27] <Eckardt> yup, ecmascript 6 (mongodb uses ecmascript 5), has for (... of ...) to iterate over values, which is nicer
[05:34:01] <pasichnyk> ok, one last question then i need to take off for the night. If i'm enumerating over these fields, and i want to wipe the contents of a child property (e.g. cc.us.sales). Whats the best way to do this? I was trying to use $concat to build the field name and then doing a seperate update query back to the collection based on the _id, but i'd have to do several of those per document, and i
[05:34:01] <pasichnyk> didn't get $concat working for this. If i go though and modify the document in memory, then call db.collection.save(mydoc), seems like it could have issues with wiping out other changes that are happening (or does it only update the changed fields when you call save)?
[05:36:23] <Eckardt> mutate in memory using the delete operator (ex delete cc.us.sales), then persist the changes by updating as you normally would with mongo
[05:37:25] <pasichnyk> hrm, i just skimmed through http://docs.mongodb.org/manual/reference/method/db.collection.save/, and it looks like the save method could cause issues with overwriting other changes that are happening (these documents get frequent $inc operations on them).
[05:39:26] <Eckardt> would probably be an issue then, rather than having all of this in a single doc, it might be better to be more relational about it?
[05:41:58] <pasichnyk> ah, we're past that at this point. These are reporting summaries that can be pulled and bound to our javascript viewmodels, just accidently some extra properties under each country that i need to clean up. Plus its good to get this schema migration stuff figured out before i actually NEED it. :)
[05:42:35] <Eckardt> ok. if you look at http://docs.mongodb.org/manual/faq/concurrency/, it seems that evaluating (db.eval()) locks
[05:44:50] <pasichnyk> ok, well i can build a list of fields that need to be $unset, then build one db.collection.update() call to do them all at once for the given document. In that case though, i just need to be able to dynamically build/get each field name to add to the list. I was tryign to do this with concat (e.g., $concat: ["cc.", "$country", ".sales"]). is this a good way to go about that, or is there
[05:44:50] <pasichnyk> something easier/better? I dind't get it working, but seems like it should work and was just some syntax issue..
[05:44:58] <Eckardt> so I would believe that your mongo shell script locks mongo?
[05:46:38] <Eckardt> what was the issue with it? why didn't it work?
[05:47:48] <pasichnyk> hrm, yeah maybe. I'm just executing it through robomongo right now, but i didn't check if other things were blocked on the master. reporting site still worked fine, but it prefers secondary reads. If the db.eval() is blocking, i can just do this in chunks (as to not lock everything up for too long) with the delete method.
[05:48:51] <pasichnyk> oh, i was just trying to do this, but it was giving me sytax errors. Didn't work at it enough to get it figured out: {$unset: { {$concat: ["cc.", "$country", ".sales"]}: ""}}
[05:50:24] <Eckardt> yes, because the {$concat...} is supposed to be an attribute name
[05:51:08] <pasichnyk> yeah, tahts the thing. I was trying to build that full attribute name dynamically. Is there a way i can extract it out of the 'country' variable or something?
[05:53:13] <Eckardt> it seems, as far as the the doc page for $unset goes, that it only accepts a static object
[05:53:57] <pasichnyk> hrm, so the delete method is probably the best then...
[05:55:22] <Eckardt> mongodb doesn't support subqueries, but you could have two queries, one for getting the attribute names, then building the update
[05:59:25] <pasichnyk> yeah, or maybe its easier to just do this in some other language, where i can more easily build the list of updates that need to happen for the given document, then apply them all at once and move onto the next. I'm just finding that doing it in javascript in the shell is a pain...
[06:00:12] <pasichnyk> probaly because i just don't have experience with javascript in the mongo shell, but oh well... :P
[06:02:33] <Eckardt> is this something that you will keep doing often? if it is just for now, using delete would be easy to get going.
[06:03:33] <Eckardt> or just use a mongo driver for your favorite lang (unless it is awk or something), if it would be easier for you
[06:05:31] <pasichnyk> um, i imagine we'll need to do similar stuff on one off and/or at/after deployment time (for some schema migrations, etc) now and then. If there was a nice clean way to do this in javascript that didn't make my head spin I figured that was less moving peices, and easier to build new scripts. But i'm already at that point so maybe doing this in another language with the mongo driver would be
[06:07:59] <Eckardt> should probably do that anyway then
[06:10:09] <pasichnyk> i just tested out the delete method, on a sample doc, and it worked well... but likely has the chance of losing other updates. So yeah... I'll look into options for providing some locking around that operation (maybe running everything in a db.eval(), but just very limitted document scope per run, or something), and look at ways to do it in another language that will still give me ease of
[06:10:09] <pasichnyk> quickily building/running new update "scripts".
[06:11:13] <Eckardt> I don't think you would lose other updates, I believe the concurrency doc says javascript blocks, so the other updates will just occur after.
[06:12:40] <pasichnyk> Oh, i guess i'm not (currentlY) running this from javascript directly with db.eval, i'm running it in a cursor.forEach(). That won't block will it?
[06:12:54] <Eckardt> user on stackoverflow says "As a side note, MongoDB actually has more locks than this, it also has a JavaScript lock which is global and blocking, it does not have the normal concurrency features of regular locks."
[06:14:05] <Eckardt> getting more data from a cursor has a read lock
[06:14:12] <pasichnyk> http://docs.mongodb.org/manual/faq/concurrency/#what-kind-of-concurrency-does-mongodb-provide-for-javascript-operations says taht in 2.4 there can be multiple javascript operations
[06:15:26] <pasichnyk> Ok, i'll read up more on locking in mongodb, to get a better idea of how this should all behave
[06:16:03] <Eckardt> ah, I see, in that case I would use $unset instead of delete to not lose the updates
[06:17:19] <pasichnyk> which is what i was trying originally, but back to the problem of not being able to define the field name dynamically. (there are way too many of them potentially, and some are truly dynamic so i can't hardcode a list)
[06:18:11] <Eckardt> break it up into seperate tasks, rather than in a single query.
[06:20:09] <Eckardt> a mongodb aggregate query (that uses $concat) to build the attribute names, then in javascript build a query document for the update, adding the fields to $unset (ex query.$unset['foo'] = ''), and run it.
[06:21:44] <pasichnyk> so just build it all as a string and then run with db.eval, to get around the inability to do dynamic fieldnames?
[06:25:09] <pasichnyk> you konw, i'm goign to want to do automatic schema upgrades on the fly on documents at some point (at access time) so i'll need to do this in another language anyway. I'll bite the bullet and dive in using hte mongo driver, with a language i'm more comfortable with.
[06:25:28] <Eckardt> however way you want to evaluate it, in shell, etc. so, store the result of db.foo.aggregate({$project: {attr: {$concat: ["cc.", "$country", ".sales"]}}}, iterate over each of the results, building an update object (ex var unset = {}; ... unset['cc.us.sales] = '';) then db.foo.update(..., {$unset: unset})
[06:26:19] <pasichnyk> its way past my bedtime, so i'm going to split. Thanks again for your help, and have a great night. I'll try out that example method in the AM. :)
[06:26:56] <Eckardt> no problem, you too. hope it works, or just use something you are more comfortable with. :)
[06:27:13] <pasichnyk> yeah either way, i'll give in tomororw. Cheers.
[08:38:26] <laprice> Which npm module is the best for talking to mongodb?
[08:55:50] <scottyob> Howdy all. Is it possible to aggregate on my "community" fields in this kind of dataset? { "_id" : ObjectId("531ea4acf4250d104e713dfc"), "communities" : { "58698:100" : 69, "58698:101" : 10, "58698:102" : 40 }, "date" : "2014-03-11", "username" : "scotto" }
[09:46:22] <Diplomat> Guys, is there some kind of web GUI for mongodb ?
[10:09:35] <intrepid-> I went through some of it early last year
[10:10:57] <Nodex> easier to learn from Google imo
[10:11:43] <intrepid-> Depends how fast you need to dive in. Was good in my case since I had no immediate requirement. Just wanted to learn a little at a time.
[10:13:21] <Diplomat> the thing is.. I'm not used with so simple things
[10:15:38] <Nodex> it can handle it - if your hardware can handle it
[10:15:42] <intrepid-> you've lost us with the "festival"
[10:16:10] <Diplomat> i know one guy who gets high and then he just sits quietly and just laughs.. and his eyes are sraight lines.. so we call him mongo festival and somehow everytime when i hear this name "mongodb" i think about this guy
[11:22:29] <Nodex> in the event a user wants to see more comments then the join happens but it's infrequent, you get speed and scalability for documents but you trade space and managment difficulties
[11:22:38] <Diplomat> http://puu.sh/7rfgy.png guys.. any ideas why my query doesn't return anything?
[11:23:05] <Diplomat> it should return ~900 arrays
[11:23:50] <Froobly> "it really depends on how you intend to access the data" <-- so you weren't talking about joins as one of the options here Nodex?
[11:24:10] <Froobly> if not, what would i be deciding between?
[11:24:32] <Nodex> Froobly : I mean more along the lines of access patterns
[11:25:28] <Nodex> in the blog post example if the post itself inc the embedded comments were accessed more than the rest of the comments then this makes sense
[11:26:01] <Nodex> if your app finds that one more page of comments are accessed then it makes more sense to embedd more comments into the original post to save the extar uery
[11:26:16] <Froobly> right, so if all the items are almost always viewed with the lists then shove them all in the same collection
[11:27:06] <Nodex> it's really down to your app, you know it better than we do, we can't comment what might be best
[11:27:29] <Froobly> just wondering if my thinking is getting any better, trying to change my thinking from sql
[11:27:41] <Nodex> if your aim is performance then you want the path of least resistence (the path with the least number of queries)
[11:28:40] <Nodex> here is another example. I have an email alert list, a user creates a list, in relational DB's you might just store the UID and do a join, whereas I would store the UID, name, email, other info I needed to send out the alert
[11:29:24] <Nodex> that does mean that if the user changes their email that I have more collections to update but that's less frequenct than the reads needed for the alert
[12:03:50] <intrepid-> yes. But keep in mind that you are not replicating yet. And once you are, you need to consider how consistent you want the writes to be
[14:06:23] <Number6> nicktodd: doesn't --src need an IP address?
[14:07:14] <nizze> I have a really huge dataset (64million records) and I'd like to reduce it to a smaller dataset. If the dataset would be simpler I would first group by record's url and then by records's created-at, split it to 15min intervals and keep each interval's first record.
[14:07:34] <nizze> But now it seems that ordinary mongo tools do not cut it any more.
[14:17:19] <nizze> Nodex: yes, but I hit all sorts of memory limits on that one also. I'd be happy to take a batch of 1000 * 100 records and process them.
[14:17:37] <nizze> And then move to next 1000 * 100
[14:19:26] <Nodex> you're going to have to map/reduce I think because grouping and or aggregation will hit the memory llimits
[14:25:00] <nizze> When returning the results of a map reduce operation inline, the result documents must be within the BSON Document Size limit, which is currently 16 megabytes.
[14:25:25] <xerxas> Hi ! I came accross this code: https://github.com/mongodb/mongo/blob/master/src/mongo/db/geo/geoquery.h#L76
[14:25:44] <xerxas> Can I use s2cell to query my data ?
[14:26:04] <xerxas> couldn't find any doc on this , but I want to sort my results by s2cells
[14:26:29] <Nodex> nizze : then you will have to pipe the map/reduce back to a collection or something then. It's not something I know a great deal about
[14:26:30] <nicktodd> Number6: something like this: -A OUTPUT -p tcp -d mms.mongodb.com --dport 443 -j ACCEPT ?
[14:29:25] <nizze> Nodex: thanks for your help, I'll check the map/reduce anyways
[14:46:51] <bodie_> looking for a little guidance with mgo if anyone is familiar with it
[14:47:11] <bodie_> i'm trying to marshal into a struct that has a slice in it...
[14:48:04] <bodie_> basically, to get a document that has a list of strings as one of the fields... in go, which deals with these things in a slightly weird way
[14:48:34] <kAworu> I'm trying to use the $where operator in Node.js with a context but I could not find how.
[16:14:36] <Nodex> (I have never tried this but).... {field : {$in [false,null,""]}} ..... might cover use cases better and I am pretty sure that $in uses an index
[16:15:08] <wc-> i mean this code is looking for exactly a boolean query like field: false or field: true
[16:15:17] <wc-> and rewriting it to field: {'$ne': opposite}
[16:23:31] <wc-> Nodex: thats a good point, these fields are eihter true or false based on schema validations, if we have a null or some other value then something else has gone wrong
[16:23:53] <wc-> so if the field is strictly true or false, it seems like converting x: true to x: {'$ne': false} and missing indexes is a bad thing?
[16:26:35] <Nodex> wc- if you're certain of the values then you can do false but if you're not then might make sense to include the rest
[16:26:50] <wc-> in this specific case we are certain
[16:26:59] <Nodex> and yes, if you know the value is true or false then that conversaion is a very silly thing
[16:27:15] <wc-> and it gets included in every. single. query.
[16:27:20] <Nodex> in his defence he might not have known that it didn't use an index
[17:08:14] <Gargoyle> zumba_addict: MongoDB doesn't have a rigid schema like a relational database. You can have documents with different structures in the same collection.
[17:08:45] <gsd> can someone point to me to a resource that explains the relationship between servers and replica sets? they each have pooling and im not sure how many pools/conns get created when you have a server pool of 5 and a replSet pool of 5
[17:09:33] <gsd> if i have a 3 mongod's (1 primary, 2 secondaries) and my server conf has pool size 5 and my replSet has a pool size 5, how many pools and connections do i have
[17:33:55] <zumba_addict> sorry folks, got invited to grab some food outside. I'm back now
[17:34:18] <zumba_addict> yes Gargoyle, I meant documents
[17:38:03] <zumba_addict> looks like it recreated it on my other mongod :)
[17:38:12] <zumba_addict> all I did is insert a new data
[18:01:59] <MacWinner> when setting up mongo for the first time in an environment that you expect to grow in the medium term, should you just setup a basic replica set?
[18:02:17] <MacWinner> is that the best starting point to expand to the 11 node super HA cluster config?
[18:02:59] <MacWinner> I currently have a single mongodb node that I've been testing stuff on.. it's not high volume at all, but I want it to be as close as possible to the future state
[18:04:42] <starfly> MacWinner: your mileage will vary, the more setup done earlier, the better you'll have a feel for how the environment will play when you get to the super config, but setup is work...
[18:05:28] <MacWinner> starfly, so basic replica set config at a minimum? or would you have a higher minimum config?
[18:05:57] <MacWinner> according to the docs, it seems like it's a very simple task to upgrade a replica config to sharding and clustering
[18:08:28] <starfly> MacWinner: from my perspective, a replica set is primarily needed to provide data redundancy and horizontal scaling of reads. In any case, it's not a big deal (assuming outage is OK) to reconfigure into a replica set. Sharding takes more design time to insure you will probably get what you want from it, which is mostly horizontal scaling of writes. Shard key selection is "key," pardon the pun.
[18:39:25] <enjawork> hello, i'd like to setup an architecture where i have a "replica" of my production database that i can run aggregations on periodically without interfering with production performance
[18:39:47] <enjawork> is there some way i can use replica sets for this? or do i need to do a custom data pipeline to get data from my production db to my "reporting" db
[18:39:59] <cheeser> you can do aggregations against a secondary
[18:40:19] <enjawork> cheeser: and it will write to the collection in the secondary or in the primary?
[19:02:52] <joannac> switch to replication and you can
[19:02:55] <leifw> ernetas: you should probably use replica sets
[19:03:43] <joannac> master slave i believe you can't sync slaves off other slaves
[19:05:06] <ernetas> leifw: can I then control which host is a slave and which is a master? I need to lock this, so I could perform backups...
[19:05:38] <leifw> you can use priority to control which is the primary
[19:06:40] <leifw> and if you want to chain a secondary off another secondary I'm not quite sure how to force that, but if the network ping times suggest that's better I think the other secondary will elect to sync from the first secondary
[19:10:40] <betty> I have a data modeling question (new to mongo). I see my data more as a many-to-one relationship as opposed to a one-to-many. I have a lot of documents (many) that will share common meta-data (one). I'm currently including that meta-data in each document so that I can find these documents by the meta-data properties in a single operation. This will be a common in our queries.
[19:12:00] <betty> I'm interested in any thoughts about why this might be bad. And what an alternative may be. Obviously, it's bad if we ever need to modify the meta-data
[19:12:30] <rafaelhbarros> betty: well, you will just have to change all the documents with that metadata...
[19:13:54] <betty> rafaelhbarros: I maybe be able to live with that, given how unlikely we believe it may be. If it does happen, we're talking about 1600 documents on average.
[19:14:13] <starfly> betty: just continuing embedding if you're OK with updating metadata changes or link (logical join) to a metadata collection
[19:15:08] <rafaelhbarros> betty: 1600 documents is nothing, if you ever have to update that and it's not something realtime it would take a minimum effort
[19:15:51] <betty> rafaelhbarros: that was my hope
[19:17:03] <betty> rafaelhbarros: I do want to index on this metadata
[20:47:38] <ernetas> Damn. Sorry, my terminology is still from MySQL... It should be a secondary (priority=0, hidden=true, delay - 2 hours). What seems to be the problem is that it has been in state STARTUP2 ever since launching. HDD load has calmed down by now.
[20:49:37] <bretep> Hi guys! Anyone have access to the MMS Support jira project? I have an issue open and was hoping for some help.
[20:50:07] <ernetas> I added a clean host to replica set and it should have been syncing, but it does not seem to be... it only allocated space, has been doing some random writes for the past couple of hours and now calmed down... Is there a way to see status of this?
[20:51:16] <joannac> ernetas: okay, so it's not a secondary, hence the message
[20:51:23] <joannac> ernetas: check the mongod log
[20:51:56] <ernetas> What should I be looking for? There are 26000 lines by now.
[20:52:28] <ernetas> Lots of lines about a connection initiated by the primary host, checking for a heartbeat.
[20:53:12] <ernetas> "[rsSync] 12955486 objects cloned so far from collection xxx" - is there a way to see status of this?
[20:53:46] <ernetas> (apart from grepping the log)
[21:53:01] <abonilla> Does anyone run an elastic mongodb cluster here? how about private and elastic?
[22:47:23] <hoverbear> Hi all, does Mongoose have race conditions?
[22:47:53] <andrewfree> Can I do querries like this? UserMonth.first.providers.where(:name.in => ["moves","jawbone"]).all.days.where(:date.gte => "2014-03-10").to_a My UserMonth has many providers, providers embeds days
[22:48:01] <andrewfree> hoverbear: Did you ask it?
[22:49:17] <andrewfree> My querry pretty much works if instead of name.in I just use :name and a single "moves" then first.days.where etc...
[22:49:48] <andrewfree> but I'm not sure if there is a way to iterate/move through the two things it should return.
[22:54:34] <ernetas> Seems like I'm stuck at STARTUP2 phase...
[22:55:06] <ernetas> It has been rebuilding index (foreground job from primary) for a multitude of times...