[04:01:19] <cagmz> How do I return a document with the min and max for a key? I know min is db.cities.find().sort({pop: 1}).limit(1) , and another query is for max, but I need both min and max in a single query
[04:27:22] <Croves> is it possible to create collections like these in MongoDB? http://pastie.org/10797174
[04:27:36] <Croves> I mean, using untitled subarrays inside a key
[04:42:16] <Boomtime> @cagmz: why do you need the max/min in a _single_ query? what's the difference? - you could probably use aggregation though, not sure if it's possible to do in a single query without some forward knowledge
[07:37:57] <kurushiyama> mroman: Ok, here is what I would do
[07:38:12] <kurushiyama> mroman: a) Get rid of the HT-Defrag warning
[07:38:33] <kurushiyama> mroman: b) Add swap. Some 2GB at least
[07:38:59] <kurushiyama> mroman: Optionally, give more RAM instead. 4GB + 1GB swap
[07:41:36] <kurushiyama> c) Do _not_ fiddle with memory settings until you are explicitly told so. Overcommitting combined with WT... ...unexpected and unpleasant results.
[07:47:31] <kurushiyama> mroman: Uhm, and we were talking of an allocation of some 27kB while dealing with a request. May well have been a point in time you could not capture the free RAM.
[08:19:22] <quattro_> is there anyway to set the initial sync source when adding a replicaset member?
[08:19:56] <Boomtime> rs.syncFrom will work if you're fast enough
[08:20:15] <quattro_> Boomtime: any other way I've tried like a 100 times already :(
[08:20:32] <quattro_> yesterday I succeeded once but now it's starting to seem impossible
[08:21:19] <Boomtime> yeah, it's a stupid race - do it anyway, but if it fails, go to the member it has locked on to, do a currentOp to get the operation the sync'ing member is doing and kill it - the sync;ing member will re-acquire using the syncFrom directive it has queued
[08:25:32] <kurushiyama> Hm, when the member s started and not added to the replSet, yet, could one preemptively set "rs.SyncFrom"?
[08:28:17] <quattro_> kurushiyama: no you have to authenthicate first
[08:28:38] <quattro_> it's almost impossible, i tried bashing the server with syncfrom command but no go
[08:28:52] <quattro_> rs.add should really have the option to set initial sync source
[08:30:22] <kurushiyama> quattro_: I agree. I can see the reason why the default behavior is to sync from P, but there should be an option.
[08:31:30] <quattro_> kurushiyama: it's not even syncing from primary it's syncing from a secondary
[10:04:27] <Keksike> I'm getting an error: "Could not authenticate with MongoDB. Either database name or credentials are invalid." when im using Mongo 3.2 through Clojure/Monger. The database with that name exists, and a user with that username exists.
[16:00:46] <philipwhiuk> Hi - a colleague of mine tried to download and install MongoDB on CentOS but came up against an error with the GPG key used to sign the RPM. It appears to be signed with the .com domain instead of .org. Does anyone know where I'd file a bug on this.
[16:00:56] <philipwhiuk> (we used --nogpgcheck for the time being)
[16:56:25] <StephenLynx> $and is for other places where it isn't.
[16:57:53] <StephenLynx> >MongoDB provides an implicit AND operation when specifying a comma separated list of expressions. Using an explicit AND with the $and operator is necessary when the same field or operator has to be specified in multiple expressions.
[16:58:17] <StephenLynx> for example, you wish to define a range for a field, like a date
[16:58:40] <StephenLynx> so you use $and to say the field must be less than X but greater than Y
[17:13:30] <kurushiyama> chinychinchin: Maybe you should describe what you want to achieve (from a high level perspective), and we might find a solution to your problem.
[17:13:49] <cagmz> I really like mongodb. my database course just got us into it after working with mysql
[17:14:07] <cagmz> I like how the data is easily accessible without joining n tables
[17:14:09] <kurushiyama> cagmz: expand on university.mongodb.com
[17:14:15] <chinychinchin> kurushiyama: we have three shared clusters that i want to migrate to the one new nonsharded replicaset in AWS -
[17:15:21] <chinychinchin> the existing uses 2 nodes + aribiter in each cluster - there are 3 clusters - mongodb 2.6 is used
[17:15:42] <StephenLynx> yeah, mongo really takes a lot of overhead out of the way
[17:15:48] <kurushiyama> chinychinchin: The new replset is big enough in terms of disk space, RAM and (most important to triplecheck) IO?
[17:15:50] <StephenLynx> "give me data, ill store it and you can read it"
[17:15:57] <chinychinchin> i want to migrate painlessly to mongodb 3.2 - i have 3 highly speced out aws servers ready to go
[17:17:20] <chinychinchin> i will be using wiredtiger - the application is read heavy about 65-35
[17:17:53] <kurushiyama> chinychinchin: Well, there was a reason why you sharded. The procedure is straightforward: Add the new replset to the sharded cluster, then remove each of the old shards one by one. When all but the new shards are removed, stop the whole thing, ditch the config servers and connect to your shiny new replset.
[17:18:43] <cagmz> for upsert ( https://docs.mongodb.org/manual/tutorial/modify-documents/#specify-upsert-true-for-the-update-specific-fields-operation ) ... if user wants to modify doc and update fields, but no doc is found, is a new doc created with only those new fields?
[17:21:02] <StephenLynx> I can only wonder how messed up it was
[17:21:13] <StephenLynx> "hey, we have this cluster running some mission critical stuff"
[17:21:19] <StephenLynx> "give it to the new guy to migrate it"
[17:21:32] <cagmz> for doc upsert ( https://docs.mongodb.org/manual/tutorial/modify-documents/#specify-upsert-true-for-the-update-replacement-operation ) , what happens if a matching doc is found? are all the fields replaced by the update?
[17:21:50] <cheeser> they are mutated by the update, yes
[17:21:50] <chinychinchin> StephenLynx: very messed up - its running on a vmware platfor with no memory to spare for the dataset
[17:22:03] <StephenLynx> if you use $setOnInsert, no
[17:22:20] <kurushiyama> chinychinchin: Let me put it that way: I am a certified MongoDB DBA, and I would double or triple check _everything_ up and down before doing that stunt. PLUS: It is very, very hard to get dimensions right, especially for a beginner.
[17:22:36] <StephenLynx> oh yeah, that isn't using neither.
[17:22:41] <StephenLynx> it will just set the whole document as that.
[17:22:42] <cagmz> what happens to fields that are already in the doc?
[17:23:45] <chinychinchin> kurushiyama: thanks for the warning - i appreciate honesty
[17:24:17] <kurushiyama> chinychinchin: Well, it would not help anybody much if you screw this.
[17:25:04] <chinychinchin> kurushiyama: i butt will be grass if i do to
[17:25:42] <chinychinchin> ill do the research - i have automated the setup of mongodb so i can play around in a nonprod environment
[17:25:51] <kurushiyama> chinychinchin: Ok, please get the terminology right, asap.
[17:25:51] <chinychinchin> kurushiyama: thanks im out
[17:26:01] <kurushiyama> chinychinchin: Wait, I have some tips.
[17:26:52] <kurushiyama> chinychinchin: a) Use MMS/CloudManager. You need to have hard facts before you should even think about migration. I am referring especially to IO.
[17:28:20] <kurushiyama> b) If you want to do that without downtime, you are going to need _a lot_ of time, since there can only be one chunk migration at any given moment across a cluster.
[17:29:12] <kurushiyama> a sharded cluster, that is. Sharded cluster = config servers + 1 or more mongos (query routers) + 1 or more shards (either standalones or replsets)
[17:31:09] <kurushiyama> c) Depending on how hard the requirement for no downtime is (for example you do not want any UX impact), you should set a balancing window.
[17:31:12] <cagmz_> in this query, is "details" field a sub-document? https://docs.mongodb.org/manual/tutorial/modify-documents/#use-update-operators-to-change-field-values
[17:31:41] <kurushiyama> cagmz_: nope, field in tl doc
[17:32:47] <cagmz_> I have a city collection with individual cities as docs. If I want to update a city with info, but I'm not sure if the doc even exists, would it be prudent to use update() and $set to not only set the new fields needed (if doc exists), but also set defaults (if the doc isn't found, with upsert)? like this: https://gist.github.com/cagmz/09f7868e4d343f948ccd9505c8a3c449
[17:33:42] <kurushiyama> cagmz_: This structure seems familiar... somehow...
[17:37:46] <cagmz_> My collection has more than 1 document for a city, so I would like to update all documents with the same visit date. This query was intended for different cities, each with different default values and visit dates
[17:40:57] <kurushiyama> cagmz_: So you have sort of a point-in-time structure?
[17:41:28] <cagmz_> I'm not sure what that means but I made a comment with results: https://gist.github.com/cagmz/09f7868e4d343f948ccd9505c8a3c449
[17:44:39] <cagmz_> what is this field? loc: [ -118.986648, 34.22179 ] . is it an array?
[17:45:06] <kurushiyama> I just saw it. The thing is that depending on your data model, the query may either be correct or not. As StephenLynx said: you have to make a distinction between $set and $setOnInsert. With the way you have it now, if there was a document matched with a different _id (which may well be, since you have multiple docs / city), MongoDB would cause an error, since _id s are immutable.
[17:45:25] <kurushiyama> cagmz_: May be a lat/lng notation for geospatial data.
[17:46:12] <kurushiyama> cagmz_: But yes, in general, it is an array.
[17:46:44] <cagmz_> do I set it the same way, or does it need quotes? eg loc: [ -118.986648, 34.22179] loc: "[ -118.986648, 34.22179]"
[17:59:34] <kurushiyama> cagmz_: so {foo:"a"} and {"foo":"a"} are equivalent, while {"foo.bar":"a"} and {foo.bar:"a"} is not, with the latter causing an error.
[17:59:54] <kurushiyama> cagmz_: I still do not get why you have multiple docs for the same city.
[18:01:53] <cagmz_> in regards to the contents of values of fields, not field names
[18:02:35] <kurushiyama> cagmz_: field specifiers need to be quoted, in case they contain a ".", which has a special meaning
[18:04:00] <kurushiyama> cagmz_: other than that, your statement would be more precise as "For values, only strings must be quoted", which still is a bit of a simplification, but sufficient.
[18:04:29] <cagmz_> ok, since mongodb actually uses data types (like integer) for comparisons?
[18:05:41] <kurushiyama> cagmz_: It can. But it is not only about comparisons. Within aggregations, you might want to do calculations.
[18:06:36] <kurushiyama> cagmz_: The basic reason most likely is that it takes less space to save 1234567890 as an integer than "1234567890" as a string.
[18:13:23] <cagmz_> is it convention to write queries all on one line vs adding whitespace?
[18:14:02] <kurushiyama> cagmz_: What I do is to make it _readable_, when I have to
[18:36:08] <shlant> hi all. Is there a best practice for where to create users? I was creating them in the db that they have permissions for, but then I have to specify authSource every time. I was thinking it makes sense to just create all users in the admin DB for easier administration. If i'm not needing the associated users to be in my db dumps, would having them all in admin be considered bad practice?
[18:49:55] <StephenLynx> you can embed it manually, but mongo won't handle that relation.
[19:00:54] <cagmz_> how do I reference the other document though? It's in another collection. do I need to specify the fields that I want, or can I embed all? I'm trying to do something like this: https://bpaste.net/show/cb78a49a76ad
[19:04:48] <StephenLynx> you fetch the document and embed on another document manually.
[19:05:23] <StephenLynx> you can use a dbref, but these are just syntactic sugar and will be solved at the driver.
[19:06:56] <cagmz_> what do you mean by manually? I'm having trouble finding any examples
[19:12:11] <cagmz_> what i want to embed a reference only? if my _id for a state is "IOWA", can I just insert state: "IOWA" into a city?
[19:12:57] <cagmz_> coming from mysql, foreign keys are checked for validity, and im wondering if mongodb does the same thing (checks to see if IOWA is in the state collection?)
[19:20:35] <cagmz_> ah, ill have to use dbref. im have no application
[19:23:46] <bhang> is it safe to add an index on system.indexes, given that it's a system collection ? MongoDB 2.6.11
[19:24:30] <bhang> I'm seeing a lot of slow-ish collection scans against this collection while running indexing operations
[19:25:40] <StephenLynx> mongo does not implement relations at all.
[19:37:53] <kurushiyama> StephenLynx: Well, we could argue that $lookup does implement sort of relations. bhang DONT. Simply DONT.
[19:38:59] <bhang> kurushiyama hence the question :-)
[19:39:46] <StephenLynx> I still need to catch up to $lookup though
[19:41:00] <kurushiyama> bhang: either do index creations in foreground (which tends to make them faster) or put them into background, which takes longer, but does not lock. fiddling with anything which is close to the system is a good way to get you into trouble.
[19:42:21] <bhang> understood. these ops were in the background but something seemed to be causing a lot of >100 ms collscans on the system.indexes collection to show up in the log
[19:44:05] <kurushiyama> StephenLynx: Well, it is basically just a convenience thingy, if you have narrowed down an aggregation to just a few docs and an according $in lookup would be more expensive or inconvenient. You can expect horrible results with an early $lookup.
[19:44:53] <kurushiyama> bhang: Maybe you already _are_ in trouble?
[19:46:08] <bhang> no, it's cool. It was definitely related to my index creation but that's done now so all good.
[19:46:43] <bhang> I'll see if I can track down why those queries were slowing down so much
[19:47:33] <AAA_awright> I'm running into an issue where mongod is constantly coming up 120-150% CPU usage on my server even though there's no active connections or open tasks
[19:47:49] <AAA_awright> I don't know what to look at next to figure out how to fix this
[20:10:34] <kurushiyama> AAA_awright: Well, for starters, how do you determine said utilization?
[20:11:26] <AAA_awright> kurushiyama: Very high system load on a system that shouldn't be doing anything, CPU usage numbers come from `top`
[20:11:54] <kurushiyama> AAA_awright: And then, the rough dimensions would surely help. Even an seemingly idle MongpDB can use quite some CPU cicles.
[20:12:28] <AAA_awright> An entire core constantly for a week, at least?
[20:13:52] <kurushiyama> AAA_awright: Well, you sure that it is only MongoDB? Let's start with http://unix.stackexchange.com/a/34436 , just to make sure.
[20:15:13] <kurushiyama> And then, I need some dimensions. That'd be clock speed, pysical RAM and swap size, data size, MongoDB version, storage engine and last but not least OS and version.
[20:16:14] <kurushiyama> And while we are at it: What is the IOWait percentage?
[20:19:45] <kurushiyama> Oh, and filesystem type would not hurt, either. And wether MongoDB has it's own partition or not.
[20:21:53] <AAA_awright> Gimme a minute, trying to pull out my collectd metrics first
[20:46:28] <AAA_awright> It's doing barely any I/O operations
[20:50:34] <kurushiyama> Hm. I am not too sure here. In general, I can not confirm said behavior. My assumption is that during high load times, there is high IOwait. Hmm. Does /var/lib/mongodb have its own partition?
[20:51:00] <AAA_awright> collectd reports this started 6 days ago, which wasn't around an upgrade
[20:51:46] <AAA_awright> kurushiyama: No, running on / mount, ext4 with software raid
[20:53:05] <AAA_awright> Two, I believe, it's been a while since I touched them
[20:53:18] <AAA_awright> I don't have any reason to believe it's I/O related, it does actual insert and queries just fine
[20:53:31] <AAA_awright> It's just on top of that all, it's also eating up 120% CPU
[20:53:58] <kurushiyama> Well, in general mirroring has the problem of seek times, worsened by RAID1. Actualy queries most likely are served from RAM, given the data size.
[20:54:20] <kurushiyama> Which is what makes me wonder.
[20:54:56] <AAA_awright> This gets almost no activity, it's pretty much insert-only unless I'm testing the program I'm working on (which works fine)
[20:54:58] <kurushiyama> The problems are just for writing, ofc
[20:56:23] <kurushiyama> Dang. I just ask because it is getting late here (+1) and wanted to ask if we can continue tomorrow.
[20:56:44] <AAA_awright> Ah, yeah sorry I wasn't quicker
[20:56:51] <AAA_awright> Feel free to ping me, I'll be here
[20:57:10] <AAA_awright> I'll also have an upgrade to 3.2.4 ready, maybe that fixes it
[20:57:33] <AAA_awright> Much appreciated kurushiyama
[20:57:48] <kurushiyama> AAA_awright: You are welcome. tty tomorrow!
[21:17:54] <AAA_awright> Running in the foreground -vvvvv, mongod is outputting this nonstop: 2016-04-14T17:08:39.084-0400 D JOURNAL [durability] Processing commit number 93761 / 2016-04-14T17:08:39.084-0400 D JOURNAL [durability] groupCommit end / 2016-04-14T17:08:39.084-0400 D JOURNAL [durability] groupCommit begin
[23:20:38] <AlexZan> If I wanted to develop a MEAN stack app, would it be best to start with the mean.io package? some other package, or putting it together manually. Just looking at realising a consensus :)