[03:10:48] <edrocks> if you have a compound index on two fields but you will be querying by both fields or either one, will the single field queries use that index? or do you need separate indices?
[03:11:53] <cheeser> it'll use the index if the field you're querying against comes first in the index.
[06:23:07] <terlar> I have created an admin user on the PRIMARY mongo database, and it is synchronized to the SECONDARY but not the ARBITER. What can I do in this case to make sure it gets synchronized to the arbiter also? What could possibly have gone wrong?
[06:41:56] <Boomtime> @terlar: arbiters do not store any user facing information, including credentials, nothing has gone wrong
[06:43:06] <terlar> But how come I cannot use that information to connect to the database on that machine with that user? While it works for the others
[06:45:40] <terlar> I get auth failed on the ARBITER using the information for that user
[06:53:51] <Boomtime> @terlar: credentials are replicated the same as any other user data - arbiters store no data, they store no credentials, so there are no credentials to authenticate you with - why are you logging in to an arbiter anyway?
[06:54:58] <terlar> its for monitoring purposes, but I guess it should be enough to have it on the primary
[06:55:49] <Boomtime> monitoring an arbiter can just use the isMaster command, no auth required
[06:56:16] <Boomtime> try it; connect to an arbiter in the shell with no auth, run db.isMaster()
[06:57:04] <Boomtime> that command is how drivers discover the replica-set, using that meagre information you can generally get to the right place - and there is little other information that an arbiter actually has
[07:01:42] <terlar> thanks, that explains everything, I guess I will just stick to regular memory and CPU monitoring on those machines
[10:15:30] <renaissancedev> I'm running into an issue where if I authenticate using the mongo shell then I can execute a command against the target database. However, if I use pymongo and authenticate I'm getting an error with the same command.
[10:15:38] <renaissancedev> Any ideas of what the issue could be?
[10:16:27] <renaissancedev> For reference, I'm using this approach to authenticate in pymongo: https://github.com/mitodl/salt/blob/06f249901a2e2f1ed310d58ea3921a129f214358/salt/modules/mongodb.py#L63
[10:16:54] <renaissancedev> In mongo shell I'm using the command `mongo --username <name> --password <password> --authenticationDatabase admin`
[10:17:16] <renaissancedev> The test command that I'm executing is `db.version()`
[10:17:32] <renaissancedev> When I use pymongo I get the error: OperationFailure: not authorized on admin to execute command
[10:22:24] <kurushiyama> renaissancedev Well, It looks like <name> lacks the sufficient privileges ;)
[10:23:31] <renaissancedev> That's the issue, I'm authenticating as my admin user which has the 'root' role
[10:23:46] <renaissancedev> And I'm even testing this in the admin database which is where that user was created.
[10:24:22] <renaissancedev> And when I run the same command in the mongo shell it works without any problem. It's only in pymongo that it throws an error.
[10:29:52] <kurushiyama> renaissancedev otoh, I'd check the auth mechanism and driver compat.
[10:34:48] <renaissancedev> I'm using Mongo v2.6.12 and pymongo version 3.3.0
[10:35:44] <kurushiyama> renaissancedev Side note: Update! ;)
[10:36:08] <renaissancedev> Yeah, I'm stuck with that version of Mongo because of compatibility issues with the appication that's using it
[10:36:21] <renaissancedev> I would happily use a newer version given the opportunity
[10:37:02] <renaissancedev> The auth mechanism should be implicitly using MONGODB-CR, though I also tried setting that explicitly without any change in outcome.
[10:37:32] <renaissancedev> If the answer is just 'use an older pymongo' that's fine, I can do that more easily than upgrading Mongo itself.
[11:19:40] <magicboiz> Hi, I have some aggregation queries with very big numYileds. Any help on how to debug it? :)
[11:26:33] <kurushiyama> magicboiz Uhm, what would be the bug?
[11:28:12] <kurushiyama> magicboiz We may be able to optimize your aggregation, however. Please pastebin.
[11:55:06] <Keksike> does having expire/TTL in a index add overhead/heaviness for the db? if I lets say have a million docs which all have a TTL
[12:07:43] <magicboiz> kurushiyama: I'm running openstack and ceilometer service which stores metering data into mongodb. I get queries like this:http://pastebin.com/jrkWJwyW
[12:09:18] <magicboiz> kurushiyama: I have queries with numYields:42830 for example, but my server is running with 24GB RAM and SSD disks....
[12:09:58] <kurushiyama> magicboiz yields have little to nothing to do with your hardware.
[12:10:44] <kurushiyama> magicboiz The ops are of little interest here – we need to see the actual queries.
[12:12:41] <magicboiz> kurushiyama: I thought there some relationship between high numYield and slowness... ok. In the pastebin I put above, there are some queries logged by mongodb (which is running with profile - log slow queries)
[12:14:40] <kurushiyama> magicboiz The thing is that yields are necessary in case the document which is supposed to be processed next (or the collection, depending on your storage engine) might be locked.
[12:17:28] <kurushiyama> magicboiz Slow aggregations usually come through collscans – that might be the case because either the whole collection needs to be processed or the aggregation query is lacking what is called an "early match" (a match on an indexed field in the first stage of the aggregation _after_ it was processed by the query optimizer)
[12:19:18] <magicboiz> kurushiyama: with my current profile level, all I've got is a log like "....keyUpdates:0 numYields:11664 locks(micros) r:245425709 reslen:284 140423ms"
[12:19:59] <kurushiyama> magicboiz Actually, since you are using a third party software, you should ask them, imho.
[12:20:30] <kurushiyama> magicboiz As far as I can see, it is the aggregation causing the problem, not MongoDB itself.
[12:21:40] <magicboiz> kurushiyama: I did, but I don't find them very comfortable debugging mongodb, this is why I decide to investigate by myself :)
[12:23:27] <kurushiyama> magicboiz Well, you need to dig into the sources then to find the aggregation. Identifying the root cause of an aggregation being slow from the metadata only is like trying to find out how old the driver is by analyzing the debris after a car crash.
[12:36:37] <kurushiyama> magicboiz Please pastebin the output of http://pastebin.com/8gvb65zx put into the mongo shell (you should be able to c&p it). As far as I can see, the indices are f...ed up. Oh, and next time: "use ceilometer; db.meter.getIndices()" is a bit more readable ;)
[12:38:17] <magicboiz> kurushiyama: ok, thx. Let me connect to the server....
[12:53:42] <magicboiz> kurushiyama: what indices should I create (at least to test it)?
[12:53:42] <kurushiyama> magicboiz depends on your number of docs, the data model and whatnot. It is not easy to predict, if at all.
[12:54:39] <kurushiyama> magicboiz As a rule of thumb, only one index is used per query. So we need a compound index incorporating _all_ fields involved in the $match stage of the aggregation.
[12:55:40] <kurushiyama> magicboiz Another rule of thumb: order matters within compound indices. So db.foo.createIndex({foo:1,bar:1}) does not produce the same result as db.foo.createIndex({bar:1,foo:1})
[12:57:45] <kurushiyama> magicboiz Interstingly enough, the query has a reverse order on the timestamp range
[12:58:16] <magicboiz> kurushiyama: thx a lot for your help :)
[12:59:12] <magicboiz> kurushiyama: so I would need to create a new index like db.ceilometer.meter.createIndex({timestamp:1, counter_name:1})??
[12:59:37] <kurushiyama> magicboiz So I would create an index with db.meter.createIndex({timestamp:-1, counter_name:1,project_id:1, resource_id:1},{background:true})
[13:00:10] <kurushiyama> magicboiz The search order on the timestamp seems to be reversed, hence the -1 here. and the background:true is imprtant. Otherwise, your collection will lock!
[13:01:59] <magicboiz> kurushiyama: ummm ok. If I create this index, how could I monitor the RAM usage for this index?
[13:02:37] <kurushiyama> magicboiz This should cause the $match stage to be lightning fast. Depending on the traffic on your db, there still might be a lot of yields, but at least we eliminated the scanning of all docs with the given resource id.
[13:02:38] <kurushiyama> magicboiz Are you serious?
[13:08:07] <kurushiyama> magicboiz You might want to update to 3.0.x at least, and wiredTiger as a storage engine.
[13:08:24] <kurushiyama> magicboiz As you have a replica set, that should be easy enough.
[13:09:25] <magicboiz> kurushiyama: ok, we'll study that option too. is it easy to upgrade a small cluster?
[14:04:43] <jayjo> I made some serious progress last night - pretty cool... Here is how I have my aggregation pipeline currently, and this gives me a count for each day. But I want to further specify and get a unique count of a different field for each of these days. Is this an easy follow up step? https://bpaste.net/show/9e977b60e889
[14:20:51] <jayjo> I've seen some different approaches - is using two $groups the most efficient? The database is large and will grow exponentially in the future
[14:27:26] <jayjo> I don't think I want to use $unwind.
[14:37:49] <kurushiyama> jayjo Well, an example doc would be helpful. And the expected output.
[14:41:53] <jayjo> kurushiyama: here is my current attempt and some sample docs: https://bpaste.net/show/0c164cf891d0
[14:44:07] <jayjo> I want an example to be the unique "_p"'s for a particular level of data aggregation. So for this instance it is date (dayOfMonth, month, year), and the count of distinct "_p". Eg: {"date": ISODate("2016-07-12T00:00:00Z"), unique_visits: 1234}
[14:57:15] <jayjo> I feel like I'm on the right track: https://bpaste.net/show/aec0499f6632
[15:00:08] <jayjo> I don't know if the two counts are an issue, or if I have to access the first groups ids the way I did in the second group
[15:11:28] <tantamount> Hey Mongos, my technical team lead is insisting that we store dates as an integer in the form 20160713 for today's date, for example. This is because he believes Mongo's date type is error prone since it also has a time component which we do not need
[15:11:40] <tantamount> Is there any merit to this approach or is there something compelling I could present that might talk him out of it?
[15:12:21] <Derick> tantamount: date math becomes very painful that way
[15:12:38] <Derick> how many days in between 20160713 and 20160802 ?
[15:13:21] <Derick> it would be much better to store the date in ISODate making sure it's 00:00 *or* use the Unix epoch and make sure you round (floor) to 86400 seconds
[15:13:33] <Derick> you won't be storing things like DST changeovers correctly then
[15:13:54] <tantamount> That's an argument we already had but he seemed convinced it was trivial math to break it down into component parts or that we would only do such logic in the application and not in the DB
[15:14:31] <tantamount> Or that we would never do such logic at all with dates because you never want to know how many days are between two birthdays, apparently
[15:15:01] <tantamount> It seems you are in agreement with my thoughts, Derick, and I appreciate that coming from an authority such as yourself :)
[15:16:35] <cheeser> his arguments are specious at best
[15:17:04] <Derick> tantamount: doing your own date math is *not* trivial.
[15:17:12] <cheeser> there are builtin tools for comparing dates. he's asking you to have to engineer a whole bevy of new tools to circumvent issues that aren't reall.
[15:18:00] <Derick> cheeser: oh, you didn't mean *my* arguments with that :)
[15:22:52] <tantamount> The thing you have to realise is that none of that matters in a corporate environment
[15:23:23] <tantamount> The problem with working for a company is it's a small, enclosed environment, and only the words of those in charge actually carry any weight
[15:23:46] <tantamount> There is no such thing as "authority" if it exists externally to the company
[15:34:07] <tantamount> Actual quote from my team lead, "So the same developer who fucked up PHP's DateTime is handing out advices on the same topic on mongo. Thanks, but no thanks."
[15:34:14] <tantamount> (after I mentioned you wrote the book)
[15:36:11] <cheeser> if all you want is range queries, then certainly his format would work. but it's of limited utility beyond that.
[15:36:57] <tantamount> "I have bear grudge for Derick's mistakes, just opt to go with a solution successfully used by SQL engine developers for decades by now, instead of listening to the half-finished implementation of mongo which strikes us as a single big bad decision so far."
[15:37:11] <tantamount> Since I introduced you as an author of MongoDB it seems he now thinks you're the sole author
[15:39:01] <tantamount> I tried to argue that DateTime is a superset of Date and therefore we can store Dates in a DateTime but he just told me I'm stupid lul
[15:39:02] <adrian_lc> hi, update({'list.sublist.attr': 'None'}, {$pull: {'list.$.sublist.': {attr: 'None'}}}) -> what would this remove, the obj in the 3rd level? or the 2nd
[15:39:05] <jayjo> sorry - lost connection, but still struggling with this double GROUP BY to get a distinct count
[15:39:15] <tantamount> More specifically, "please don't always look from a narrow-minded engineer perspective of implementation. A DateTime is not a superset of Date. Date is the name of a day shifting with timezones. It does not have an inherent mapping to and DateTime value aka a point in time. It can only be mapped via a fragile agreement (e.g. UTC midnight) which we have already seen as not to fit our purpose."
[15:39:25] <Derick> cheeser: that's just fractions of a unix timestamp
[17:12:04] <jayjo> wow can't believe how much I'm struggling with a second aggregation in the pipeline. Is there a cookbook with recipes for stuff like this?
[17:12:27] <jayjo> Trying to group a second time to get a unique count of a different field after grouping initially
[17:28:25] <kurushiyama> @cheeser Any updates on HPD, actually?
[17:29:46] <cheeser> kurushiyama: last i heard, "not in 3.4"
[17:30:52] <kurushiyama> @cheeser Thank you, albeit the statement is not making me too happy. Thought it would be in 3.4 (well, most of us did, I guess).
[17:32:38] <cheeser> demand was low i believe relative to other features.
[17:33:06] <cheeser> it would've really helped some of the driver/morphia work i've been doing, too.,
[18:05:38] <kurushiyama> @cheeser Oh, you do morphia? My sincere thanks!
[18:05:46] <kurushiyama> Oh, and btw, some gem I found: https://youtu.be/E5KDTw_OQ1Y
[20:47:28] <atbe> in the ServerAddress constructor
[20:48:06] <atbe> Any help or recommendations would be appreciated. Anything at all that would allow me to create a mongo client using an ipv6 address.
[20:48:25] <syrius> what version of the mongo java driver are you using?
[21:07:15] <atbe> at com.mongodb.ServerAddress.<init>(ServerAddress.java:122) at com.mongodb.ServerAddress.<init>(ServerAddress.java:49) at com.bluemedora.mongo.mongodb.definitions.scratch.getMetric(scratch.kt:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at
[21:15:02] <atbe> val serverAddress = ServerAddress("[$hostAddress]", 27017)
[21:15:08] <cheeser> ok. backing up. looks like those [] *are* required (at least according to the comments in that code but none of the examples i've seen)
[21:15:16] <atbe> to initialize the 2 arg constructor explicitly
[21:15:47] <atbe> right, so here is the serverAddress expression
[21:15:48] <atbe> val serverAddress = ServerAddress("[$hostAddress]", 27017)
[21:19:11] <atbe> String host = "[2604:fe00:3:1103:250:56ff:fe98:c831]"; ServerAddress address = new ServerAddress(host, 27017); MongoClient client = new MongoClient(address);
[21:19:42] <atbe> The constructing of the serveraddress is fine, but the construction of the mongoclient is the issue
[21:20:02] <cheeser> i have to run but i'll be back online in a few hours. if you haven't figured this out by then i'll be back to my kotlin env and can walk through things from there as well.
[21:20:15] <cheeser> oh, i see. i'll try to take a look later...
[22:25:26] <bodom> Hi there! I am new with mongodb. I am trying to find all records with a date prior to a given one. I am storing dates like "date" : { "y" : 2010, "m" : 10, "d" : 31 } because I've been reading native dates must include time and timezone. Is it doable?