[07:35:10] <Lope> is there any way to flatten data when I do a query? Like I pull the data out with dot notation like this: 'foo.bar.baz'. can I make baz appear on the root level of the object?
[09:39:26] <krion> (and looks like error is only for a single databases)
[09:39:35] <kurushiyama> krion Ah. Uh... Have to think where the balancer errors were recorded there. You are aware that you are doing experimental archeology with 2.6?
[09:51:30] <kurushiyama> krion Here is what I would do: Stop the balancer, do a rolling update to 3.2 and see what happens.
[09:52:05] <kurushiyama> krion ofc, you would have to do 2.6 => 3.0.x => 3.2.x
[09:52:52] <krion> This is not possible, unfortunately. 3.x can't run on squeeze.
[09:53:14] <krion> From what i see in the logs, this concern only a single chunk.
[09:53:26] <kurushiyama> Zelest A bit too fast to my taste for working and chillout (as a father of one 14 month old, you do little more), but it definetly has some quality.
[09:53:53] <Zelest> kurushiyama, weakling! I was listening to gabber during the nighty nighty of my new born ;)
[09:54:18] <Zelest> but then again, i was younger back then
[09:54:25] <krion> So I guess it's not that critical.
[09:54:26] <kurushiyama> krion remove one member from the replset, Take that replset member offline, update the os, update the mongodb rejoin the member.
[13:07:44] <pamp> hey, we have slow performance in our bulk insert last night, we had a lot of errors like "[QueryFailure flag was Timeout while waiting for migration commit (response was { "$err" : "Timeout while waiting for migration commit", "code" : 24 }).]"
[13:08:29] <hvemder> thanks to @kurushiyama for the help! enormous
[13:08:42] <pamp> what can be the reason for this errors?
[13:14:38] <kurushiyama> pamp Of the top of my head, I'd say your target shard is overloaded.
[13:17:38] <pamp> we had good performance for months, and this weekend we had this problem,
[13:21:57] <jayjo> If I have epoch timestamps as a field in my document, can I use the aggregation framework to get a count for different time periods? Like weekly, daily, monthly counts? A GROUP BY in SQL
[13:24:43] <kurushiyama> pamp That is a bit like saying "My car ran for years! How can it be broken now?" ;)
[13:25:13] <kurushiyama> jayjo There is a SO answer on that by BlakesSeven, iirc.
[13:29:42] <jayjo> kurushiyama: I'm looking around but can't find it - is this something like "epoch group by mongodb" ... any other search term rec?
[13:31:10] <kurushiyama> jayjo Sorry, was by somebody else http://stackoverflow.com/questions/35861602/aggregate-group-by-time-of-intervals-of-15-minutes
[13:40:29] <jayjo> kurushiyama: is it right that these date aggregation tools require an ISODate()?
[13:40:43] <jayjo> Do I convert it in the pipeline? Does this add significant overhead?
[13:41:55] <kurushiyama> jayjo hence my "adjust accordingly". You do a modulo operation. It does not really matter on what kind of number you do it. And I'd first get it to run and then optimize – As Donald Knuth once said "Premature optimization is the root of all evil."
[13:42:32] <cheeser> and perl is the root of all eval
[13:43:55] <kurushiyama> jayjo I just wonder why you store an epoch distance instead of an ISODate – internally, an ISODate is stored as an epoch distance, anyway.
[13:44:58] <kurushiyama> jayjo So storing an epoch distance has no benefit, but a lot of drawbacks.
[13:45:49] <jayjo> OK - I didn't know that. Right now I upload this data and can make changes accrodingly. In the future it sounds like converting to ISODate at that upload time would be beneficial
[13:56:33] <jayjo> kurushiyama: I think you're right - being able to use the date aggregators (https://docs.mongodb.com/manual/reference/operator/aggregation-date/) is a huge help... Can I convert the time field to ISODate inplace or do I need to add a field to each document in the mongo shell?
[13:58:28] <kurushiyama> jayjo I'd probably use something like var docs = db.mycoll.find(); bulk = db.bulk.InitializeUnorderedBulkOp(); docs.forEach(...); bulk.execute()
[14:00:09] <kurushiyama> jayjo Within the forEach, you can take the current value, convert it to an ISODate and then add an according bulk.find(docsUnderscoreId).update()
[14:21:25] <jayjo> kurushiyama: so docs.forEach( bulk.find(docsUnderscoreId).update() ) ? not clear on this second component
[14:22:15] <jayjo> is it not just forEach( function(myDoc) { myDoc["_t"] = Date(myDoc["_t"])})
[14:23:32] <kurushiyama> Gimme a sec, I'll write a bit of (non-checked) code
[14:27:22] <jayjo> I think I'm letting mysql (groan) influence my mongo development in very negative ways
[14:42:55] <kurushiyama> jayjo May well be – albeit there is a datetime type in MySQL as well, iirc. Here is what I hacked together. Please check against samples first!!! http://hastebin.com/izinunazef.coffee
[14:46:01] <kurushiyama> jayjo I just hacked it down from the top of my head – so please review it properly.
[14:50:15] <jayjo> kurushiyama: thanks this helps a lot - I just need to get in the habit of using these 1-off scripts to accomplish this stuff
[15:00:00] <chris|> does anyone know where I can get the mongosniff tool? I cannot find any download and it is not part of the distribution
[15:00:33] <cheeser> are you sure? it's there for me.
[15:00:48] <UberDuper> I haven't seen mongosniff in a release since early/mid 2.4.x
[15:01:02] <UberDuper> You're *way* better off using wireshark
[15:01:14] <UberDuper> Their protocol analyzer for mongo is excellent.
[15:02:41] <cheeser> it's included in the homebrew builds at least.
[15:05:40] <chris|> can it read the diaglog file format?
[15:06:52] <UberDuper> I'm not familiar with diaglog.
[15:07:11] <UberDuper> I always just tcpdump -w and open the capture in wireshark
[15:08:47] <chris|> well, the diaglog file is already there, I need a tool to read it :)
[15:09:40] <UberDuper> Oh that's a mongo log output?
[15:10:13] <UberDuper> mongosniff and wireshark are for inspection on the wire protocol traffic.
[15:12:06] <chris|> it is sort of a very verbose binary diagnostic log, but I am not sure if it's any useful if I cannot read it :)
[16:37:39] <benjwadams> what's the preferred way of storing date/time intervals in mongo?
[16:38:01] <benjwadams> store as float in seconds?
[16:39:23] <benjwadams> something akin to the 'interval' type in postgres
[16:55:46] <jorsto> When using Wiredtiger, for timeseries, if I'm storing all the points in a single document, does it store the data in a hash table for faster updates? or should I still split up the time into smaller sections? (i.e. 0..........4071... updating doc.stats.0 is faster than doc.stats.4071)... is this still relevant?
[17:00:25] <kurushiyama> jorsto If you store every data point in the same document, you will have a rather small database: there is a 16MB size limit.
[17:00:39] <jorsto> i know this, i won't get close to hitting that
[17:00:41] <kurushiyama> benjwadams Simply store isodates.
[17:01:05] <jorsto> i'm storing 3 days worth of data at 15 minute intervals
[17:01:25] <kurushiyama> jorsto Then, you could store it any way you want, since chossing an effective or ineffective way would hardly make a difference.
[17:02:43] <jorsto> super old post, but it says it's "not as a hash table", but wasn't sure if wiredtiger optimized this.
[17:05:27] <kurushiyama> jorsto Actually, afaik, there is no such thing as a hashtable. And I do not see the point in having one. Lets take MMAPv1 as an example, for the moment to make clear why embedded documents for time series do not make sense atm.
[17:07:43] <kurushiyama> jorsto Simplified, a data file consists of documents, which in turn consist of field descriptions: the type and (if necessary) the length of the field. The default _id index holds the position of an individual document
[17:08:42] <jorsto> yeah, last library i made for a previous company, time series were stored in different collections depending on the step and it worked fine, but this new use-case, there's more documents where i need the values such as "post title" and would rather not do a join on all of them, so it's kind of a trade-off. i guess i can store it in each single document and assume the latest one is the title, but that's just that much more data :P
[17:09:08] <kurushiyama> jorsto However logically related individual events may be, putting them into a single document holds not benefit
[17:09:25] <jorsto> just more annoying complexity. :P
[17:10:25] <jorsto> the minute i almost started doing, { 0: { 0...60 }, 1: { 0...60} ... 24 } is what made me cringe. back to 1 doc per event
[17:10:47] <kurushiyama> jorsto If you really want to learn about time series data, InfluxDB, and the way it treats values and tags really is en eye opener
[17:11:25] <kurushiyama> jorsto Well, basically MongoDB is just a different mean of storage
[17:11:36] <kurushiyama> jorsto The same rules and best practises apply.
[17:14:17] <kurushiyama> jorsto Not to mention that storing time series data in subdocuments makes aggregations more complicated – and I guess aside from adding the data, this is your most common use case. So we have covered the most common use cases (adding the data and doing aggregations on it) to be more efficient with single docs.
[17:26:17] <zylo4747> can i use the latest version of pymongo against a 2.6 replica set?
[17:53:27] <jorsto> is there an easy way to download a dump from mongo atlas and restore it locally for debugging
[19:24:22] <phutchins> Anyone good with aggregations that can piont me in the right direction with what I'm doing wrong here? db.reports.aggregate([ { $match: { $timestamp: { $gt: ISODate("2016-07-04T14:45:17.809Z"), $lt: ISODate("2016-07-04T14:55:17.809Z") } } }, { $group: { _id: "$nodeID", used: { $max: "$storage.used" } } } ] ) I'm trying to match 10 minutes worth of data, group by the 'nodeID' field, and get the max
[20:24:14] <NotBobDole> Anyone here do authentication cycling/revocation?
[21:25:20] <ctooley> Is there a way to turn off the WARNINGs that get reported on client connect for things like transparent_hugepage? We've got replicasets in docker containers and I can't turn those things off on the docker hosts. But, it's breaking the client's ability to connect cleanly.
[21:33:45] <teeka> http://stackoverflow.com/questions/38276461/mongodb-bulk-upsert-throwing-e11000-duplicate-key-error <-- can anyone take a peak ? :)
[22:34:23] <poz2k4444> hi guys, I'm trying to sync some data to elasticsearch with mongo-connector, but the oplog thread says "reading last checkpoint as None" I think this means it's not going to sync anything, or am I missing something here?
[22:36:56] <poz2k4444> but other collection syncs without problem
[22:37:17] <poz2k4444> does this mean mongo-connector haven't found the oplog entry for the collection I'm trying to sync?