[00:00:33] <Boomtime> in your case, the information you want is trivially pre-computable
[00:02:00] <atrigent> Boomtime: it seems to me that you're blinding yourself to the fact that this query is obviously running massively slower than it needs to just because I mentioned SQL
[00:02:58] <atrigent> there is absolutely nothing SQL-y in what I'm trying to do
[00:03:01] <Boomtime> no, it's running massively slower than it needs to because it shouldn't be run at all
[00:03:37] <Boomtime> if you run this aggregation twice, will the answer be different?
[00:04:42] <Boomtime> the aggregation you quoted is trivially pre-computable, you can maintain the answer much more easily than re-computing it every time
[00:05:12] <atrigent> Boomtime: ok, are you disputing this because this query can not actually, in a purely technical sense, be made faster, or because of some idea of how mongo "should" be used?
[00:05:29] <atrigent> because if it's the ladder then I'm not interested in discussing this anymore with you
[00:06:02] <Boomtime> your query as it stands cannot be made faster, it requires walking the entire collection
[00:06:19] <Boomtime> which is usually a sign of using MongoDB incorrectly
[00:06:54] <Boomtime> if you had a $match at the start that made the result more dynamic then you would have something is not so easily pre-computable
[00:07:09] <Boomtime> but the you would also be making use of an index
[00:09:13] <Boomtime> what you are trying to achieve is absolutely trivial, but you insist on doing it in the worst possible way and then complaining that SQL does it better that way, you mentioned SQL, not me
[00:11:13] <atrigent> well, to answer your question about whether subsequent queries will return the same results, the answer is obviously no
[00:11:52] <atrigent> but it probably won't change a huge amount, and having this data up-to-date isn't crucial, which is why I'll probably wind up caching it somehow
[00:12:08] <Boomtime> now we're getting somewhere...
[01:09:02] <SoulBlade> i think i have to do it one by one
[01:15:51] <Boomtime> @SoulBlade: you are correct - many drivers have a multi-update in the API for 2.4, but under the bonnet they are sending the updates one by one
[01:23:39] <Boomtime> the array insert will be the single largest boost, the bulk API might get you a little more on top
[01:23:44] <SoulBlade> im kind of surprised at the array of docs actually being parallel
[01:24:30] <Boomtime> there isn't actually a huge amount different between parallel commands and the bulk API commands, at least not until you have more than, say, 100 inserts to do
[01:24:52] <Boomtime> however, the bulk API lets you do mixed commands
[01:25:10] <Boomtime> where-as the array insert means they are all definitely inserts
[01:26:04] <Boomtime> the bulk API also lets you inspect the invidual result of each op, including multi-updates, where-as the array method almost always cannot
[01:27:19] <Boomtime> in short: array insert is implemented in parallel by the client, bulk API is implemented in parallel by the server
[01:27:19] <SoulBlade> so looking at the insert w/ array, it seems multiple documents are added to a single insert command
[01:27:57] <Boomtime> the trouble is that you get only a single answer
[01:28:52] <Boomtime> if a document insert fails you have a lot of fiddling to do to figure out which one and what happened.. the bulk API has a seperate answer per request contained
[01:28:54] <SoulBlade> yea thats the problem w. that one, but i am ok with that limitation for the boost
[01:29:01] <SoulBlade> yea - i look forward to that :)
[01:30:46] <SoulBlade> sigh.. so yea i guess bulk update will be slow for me still on 2.4 - unfortunate
[01:30:59] <SoulBlade> but ill switch over to bulk op when i can
[01:31:22] <SoulBlade> have you migrated a 2.4 to 2.6? can it be done by just adding a 2.6 member to a 2.4 replica set?
[01:36:45] <Boomtime> i may have understated the auth changes, from that doc above: "MongoDB 2.6 includes significant changes to the authorization model,.."
[01:37:23] <Boomtime> "After you begin to upgrade a MongoDB deployment that uses authentication to 2.6, you cannot modify existing user data until you complete the authorization user schema upgrade."
[01:39:06] <SoulBlade> well i dont really crud users in system.users - i just have one that i use to connect to the DB
[01:39:47] <SoulBlade> and i doubt id modify any of the data - and truth be told, i probably don't need the auth anyway since my app is colocated at the moment
[01:56:28] <atrigent> Boomtime: I am aware of ways to make this work, and I'm aware that counts can be maintained as I add and remove things, but I am still curious why the query is so much faster in mysql
[01:56:53] <atrigent> I realized it might have something to do with how the data is stored
[02:00:57] <Boomtime> you have asked a far more complicated question than you realise, and the last time i started to answer it you rejected the answer
[02:04:20] <Boomtime> "I have problem X, what is a fast solution?" versus "I have solution X, how do I make it fast?"
[02:05:02] <Boomtime> With the latter question there might be a system which implements "solution X" and is thus fast by default
[02:05:35] <Boomtime> Porting that solution to a different, unrelated system is an unfair starting point
[02:08:54] <SoulBlade> hmm i must have split during a good convo
[02:15:02] <atrigent> Boomtime: it looked to me like you were just saying "this isn't how it's done", whereas I was looking for technical reasons why this can't be implemented in mongo
[02:16:33] <atrigent> I have a problem where I find it hard to accept the status quo if it's less than ideal :)
[02:16:35] <Boomtime> "... can't be implemented in mongo" <- it can, and you know because you've done it, and there are multiple ways
[02:17:22] <atrigent> but if there's some core aspect of mongo, such as the fact that it is schemaless, that makes this SLOW (sorry, I meant slow), then that's understandable
[02:22:07] <Boomtime> what you are trying to do would be covered by this: https://jira.mongodb.org/browse/SERVER-4507
[02:23:11] <atrigent> when it says "sorted by the _id", it means sorted by the _id that you specify in the $group, right?
[02:26:46] <atrigent> oh I see, so essentially if it's sorted then when you hit another value, you know that you're done with the previous value, so you can do the accumulator operations that group
[02:29:28] <atrigent> but if you're just doing a $sum then you can just maintain a single value for each group, can't you?
[02:30:55] <Boomtime> that is the solution i first proposed
[02:31:37] <atrigent> I mean for the $group operation
[02:32:13] <Boomtime> you really are very insistent on doing this the SQL way aren't you? you are in for a lot of pain
[02:32:23] <atrigent> I might not be understanding what this issue is saying
[02:39:31] <Boomtime> there is no way to optimize your aggregation, it trawls the entire collection, i've said that, that's the answer to why your aggregation is slow, there is nothing more to say
[02:39:33] <atrigent> Boomtime: do you know why it's so much faster in mysql or not? I'm not "insisting" anything right now, I'm just, out of genuine curiosity, looking for an answer to that question
[02:53:11] <atrigent> this definitely is the sort of thing that would work fine in a sql database
[02:53:19] <Boomtime> goodo, and how big are the documents? bson size if possible, you can use .collection.stats() to tell you averages
[02:54:13] <Boomtime> btw, with only 1501 maximum, and in particular an average of just 180 you could use an array to embed all of them in a single document unless they are quite large
[12:38:27] <b1001> doh.. that was pretty stupid.. thanks kali
[12:45:20] <bwin> people please answer to question http://stackoverflow.com/questions/26074738/mongodb-multiple-operators
[12:52:58] <dazzled> can anyone lead me on the right path http://stackoverflow.com/questions/26072379/adding-entries-onto-a-database-document-instead-of-creating-a-new-one
[13:21:44] <b1001> I have a problem where db.tweets.find({'lang':'ar', 'translated':{'$exists':'false'}}).count() gives a different result in pymongo and mongo shell.
[13:22:44] <b1001> the mongoshell is without '' around false.. I actually find the one post where I put a field called 'translated:'gibberish'
[13:44:20] <testerbit> If mongodb does not find a search query, will it return undefined to the node driver? Or what does it return?
[13:53:50] <edrocks> anyone have experience with infinite scroll or pagination? Specifically with a ranked view not based on time
[17:42:44] <nexact> hello, is inserting a document containing a field named _id will automatically be converted to an ObjectID() item ? I'm trying to import data and I want to keep the id that i've previously generated ... thanks
[17:51:23] <kali> nexact: nope, if you provide an _id field, it will be kept as is and it can be about anything. your driver will add an objectId _id if you insert a doc without any _id, that's all it does.
[18:43:18] <hydrajump> hi if you have a 3 instance mongod setup, so 1 primary and 2 secondaries. When you have say a nodejs app using mongoose. It is setup to talk to the primary mongod instance. If the primary fails and one of the secondaries is elected a new primary, how does the nodejsapp know how to read/write to that new primary? I don't understand the client config for a replicaset
[18:45:42] <kali> hydrajump: when connecting to a replica set node, the driver pulls the replica set configuration and remembers it
[18:46:56] <kali> hydrajump: but in order to be fully resilient, you'd better give the driver the three hosts when it's possible, so if the expected primary is down, the client can still discover the replica set setup
[18:49:44] <hydrajump> kali: cool thanks for explaining and providing all three makes total sense!
[18:52:17] <hydrajump> kali: another Q. I can't find a chart or info on the difference between mongod and mongod enterprise? Is the latter commericial and what else is different from the "Regular" version?
[19:07:51] <hydrajump> Only thing i've found about enterprise is that apparently it won't work on ubuntu 14.04 :(
[19:09:10] <hydrajump> kind of strange that SSL requires compiling yourself or using enterprise but on ubuntu 12.04. Doesn't make sense when 14.04 is LTS
[21:13:04] <edrocks> how can I sort by a number then by time? I tried sort({"mycounter":-1,"_id";-1) but it just ends up with the time