pmxbot IRC Log Viewer

[00:40:53] <coderman2> im running db.col.aggregate([{"$group" : {_id:"$source_group", count:{$sum:1}}}]) on a collection with 5m records. i have an index on source_group, but the query is doing a colscan. is that normal?

[00:44:09] <joannac> try putting a $sort at the start?

[00:46:11] <coderman2> db.col.aggregate([{ $sort: {_id: 1}}, {"$group" : {_id:"$source_group", count:{$sum:1}}}])?

[00:46:21] <joannac> ...no

[00:47:24] <joannac> I have no idea why you think sorting on _id would help

[00:47:35] <joannac> sort on the field you're grouping on

[00:49:30] <coderman2> .aggregate([{ $sort: {source_group: 1}}, {"$group" : {_id:"$source_group", count:{$sum:1}}}]) been running over a minute

[00:52:43] <coderman2> that did make it do an index scan though

[00:53:17] <coderman2> takes much longer, so must be a case of the planner doing the right thing

[01:35:38] <nofxx> j. chrst , why mongo doesn't pay a github or use gitlab at least? JIRA is a waste of time, to put in kind words..

[01:39:04] <joannac> nofxx: do you have a specific complaint?

[01:41:05] <cheeser> it's definitively *not* a waste of time though I can see how some wouldn't like using it.

[01:41:15] <nofxx> joannac, do you want'em in chronological or alphabetical order? ;) kidding but, really just old ideas, doesn't compare. But some examples: status: 'on code review' ... good, where's the code? what changed? there's no repo<>issues integration, not to mention pull request concept.

[01:42:02] <cheeser> that's unrelated to jira

[01:42:19] <cheeser> pull requests are still used.

[01:44:10] <nofxx> cheeser, move issues back to GH so it'll be unrelated to JIRA.

[01:44:20] <nofxx> don't how it is now

[01:44:58] <cheeser> not my call

[01:47:46] <nofxx> and the cherry on the top: the design, is so bad and ill-done I can't even start. And I don't mean the beauty, I mean the usability. Currently I'm trying to read a issue and there's so many panels and nonsense info the description and comments have exactly 20% of the screen

[01:57:50] <nofxx> Annoyance and despair: http://i.imgur.com/zph7EUD.png Clarity and information: http://i.imgur.com/hC2GQlL.png cheeser joannac

[02:04:24] <cheeser> nofxx: you're barking up the wrong tree here. no one here can change the issue track decision

[02:05:20] <joannac> nofxx: do you contribute to the source? issue pull requests?

[02:05:58] <joannac> Like cheeser said, no one here has the authority to change the decision to use JIRA

[02:07:24] <joannac> If you contributed, you might get some traction. If you're just a user, I don't think you're going to have much leverage here

[02:22:16] <nofxx> cheeser, joannac , no, not directly, neither claiming to. Just a 'user' rant, outside of the box view. But was naive of me to think there's devs here or that mongo devs don't have any active voice in this?

[02:23:53] <cheeser> do you have any idea how hard it is to migrate from one tool to another? apart from massaging the data from one form to another, there are processes and links aplenty. it's non-trivial and needs a bit more motivation than "just a 'user' rant"

[02:26:15] <nofxx> cheeser, with access to the jira db directly, a script to export all to gitlab or github is pretty trivial. I would use some ORM and just map to their respective API.

[02:26:32] <nofxx> but cultural changes, way easier when you're moving to something better.

[02:27:26] <nofxx> won't be surprise if jira2github jira2gitlab found some projects too ;)

[02:28:06] <nofxx> but ok, let's just stop wasting more time... till soon ;)

[04:23:18] <devians> heya, can I query a find across dbrefs in 2.6.*? ie db.mycollection.find('dbref.field', 'value'); I've tried and get no results so I'm obv missing something.

[04:26:09] <joannac> devians: pastebin an example document and an example query?

[04:27:20] <devians> joannac, essentially this http://stackoverflow.com/questions/6195286/how-to-query-mongodb-with-dbref

[04:27:55] <joannac> devians: okay, and the first answer doesn't work for you?

[04:28:25] <devians> they're querying the id thats contained in the dbref, i want to query a field thats on the referenced document.

[04:28:40] <joannac> then query the referenced document

[04:29:20] <devians> uh, well I want foo's where the foo references a bar with value 0.

[04:29:40] <joannac> devians: whatever is in the document is what you can query

[04:30:08] <joannac> if it's not in the document, it requires 2 queries

[04:30:20] <devians> ok. then how can i achieve this in a performant manner? can I query foos and bars independantly and then get the product of those queries somehow?

[04:30:21] <joannac> one to find all bar=0

[04:30:35] <joannac> and then one to find all foo that match the bar in query #1

[04:31:19] <devians> ok, so the latter query, is that essentially an 'where dbref.$id in' kind of query I guess?

[04:31:28] <joannac> yes

[04:31:39] <devians> can this be done as a subquery, does mongo have a concept like that?

[04:32:03] <joannac> no

[04:32:20] <joannac> "To resolve DBRefs, your application must perform additional queries to return the referenced documents. Many drivers have helper methods that form the query for the DBRef automatically. The drivers [1] do not automatically resolve DBRefs into documents."

[04:32:41] <joannac> http://docs.mongodb.org/manual/reference/database-references/

[04:36:39] <devians> joannac, ok, thanks :)

[04:36:43] <devians> I think this is what I need: http://doctrine-mongodb-odm.readthedocs.org/en/latest/reference/priming-references.html

[04:39:07] <joannac> devians: that's basically the same thing, just that the ODM does it for you

[04:39:19] <devians> yep :)

[04:40:19] <devians> it means I cant use count() anymore, which is a bit annoying but I'm limiting by a date range so the numbers should be low enough that the speed penalty isnt too bad.

[05:12:24] <h1fuelcell> Hello

[05:12:38] <h1fuelcell> I'm stuck using an implementation of Mongo that disallows the aggregation framework

[05:13:58] <h1fuelcell> I'm trying to use the group() function to get what I need

[05:14:45] <h1fuelcell> I'm having trouble understanding what this means: "initial (object) – initial value of the aggregation counter object."

[05:14:52] <h1fuelcell> http://mongodb.github.io/node-mongodb-native/api-generated/collection.html#group

[05:15:14] <h1fuelcell> and this one:

[05:15:14] <h1fuelcell> command (boolean) – specify if you wish to run using the internal group command or using eval, default is true.

[05:15:27] <h1fuelcell> Where can I get more information about these terms?

[05:16:18] <h1fuelcell> specifically, how can a counter be an object

[05:16:23] <h1fuelcell> isn't it usually an int?

[08:30:32] <Mia> Hi channel -- I have a mongodb document structure as {_id:"...",content:{...},color:"red"} - and I want to get a random document from my collection with a specific color --- such as "a random document that's red" -- how can I query this in mongodb?

[08:44:57] <Derick> Mia: I don't think there is a way to fetch a random document. However, if there are not too many documents in the collection, you can do something like this: 1. count( { color: "red" } ); 2. calculate random number in range 0-($count-1) 2. find( { color: "red" } ).skip($random).limit(1);

[08:45:48] <Mia> Derick, .skip is a no-no, I tried it, I have around 200k documents in te collection now and if it's a big number it takes seconds

[09:03:11] <Derick> Mia: yeah, 200k is too many

[09:13:44] <al-damiri> Hi #mongodb

[09:14:01] <al-damiri> How can I print the whole collection to a file in a pretty format?

[09:14:03] <al-damiri> Is it possible?

[09:19:02] <Derick> mongodump into a json file?

[09:21:57] <al-damiri> Derick: I got it done by doing mongoexport -c <collection_name>.

[09:22:00] <al-damiri> Derick: Thanks. :)

[09:36:24] <bartzy> Hello

[09:37:04] <bartzy> I’m currently using MongoDB 2.4 on Debian, via the 10gen repos. The package name is mongodb-10gen. I want to upgrade to 2.6, where the package name is mongodb-org. These packages conflict. How should I upgrade? (This is a production environment)

[10:32:54] <ielezovikj> How do I restart mongod?

[13:15:16] <vagelis> I ask something even if i have already did some time ago because i really cant do it with mongo. Image we have 3 simple documents like this: {num: 1}, {num: 2}, {num: 3}. How can i get the documents that have sum of 3 (so the first 2 documents) ?

[14:06:44] <Pinkamena_D> Hello, I have a special query in mind: first a specific query is fetched and sorted, and then a less specific query is fetched and sorted. The more specific entries should be returned first. Is this possible with aggregation or should I just pull and store and conbine the two queries client side?

[15:47:52] <Axy> Hey all

[15:48:03] <Axy> is there any notes for what's going to be in the next release of mongodb

[15:48:10] <Axy> I heard there were some options about getting random documents

[15:48:15] <Axy> maybe I misheard about it

[15:48:19] <Axy> I wanted to see a full list

[15:59:26] <cheeser> $sample is a new agg pipeline stage

[15:59:45] <StephenLynx> hm

[15:59:50] <StephenLynx> what will it do?

[16:35:20] <terminal_echo> stupid question but lets say you've put a bunch of data in a mongo db

[16:35:20] <terminal_echo> how do you explore this data manually?

[16:47:14] <StephenLynx> using the terminal

[16:54:47] <baid0c> Hello, I`m getting MongoCursorTimeoutException, although the server load is very small between 1-2, the server has 8 cpus

[16:55:11] <Derick> baid0c: what does the mongo log say?

[16:59:45] <baid0c> I see all db logs there

[16:59:48] <baid0c> so I should wait for an error

[16:59:53] <baid0c> to see why it's triggered

[17:00:32] <Derick> not an error, but a slow query can trigger it

[17:00:40] <Derick> post the last 100 lines in a pastebin or something

[17:11:28] <baid0c> the collection matters, if the slowquery runns on collection A, can the timeout query run on collection B?

[17:11:43] <baid0c> or there should be a query on collection B, that is running slow

[17:11:47] <baid0c> ?

[17:45:00] <stuntmachine> I'm setting up Mongo 3.0.5 from scratch on CentOS 6 and curious how to properly configure an arbiter w/ two nodes as replicas. I see a lot of the documentation talking about running commands against mongo to set up the replica set but can't I codify this into the Mongo configuration file before starting the arbiter and replicas?

[17:49:39] <Derick> stuntmachine: nope, configuration for a set is not written down in config files, as that makes adding a new member while it runs not straightforwards

[17:49:47] <Derick> just the arbiter you need to add a flag for

[17:50:24] <stuntmachine> I just don't want it to start and try replicating data to the arbiter. I think I'm thinking of this too much like a traditional RDBMS

[17:51:22] <Derick> it won't start replicating to an arbiter if you don't add it as a normal host, but an arbiter

[17:51:32] <stuntmachine> I see now why it says to specify journal.enabled to false and smallFiles to true

[17:51:44] <stuntmachine> alright it's a new build so no harm in just giving it a shot

[17:51:49] <Derick> usually on the "primary": rs.initialize(); rs.add("hostnameb"); rs.addArb("arbnodehostname");

[17:52:08] <stuntmachine> ah okay that needs to be run on each member?

[17:52:11] <Derick> (I write that that without checking docs)

[17:52:15] <Derick> no, just on "hostnamea"

[17:52:17] <stuntmachine> sure sure, i'll verify against them

[17:52:25] <Derick> if you add a host, it syncs the config too

[17:53:41] <stuntmachine> oh, you know what i was getting confused about last time i looked at this? i install mongo from the official repo and it creates a config file that isn't YAML, but then the docs say the new method is YAML.

[17:54:05] <Derick> yeah... the packaging could do some updates :S

[17:54:13] <stuntmachine> ha

[17:54:50] <stuntmachine> so then what's the proper way of adding those two values in the old format? journal.enabled and smallFiles

[17:55:01] <stuntmachine> the docs point to a yaml config

[17:57:40] <Derick> stuntmachine: I don't know :) Never used the yaml config stuff

[17:57:55] <stuntmachine> no, i mean, what's the non-yaml config look like

[17:58:10] <Derick> smallFiles=yes

[17:58:17] <cheeser> (or true)

[17:58:25] <cheeser> and just "journal" iirc

[17:58:28] <Derick> and... you shouldn't mess with the journal setting :)

[17:58:40] <stuntmachine> but... the docs say to do it for the arbiter :P

[17:58:56] <cheeser> i wouldn't think it'd matter for the arbiter

[17:59:06] <Derick> same here...

[17:59:23] <stuntmachine> http://docs.mongodb.org/master/tutorial/add-replica-set-arbiter/

[17:59:42] <stuntmachine> "These settings are specific to arbiters. Do not set journal.enabled to false on a data-bearing node. Similarly, do not set smallFiles unless specifically indicated."

[18:48:58] <jr3> with mongo one can shard data by actual values on said data right? If property is A put into A shard, if B put into B shard

[19:02:16] <cheeser> like so? http://docs.mongodb.org/master/core/tag-aware-sharding/

[19:04:53] <jr3> danke cheeser

[19:07:39] <cheeser> bitee

[19:07:41] <cheeser> bitte

[21:23:36] <Trindaz> Looks like @namlook isn't here to help with a mongokit problem

[21:23:51] <Trindaz> followed the tutorial in the docs, getting TypeError: 'Collection' object is not callable. If you meant to call the 'register' method on a 'Database' object it is failing because no such method exists.

[21:36:05] <Trindaz> nevermind @namlook, I solved the problem.

[22:10:34] <jr3> can anyone explain like im 1 what a "working set is?"

[22:10:43] <jr3> how do I measure my working set?

[22:11:12] <jr3> my total db data size is 20GB, is that considered my working set?

[22:36:04] <jmorphio> Hey there. Should I avoid using mongo for a convential web app?

[22:36:17] <jmorphio> conventional*

[22:47:31] <movedx> jmorphio: No. I find both MongoDB and ElasticSearch to be fine for general purpose applications.

[22:48:30] <movedx> jmorphio: In fact, NoSQL databases are somewhat designed/perfect for such applications, because they're structureless (in that they don't force a structure on you or force you to define one before hand), so they can grow with your application.

[22:48:50] <jmorphio> movedx: that's what i'm most attracted to about mongo right now

[22:49:46] <movedx> jmorphio: Understandable. Personally I prefer ElasticSearch as its interface for accessing data is easier to get to grips with.

[22:50:04] <movedx> I find MongoDB's querying too complicated, but it's a very capable DB.

[22:50:25] <movedx> Possibly more mature than ES too. I've never looked up the dates/facts on that one.

[22:50:46] <jmorphio> i'm still getting used to the way relationships are defined in mongo, but it doesn't seem that bad

[22:51:13] <jmorphio> there seems to be a stigma around querying embedded documents in a collection - like it's a performance problem

[22:53:03] <movedx> For me it's just the query language. I've also never used MongoDB in any detailed capacity - I'm a Systems guy.

[22:59:46] <jmorphio> thanks for the response :) i think i'll stick with mongo for now

[23:30:22] <diegoaguilar> Hello, is it even possible to get an index based on a regex?

[23:32:05] <cheeser> say what now?

Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 17th of August, 2015