pmxbot IRC Log Viewer

[01:11:42] <MrAmmon> is there a simple way to create a collection in mongo from a subset of documents in another collection? For example, if I just want to create a table of 1000 documents from my collection of millions so I can test an index?

[01:12:24] <GothAlice> MrAmmon: http://docs.mongodb.org/manual/reference/operator/aggregation/out/

[01:12:32] <GothAlice> Aggregation lets you transform the documents however you wish during the migration.

[01:12:41] <GothAlice> Migration/copy.

[01:13:24] <MrAmmon> excllent, thank-you

[01:13:31] <GothAlice> It never hurts to help. :)

[01:13:54] <PeterFA> GothAlice, what if you're helping someone hurt people or things?

[01:14:23] <MrAmmon> it's ok, I promise not to use my new found skills to assist my marketing department

[01:14:57] <PeterFA> Marketing departments are the one who love denormalized databases.

[01:15:13] <GothAlice> Accounting, not so much. XP

[01:15:33] <PeterFA> No, they hate them.

[01:18:46] <lqez> ...lol

[01:24:54] <StephenLynx> marketing dpt doesn't even know what a db is.

[01:25:33] <GothAlice> Mine calls a 1-column CSV a database. Considering SQL, she's not wholly incorrect. (Zzing!)

[01:28:41] <PeterFA> COBOL :

[01:28:43] <PeterFA> :)

[01:29:32] <GothAlice> The size of a commercial fridge.

[01:30:11] <GothAlice> FTP is my SSH.

[01:30:39] <GothAlice> T_T

[01:31:13] <GothAlice> Irony: my highlight color is green, and your name color is yellow, on my client. Excellent combination.

[02:23:09] <MrAmmon> does mongo have to actually find something on a query in order to return an accurate explain?

[02:23:27] <joannac> no

[03:40:06] <loth> Hey all, getting a "exception: need most members up to reconfigure, not ok" error even though I can reach the node im adding just fine https://pastebin.mozilla.org/8836929

[03:51:42] <joannac> loth: you are piping that command to 10.0.0.1

[03:51:59] <joannac> connect to that host, and verify that host can connect to 10.0.128.107:27017

[03:57:00] <loth> joannac: which host should i connect from? 10.0.0.1 = cpl001

[03:58:29] <loth> which is where i ran rs.initiate()

[03:59:16] <madeti> why is this not working: find({'someArrayField': {$elemMatch: {$in: [{'k1': 'v1', 'k2', 'v2'}, {'k1: 'b1', 'k2': 'b2'}] } } })

[03:59:31] <madeti> cant I use $elemMatch with $in ?

[04:03:46] <Boomtime> @madeti: there are a few curious things going on in your query there.. first, you only have 1 predicate to the $elemMatch, so $elemMatch is superfluous

[04:03:57] <Boomtime> secondly, the $in looks suspiciously odd

[04:04:09] <Boomtime> perhaps you should start by explaining what you are trying to do

[04:05:21] <madeti> Boomtime: 'someArrayField' is a field in my document which has an array of objects/dicts,

[04:05:32] <madeti> so I want to find all documents which have

[04:06:35] <madeti> either one of the objects from the array [{'k1': 'v1', 'k2', 'v2'}, {'k1: 'b1', 'k2': 'b2'}] in the 'someFieldArray'

[04:07:28] <madeti> my array can can have several objects

[04:08:36] <joannac> loth: open a shell to 10.0.0.1 and run db.serverStatus().host

[04:10:06] <loth> joannac: i get cpl001

[04:12:47] <joannac> loth: *eyebrows*

[04:13:19] <loth> I think i figured it out, after adding keyFile on both nodes i can add the host

[04:13:33] <joannac> oh, you have auth on?

[04:13:37] <loth> Yeah

[04:13:42] <joannac> sigh

[04:16:41] <Boomtime> @madeti: $elemMatch applies to it's own parameters, not to the parameters of operators that are nested deeper in - so you actually need each 'or' clause to be: { $elemMatch: {'k1': 'v1', 'k2', 'v2'} }

[04:17:06] <Boomtime> this also implies that $in is not the right operator, since it expects to apply direct matching

[04:17:40] <Boomtime> i.e you want an arrangement more like this: $or: [ $elemMatch: {}, $elemMatch: {} ]

[04:18:22] <madeti> Boomtime: thank you, understood

[06:52:36] <Jonno_FTW> hi, I'm having trouble with pymongo, I can't get an index from a cursor, eg: readings[1234]

[09:17:54] <markand> hi guys

[09:18:04] <markand> it seems that mongo-c-driver 1.1.7 lacks the content of src/libbson

[09:18:12] <markand> and we cannot add an issue on github

[09:28:20] <circ-user-22hxs> I have a collection with 2 millions documents, I'd like to move documents older than 6 months to a backup collection, how can I do this efficiently?

[09:29:51] <Lujeni> circ-user-22hxs, you can dump the document which need a backup, then remove it with a basic query ?

[09:31:23] <circ-user-22hxs> @Lujeni: it's multiple documents, around 1 million, so I'm wondering what's the best way to do this.

[09:35:15] <Lujeni> circ-user-22hxs, 1million is not a big deal. You can simply remove it. Your database lock will increase during this operation

[09:38:15] <circ-user-22hxs> @Lujeni : I'd like to move them to a backup collection, not remove them.

[09:38:28] <mtree> guys, how to set key from a value in aggregate query?

[09:38:53] <mtree> i need to store each unwinded value per its specific key

[09:39:43] <mtree> $project: { '$scrapes.ota': '$scrapes.ota.price' } gives me an error:

[09:40:01] <mtree> "errmsg" : "exception: the operator must be the only field in a pipeline object (at '$scrapes.ota'",

[09:40:45] <Lujeni> circ-user-22hxs, move = delete in your case

[09:41:50] <mtree> $project: {

[09:41:53] <mtree> '$scrapes.ota': '$scrapes.result.price'

[09:44:01] <mtree> http://stackoverflow.com/questions/24548295/use-field-value-as-key

[09:44:07] <mtree> this describes my problem nicely

[10:19:28] <kenalex> Hello

[10:21:36] <moqstheone> hi

[10:21:43] <moqstheone> i have a question.

[10:23:01] <moqstheone> Is it possible in mongodb with mongoose to create a schema that contains an array with different schemas?

[11:00:40] <GoCrazy> hey

[11:01:29] <GoCrazy> I want to ask about a better way to implement autocomplete using mongodb , currently i am using simple RegEx , i want to make it more efficient .

[11:31:40] <deathanchor> autocomplete?

[11:31:56] <deathanchor> oh for a UI search box right?

[11:32:25] <deathanchor> I'm sure there are a few methods/examples online

[12:08:41] <StephenLynx> hm, mongo update for me

[12:09:03] <StephenLynx> it seems to be just the client though

[12:39:36] <GoCrazy> i want to know is there any way to use array variable with $each in aggreagtion in mongodb

[14:56:48] <synergy_> In MongoDB is a cursor an object?

[15:35:20] <ehershey> I would really like to automate this part of the release process:

[15:36:02] <ehershey> 3.0.4 is out!

[15:37:56] <GothAlice> \o/

[15:38:21] <GothAlice> ehershey: Anything worth doing twice is worth writing a tool to do it for you.

[15:38:52] <deathanchor> GothAlice: if you do it once, you'll probably have to do it again, so write the tool the first time.

[15:39:24] <GothAlice> deathanchor: Aye, my corollary is usually: Anything worth doing right is worth doing twice.

[15:39:25] <GothAlice> :)

[15:40:20] <ehershey> haha awesome

[15:41:39] <deathanchor> When you do things right, people won't be sure you've done anything at all. -Futurama Binary Galaxy

[15:41:53] <StephenLynx> aaah, that is why only my shell updated. I have the database on a VM :v

[15:42:21] <GothAlice> That'd do it. I even MMS manage my VMs, StephenLynx. >:D

[15:42:27] <GothAlice> (My local VMs, that is.)

[15:45:05] <StephenLynx> its just me or the binaries were reduced in size?

[15:45:17] <StephenLynx> I remember a full update would clock around 80mb, but its 50mb this time around

[17:04:07] <BadHorsie> I have a collections ipNets with entries like: {"cidrs":{"10_0_0_1/24":{"ip":{"10_0_0_2":{name:"mygw"}}}}}, I'm curious about how to search for {"name":"mygw"}, I usually do like "cidrs.<range>.ip.<ip>.name", but in this case I don't know the <range> or <ip> from the query...

[17:06:15] <StephenLynx> why did you nested it like that?

[17:06:31] <StephenLynx> instead of making cidrs an array and the ip a field of the objects inside the array?

[17:06:52] <StephenLynx> "10_0_0_1/24" what does this mean?

[17:07:08] <StephenLynx> what the

[17:07:13] <StephenLynx> why did you nested everything?

[17:08:22] <BadHorsie> We thought of that over "cidrs":[{"range":"10.0.0.0/24","ips":[{"ip":10.0.0.1,"name":"mygw"}]}] because it was kinda hard for updates, like the find would return one positional operator and you would need that trick of foreknowing the itemNumber...

[17:08:34] <BadHorsie> That doesn't mean that both are not completely wrong lol

[17:08:57] <BadHorsie> TBH we are just learning and trying to find a reasonable model and failing at doing so, part of learning I guess

[17:09:15] <StephenLynx> {cidrs:x,ip:y,name:z}

[17:09:22] <StephenLynx> that is what you need.

[17:09:33] <BadHorsie> One cidr has many IPs

[17:09:54] <StephenLynx> make it an object with an array on one if its fields

[17:10:13] <BadHorsie> You want to have the subnet repeated for each IP?

[17:10:33] <StephenLynx> {cidrs:{subnet:ips:[ip:x,name:z]}}

[17:10:50] <StephenLynx> {cidrs:{subnet:x,ips:[ip:y,name:z]}} fixed

[17:11:20] <StephenLynx> dynamic keys are one of the worst practice on any database.

[17:12:36] <StephenLynx> just whatever you do, don't use dynamic keys.

[17:12:58] <BadHorsie> Interesting, so have an array of IPs per key-ed range... cidrs in your example is "10_0_0_0/24" and there are multiple cidrs expected, right ?

[17:14:25] <StephenLynx> there are? I don't know much about networking to figure your requirements just by what I understood so far.

[17:15:00] <StephenLynx> in that example cidrs should be an array too, when I think about it.

[17:15:18] <StephenLynx> if you can have multiples of it per document.

[17:15:40] <StephenLynx> "cidrs":[{"range":"10.0.0.0/24","ips":[{"ip":10.0.0.1,"name":"mygw"}]}] is much better

[17:15:42] <StephenLynx> and is not hard to update

[17:16:02] <StephenLynx> because you can match on the query block and use $ on the update block.

[17:16:33] <StephenLynx> cidrs.$.ips.$.name: value

[17:16:43] <BadHorsie> That's our current design

[17:16:53] <StephenLynx> given that you matched the correct cidrs and ip on the query block.

[17:17:05] <StephenLynx> you can either use subarrays or split on a separate collection.

[17:17:12] <StephenLynx> each approach has its own limitations.

[17:18:15] <StephenLynx> it depends more on how you will query it rather than how you will update it.

[17:19:30] <StephenLynx> if you want to query the ips from multiple ranges along with data that is pertinent to the range, you will have to perform an additional query to get the range documents after obtaining the ip documents and then manually append the data on the ips to fake a join.

[17:19:47] <BadHorsie> How would you build the update instruction? I have a db.ipNets.update({"cidrs.ips.ip":"10.0.0.1"},{$set:cidrs.$.ips.0.name:"mygw"}); which works in my case because I know beforehand that it's item subzero...

[17:19:47] <StephenLynx> that is if you use separate documents.

[17:20:03] <StephenLynx> why is ips.0 hardcoded?

[17:20:18] <StephenLynx> the index 0.

[17:20:35] <StephenLynx> you could just use $ there too, if I am not mistaken.

[17:21:19] <BadHorsie> So I use twice $ and it knows which array itemNumber I mean ? AFAIK there is a JIRA for it to work with like $1, $2 and so on... But so far is not possible...

[17:21:33] <StephenLynx> I think I have used it.

[17:21:44] <StephenLynx> in a project where each thread has a tree representing its comments

[17:21:59] <StephenLynx> with infinite depth, each comment has its own array of subcomments.

[17:22:10] <StephenLynx> and I just build the query concatenating $'s.

[17:22:30] <StephenLynx> let me look for it.

[17:22:40] <BadHorsie> Nice

[17:23:18] <StephenLynx> damn, some sites are crippling slow for me today.

[17:23:20] <StephenLynx> imgur, gitlab.

[17:23:30] <GothAlice> Solar radiation.

[17:23:33] <StephenLynx> I know its not me because youtube loads fine, I am checking on the network.

[17:23:47] <asturel> hi, how can i .find( "target" : "xx" ) case-insensitive? regexp doesnt always works because there's sometimes special characters

[17:23:56] <asturel> sry {}

[17:24:44] <asturel> https://bpaste.net/show/074cb2a28cc6

[17:24:53] <asturel> (nodejs)

[17:25:02] <GothAlice> asturel: Due to use of extended characters, you may be forced to save a second copy of the field, lower-case'd, and query on that.

[17:25:33] <asturel> 'save a second copy'?

[17:25:41] <asturel> u mean when i insert?

[17:25:46] <GothAlice> But if you're allowing anything outside [a-zA-Z_][a-zA-Z0-9_-]+ as a username… you're gonna have issues.

[17:26:07] <GothAlice> asturel: Correct. {username: actual_username, username_lower: to_lower(actual_username)}

[17:26:44] <asturel> http://docs.mongodb.org/manual/reference/operator/aggregation/strcasecmp/ something like this doesnt work?

[17:27:00] <StephenLynx> Can't find it. But GothAlice should be able to confirm, you are able to use multiple $ to indicate indexes in an update, right?

[17:27:16] <StephenLynx> like farms.$.cows.$ : newAmount

[17:27:19] <GothAlice> asturel: That would be the least efficient thing you could possibly do.

[17:27:32] <GothAlice> asturel: Though technically it'd work… ish.

[17:27:43] <GothAlice> asturel: Would your system let me use "🚸" as a username? (Yes, there's a unicode symbol between those quotes.)

[17:27:59] <StephenLynx> like farms.$.cows.$.name : name would be a better example

[17:27:59] <asturel> only ascii

[17:28:08] <asturel> well maybe extended ascii

[17:28:30] <asturel> it doesnt have to be fast

[17:29:01] <GothAlice> https://en.wikipedia.org/wiki/Unicode_equivalence and http://www.unicode.org/faq/normalization.html should be interesting to you.

[17:29:19] <GothAlice> No, I mean, that type of query can't use an index. It has to load and iterate all possible matching records.

[17:30:55] <asturel> hmm https://jira.mongodb.org/browse/SERVER-90

[17:33:04] <GothAlice> Text normalization isn't easy. ;P

[18:46:25] <fllr> Hey guys! I'm running into a weird issue with mongo. As soon as I set up a user on mongo using db.userCreate(), it doesn't authenticate me anymore

[18:46:28] <fllr> What gives?

[18:47:13] <StephenLynx> that is intended.

[18:47:18] <StephenLynx> its the localhost exception.

[18:47:37] <StephenLynx> if you have authentication turned on but no users, you can log in from localhost without authentication.

[18:47:48] <StephenLynx> as soon as you create an user, you have to authenticate even from localhost.

[19:01:56] <fllr> StephenLynx: Right... But it's not working even if I run db.auth()

[19:03:27] <loth> does rs.add() have the ability to add multiple replicas comma delimited?

[19:33:42] <tehgeekmeister> is there a way to compare two collections/mongo dumps for document level equality efficiently?

[19:33:55] <tehgeekmeister> looks like dumps are arbitrarily ordered, so md5sum isn't sufficient.

[19:35:19] <StephenLynx> can you sort them before dumping?

[19:45:28] <ngl> Can I not just take a string and insert it into GFS?

[19:50:08] <StephenLynx> yes.

[19:51:14] <fllr> Hey guys! I'm running into a weird issue with mongo. As soon as I set up a user on mongo using db.userCreate() with the following roles: ['clusterAdmin', 'readWriteAnyDatabase', 'userAdminAnyDatabase', 'dbAdminAnyDatabase'] (in the admin db), it doesn't authenticate me even if I run db.auth() (in the database I want to use)

[19:51:45] <StephenLynx> have you tried putting the authentication data in the connection string?

[19:52:17] <StephenLynx> like user:password@address:post/database

[19:52:28] <fllr> StephenLynx: Yeah... I still get refused...

[19:52:29] <StephenLynx> address:port*

[19:52:31] <StephenLynx> hm

[19:52:53] <GothAlice> fllr: Were you describing that the user was created in a DB other than the one the user needs access to?

[19:53:06] <GothAlice> I.e. the user was created in the admin database, but needs to use database "foo"?

[19:53:09] <cheeser> you'd need --authSource (iirc) then

[19:53:18] <cheeser> --authDb ?

[19:53:20] <fllr> GothAlice: The admin database... with readWrite any database

[19:53:21] <cheeser> something like that.

[19:53:39] <GothAlice> --http://docs.mongodb.org/manual/reference/program/mongo/#mongo-shell-authentication-options

[19:53:47] <GothAlice> :P

[19:54:14] <GothAlice> fllr: Yeah, then you need to tell MongoDB's tools which database to connect to first and authenticate with, via the --authenticationDatabase option.

[19:54:40] <GothAlice> (I.e. mongo --user foo --authenticationDatabase admin localhost/foodb --pass

[19:54:46] <fllr> agh...

[19:55:11] <GothAlice> To avoid this, create the user in the DB they will be accessing.

[19:56:04] <fllr> Omg, it worked! :~~~)

[19:56:44] <GothAlice> :)

[20:34:03] <kopasetik> QUESTION: for some reason i cannot save to my users collection with this code: https://gist.github.com/kopasetik/5cc8ea545b4179277886

[20:34:54] <kopasetik> https://gist.github.com/kopasetik/5cc8ea545b4179277886#file-route-js Lines 31-38

[20:36:00] <ngl> Oh my. I was just not calling "end". :\ I could kick me.

[20:36:48] <GothAlice> ngl: Sometimes it's the little things, eh? ;P

[20:36:57] <deathanchor> is there docs about performance issue with queries that use $all and $nin on an array field which is indexed?

[20:37:10] <ngl> Yep

[20:37:18] <GothAlice> deathanchor: http://docs.mongodb.org/manual/core/index-multikey/

[20:37:23] <deathanchor> thx

[20:40:58] <GothAlice> deathanchor: In general, issuing multi-value array queries will multiply the "difficulty" of a given plan by the number of values, i.e. worst-case it can't use an index and needs to manually compare each value passed in against the actual records. With an index, it still compares each, but these are much faster btree lookups. (Not hashtable lookups, but one step back from that in terms of performance.) The more values, the slower.

[20:41:15] <GothAlice> Getting an .explain() of the queries can be helpful. :)

[20:45:15] <GothAlice> deathanchor: http://codyaray.com/2014/11/mongo-multi-key-index-performance goes into some detail, though the unicode errors make it hard to read on my display. XP

[20:45:25] <GothAlice> "Â" all over the place.

[20:48:37] <StephenLynx> GothAlice, how do you sort pinned threads in your forum? If I use two sorts in the aggregation, it will mess up sorting all unpinned threads. if I use one sort with the two fields, it will not put the pinned on top.

[20:49:03] <GothAlice> Pinned threads by definition only matter on the first "page" of thread results, to my forums. Thus: separate query.

[20:49:45] <StephenLynx> what if you have more pinned threads than threads per page?

[20:49:54] <GothAlice> I technically don't have pages.

[20:50:01] <GothAlice> It's an infinite scroller that just "continues where it left off".

[20:50:17] <GothAlice> If it left off mid-query on the pinned query, it'll continue from there.

[20:51:16] <GothAlice> Having more than a "page" of pinned threads, though, utterly defeats the purpose of thread pinning and obstructs normal usage by hiding ordinary threads.

[20:52:10] <GothAlice> (But we also do extensive "new since the last time you visited" checking and highlighting on the interface, so even if you refresh the "first page" of nothing but pinned threads, if an ordinary thread was updated, you'll still be able to see that fact.

[20:53:10] <StephenLynx> I think I don't have a choice but perform a separate query.

[20:53:38] <StephenLynx> either way it doesn't work in one of the cases.

[20:53:40] <GothAlice> {pinned: -1, modified: -1} doesn't work?

[20:53:57] <StephenLynx> breaks when all of them are not pinned.

[20:54:15] <GothAlice> … they'd all have the same value and it'd rely entirely on the modified order, wouldn't it?

[20:54:19] <StephenLynx> wait

[20:54:22] <StephenLynx> my mistake.

[20:54:26] <GothAlice> Same pinned value, that is.

[20:54:28] <StephenLynx> it was non-reloaded code.

[20:54:36] <StephenLynx> it does works.

[20:54:46] <StephenLynx> I had tried and got a false negative because of that :v

[20:54:52] <GothAlice> ^_^

[20:57:11] <GothAlice> In my forums, pinned are rendered separately from ordinary, too, so double reason for two queries, but in the general case, yeah, sort is good.

[21:09:57] <kopasetik> anyone here do mongodb in nodejs with promises?

[21:12:11] <kba> kopasetik: I use mongoose, so I guess that uses promises

[21:18:46] <GothAlice> kba: It doesn't.

[21:19:19] <GothAlice> Hmm, okay, maybe it sorta does. I've never seen a support example that made overt use of it, I guess.

[21:20:09] <GothAlice> That explains that end() issue. XD

[21:21:37] <kba> It uses callbacks which is very close to the same thing

[21:22:12] <GothAlice> kba: https://en.wikipedia.org/wiki/Futures_and_promises :)

[21:22:23] <kba> I know what promises are

[21:23:11] <GothAlice> A callback is less than a weak promise. :P

[21:50:49] <ruffyen> anyone know if there are any intentions to support raspberrypi with official MongoDB builds?

[21:53:10] <GothAlice> ruffyen: Won't happen.

[21:53:17] <ruffyen> really?

[21:53:30] <GothAlice> rPi fails on endianness and 64-bit.

[21:53:37] <GothAlice> Endianness being the big one.

[21:53:56] <rendar> GothAlice: shoulnd't endianess be abstracted away?

[21:54:32] <ruffyen> hmm

[21:54:34] <ruffyen> that sucks

[21:54:44] <ruffyen> it would be pretty awesome to run on my rpi-v2

[21:55:38] <GothAlice> ruffyen: MongoDB uses memory-mapped files extensively, thus no, very difficult to abstract away while keeping performance.

[21:55:47] <ruffyen> yeah

[21:56:38] <rendar> GothAlice: you mean that MongoDB uses mmap to memory-map files and write binary data (e.g. integers) into files, which are for example in little endian? what you mean exactly?

[21:57:14] <GothAlice> rendar: Correct. Even things you might not expect to be integers are integers, such as ObjectIds, which contains four, I think?

[21:57:24] <rendar> yeah right

[21:57:25] <Progz> hello

[21:57:55] <rendar> GothAlice: the strange thing i can't get is: even if rasp is big endian, the I/O of those data will be big endian into the rasp file system, so what? :)

[21:57:57] <GothAlice> rendar: Basically, having the wrong endianness won't just garble your data files to be non-portable off that architecture. It'll also garble the BSON wire protocol.

[21:58:07] <rendar> oh, ok

[21:58:21] <rendar> so the BSON can't get big endian, just little endian

[21:58:44] <GothAlice> It's engineered for one, and in a single codebase supporting both is, as mentioned, not simple.

[21:58:50] <rendar> ok

[21:59:06] <rendar> so BSON is the "bottleneck" here

[21:59:30] <GothAlice> Well, the codebase is a bottleneck.

[21:59:55] <GothAlice> I'd prefer to not have the added complexity, and thus bugs, that broad support for edge case platforms would bring, too. :P

[22:00:16] <rendar> :)

[22:00:30] <rendar> GothAlice: it seems you know pretty well MongoDB internals

[22:00:45] <GothAlice> rendar: I read through it over coffee in the mornings.

[22:01:03] <GothAlice> (No joke. I do very much enjoy reading other's code.)

[22:01:16] <rendar> lol

[22:01:29] <rendar> i'll show you my code someday

[22:01:40] <rendar> i hope you'll enjoy

[22:03:23] <Progz> Hello, I've git a trouble with a simple aggregate query. I Match on a domain name and sum the number of items. I had an index but nothing to take. It's pretty slow. 3 to 5 seconds.

[22:03:31] <Progz> => Index and query : http://pastebin.com/qytFhU6T

[22:04:00] <GothAlice> Progz: Is that part of a larger aggregate pipeline?

[22:04:11] <GothAlice> If not, don't use aggregates for that type of query.

[22:04:25] <Progz> GothAlice: don't understand, I am just doing this command.

[22:04:32] <Progz> *query

[22:04:32] <GothAlice> db.collection.find({domain: "toto.com"}).count() < should give you the answer near-instantly.

[22:04:44] <GothAlice> Oh, sorry, missed the $sum.

[22:04:51] <Progz> ^^

[22:04:55] <GothAlice> (Wasn't just a $sum: 1)

[22:05:04] <GothAlice> Ah, project.

[22:05:22] <Progz> db.collection.find({domain: "toto.com"}).count() => this would return the number of page in my website

[22:05:29] <GothAlice> [{$match: {…}, {$project: {nb_items: 1}}, {$group: {…}}]

[22:05:38] <GothAlice> You don't want MongoDB throwing whole records around if it can avoid it.

[22:05:56] <Progz> GothAlice: I tried but nothing different.

[22:06:03] <Progz> I will retry maybe I mistaken

[22:06:03] <GothAlice> What's the result of an explain?

[22:06:12] <Progz> ??? explain on a aggregate ?

[22:06:48] <GothAlice> http://docs.mongodb.org/manual/reference/method/db.collection.aggregate/#db.collection.aggregate < first documented option.

[22:06:50] <Progz> How can you explain an aggregate ? I was told it wasn't possible :s

[22:07:32] <Progz> GothAlice: thanks. Trying

[22:07:37] <bmillham> How many times have you had coffee coming out of your nose after reading someone else's code GothAlice ;-) (And Hi, long time no see :-) )

[22:08:02] <GothAlice> bmillham: Howdy! Indeed, long time. Rarely. It takes a truly special gem of code to phase me, these days.

[22:08:16] <GothAlice> <? include($_GET['page']); ?> would qualify. >:P

[22:09:06] <rendar> GothAlice: sometimes it happens also to me

[22:09:31] <GothAlice> Expulsion of coffee via the sinus?

[22:09:51] <bmillham> lol

[22:10:05] <rendar> if i read void *** p; ? yes.

[22:10:10] <GothAlice> Ha.

[22:10:12] <rendar> :)

[22:10:38] <rendar> lol

[22:10:57] <GothAlice> (When you have a linked list that's also a packed array that's also a ring buffer… you're in for fun times.)

[22:11:02] <rendar> if you think you abuse pointers, i just did a pointer-harassment

[22:11:58] <Progz> GothAlice: http://pastebin.com/UtiFmMSH

[22:12:02] <rendar> union { void * ptr; uintptr_t integer; } u; u.ptr = some_pointer(); u.integer &= ~mask; user_the_pointer(u.ptr);

[22:12:10] <rendar> ftw.

[22:12:14] <GothAlice> Nice.

[22:13:09] <GothAlice> Progz: So MongoDB chose a different index than the one you pasted. :)

[22:13:21] <Progz> indeed

[22:13:32] <Progz> I just thought $group not using an index

[22:13:33] <GothAlice> In fact, it chose an index that doesn't "cover" the query, even though one is available. (Likely because it's an aggregate and can't actually benefit from index query covering.)

[22:13:40] <GothAlice> Oh, $group is certainly not using an index.

[22:13:42] <GothAlice> It can't.

[22:13:44] <Progz> so the index choisen by mongo issmaller than the other one

[22:14:01] <GothAlice> $match, $order, and $limit, if not preceded by anything other than each-other, can use an index.

[22:14:16] <Progz> ^^

[22:14:16] <GothAlice> Er, $sort, soryy.

[22:14:42] <Progz> so it's impossible to sum my items quickly.

[22:14:59] <GothAlice> Do it application-side.

[22:15:05] <GothAlice> Using a covered query. :)

[22:15:13] <Progz> The best way to do it, it's to calculte them asynchronously and update the value in my document

[22:15:14] <GothAlice> (Returning _only_ the counts. Not even the _id's.)

[22:16:07] <GothAlice> var count = 0; db.foo.find({}, {nb_items: 1}).forEach(function(doc)}{ count += doc.nb_items; }) — I'd see how quick that is compared to the aggregate, for you. (Running an explain, or hinting, to use the index that includes nb_items.)

[22:16:13] <Progz> Some wbesite can have more than 30 000 pages

[22:16:44] <GothAlice> Then consider: either you do the loop, or MongoDB is doing the loop, but someone, somewhere is spending the time to loop over all that.

[22:17:00] <Progz> I have got multiple domain in the same collection Page.

[22:17:02] <GothAlice> So pre-aggregation of the counts becomes a much more viable alternative to figuring out the total later.

[22:18:25] <Progz> GothAlice: ok GothAlice

[22:19:17] <GothAlice> Progz: The trick is that whenever you update nb_count on a document, also update the nb_total or whatever on some document somewhere that represents the "site" or other grouping you want to get stats on.

[22:19:19] <Progz> Thanks a lot

[22:20:50] <Progz> GothAlice: indeed. With the Page collection, we have the Domain collection with information about this domain. Il will add in the domain document nb_total

[22:24:19] <Progz> Thanks

[22:24:20] <Progz> bye bye

Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 16th of June, 2015