[01:11:42] <MrAmmon> is there a simple way to create a collection in mongo from a subset of documents in another collection? For example, if I just want to create a table of 1000 documents from my collection of millions so I can test an index?
[03:40:06] <loth> Hey all, getting a "exception: need most members up to reconfigure, not ok" error even though I can reach the node im adding just fine https://pastebin.mozilla.org/8836929
[03:51:42] <joannac> loth: you are piping that command to 10.0.0.1
[03:51:59] <joannac> connect to that host, and verify that host can connect to 10.0.128.107:27017
[03:57:00] <loth> joannac: which host should i connect from? 10.0.0.1 = cpl001
[03:58:29] <loth> which is where i ran rs.initiate()
[03:59:16] <madeti> why is this not working: find({'someArrayField': {$elemMatch: {$in: [{'k1': 'v1', 'k2', 'v2'}, {'k1: 'b1', 'k2': 'b2'}] } } })
[03:59:31] <madeti> cant I use $elemMatch with $in ?
[04:03:46] <Boomtime> @madeti: there are a few curious things going on in your query there.. first, you only have 1 predicate to the $elemMatch, so $elemMatch is superfluous
[04:03:57] <Boomtime> secondly, the $in looks suspiciously odd
[04:04:09] <Boomtime> perhaps you should start by explaining what you are trying to do
[04:05:21] <madeti> Boomtime: 'someArrayField' is a field in my document which has an array of objects/dicts,
[04:05:32] <madeti> so I want to find all documents which have
[04:06:35] <madeti> either one of the objects from the array [{'k1': 'v1', 'k2', 'v2'}, {'k1: 'b1', 'k2': 'b2'}] in the 'someFieldArray'
[04:07:28] <madeti> my array can can have several objects
[04:08:36] <joannac> loth: open a shell to 10.0.0.1 and run db.serverStatus().host
[04:16:41] <Boomtime> @madeti: $elemMatch applies to it's own parameters, not to the parameters of operators that are nested deeper in - so you actually need each 'or' clause to be: { $elemMatch: {'k1': 'v1', 'k2', 'v2'} }
[04:17:06] <Boomtime> this also implies that $in is not the right operator, since it expects to apply direct matching
[04:17:40] <Boomtime> i.e you want an arrangement more like this: $or: [ $elemMatch: {}, $elemMatch: {} ]
[09:18:04] <markand> it seems that mongo-c-driver 1.1.7 lacks the content of src/libbson
[09:18:12] <markand> and we cannot add an issue on github
[09:28:20] <circ-user-22hxs> I have a collection with 2 millions documents, I'd like to move documents older than 6 months to a backup collection, how can I do this efficiently?
[09:29:51] <Lujeni> circ-user-22hxs, you can dump the document which need a backup, then remove it with a basic query ?
[09:31:23] <circ-user-22hxs> @Lujeni: it's multiple documents, around 1 million, so I'm wondering what's the best way to do this.
[09:35:15] <Lujeni> circ-user-22hxs, 1million is not a big deal. You can simply remove it. Your database lock will increase during this operation
[09:38:15] <circ-user-22hxs> @Lujeni : I'd like to move them to a backup collection, not remove them.
[09:38:28] <mtree> guys, how to set key from a value in aggregate query?
[09:38:53] <mtree> i need to store each unwinded value per its specific key
[09:39:43] <mtree> $project: { '$scrapes.ota': '$scrapes.ota.price' } gives me an error:
[09:40:01] <mtree> "errmsg" : "exception: the operator must be the only field in a pipeline object (at '$scrapes.ota'",
[09:40:45] <Lujeni> circ-user-22hxs, move = delete in your case
[11:01:29] <GoCrazy> I want to ask about a better way to implement autocomplete using mongodb , currently i am using simple RegEx , i want to make it more efficient .
[15:45:05] <StephenLynx> its just me or the binaries were reduced in size?
[15:45:17] <StephenLynx> I remember a full update would clock around 80mb, but its 50mb this time around
[17:04:07] <BadHorsie> I have a collections ipNets with entries like: {"cidrs":{"10_0_0_1/24":{"ip":{"10_0_0_2":{name:"mygw"}}}}}, I'm curious about how to search for {"name":"mygw"}, I usually do like "cidrs.<range>.ip.<ip>.name", but in this case I don't know the <range> or <ip> from the query...
[17:06:15] <StephenLynx> why did you nested it like that?
[17:06:31] <StephenLynx> instead of making cidrs an array and the ip a field of the objects inside the array?
[17:06:52] <StephenLynx> "10_0_0_1/24" what does this mean?
[17:07:13] <StephenLynx> why did you nested everything?
[17:08:22] <BadHorsie> We thought of that over "cidrs":[{"range":"10.0.0.0/24","ips":[{"ip":10.0.0.1,"name":"mygw"}]}] because it was kinda hard for updates, like the find would return one positional operator and you would need that trick of foreknowing the itemNumber...
[17:08:34] <BadHorsie> That doesn't mean that both are not completely wrong lol
[17:08:57] <BadHorsie> TBH we are just learning and trying to find a reasonable model and failing at doing so, part of learning I guess
[17:11:20] <StephenLynx> dynamic keys are one of the worst practice on any database.
[17:12:36] <StephenLynx> just whatever you do, don't use dynamic keys.
[17:12:58] <BadHorsie> Interesting, so have an array of IPs per key-ed range... cidrs in your example is "10_0_0_0/24" and there are multiple cidrs expected, right ?
[17:14:25] <StephenLynx> there are? I don't know much about networking to figure your requirements just by what I understood so far.
[17:15:00] <StephenLynx> in that example cidrs should be an array too, when I think about it.
[17:15:18] <StephenLynx> if you can have multiples of it per document.
[17:15:40] <StephenLynx> "cidrs":[{"range":"10.0.0.0/24","ips":[{"ip":10.0.0.1,"name":"mygw"}]}] is much better
[17:15:42] <StephenLynx> and is not hard to update
[17:16:02] <StephenLynx> because you can match on the query block and use $ on the update block.
[17:16:33] <StephenLynx> cidrs.$.ips.$.name: value
[17:16:53] <StephenLynx> given that you matched the correct cidrs and ip on the query block.
[17:17:05] <StephenLynx> you can either use subarrays or split on a separate collection.
[17:17:12] <StephenLynx> each approach has its own limitations.
[17:18:15] <StephenLynx> it depends more on how you will query it rather than how you will update it.
[17:19:30] <StephenLynx> if you want to query the ips from multiple ranges along with data that is pertinent to the range, you will have to perform an additional query to get the range documents after obtaining the ip documents and then manually append the data on the ips to fake a join.
[17:19:47] <BadHorsie> How would you build the update instruction? I have a db.ipNets.update({"cidrs.ips.ip":"10.0.0.1"},{$set:cidrs.$.ips.0.name:"mygw"}); which works in my case because I know beforehand that it's item subzero...
[17:19:47] <StephenLynx> that is if you use separate documents.
[17:20:35] <StephenLynx> you could just use $ there too, if I am not mistaken.
[17:21:19] <BadHorsie> So I use twice $ and it knows which array itemNumber I mean ? AFAIK there is a JIRA for it to work with like $1, $2 and so on... But so far is not possible...
[17:25:02] <GothAlice> asturel: Due to use of extended characters, you may be forced to save a second copy of the field, lower-case'd, and query on that.
[17:26:44] <asturel> http://docs.mongodb.org/manual/reference/operator/aggregation/strcasecmp/ something like this doesnt work?
[17:27:00] <StephenLynx> Can't find it. But GothAlice should be able to confirm, you are able to use multiple $ to indicate indexes in an update, right?
[17:27:16] <StephenLynx> like farms.$.cows.$ : newAmount
[17:27:19] <GothAlice> asturel: That would be the least efficient thing you could possibly do.
[17:27:32] <GothAlice> asturel: Though technically it'd work… ish.
[17:27:43] <GothAlice> asturel: Would your system let me use "🚸" as a username? (Yes, there's a unicode symbol between those quotes.)
[17:27:59] <StephenLynx> like farms.$.cows.$.name : name would be a better example
[17:29:01] <GothAlice> https://en.wikipedia.org/wiki/Unicode_equivalence and http://www.unicode.org/faq/normalization.html should be interesting to you.
[17:29:19] <GothAlice> No, I mean, that type of query can't use an index. It has to load and iterate all possible matching records.
[17:33:04] <GothAlice> Text normalization isn't easy. ;P
[18:46:25] <fllr> Hey guys! I'm running into a weird issue with mongo. As soon as I set up a user on mongo using db.userCreate(), it doesn't authenticate me anymore
[19:51:14] <fllr> Hey guys! I'm running into a weird issue with mongo. As soon as I set up a user on mongo using db.userCreate() with the following roles: ['clusterAdmin', 'readWriteAnyDatabase', 'userAdminAnyDatabase', 'dbAdminAnyDatabase'] (in the admin db), it doesn't authenticate me even if I run db.auth() (in the database I want to use)
[19:51:45] <StephenLynx> have you tried putting the authentication data in the connection string?
[19:52:17] <StephenLynx> like user:password@address:post/database
[19:52:28] <fllr> StephenLynx: Yeah... I still get refused...
[19:54:14] <GothAlice> fllr: Yeah, then you need to tell MongoDB's tools which database to connect to first and authenticate with, via the --authenticationDatabase option.
[20:34:03] <kopasetik> QUESTION: for some reason i cannot save to my users collection with this code: https://gist.github.com/kopasetik/5cc8ea545b4179277886
[20:40:58] <GothAlice> deathanchor: In general, issuing multi-value array queries will multiply the "difficulty" of a given plan by the number of values, i.e. worst-case it can't use an index and needs to manually compare each value passed in against the actual records. With an index, it still compares each, but these are much faster btree lookups. (Not hashtable lookups, but one step back from that in terms of performance.) The more values, the slower.
[20:41:15] <GothAlice> Getting an .explain() of the queries can be helpful. :)
[20:45:15] <GothAlice> deathanchor: http://codyaray.com/2014/11/mongo-multi-key-index-performance goes into some detail, though the unicode errors make it hard to read on my display. XP
[20:48:37] <StephenLynx> GothAlice, how do you sort pinned threads in your forum? If I use two sorts in the aggregation, it will mess up sorting all unpinned threads. if I use one sort with the two fields, it will not put the pinned on top.
[20:49:03] <GothAlice> Pinned threads by definition only matter on the first "page" of thread results, to my forums. Thus: separate query.
[20:49:45] <StephenLynx> what if you have more pinned threads than threads per page?
[20:49:54] <GothAlice> I technically don't have pages.
[20:50:01] <GothAlice> It's an infinite scroller that just "continues where it left off".
[20:50:17] <GothAlice> If it left off mid-query on the pinned query, it'll continue from there.
[20:51:16] <GothAlice> Having more than a "page" of pinned threads, though, utterly defeats the purpose of thread pinning and obstructs normal usage by hiding ordinary threads.
[20:52:10] <GothAlice> (But we also do extensive "new since the last time you visited" checking and highlighting on the interface, so even if you refresh the "first page" of nothing but pinned threads, if an ordinary thread was updated, you'll still be able to see that fact.
[20:53:10] <StephenLynx> I think I don't have a choice but perform a separate query.
[20:53:38] <StephenLynx> either way it doesn't work in one of the cases.
[20:57:11] <GothAlice> In my forums, pinned are rendered separately from ordinary, too, so double reason for two queries, but in the general case, yeah, sort is good.
[21:09:57] <kopasetik> anyone here do mongodb in nodejs with promises?
[21:12:11] <kba> kopasetik: I use mongoose, so I guess that uses promises
[21:56:38] <rendar> GothAlice: you mean that MongoDB uses mmap to memory-map files and write binary data (e.g. integers) into files, which are for example in little endian? what you mean exactly?
[21:57:14] <GothAlice> rendar: Correct. Even things you might not expect to be integers are integers, such as ObjectIds, which contains four, I think?
[21:57:55] <rendar> GothAlice: the strange thing i can't get is: even if rasp is big endian, the I/O of those data will be big endian into the rasp file system, so what? :)
[21:57:57] <GothAlice> rendar: Basically, having the wrong endianness won't just garble your data files to be non-portable off that architecture. It'll also garble the BSON wire protocol.
[22:03:23] <Progz> Hello, I've git a trouble with a simple aggregate query. I Match on a domain name and sum the number of items. I had an index but nothing to take. It's pretty slow. 3 to 5 seconds.
[22:03:31] <Progz> => Index and query : http://pastebin.com/qytFhU6T
[22:04:00] <GothAlice> Progz: Is that part of a larger aggregate pipeline?
[22:04:11] <GothAlice> If not, don't use aggregates for that type of query.
[22:04:25] <Progz> GothAlice: don't understand, I am just doing this command.
[22:07:37] <bmillham> How many times have you had coffee coming out of your nose after reading someone else's code GothAlice ;-) (And Hi, long time no see :-) )
[22:08:02] <GothAlice> bmillham: Howdy! Indeed, long time. Rarely. It takes a truly special gem of code to phase me, these days.
[22:08:16] <GothAlice> <? include($_GET['page']); ?> would qualify. >:P
[22:09:06] <rendar> GothAlice: sometimes it happens also to me
[22:09:31] <GothAlice> Expulsion of coffee via the sinus?
[22:13:32] <Progz> I just thought $group not using an index
[22:13:33] <GothAlice> In fact, it chose an index that doesn't "cover" the query, even though one is available. (Likely because it's an aggregate and can't actually benefit from index query covering.)
[22:13:40] <GothAlice> Oh, $group is certainly not using an index.
[22:15:13] <Progz> The best way to do it, it's to calculte them asynchronously and update the value in my document
[22:15:14] <GothAlice> (Returning _only_ the counts. Not even the _id's.)
[22:16:07] <GothAlice> var count = 0; db.foo.find({}, {nb_items: 1}).forEach(function(doc)}{ count += doc.nb_items; }) — I'd see how quick that is compared to the aggregate, for you. (Running an explain, or hinting, to use the index that includes nb_items.)
[22:16:13] <Progz> Some wbesite can have more than 30 000 pages
[22:16:44] <GothAlice> Then consider: either you do the loop, or MongoDB is doing the loop, but someone, somewhere is spending the time to loop over all that.
[22:17:00] <Progz> I have got multiple domain in the same collection Page.
[22:17:02] <GothAlice> So pre-aggregation of the counts becomes a much more viable alternative to figuring out the total later.
[22:19:17] <GothAlice> Progz: The trick is that whenever you update nb_count on a document, also update the nb_total or whatever on some document somewhere that represents the "site" or other grouping you want to get stats on.
[22:20:50] <Progz> GothAlice: indeed. With the Page collection, we have the Domain collection with information about this domain. Il will add in the domain document nb_total