[03:20:59] <bakhtiyor> anyone there? can help with aggregation query?
[03:22:06] <Boomtime> @bakhtiyor: hi there, ask your question - if an agg is not working as you expect, put your current attempts in a pastebin or equivalent with explanation of what you expect
[03:22:31] <Boomtime> i can't guarantee anyone will know, but you'll need to do this step before anyone can try
[03:28:00] <bakhtiyor> need to cross check by "key" field on the same collection, means the same "value" exists on "a" and "b" key
[03:47:13] <bakhtiyor> is it possible with aggregation without any script?
[07:00:48] <fleetfox> Hello. I'm archiving with mongodump --quiet --archive > file.archive and getting Failed: corruption found in archive; ParserConsumer.BodyBSON() ( Document is corrupted )
[09:07:07] <ams_> Derick: Yeah that's actually a log line from when our server didn't have enough memory. But we've got a bunch more, these looks like someone didn't add an index
[09:08:31] <kurushiyama> ams_: No news are good news. If a query makes it to the logs, that's bad news. My guess would have been disk IO, though, as per timeAcquiringMicros.
[09:16:38] <ams_> yeah it's a bit confusing because when you log lines that were too long to log it gets upgraded to a W and has an explicit message, I did notice "log slow queries" config item but assumed it would do the same. Not a big deal, though.
[12:02:16] <mroman> I assume that things can go wrong if you allow that a user can upload a json document?
[12:16:07] <lipiec> Is there a way to setup mongos to read only from hidden replica members?
[12:40:57] <mroman> The use-case is roughly that users can upload json documents which are then stored in the mongodb
[12:41:45] <kurushiyama> lipiec: Hidden members are not exposed and you can only connect directly.
[12:41:47] <mroman> i.e. a user uploads {"x":33,"y":38, "info" : { "r" : 99, "kind" : "CT" }}
[12:42:06] <kurushiyama> mroman: Unchecked input is the best way to get you into trouble.
[12:42:09] <mroman> which is converted to BSON and insetred.
[12:42:44] <kurushiyama> mroman: The problem is not what problems you can think of, but the problems you do not think of.
[14:33:50] <Derick> MongoDB\Client comes from the PHP library - have you installed that (through composer)? You also need to require it's libraries, as is shown at: http://mongodb.github.io/mongo-php-library/getting-started/
[14:34:23] <cpama> yes Derick i believe i have. but i will double check. checking phpinfo...
[14:34:33] <Derick> you can use *either* \MongoDb\Driver\Manager for low level stuff, or, \MongoDB\Client for a nicer API (recommended). The latter requires the MongoDB Library
[14:34:46] <Derick> cpama: phpinfo() will not tell you about PHP libraries installed through composer, only PHP extensions
[14:35:28] <cpama> my sys admin set up this box for us. and I was just "told" it had php 7 and latest mongodb driver.
[14:36:05] <Derick> cpama: sure, and you should be able to use composer yourself to install the MongoDB Library for PHP, as is described at http://mongodb.github.io/mongo-php-library/getting-started/
[14:36:56] <cpama> hm. This server is running alpine linux. Will check it out
[16:59:19] <dgaff> Hey all - quick question about indexes - if I query a collection of a billion documents using four separate indexes in the query (e.g. where field1,field2,field3,field4), is it possible to get a speedup by concatenating the fields and making a single index (e.g. where field5, and field5 is just field1,field2,field3,field4 put together)
[17:05:07] <rbpd5015> Is there anyway I can avoid using aggration by doing it in the update?
[17:05:11] <dgaff> is an $or operator worse than an $in for that case?
[17:05:18] <rbpd5015> Aggragation is killing my cpu
[17:05:27] <kurushiyama> dgaff: Still I do not understand completely. You are searching for 1k combinations of field1-4 in a single query?
[17:05:33] <Derick> so, with that index, it will be used if you query compares with parent_author alone, parent_author+reply_author, parent_author+reply_author+subreddit, or parent_author+reply_author+subreddit+time
[17:05:48] <Derick> not for for example, just reply_author, or reply_author+subreddit+time
[17:07:03] <Derick> dgaff: why are you doing this? what's the result you want to get?
[17:07:34] <kurushiyama> rbpd5015: Could you explain with an example? Please pastebin it.
[17:09:27] <dgaff> @Derick it's a long sad story. Right now, I have 300k sets, each of 1,000 ids, each corresponding to a comment on reddit, stored in a separate collection. For each 300k, I have to look up the 1,000 comments, look up the potentially up to 1,000 comments they were a reply to, and then I now have 0-1,000 objects that look like {parent_author: "abc", reply_author: "xyz", subreddit: "blah", time: TimeObject}
[17:10:20] <rbpd5015> I am getting updates to players stats every second via kafka and the players are being updated just fione
[17:10:33] <dgaff> @Derick - the output is a graph of all interactions on reddit ever
[17:10:50] <rbpd5015> the problem i am having is the aggration of the summing all players fantasy points to the root level lineup fantasy points is killing our system
[17:10:50] <Derick> dgaff: sounds like you'd want a graph database instead? :-)
[17:11:10] <dgaff> I was down to 4k jobs, and then I had a brownout at my house, which fried my hard drive, and now I'm back to a backup of those tasks from two months ago
[17:11:51] <dgaff> @Derick - I looked into neo and cassandra and even sql/postgres for a few weeks before I went with mongo for this - none of them were really that great for this job
[17:12:21] <dgaff> I may move subsets of the graph into things like that once the full list of edges are finished though
[17:12:37] <rbpd5015> the document I showed a screenshot of is a fantasy sports lineup that consists of 9 players
[17:12:59] <rbpd5015> each players has fantasy points which need to total in the root of the lineup after every kafka update
[17:12:59] <dgaff> but either way - how would you optimize the chain of queries that have to happen?
[17:13:20] <rbpd5015> I was wondering how we can do it in the mongodb update statement which is faster than aggragation
[17:18:25] <kurushiyama> rbpd5015: That does not help me much. Sample docs, expected output and what you want to do...
[17:19:14] <rbpd5015> sorry I am looking for a high level answer I am just PO, my devs are stuck.
[17:19:55] <rbpd5015> Just they are saying they can aggragte and update at same time and the for loop is killing them after thy update each player stat. I am just trying to come up with a solution for them to research more
[17:20:31] <awal> New to mongo here (but not programming). I can't find any strong opinions about mongoose on the web. It seems to be forcing the schema thing on me. I am not feeling comfortable with it. And all the "starter" apps for mongo and passport use mongoose only. I am a bit overwhelmed. Where should I go? use mongoose or not? if not, then how where can I get some good examples of usage of mongodb-native driver with other libs?
[17:20:54] <Derick> awal: always with a new technology, use the raw things first
[17:21:05] <Derick> only that teaches you how it actually works.
[17:21:18] <Derick> If you're comfortable with it, perhaps consider abstraction layers.
[17:21:33] <awal> Derick: right, that's what I want to do. Except I can't find good example code :( I learn from examples best.
[17:22:13] <Derick> awal: https://mongodb.github.io/node-mongodb-native/ are the docs
[17:22:18] <Derick> but I realise that's not an example
[17:23:03] <awal> yeah... docs tell me how to do one very specific thing for every specific thing that can be done. but they don't show me the high-level view of things.
[17:23:35] <Derick> i've asked the author, but seems he's already back to life-mode for today
[17:24:00] <awal> I have used many other nosql and sql databases comfortably, but I am failing at starting with mongo :(
[17:24:13] <Derick> what is the thing you're not getting?
[17:27:00] <awal> Derick: for instance, we need a database of dynamic collections (a new collection is generated for some data of every user). we also keep a collection of all users in a separate "users" collection which just contains minimal information. mongoose doesn't let me do dynamic collections (or schemas since it maps collections to schemas) (or it does but no examples on how to go about structuring such a thing)
[17:29:18] <awal> and I'd also like to integrate passport with mongo, but literally every example for "passport mongo" is actually "passport mongoose". It is a bit irritating :/
[17:30:37] <Derick> i don't really know what passport is
[17:31:10] <awal> passport is the battle tested user authentication middleware for nodejs/express apps.
[17:31:29] <awal> "the" because there is no other alternative :/
[17:44:54] <hipy> I wanted to know if there's a way in MongoDB to implement customized authentication mechanism
[17:57:19] <kurushiyama> hipy: SASL not good enough for you?
[18:10:54] <hipy> basically we already have a user authentication management system (webserver) where all users, their credentials, roles etc. are registered. If we use MongoDBs auth mechanism we'll have to maintain same set of users in MongoDB as well.
[18:11:47] <hipy> ...with preferably same credentials as in our main user management system.
[18:12:25] <hipy> We want to avoid this duplication. So we want sort of SSO/single login type of arrangement.
[18:14:00] <hipy> Hence I was thinking if they have allow implementation of custom authentication mechanism then I can just forward/relay the user credentials to our main user management system and get back authentication response.
[18:14:58] <hipy> ok...reading about it....I don't very good security background
[18:16:06] <kurushiyama> Well, if you are not good with security, you are probably not the best choice to implement a custom auth mechanism?
[18:17:08] <hipy> kurushiyama: can you please share link to MongoDB documentation about what you are referring to.
[18:17:59] <kurushiyama> hipy: It is with LDAP, but you should get the picture: https://docs.mongodb.org/manual/tutorial/configure-ldap-sasl-openldap/
[18:35:12] <hipy> kurushiyama: Here's my understanding about "Authenticate Using SASL and LDAP with OpenLDAP". MongoDB stores username (not password). It'll proxy authentication requests to LDAP server. Currently in our setup there's no LDAP.
[18:35:36] <kurushiyama> saslauthd can be connected to various sources.
[18:36:31] <kurushiyama> hipy: For example PAM, which in turn can auth against... whatever.
[18:37:40] <hipy> Forgot to mention that everything's on Windows
[18:39:17] <kurushiyama> hipy: Well, running MongoDB on Windows is not exactly what I would call a good choice. The Windows version does not provide SASL anyway, iirc. Given these constraints, the answer to your original question is "Not that I am aware of".
[18:49:24] <hipy> I see. Thanks for the pointers kurushiyama. Atleast I have a better understanding of various keywords we discussed.
[18:52:26] <kurushiyama> hipy: the findings in the above link are my first and foremost reason to think it is not the best idea to run MongoDB on Windows.
[19:05:25] <kurushiyama> Derick: No. It would be a bit rude knowing that people have to migrate to 3.0 first and then skipping that version in the ports and jump directly to 3.2, no?
[19:17:39] <kurushiyama> Just a few days ago, somebody tried the same stunt, and it worked. But it was a popcorn and beer session, tbh.
[19:18:22] <kurushiyama> I would not do it for one simple reason: Running production data on an operating system not officially supported by the DBMS vendor is not the best idea to begin with.
[19:19:27] <kurushiyama> The upgrade process as described in the respective docs is tested and basically guarantee a flawless procedure. Everything else... not so much.
[19:19:56] <kurushiyama> Plus: Mongodump and especially restore take AGES when compared to a rolling replset backup.
[19:20:34] <kurushiyama> And it is without downtime.
[19:20:45] <kurushiyama> That pretty much sums my reasons up.
[19:23:25] <Zelest> I'm just curious on how to do it.. Seeing I only have 2 clusters runnig now.. one on 2.7 and one on 3.2, in stand-by, ready to become the new one..
[19:23:45] <Zelest> Mayhaps I can downgrade it to 3.0 and go from there :)
[19:24:26] <kurushiyama> Zelest: Bad news: I am not sure wether downgrade is supported.
[19:24:43] <Zelest> it is, the new system has no data yet :)
[20:02:21] <Ryzzan> having a doc like this: {_id : 123, someArray : [{_uniqueId : 321, otherArray : ['value1', 'value2']}, {_uniqueId : 654, otherArray : ['value3', 'value4']}]}
[20:02:40] <Ryzzan> how to pull, let's say, 'value3' from otherArray
[20:58:33] <Ryzzan> kurushiyama: There's no way to pull an array element using its index!?
[20:59:00] <kurushiyama> Ryzzan: You can even pull by value. But only the first occurence.
[20:59:57] <Ryzzan> kurushiyama: ok, ty... gonna keep it in mind
[20:59:58] <kurushiyama> Ryzzan: I told you before and I do it again: You need to remodel your data. You will run into problems again and again.
[21:00:51] <Ryzzan> kurushiyama: just studying the possibilities... kindda disappointing not being able to pull array data by any index... :)
[21:02:04] <kurushiyama> Ryzzan: Well, it is a matter of attitude. If you try to slice bread with a screwdriver, you might be disappointed by the results as well ;)
[21:09:40] <kurushiyama> Ryzzan: Just doing wikipedia research. Beautiful.
[21:11:07] <Ryzzan> kurushiyama: rio de janeiro is all about "propaganda"... beaches in the northeast are as beautiful as over there... and drinkds and food are way cheaper and tasty
[21:12:22] <kurushiyama> Ryzzan: Well, that is an argument. A friend of mine got robbed in Rio. Thanks god she was prepared and the guys were kind of nice.
[21:13:34] <Ryzzan> kurushiyama: if u come consider visiting northeast, that's where "cariocas" spend their vacations
[21:50:36] <vicatcu> hey all, question - i want to get the results of a collection with a condition, and limit and skip to paginate the results, but i also want to know how many results there are in all
[22:11:33] <Ryzzan> kurushiyama: help me think as a nosql developer... to make a relation between two users in sql, i would have a table with users data and another one with users id and its relation description... then i would join these table to deal with info about users and theirs relations...
[22:11:46] <Ryzzan> kurushiyama: how must i thinks as a nosql developer?
[22:12:15] <kurushiyama> Ryzzan: Think of the use case.
[22:12:32] <kurushiyama> Ryzzan: Let us say user A follows user B
[22:13:21] <kurushiyama> Ryzzan: You could either have {_id:"A", follows:["B"]} or {_id:"B", followed:["A"]}
[22:13:31] <kurushiyama> Ryzzan: What is the problem?
[22:15:48] <kurushiyama> Ryzzan: It is pretty obvious... ;)
[22:17:20] <Ryzzan> kurushiyama: wjat if user A has more then one kind of connection with B
[22:18:02] <Ryzzan> kurushiyama: let's say {_id:"A", follows:[user:"B", conecctions:["lover", "killer"]]}
[22:18:07] <kurushiyama> Ryzzan: Well, the more obvious problem is that there is a limitation on how many users A can follow or by how many users B can be followed, but your explanation is correct, too.
[22:18:45] <kurushiyama> Ryzzan: This is because of the 16MB size limit of BSON documents.
[22:25:29] <Ryzzan> kurushiyama: got it... not that easy for me not thinking the sql way... it was a long relationship... but ty for showing me the way... gonna work over it
[22:25:38] <kurushiyama> So, if you want to show user B who loves him, you simply query db.relations.find({user2:"B", relation:"loves"})
[22:44:22] <jc3`> would skip() performance still be an issue if you are only ever querying/paginating a subset of a few thousand docs at any given time within a large collection?