PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Sunday the 6th of January, 2019

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[20:06:57] <top22> Hello, suppose I have an object like this: { _id, dict: { key1: value1, key2: value2 ... } } (the dict can be really big) and I query like this : find({_id:id}, {dict.key1: 1}). I know that only the dict.key field will be returned, but if the dict is really big in that document (say 10000 records), will I see a performance issue ?
[20:08:02] <top22> Will mongo load the whole document in memory, and then extract only dict.key1 ? or will it directly get only dict.key1 from disk ?
[21:00:29] <top22> any ideas ?
[21:15:26] <Derick> it's likely already going to be in memory anyway? But yes, documents are stored as one unit
[21:15:54] <Derick> whether that has a performance impact, that remains to be seen
[21:18:35] <top22> so if the document is big, even if I only ask for one field, the whole document is loaded
[21:22:24] <Derick> yes, unless it's already in memory - but then again, you ought to have your "working set" in memory most of the time anyway. There is no guarantee it *has* to be read from disk.
[21:22:29] <Derick> I wouldn't worry too much about it
[21:31:05] <MonicleLewinsky> Hi guys, I'm running into an issue when trying to find the total number of unique user id's in a collection. When I try doing this: db.test.distinct("payment.target.user.id").length; I get an error about exceeding the 16MB limit
[21:31:37] <MonicleLewinsky> So I tried doing: db.test.aggregate([ { $group: { _id: 'payment.target.user.id', "count": { "$sum": 1 } } }, { "$project": { "count": "$count" } } ] ); but that just returns the total number of records in my collection
[21:31:41] <MonicleLewinsky> Which I don't think is correct
[21:39:28] <top22> thanks derick
[21:48:18] <top22> MonicleLewinsky: db.getCollection('users').aggregate([ { "$group": { "_id": {payment.target.user.id: "$payment.target.user.id"}, "count": { "$sum": 1 } }}, { "$match": { "count": { "$eq": 1 } }} ])
[21:48:39] <top22> I think this will get you only the unique ids, so maybe add a length on that ?
[21:48:52] <top22> Note: I did not actually test it
[21:50:21] <MonicleLewinsky> top22: SyntaxError: missing : after property id @(shell):1:62
[21:50:22] <MonicleLewinsky> Hmm
[21:52:02] <top22> I'm not sure about the whole payment.target.user.id; i've only used it on a simple field, like 'name'
[21:55:37] <MonicleLewinsky> Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.
[21:55:45] <MonicleLewinsky> I'm querying about 6M documents
[21:59:32] <top22> it works with "_id": {"userId":"$payment.target.user.id"}
[22:03:46] <MonicleLewinsky> db.test.aggregate([ { "$group": { "_id": {"userId":"$payment.target.user.id"}, "count": { "$sum": 1 } } }, { "$match": { "count": { "$eq": 1 } } } ], {allowDiskUse: true}).length;
[22:03:59] <MonicleLewinsky> Runs without error but returns nothing
[22:08:41] <top22> maybe the document you are trying to return exceeds 16M
[22:09:47] <MonicleLewinsky> Usually it gives me that error, though. Oh well - I'll try omitting the length and count() and just try counting via the cursor in my Python script