pmxbot IRC Log Viewer

[20:06:57] <top22> Hello, suppose I have an object like this: { _id, dict: { key1: value1, key2: value2 ... } } (the dict can be really big) and I query like this : find({_id:id}, {dict.key1: 1}). I know that only the dict.key field will be returned, but if the dict is really big in that document (say 10000 records), will I see a performance issue ?

[20:08:02] <top22> Will mongo load the whole document in memory, and then extract only dict.key1 ? or will it directly get only dict.key1 from disk ?

[21:00:29] <top22> any ideas ?

[21:15:26] <Derick> it's likely already going to be in memory anyway? But yes, documents are stored as one unit

[21:15:54] <Derick> whether that has a performance impact, that remains to be seen

[21:18:35] <top22> so if the document is big, even if I only ask for one field, the whole document is loaded

[21:22:24] <Derick> yes, unless it's already in memory - but then again, you ought to have your "working set" in memory most of the time anyway. There is no guarantee it *has* to be read from disk.

[21:22:29] <Derick> I wouldn't worry too much about it

[21:31:05] <MonicleLewinsky> Hi guys, I'm running into an issue when trying to find the total number of unique user id's in a collection. When I try doing this: db.test.distinct("payment.target.user.id").length; I get an error about exceeding the 16MB limit

[21:31:37] <MonicleLewinsky> So I tried doing: db.test.aggregate([ { $group: { _id: 'payment.target.user.id', "count": { "$sum": 1 } } }, { "$project": { "count": "$count" } } ] ); but that just returns the total number of records in my collection

[21:31:41] <MonicleLewinsky> Which I don't think is correct

[21:39:28] <top22> thanks derick

[21:48:18] <top22> MonicleLewinsky: db.getCollection('users').aggregate([ { "$group": { "_id": {payment.target.user.id: "$payment.target.user.id"}, "count": { "$sum": 1 } }}, { "$match": { "count": { "$eq": 1 } }} ])

[21:48:39] <top22> I think this will get you only the unique ids, so maybe add a length on that ?

[21:48:52] <top22> Note: I did not actually test it

[21:50:21] <MonicleLewinsky> top22: SyntaxError: missing : after property id @(shell):1:62

[21:50:22] <MonicleLewinsky> Hmm

[21:52:02] <top22> I'm not sure about the whole payment.target.user.id; i've only used it on a simple field, like 'name'

[21:55:37] <MonicleLewinsky> Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in.

[21:55:45] <MonicleLewinsky> I'm querying about 6M documents

[21:59:32] <top22> it works with "_id": {"userId":"$payment.target.user.id"}

[22:03:46] <MonicleLewinsky> db.test.aggregate([ { "$group": { "_id": {"userId":"$payment.target.user.id"}, "count": { "$sum": 1 } } }, { "$match": { "count": { "$eq": 1 } } } ], {allowDiskUse: true}).length;

[22:03:59] <MonicleLewinsky> Runs without error but returns nothing

[22:08:41] <top22> maybe the document you are trying to return exceeds 16M

[22:09:47] <MonicleLewinsky> Usually it gives me that error, though. Oh well - I'll try omitting the length and count() and just try counting via the cursor in my Python script

Log file Viewer

Help | Karma | Search:

#mongodb logs for Sunday the 6th of January, 2019