PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 19th of July, 2012

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:16:47] <sinisa> hello
[00:17:26] <sinisa> aggregation in 2.2 will work only with single collection ?
[00:24:34] <sinisa> aggregation works only with single collection.. anyone
[00:30:40] <crudson> I believe so, but I only played with it when it was released.
[00:33:03] <deoxxa> sinisa: yes
[00:33:33] <deoxxa> sinisa: remember that mongodb is a non-relational database - there's no correlation or joining of different collections
[00:34:11] <sinisa> i know, but it would be nice :)
[00:34:47] <sinisa> i mean, im importing in two different collections, and have to do some "joins" on that
[00:34:55] <deoxxa> do it in code
[00:35:13] <sinisa> code
[00:35:42] <deoxxa> if you repeat everything i say, this conversation will take exactly twice as long as it needs to
[00:36:05] <sinisa> i forgot ? at the end
[00:36:44] <sinisa> what you mean with "do it in code"
[02:54:15] <hdm> I consistently get a corrupted database after ~150m records. I tried 2.0.6, 2.1.x, and two different storage destinations (SSD & R0 spinning disk), with errors like: "$err" : "Invalid BSONObj size: 1801675112 (0x6861636B) first element: */\\n}\\n\\n#content {\\nwidth: 700px;\\npadding: 10px;\\nmargin-t
[02:54:23] <hdm> any idea what to check next? RAM possibly?
[02:54:36] <hdm> (its ECC,registered, basic tests come back fine)
[02:55:03] <hdm> "$err" : "Invalid BSONObj size: 1801675112 (0x6861636B) first element.. or error: { "$err" : "BSONElement: bad type 64", "code" : 10320 } or similar keep popping up
[02:58:10] <crudson> hdm: one error looks like you are trying to insert a document way over the max size limit
[02:58:27] <hdm> those are errors doing an unrelated count() query later
[02:58:49] <hdm> now all counts immediately fail since something is instantly hit that is corrupted
[02:59:07] <hdm> im about to blame linux's raid-0 implementation or something
[02:59:26] <hdm> about four days of chasing this around, data load takes 12-48 hours depending on what set
[03:02:24] <crudson> hdm: ah ok. So you are sure the size and "bad type 64" are just db corruption
[03:16:07] <hdm> crudson: yup, random other bson errors if i keep trying to insert
[03:16:14] <hdm> smells like a mongo bug
[03:16:30] <hdm> odd though - its hitting only at certain sizes of collections
[03:16:46] <hdm> first 140m are just fine, then somewhere between 140 and 150 it corrupts
[05:17:33] <Dreamer3> did someone change with gridfs?
[05:18:07] <Dreamer3> i was running an integrity check and after 235k documents or so ALL the md5 sums don't match anymore
[05:19:24] <Dreamer3> perhaps an upgrade calks them differently now or something?
[05:19:37] <Dreamer3> seems very strange
[05:19:42] <Dreamer3> was not expecting to find this kind of problem
[05:53:10] <Dreamer3> seems to have fixed itself after our upgrade to 1.8
[05:53:21] <Dreamer3> i wonder when i look at the start date what i will find
[06:06:31] <Dreamer3> and started with an upgrade from bson 1.1.5 to 1.3 perhaps?
[07:30:44] <[AD]Turbo> hi there
[08:39:07] <Bartzy> Hi
[08:39:13] <Bartzy> How much time does it take to do a query plan for monog ?
[08:39:14] <Bartzy> monog?
[08:39:18] <Bartzy> mongo*, damn :)
[08:39:57] <Bartzy> I mean, when there's no perfect index to use and there's a choice of more than 1 index, the query optimizer is choosing one of the indexes to use (or none at all) by doing query plans for all of them together, and checking nscanned versus n
[08:40:07] <Bartzy> But doing that means actually doing the queries, right ?
[08:40:24] <Bartzy> and then just kill the ones that didn't finish yet when one of them finishes?
[08:53:41] <NodeX> you can hint()
[09:08:50] <Oggu> I have a hash in a field. Say A. Which gives A.a, A.b, A.c, A.d… I want to find documents where A.a is equal to condition a or doesn't exists. And the same for A.b, A.c….
[09:08:54] <Oggu> How do I do this?
[09:09:35] <Oggu> Doing a OR on field level...
[09:17:47] <neil__g> $or: [{}, {}
[09:17:49] <neil__g> oops
[09:17:55] <neil__g> finger slipped
[09:18:31] <neil__g> $or: [{a:...}, {a:{$exists:0}}] ?
[09:20:35] <NodeX> $or is very inefficent, avoid it if you can
[09:31:53] <Oggu> The $or thing won't work. I would need or on field level. Is that possible? Should I use elemMatch then?
[09:32:40] <NodeX> do you want field=null of field doesnt exist at all ?
[09:33:10] <Oggu> So A.a = 'stringA' or exists:false and A.b = 'stringB' or exists:false
[09:33:39] <PDani> hi
[09:34:34] <PDani> how should I run those mognod instances, which are only arbiters in a sharded replicated environment? is it a good practice to start these with --noprealloc and --nojournal because they won't hold any data?
[09:35:44] <NodeX> Oggu : $elemMatch has to match both elements
[09:36:03] <NodeX> so A.a='string' or exists:false would allways fail
[09:36:33] <Oggu> Ok. How can I do it?
[09:37:18] <NodeX> using $or but it's not efficient
[09:37:59] <NodeX> or change your schema slightly to always have either a value or a null value for A.a ... then use ... A.a : {$in ['stringA',null]}
[09:38:52] <NodeX> or do 2 queries and merge the results - not sure if that's faster than an $or ... I doubt it very much
[09:39:12] <Oggu> Speed isn't very important here. Only run once a day
[09:39:16] <zenista> hi guys my client is on windos xp 32 bit... i m trying mongodb first time with new version of a app... but i learned that 32 bit of monogdb is only for testing and eval... and plus there are lots of posts of data corruption by using mongo db...
[09:39:18] <zenista> what are my chances
[09:39:33] <zenista> and yes mongodb will run in single server intance
[09:39:34] <NodeX> chances of what?
[09:39:47] <zenista> chances of using mongodb as primary data storage
[09:39:54] <zenista> on windows xp 32bit
[09:40:12] <zenista> max records will be around 1,00,000 rows
[09:40:13] <NodeX> as long as you don't want a database that exceeds 2gb you're fine
[09:40:34] <zenista> gr8 thanks for the positive insight
[09:40:37] <zenista> i really want to try it out
[09:40:43] <Oggu> I don't see how I can do it with $or. $or {'A.a': 'value', 'A.a' {'$exists': false}, 'A.b': 'value', 'A.b': {'$exists': false}} would match to much. It would be enough if A.b didn't exist for the element to match
[09:40:56] <zenista> i m tryng node.js also .. and mongodb is quite well go with node.js
[09:41:00] <NodeX> I would strongly recommend your client upgrade to a stable operating system that is designed for large data
[09:41:14] <zenista> k
[09:41:35] <NodeX> $or expects an array();
[09:42:02] <zenista> Nodex what would be minimum ram memoery requirement for mongodb
[09:42:09] <zenista> on window xp 32 bit single server
[09:42:13] <NodeX> 42
[09:42:15] <NodeX> lol
[09:42:26] <NodeX> how much data are you expecting?
[09:42:56] <NodeX> db.foo.find( { $or : [ { A.a : 'value' } , { A.a : {$exists:false} } ] } )
[09:44:08] <Oggu> NodeX: Well. Still the same problem
[09:44:23] <Oggu> Or I will have to do as many cases as there are combinations
[09:44:39] <NodeX> what same problem?
[09:45:38] <Oggu> That I will match to many documents
[09:46:29] <NodeX> then add another filter
[09:46:38] <Oggu> '$or': [{'A.a': 'value'}, {'A.a' {'$exists': false}}, {'A.b': 'value'}, {'A.b': {'$exists': false}}]
[09:47:20] <Oggu> Will match all elements where A.b doesn't exist. No matter what A.a is
[09:48:25] <augustl> for the document {authTokens: [{token: "123abc", ...}, ...]}, will the query {authTokens: {token: "123abc"}} find that doc?
[09:48:41] <NodeX> augustl: yes
[09:49:15] <NodeX> Oggu, then you're going to need a data restructure to make it efficient
[09:49:28] <augustl> it doesn't work for some reason. Just tried {"authTokens.token": "123abc"}, that worked. Using the monger library for Clojure
[09:52:00] <NodeX> sorry, I misread. dot notation is correct
[09:52:36] <augustl> ah, I see :)
[10:02:34] <PDani> any idea?
[10:03:06] <saeed_> pdani, for ?
[10:03:15] <PDani> how should I run those mognod instances, which are only arbiters in a sharded replicated environment? is it a good practice to start these with --noprealloc and --nojournal because they won't hold any data?
[10:04:34] <saeed_> you must int both in once
[10:05:36] <schlitzer|freihe> hey there
[10:06:11] <schlitzer|freihe> is there something like writeconcern majority and the likes in pymongo?
[10:06:47] <PDani> saeed_, you're saying, I must use both parameter?
[10:07:26] <saeed_> pdani, i think it's right
[10:07:30] <PDani> thx
[10:08:35] <saeed_> pdani:where you are from ?
[10:09:01] <PDani> saeed_, Hungary, why?
[10:09:21] <saeed_> pdani: anyway ;)
[10:09:30] <PDani> okay :)
[10:13:57] <augustl> albertolopez: \o/
[10:14:03] <augustl> err, algernon ^^
[10:17:34] <schlitzer|freihe> or will i have to figure out myself how much members are in a replica set and set the right number in the "w" parameter?
[10:18:55] <schlitzer|freihe> because i know tat at least the java driver has these writeconcern features that you can say "majority" instead of "2" if you have 3 members in a shard.
[10:19:05] <schlitzer|freihe> sry for typos....
[10:36:17] <mephju> Hello guys. I want to do something in one query which is more easy to do in two queries. But I think it would be more efficient in one. So here is what I want to do: I want the results of an $and query and also the results of an $and query. both at the same time. is it possible with some operaiton I might not know yet?
[10:36:38] <mephju> ....shit...there is a typo:
[10:37:03] <mephju> I want the results of an $and query and also the results of an $or query.
[10:37:49] <mephju> e.g. something like this { '$or': [ {'$and':array}, {'$or':array} ] }
[10:56:27] <Oggu> NodeX: Efficiency isn't very important in this case. Should I get all documents and filter application side? Or do or cases for all different combinations?
[10:59:32] <NodeX> whatever is easiest
[11:10:08] <Oggu> I solved it =) http://pastie.org/4283227
[11:14:04] <harryhcs> hello!
[11:14:37] <harryhcs> can you copy an existing mongo db to replace another one? would that work?
[11:14:48] <harryhcs> its for a django app, and i moved it to a new server
[11:15:11] <Derick> yes, as long as mongod isn't running on either side
[11:15:32] <mephju> when I have an $or query with two conditions. will both conditions be evaluated in any case? or will the evaluation stop in case first condition is met
[11:23:19] <harryhcs> Derick: mm, i did copy it when running
[11:23:28] <harryhcs> should I stop and do it again?
[12:00:34] <Bartzy> $in is range ?
[12:06:41] <NodeX> in is like SQL IN();
[12:07:16] <Bartzy> NodeX: And it is considered a range query ?
[12:07:17] <NodeX> a: {in :[1,2] } - match a=1 or a=2
[12:07:31] <NodeX> no, sql in is not a range query
[12:08:06] <Bartzy> NodeX: I'm asking because I'm doing: db.things.find({uid: {$in: [1,2,3,4.....]}.sort({_id: -1}), and I have this index: {uid: 1, _id: 1} , but I still get scanAndOrder for the query...
[12:08:22] <Bartzy> Why is that ?
[12:08:22] <Bartzy> forgot to close }) there
[12:08:31] <NodeX> that will check 1 or 2 or 3 or 4 etc against uid
[12:09:04] <Bartzy> right, and then order by _id desc
[12:09:10] <Bartzy> But why it doesn't use the _id part of the uid_1__id_1 index ?
[12:09:22] <NodeX> try $natural : -1 instead
[12:09:35] <Bartzy> it's in the end, so it should be possible to use it for sorting..
[12:09:45] <Bartzy> for the sort? But the collection is not capped
[12:09:53] <NodeX> try $natural : -1 instead
[12:10:00] <Bartzy> so why natural should work ?
[12:10:09] <Bartzy> BTW there's also a limit(50) after that, if that helps
[12:10:11] <NodeX> or dont try it, carry on asking questions ;)
[12:10:34] <Bartzy> I'm trying I'm trying :)
[12:11:29] <NodeX> is it a range you want or does it skip any numbers?
[12:12:10] <Bartzy> it skips numbers
[12:12:29] <Bartzy> the uid is user_id
[12:12:29] <Bartzy> it can be 1 and 1324234 together in the same query
[12:12:36] <Bartzy> $natural takes a lot of time
[12:12:44] <Bartzy> it's 100 million documents
[12:13:06] <Bartzy> I hate it when explain() takes as much time to finish as the query itself. In MySQL EXPLAIN doesn't do that.
[12:13:18] <ron> then use mysql :)
[12:13:20] <Bartzy> that makes it very difficult to optimize non indexed queries on large datasets. You have to wait
[12:13:23] <NodeX> +1
[12:13:28] <Bartzy> But Mongo is so much better :P
[12:13:41] <NodeX> you should be sharding on that many docs tbh
[12:13:53] <Bartzy> Why, if I have the RAM to hold them
[12:13:55] <NodeX> what happens if you drop the sort?
[12:13:58] <Bartzy> or at least my working set
[12:14:02] <Bartzy> NodeX: Then it is very fast
[12:14:34] <Bartzy> NodeX: And even with the sort, with <200 uids in $in, it's less than 100ms
[12:14:37] <NodeX> err
[12:14:49] <NodeX> {uid: 1, _id: -1} .. should be your index
[12:14:51] <NodeX> not {uid: 1, _id: 1}
[12:14:58] <Bartzy> But with 1000 uids (that should be easy for mongo to handle, according to MongoDB in Action), it can takes 1500 ms
[12:15:02] <Bartzy> Ah, why is that ?
[12:15:15] <NodeX> it sets up the sort to be desc
[12:15:16] <Bartzy> I tried reading on why -1 and 1 matters on indexes
[12:15:21] <Bartzy> yeah but the index is either way
[12:15:32] <Bartzy> ahhhh but because the index is first used for uid, then it can't do it reveresed??
[12:15:36] <NodeX> then yo'll need both
[12:15:43] <Bartzy> reversed*
[12:16:32] <Bartzy> so the reason I need uid:1, _id: -1 is because when the index is past using the uid part of it, it can't do the sorting by _id reversed ?
[12:16:50] <NodeX> correct because you setup the index for asc not desc
[12:17:05] <Bartzy> NodeX: I still get: "scanAndOrder" : true,
[12:17:18] <NodeX> you'll probably have to hint the index now
[12:17:26] <Bartzy> I dropped the other one
[12:17:38] <NodeX> indexed 100million docs in that time?
[12:17:45] <NodeX> 50x hexacore?
[12:17:51] <Bartzy> I already had that index you mentioned
[12:17:53] <Bartzy> just dropped the other one
[12:18:16] <NodeX> pastebin : db.things.getIndexes();
[12:19:34] <Bartzy> NodeX: http://pastebin.com/6NN6sMFP
[12:19:45] <Bartzy> The uids don't exist, so it's fast
[12:19:45] <Bartzy> sec, I'll try with real UIDs
[12:21:58] <NodeX> I think possibly the _id is choking the sort
[12:22:17] <Bartzy> What do you mean choking ?
[12:23:33] <NodeX> Verb: Hinder or obstruct the breathing of (a person or animal) in such a way.
[12:24:08] <NodeX> I think you might have to hint the index to use tbh
[12:24:33] <NodeX> db.things.find()...hint({_id:-1});
[12:24:47] <Bartzy> NodeX: Why use the _id index and not my compound index?!
[12:25:01] <NodeX> or hint({uid:1,_id-1});
[12:25:05] <Bartzy> NodeX: See explain with 244 UIDs that really exist: http://pastebin.com/0FJ7Mk1h
[12:25:06] <Bartzy> 3 sec
[12:25:18] <Bartzy> NodeX: But the explain shows it is using that index anyway.
[12:25:22] <Bartzy> sec I'll try hint
[12:26:10] <Bartzy> seems the same, still scanAndOrder true
[12:26:13] <Bartzy> but takes 8ms
[12:26:15] <NodeX> it could be a bug maybe, that _id is choking the parser or somehting, I should wait for a 10gen employee to confirm you cna actually index and sort on _id -1
[12:26:31] <Bartzy> I think it's because query cache or something - because now it takes 8ms for the regular , non-hint query ?
[12:26:39] <NodeX> what's the nscanned?
[12:27:02] <Bartzy> 787
[12:27:04] <Bartzy> exactly the same
[12:27:12] <Bartzy> I try the same query and it's extremely fast.. Mongo has a query cache ?
[12:27:52] <NodeX> the OS does the caching
[12:28:24] <NodeX> the best way to test it is duplicate your collection and add a sortable field, add the index on that and see the diference
[12:28:36] <Bartzy> So no way to check the speed now, only if I restart the server ?
[12:28:38] <NodeX> it's possible you stumbled a bug or some non intended behaviourt
[12:28:59] <NodeX> wait for LRU to eject the cache?
[12:29:05] <Bartzy> OK - how do I duplicate the collection ?
[12:29:06] <Bartzy> I have a timestamp field, I'll replace _id in the index with it
[12:29:09] <NodeX> (run a load of other queries)
[12:29:14] <Bartzy> NodeX: That's a bit optimistic
[12:29:35] <Bartzy> There's really no way ?
[12:29:57] <NodeX> I dont know of a way to force *nix to dump it's FS cahce without a reboot
[12:30:55] <Aim> sync
[12:32:15] <NodeX> does it eject the cache though?
[12:33:27] <Aim> it shold
[12:33:30] <Aim> should*
[12:33:55] <Aim> it should commit the buffer cache to disk
[14:18:13] <_simmons_> Hello..
[14:18:50] <_simmons_> I'm looking on mongodb.log and there is many errors like this: "Assertion failure false ./util/../db/../bson/../util/hex.h 29"
[14:19:40] <_simmons_> And in the final of the "stack trace" /usr/lib/libboost_thread.so.1.42.0(thread_proxy+0x60) [0x7f0250c0b200] /lib/libpthread.so.0(+0x68ba) [0x7f02513328ba] /lib/libc.so.6(clone+0x6d) [0x7f025006802d]
[14:20:18] <_simmons_> could be a bug in libc ?
[14:47:34] <wereHamster> no. It's an assertion failure in the mongodb code
[14:47:46] <wereHamster> The first line is relevant, not the last one
[15:17:36] <markgm> I'm currently working with embedded documents and from above it looks like they are saved in two places, inside the parent document itself and the collection which holds the embedded documents. Is this in fact the case?
[15:18:03] <wereHamster> markgm: they are saved wherever you save them.
[15:18:30] <wereHamster> and only there, and noplace else.
[15:19:35] <markgm> My understanding of embedded documents is that they are a fully initialized reference to a documents and that changes in one correspond to changes in the other
[15:19:57] <wereHamster> eh?
[15:20:59] <wereHamster> you either embed the document or you use a reference to one. You can not have both.
[15:21:29] <markgm> Im using doctrine and embedding many of the same document type in a parent doc. so when I look at the parent doc, it contains the correct docuements within it
[15:21:51] <wereHamster> no idea what doctrine is or what it does or how it does it
[15:22:06] <markgm> and then there is a seperate collection which holds these embedded documents
[15:22:32] <wereHamster> I recomment to use the shell to look at what is actually stored in the database
[15:22:34] <markgm> Im wondering if they are the same document or a copy saved in two different places
[15:22:38] <markgm> i am
[15:22:44] <markgm> using genghis
[15:22:48] <wereHamster> pastebin an example of such document
[15:26:23] <markgm> http://pastebin.com/kXm3xTSk the first is the parent doc, and the second is one of the embedded docs
[15:26:54] <wereHamster> I can not parse that. Can you use the mongo shell ?
[15:28:06] <markgm> don't know how to use the mongo shell. what can't you parse about it?
[15:28:29] <wereHamster> db.foo.find({ _id: ObjectId("xxxx") })
[15:53:19] <ramsey> Derick: Does the latest release of the mongo extension for PHP fix the replica set issues we discussed on that thread a while back?
[15:53:27] <Derick> no
[15:53:37] <Derick> ramsey: that's 1.3, to be released very soon™
[15:53:41] <Derick> hacking on that now
[15:53:48] <ramsey> cool... thanks
[16:07:34] <Mortah> hello
[16:07:47] <Mortah> having an issue with removing a node from a replica set... rs.remove() worked fine
[16:08:01] <Mortah> but, mongos is spamming to its log saying it cannot communicate with that server
[16:08:22] <Mortah> running db.runCommand({ listShards : 1}) still lists the removed server
[16:08:34] <Mortah> but rs.status() correctly does not list the server
[16:08:38] <Mortah> have I missed a step somewhere :)
[16:18:29] <_simmons_> wereHamster: thanks man.
[16:21:17] <NodeX> http://docs.mongodb.org/manual/tutorial/expire-data/
[16:21:18] <NodeX> +1
[16:21:25] <NodeX> top feature
[16:25:18] <NodeX> Kudos on PHP driver update too
[16:28:09] <_simmons_> This error: decode failed. probably invalid utf-8 string [[[�]]]
[16:28:09] <_simmons_> Thu Jul 19 13:28:09 why: TypeError: malformed UTF-8 character sequence at offset 2
[16:28:12] <_simmons_> Thu Jul 19 13:28:09 Error: invalid utf8 shell/utils.js:1238
[16:28:37] <_simmons_> could be caused by Wed Jul 18 01:00:02 [conn13270] assertion 13297 db already exists with different case other: [utf-8] me [UTF-8] ns:UTF-8.* query:{} ???
[16:58:50] <diegok> Hi!, is it possible to retrieve last element of an array using dot notation?
[16:58:57] <diegok> ^ how?
[17:02:45] <diegok> ^ ok, it's $slice: -1
[17:50:22] <soumya> i am trying to write a query against my db
[17:50:32] <soumya> i have a list of dictionaries
[17:51:07] <soumya> i want it return the person if 2 keys in one of the dictionaries i nthe list match
[17:52:00] <soumya> i want it to return the person if 2 keys in one of the dictionaries in the list match (Note: the dictionary has a total of 4 keys)
[18:15:20] <greenberet123> I want to retrieve all records in which a key matches any of 20000 values…….what is the best way to run this query from a program(I already have an index on the key)?
[18:18:00] <durodeprogramar> hi
[18:18:14] <durodeprogramar> Hi!
[18:18:25] <durodeprogramar> Someone is using Django framework?
[18:19:44] <durodeprogramar> hi
[18:23:03] <durodeprogramar> Looks like everybody has the keyboard broked =P
[18:24:10] <wereHamster> looks like somebody is impatient
[18:24:42] <durodeprogramar> ah ah ah! =P
[18:25:18] <durodeprogramar> How are you? wereHamster
[18:25:20] <durodeprogramar> =D
[18:25:29] <durodeprogramar> JOIN #mongoengine
[18:26:28] <greenberet123> is it wise to write an IN query specifying an array of 20000 values?
[18:27:13] <durodeprogramar> Are you in Django?
[18:27:16] <crudson> greenberet123: depends what you want to do with it.
[18:27:48] <greenberet123> crudson: I want to retrieve all documents where the key is IN these 20000 values from a program..
[18:28:21] <crudson> greenberet123: I found splitting into smaller chunks was more efficient, but depends whether you have the logical freedom to do that.
[18:29:10] <TubaraoSardinha> Hello! =)
[18:29:16] <greenberet123> crudson: splitting the query? so….eventid IN [1,2,3,4,5,6] becomes eventid IN [1,2,3] and eventide IN [4,5,6] ?
[18:30:32] <crudson> greenberet123: yeah for very large numbers, 10s of thousands, at least what I experienced when having to process and move many documents from one db to another depending on a list of values
[18:31:11] <crudson> greenberet123: but I'd do your own tests and see, it could be a factor of how your documents are structured, indexed and queried
[18:31:36] <greenberet123> crudson: But is specifying a list the only way? the query would become really HUGE…….
[18:32:03] <greenberet123> crudson: it seems a little inelegant to me…..
[18:34:44] <crudson> greenberet123: unless you can identify some other way to identify those documents. Perhaps look at how the list of eventids is generated and see if documents can be identified similarly.
[18:36:57] <greenberet123> crudson: ok…thanks mate
[19:12:49] <j0shua> hey; i want to do an aggregate w/ group to create a sum but I only want the top 5 results. how can i do that w/o sorting the entire resutl set w/ sort in my app ?
[19:14:54] <j0shua> meaning i want to reduce {name: 'joe', likes: 5}, {name: 'joe', likes: 2}, {name: 'betty', likes:4} to [{name: 'joe', likes: 7}, {name:'betty', likes: 4} but i ONLY want to get the winner
[19:25:41] <wereHamster> who is the winner?
[19:26:04] <wereHamster> use mongodb to sort and limit the results
[19:29:02] <ron> j0shua: do you ever need to retrieve the non-aggregated data? that is {name:'joe', likes: 2}?
[19:41:13] <crudson> j0shua: db.users.aggregate({ $group:{_id:'$name',likes:{$sum:'$likes'}} }, { $sort:{likes:-1} } ).result[0]
[19:41:58] <crudson> j0shua: gives { "_id" : "joe", "likes" : 7 }
[19:47:30] <crudson> j0shua: actually this will be more efficient, and it renames _id back to name db.users.aggregate({$group:{_id:'$name',likes:{$sum:'$likes'}}}, {$sort:{likes:-1}}, {$limit:1}, {$project:{name:'$_id',likes:1,_id:0}} ).result
[19:53:41] <crudson> j0shua: if you wanted "top 5" then just change the $limit accordingly
[19:58:55] <j0shua> @crudson i cant chain limit to group
[19:59:12] <crudson> ?
[20:00:18] <crudson> j0shua: did you try copying and pasting my last command?
[20:00:42] <j0shua> db.blah.group({ key: { name : true}, initial: {csum: 0}, reduce: function(doc, out) { out.csum += doc.like; } } ).sort(function(a, b){ return b.csum - a.csum; }).sort thats not valid
[20:01:28] <j0shua> sorry i missed it .. looking
[20:02:10] <crudson> j0shua: use aggregate() rather than a direct group()
[22:09:40] <nicholasdipiazza> newbie question: Does the community version of MySQL cluster v5.5.2 use the NDB storage engine?