PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 30th of July, 2013

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:18:23] <bjorn248> hey guys, new to mongo here, but I didn't find anything clear in the documentation on this question...so I decided to come here. I'm trying to set up something very simple. Mongo replication with automatic failover. I have a node app using mongoose pointing at my database server, and the database is replicated on two other boxes (minimum replica set). No sharding at the moment. Is there a load balancer or something I have to set up between the node
[00:19:24] <rafaelhbarros> bjorn248: let me know what you find out because I'm interested on that one too.
[00:20:35] <bjorn248> I mean, if I set up a sharded and replicated cluster, I understand that mongos can act as a router in between the application and the database, but in the case where I am not using sharding, I am not sure how that is handled.
[00:25:28] <cheeser> you can use mongos without sharding
[00:32:58] <bjorn248> cheeser: oh really? awesome, so I can just set up mongos on the application box then? neat
[00:36:48] <cheeser> yep
[00:37:11] <cheeser> we did that at a former gig so that all the infrastructure was in place once we decided to shard.
[00:53:39] <tpayne> can anyone hope me figure out what's syntactically wrong with this line of code for aggregate?
[00:53:46] <tpayne> missing arguments for method aggregate in trait TraversableOnce;
[00:53:46] <tpayne> coll.aggregate(MongoDBObject("$group" -> MongoDBObject("_id" -> "$key", "value" -> MongoDBObject("$addToSet" -> "$value"))))
[00:56:08] <DanWilson> tpayne: looks like PHP?
[00:56:14] <tpayne> casbah scala
[00:56:24] <tpayne> i believe this translates to coll.aggregate({"$group": {"_id": "$key", "value": {"$addToSet": "$value"}}})
[00:56:53] <DanWilson> well, not exactly
[00:57:00] <DanWilson> aggregate takes an array as an argument
[00:57:01] <DanWilson> like this
[00:57:06] <DanWilson> db.cfsummit.aggregate([
[00:57:06] <DanWilson> {$match: {"cards.idList": {"$in": ["51c9aa15d0b4871a3e000075"]}}},
[00:57:06] <DanWilson> {$project: { "_id": "$cards.name"}},
[00:57:06] <DanWilson> {$group: {"_id": 1, "count": {"$sum":1 }}}
[00:57:06] <DanWilson> ])
[00:57:25] <DanWilson> then, inside the arrays are the various aggregate pipeline statements
[00:57:38] <tpayne> DanWilson: can you take a look at my data, and tell me how to do a group by on it?
[00:57:43] <tpayne> it's 4 lines
[00:57:52] <DanWilson> sure. put it on pastebin
[00:58:27] <tpayne> DanWilson: http://pastebin.com/eGbUkU7e
[00:59:15] <tpayne> i want to get a group by of "id"
[00:59:31] <tpayne> that puts the values for it, in a list sort of
[01:02:24] <DanWilson> so, you want to group by id,
[01:02:37] <DanWilson> and do what with the values?
[01:03:24] <tpayne> sort of create a bucket
[01:03:34] <tpayne> like for id 1, there are 2 entries in the bucket
[01:03:40] <tpayne> same for id 3
[01:03:44] <tpayne> id 2 there is only 1
[01:04:29] <DanWilson> mock up what the end result of the data should look like
[01:04:46] <DanWilson> are you trying to do a count of documents by id number?
[01:05:26] <DanWilson> because that would look like this:
[01:05:26] <DanWilson> db.whatever.aggregate([
[01:05:26] <DanWilson> {$group: {"_id": 0, "id": 1, "count": {"$sum":1 }}}
[01:05:26] <DanWilson> ])
[01:05:52] <tpayne> not a count, i want the actual data. imagine a tree with a root node with id 1, and coming out of it is 2 branches
[01:06:08] <tpayne> pointing to ["2","3"], and ["2"]
[01:06:31] <tpayne> DanWilson: I tried to explain it here: https://groups.google.com/forum/#!topic/mongodb-casbah-users/zp0izu4iaL8
[01:12:20] <DanWilson> try this:
[01:12:21] <DanWilson> db.whatever.aggregate([
[01:12:21] <DanWilson> {$group: {"_id": 0, "id": 1, "stuff": {"$addToSet":$list }}}
[01:12:21] <DanWilson> )
[01:12:28] <DanWilson> run it in your mongo shell
[01:12:57] <tpayne> ok
[01:14:22] <tpayne> ok so i ran db.droplets.aggregate([{$group: {"_id": 0, "id": 1, "stuff": {"$addToSet":$list }}}])
[01:14:25] <tpayne> db.droplets.aggregate([{$group: {"_id": 0, "id": 1, "stuff": {"$addToSet":$list }}}])
[01:14:35] <tpayne> it says $list is not defined
[01:15:38] <DanWilson> take the $ out
[01:16:16] <tpayne> and wrap it in quotes?
[01:17:41] <DanWilson> db.droplets.aggregate([{$group: {"_id": 0, "id": 1, "stuff": {"$addToSet":list }}}])
[01:17:52] <DanWilson> db.droplets.aggregate([{$group: {"_id": 0, "id": 1, "stuff": {"$addToSet":"list" }}}])
[01:18:05] <DanWilson> yeah, wrap it in quotes, I missed that
[01:18:31] <tpayne> ok now i get this: "errmsg" : "exception: the group aggregate field 'id' must be defined as an expression inside an object",
[01:21:04] <DanWilson> lemme import your data and give it a go
[01:22:32] <tpayne> ok
[01:22:41] <tpayne> anyway i can make that part easyier?
[01:22:51] <DanWilson> nope
[01:22:53] <DanWilson> gimme a sec
[01:24:17] <DanWilson> this is almost what you want
[01:24:18] <DanWilson> db.tpayne.aggregate([{$group: {"_id": "$id", "stuff": {"$addToSet":"$list" }}}])
[01:24:20] <DanWilson> 1 sec
[01:27:11] <DanWilson> OK
[01:27:11] <DanWilson> db.tpayne.aggregate([
[01:27:12] <DanWilson> {$project: { "_id": "$id", "list": "$list"}},
[01:27:12] <DanWilson> {$unwind: "$list"},
[01:27:12] <DanWilson> {$group: {"_id": "$_id", "newlist": {"$addToSet":"$list" }}}
[01:27:12] <DanWilson> ])
[01:27:13] <DanWilson> try that
[01:27:32] <tpayne> ok
[01:30:28] <tpayne> hmm almost
[01:30:37] <tpayne> they aren't separated
[01:31:11] <tpayne> for example for "3" there should be 2 arrays, [["5","6","7","8"],["9"]]
[01:32:19] <DanWilson> oh, you wanted different arrays?
[01:32:25] <tpayne> right
[01:32:42] <DanWilson> what about this?
[01:32:43] <DanWilson> db.tpayne.aggregate([{$group: {"_id": "$id", "stuff": {"$addToSet":"$list" }}}])
[01:33:15] <tpayne> nice!
[01:33:34] <tpayne> can stuff be $id?
[01:33:53] <tpayne> hmm guess not
[01:34:00] <DanWilson> you gotta be able to group by something
[01:34:19] <tpayne> is that because my document structure is flawed?
[01:34:21] <tpayne> because i can fix that
[01:35:21] <DanWilson> do you want the keyname to be id?
[01:35:27] <DanWilson> holding the arrays?
[01:35:30] <tpayne> right
[01:35:45] <DanWilson> on
[01:36:01] <DanWilson> that's a piece of piss. It's just a label in the statement: db.tpayne.aggregate([{$group: {"_id": "$id", "id": {"$addToSet":"$list" }}}])
[01:37:10] <tpayne> piece of piss?
[01:37:50] <DanWilson> like, not very hard? No Big Deal?
[01:38:37] <DanWilson> if you look at my statement, the "id" part could be "purplemonkeyzebra"
[01:38:47] <DanWilson> it's like using an alias in SQL
[01:39:04] <tpayne> but it can't be the value? only a string i come up with
[01:39:05] <DanWilson> you can call it whatever you like, just don't collide with anything else in the statement at that level.
[01:39:14] <DanWilson> what do you mean it can't be the value?
[01:39:34] <DanWilson> what, specifically do you want mongo to spit out?
[01:40:05] <tpayne> i'll paste
[01:41:16] <tpayne> DanWilson: http://pastebin.com/VbHQiCJa
[01:45:01] <DanWilson> I'm not sure of how to do that.
[01:46:43] <tpayne> hmm ok, this is really good though, i appreciate it
[01:46:46] <tpayne> i wish i understood it
[02:23:00] <jackh> all, how to run the c++ tests in dbtests directory?
[02:36:02] <Petrochus> does anyone know of a nice python mongodb wrapper that can transform things like `mongo.db.users.save({"_id": x}, {"$push": {"array": "value"}})`into, something like `users(id=x).array.append(value)`? pymongo, mongokit, and mongoengine don't really have such features
[02:36:56] <Petrochus> an easy OOP way to modify documents in place, essentially
[03:20:42] <mscook> Hi - can anyone see something wrong with this query:
[03:20:47] <mscook> {'$and': [{'StrainID': 'S77EC'}, {'Class': 'substitution'}, {'$and': [{'Position': {'$gte': '1'}}, {'Position': {'$lte': '1000000'}}]}]}
[03:21:08] <cheeser> what error do you get?
[03:22:07] <mscook> I don't get anything back...
[03:22:47] <mscook> Position is a integer however...
[03:22:56] <cheeser> start peeling away query params until things start showing up.
[03:23:29] <mscook> lol
[03:23:35] <mscook> Learnt something.
[03:23:41] <mscook> Types are important
[03:23:51] <cheeser> just a bit. :)
[03:37:57] <tpayne> how do i write this query in casbah? db.droplets.aggregate([{$group: {"_id": "$id", "id": {"$addToSet":"$list" }}}])
[05:57:35] <LoneSoldier728> hey
[05:57:51] <LoneSoldier728> anyone here
[06:35:12] <ron> LoneSoldier728: never ask that on irc. if you have a question, just ask it and wait until someone responds.
[07:50:59] <trupheenix> anyone here know about MongoLab DB hosting service? I wanted to know if I can create a db with my free shared plan. I logged in and tried doing show dbs but it doesn't show me dbs.
[07:52:01] <[AD]Turbo> yo
[07:57:11] <cpu> My mongo server is entering a "state" I can't understand. 100CPU, locked all the time. Can someone help me understand the results of db.serverStatus?
[07:59:42] <Garo_> cpu: don't ask if you can ask, just ask your question (that includes the output of your db.serverStatus in this case)
[08:00:07] <cpu> thanks Garo
[08:01:10] <cpu> I've got a machine mongo on SSD (raid0), mongo stat shows:
[08:01:11] <cpu> 9124 *0 *0 *0 0 1|0 0 66g 133g 21.5g 1 crayon:123.6% 0 14|0 0|1 1m 3k 18 07:39:54
[08:02:19] <cpu> in log I find: serverStatus was very slow: { after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 1460, after globalLock: 1460, after indexCounters: 1460, after locks: 1460, after network: 1460, after opcounters: 1460, after opcountersRepl: 1460, after recordStats: 1580, at end: 1580 }
[08:02:40] <cpu> yesterday everything was running smooth
[08:03:51] <cpu> how can I find the cause for the 100% lock time
[08:04:34] <Nodex> check your mongodb.log
[08:05:35] <cpu> what to search in it
[08:06:28] <cpu> I've got inserts which take forever for no good reason:
[08:06:31] <cpu> Tue Jul 30 07:47:23.836 [conn78] insert crayon.samples_sec ninserted:628 keyUpdates:0 locks(micros) w:328706367 328706ms
[08:07:52] <trupheenix> I'm confused what kind of DB architecture I should use for my application. I'm trying to use MongoLab where I am limited by the number of databases I can have on a single connection. My current code assumes that multiple DBs can be accessed on the same connection. So is it a good idea to shift all my collections into one db and then connect to it?
[08:09:09] <Nodex> cpu, what is the operation directly before that one?
[08:15:26] <trupheenix> Nodex you got any suggestions for me?
[08:16:11] <Nodex> why are you trying to use multiple databases in one app for?
[08:17:11] <trupheenix> Nodex to separate the objects and functionality
[08:17:33] <trupheenix> Nodex, I have one data set for authentication another for storing images another for storing user data and so on
[08:19:31] <Nodex> amazing but why are they not collections, why do they have to be databases?
[08:20:47] <trupheenix> Nodex, ok
[08:21:02] <trupheenix> Nodex, what if I want to take out a collection and put it on a dedicated machine?
[08:21:16] <Nodex> and why would you need to do that?
[08:21:30] <trupheenix> Nodex, like images might be hosted on a different set of machines
[08:21:49] <Nodex> why does that make a difference what database it belongs to?
[08:22:07] <trupheenix> Nodex, hmmm
[08:26:13] <trupheenix> Nodex, ok
[08:26:26] <trupheenix> Nodex, so single db and multiple collections it is then
[08:34:34] <cofeineSunshine> hello
[08:34:44] <cofeineSunshine> how do I query document with not null value
[08:34:48] <cofeineSunshine> ?
[08:35:09] <cofeineSunshine> $exist is not what I need
[08:36:24] <ron> null is a value. you'd check for 'not null' the same way you'd check 'not 0'
[08:36:46] <ron> a field with a null value 'exists', so $exists won't help.
[08:37:17] <cofeineSunshine> {"some_id": not null}
[08:37:18] <cofeineSunshine> ?
[08:37:21] <cofeineSunshine> not work
[08:37:51] <ron> cofeineSunshine: http://docs.mongodb.org/manual/reference/operator/
[08:38:00] <ron> http://docs.mongodb.org/manual/reference/operator/not/#op._S_not
[08:40:36] <Nodex> you can also use $ne
[08:41:10] <ron> of course, silly me.
[09:02:51] <cofeineSunshine> thank you
[09:02:52] <cofeineSunshine> got it
[09:02:53] <cofeineSunshine> workds
[09:10:07] <puppeh> is there the possibility to do a range query but searching on keys and not on values?
[09:11:47] <puppeh> for ex. I have documents like this: http://pastie.org/private/el4nquexdgla65cyttiqq what I want to do is to query for documents that are between "13_07_06" and "13_07_09" in their "d" fields
[09:12:00] <puppeh> is is possible? or my schema is not suited for this operation?
[09:13:19] <Nodex> you can't serach on keys
[09:13:21] <Nodex> search*
[09:13:51] <puppeh> ok thx
[09:20:01] <remonvv> \o
[09:20:10] <Nodex> o/
[09:21:47] <Zelest> damn nazis
[09:25:22] <Nodex> what they done now?
[09:26:02] <Zelest> hailing ^
[09:27:04] <Nodex> oh lmfao
[09:27:10] <Zelest> :D
[09:28:52] <_Heisenberg_> Good Morning
[09:29:18] <Zelest> o/
[09:29:21] <Nodex> happy tomorrow
[09:37:16] <Zelest> happy tomorrow?
[09:37:29] <Nodex> I'm trying somehting new :P
[09:38:11] <_Heisenberg_> just recognized that I built bullshit yesterday. adding members of my replica set (1 master 2 slaves) as shards to a sharded cluster makes no sense right? A shard should be a replica set (represented by the actual master), so what I would need to built is something like two or more replica sets which consits of 2 or more replica nodes each and then add the replica sets each as shards?!
[09:41:58] <Nodex> ran out of Sons of Anarchy :(
[09:50:34] <_Heisenberg_> This would be the way to go right? https://www.lucidchart.com/documents/view/4295-3704-51f789dd-a4c3-65470a00507e (no config servers and multiple mongos modeled)
[09:54:44] <Nodex> really depends what you're trying to achieve
[09:57:18] <_Heisenberg_> Nodex: High Availability, Horizontal Scaling, Performance and Fault Tolerance are the main aspects. I'm building a prototype which makes use of polyglot persistence, MongoDB takeing the star role ;)
[09:58:02] <Nodex> I don't know what polyglot persistence is sorry
[09:58:43] <Nodex> is your app going to write to both masters?
[09:58:53] <_Heisenberg_> just means you use several datastores for your application. for example a document store for a product catalog, a kv store for sessions and a rdbms for critical oerations where you need transactions
[09:59:59] <_Heisenberg_> Nodex: If you use sharding I assume that queries get routed to both masters, yes
[10:00:16] <_Heisenberg_> depending where the record resides
[10:01:03] <_Heisenberg_> that's the key of horizontal scaling, isn't it? distributing your reads and writes over all available nodes...
[10:01:50] <Nodex> sharding doesn't dupe the data
[10:02:23] <_Heisenberg_> dupe?
[10:02:23] <Nodex> are you thinking that rs0 and rs1 will both have exactly the same data?
[10:02:33] <_Heisenberg_> no, of course not
[10:02:44] <Nodex> that's what your diagram suggests
[10:03:47] <_Heisenberg_> the master/slaves in a replicaset should have the same data, the shards will have different data depending on the sharding key
[10:04:31] <Nodex> in your diagram which are the shards?
[10:04:57] <_Heisenberg_> rs0 and rs1
[10:05:07] <Nodex> right, perhaps don't name them master then :)
[10:05:44] <Nodex> rs0/1-slave = replica sets of each of the shards, meaning each side would have 50% of the data
[10:05:51] <_Heisenberg_> just wanted to show that the master handels the operations for the replica sets
[10:06:39] <Nodex> each side will have 3 copies of the same data - seems rigid enough to me
[10:06:58] <Nodex> * 3 copies of 50% of the data
[10:09:34] <_Heisenberg_> I think a multi-master architecture with replication between them would be much better :/
[10:10:13] <_Heisenberg_> if it comes to availability... of course you run into consistency problems then...
[10:11:09] <Nodex> mongodb doens't support multi master
[10:11:29] <_Heisenberg_> sadly
[10:18:00] <_Heisenberg_> so for a productive system you must guarantee to have the members of a replica set in different data centers to a
[10:18:19] <_Heisenberg_> chieve HA, correc?
[10:18:33] <_Heisenberg_> damn new keyboard...
[10:19:21] <_Heisenberg_> because if you have a network partition between your replica sets or a datacenter fails, half of the data would be unavailable
[10:21:21] <Nodex> I don't think you will ever have 100% network parition tollerance
[10:21:45] <Nodex> but replica sets of shards should be in a separate data center to get anywhere near it
[10:24:49] <_Heisenberg_> I don't get it why in this example (http://docs.mongodb.org/manual/core/replica-set-architecture-geographically-distributed/) the Secondary in Data Center 2 should never get primary?
[10:28:19] <Nodex> piority 0
[10:28:37] <Nodex> http://docs.mongodb.org/manual/core/replica-set-priority-0-member/ <---
[10:29:11] <_Heisenberg_> yes but it makes no sense to me. if datacenter 1 goes down the secondary in data center 2 should get primary to ensure availabilty?!
[10:30:46] <Nodex> not if it's priority 0
[10:30:58] <Nodex> if you want it to then change its priority
[10:34:01] <_Heisenberg_> okok... thanks for your patience ;)
[10:34:57] <Nodex> no probs
[10:46:54] <Andy80> hi I'm performing this aggregation on a collection that I've: db.campaigns.aggregate({'$group': {'_id': 'budget', 'daily_budget_sum': {'$sum': '$daily_budget'}}})
[10:46:58] <Andy80> actually it works, but...
[10:47:47] <Andy80> I'd like to filter, for example, the campaign using some parameters.... example: campaign_status = 1, account_enabled = 1
[10:47:49] <Andy80> etc...
[10:47:59] <Andy80> where do I insert the "filter" in this collection?
[10:50:27] <Nodex> you need to add them to a $match iirc
[10:50:44] <Nodex> been a while since I wrote any aggregations, bear with me
[10:51:28] <Nodex> http://docs.mongodb.org/manual/tutorial/aggregation-examples/#states-with-populations-over-10-million <--- shows a $match
[10:52:00] <Andy80> Nodex, thank you so much! Going to read it now :)
[10:55:14] <neophy> my mongodb document some thing like this: http://pastebin.com/xXR8DnUh .most of my query is based on timestamp, hostname and message. Now I want to build index for timestamp and hostname for existing collection.
[10:55:51] <neophy> does it build index for timestamp and hostname db.events.ensureIndex( { "timestamp": 1, "hostname": 1 } , {background: true, name: "events_time_host"} ) right?
[10:56:37] <neophy> I planing to build compound index for timestamp and hostname.
[10:57:09] <Nodex> that will build a compound index for those 2 fields
[10:58:53] <neophy> existing collection right? I have to do this through client interface for new insert right?
[10:59:40] <Nodex> you do it once from the shell or your driver
[10:59:56] <Nodex> the collection doesn't even have to exist
[11:01:09] <Andy80> Nodex, from what I'm reading, the $match is more similar to the sql HAVING command. I would need something relative to WHERE instead.
[11:02:26] <Nodex> match will do that also
[11:06:42] <remonvv> ^
[11:12:03] <neophy> Nodex: I have single node MongoDB instance which has last two months syslog. The actual syslog size is 20GB in flat file. but MongoDB takes around 600GB. There are lots of dbname.xxx file with 2GB size. Does it make sense?
[11:12:56] <neophy> I need some pointer to learn how MongoDB does it internally...
[11:18:14] <Nodex> neophy : http://docs.mongodb.org/manual/faq/storage/
[11:23:07] <rspijker> neophy: if you want more specifics there was a talk I saw recently. Talking about the storage engine, pretty interesting stuff
[11:23:10] <rspijker> let me see if I can find it
[11:23:37] <rspijker> neophy: http://www.10gen.com/presentations/mongodbs-storage-engine-bit-bit
[11:23:42] <neophy> ok
[11:24:10] <rspijker> And I recently learned that in some scenarios the freelist algorithm is just utter crap :)
[11:25:07] <cpu> OK.. I now know better something I've been researching that is happening to my mongo.
[11:25:29] <cpu> I'm using nodejs as client and I do roughlty 10K inserts per second (according to mongostat)
[11:25:43] <cpu> after 30 seconds, the information starts "lagging"
[11:26:08] <cpu> My nodejs service outputs that the submitted the documents to mongo, but they appear in the DB only after a few minutes
[11:26:20] <cpu> after a restart, the lag disappears
[11:34:44] <_Heisenberg_> Do I have to call sh.addShard() for each member of a replica-set?
[11:36:58] <kali> no, you add the RS as a whole
[11:37:16] <rspijker> _Heisenberg_: http://docs.mongodb.org/manual/tutorial/deploy-shard-cluster/
[11:40:03] <_Heisenberg__> the docs are a little confusing towards this. because if I would call sh.addShard() for each member of my replica set I would end up with 4 shards in my case (2 replica sets with two members each)
[11:41:22] <_Heisenberg__> o.O
[11:42:38] <rspijker> _Heisenberg__: the docs aren;t confusing, they say exactly what you need to do
[11:42:42] <rspijker> look at the link I posed
[11:42:45] <rspijker> *posted
[11:43:12] <rspijker> Add each shard to the cluster using the sh.addShard() method, as shown in the examples below. Issue sh.addShard() separately for each shard. If the shard is a replica set, specify the name of the replica set and specify a member of the set. In production deployments, all shards should be replica sets.
[11:43:27] <rspijker> there is even an example of the exact command you need
[11:45:42] <rspijker> cpu: I'm guessing you are doing the find over a different connection than the insert?
[11:45:51] <_Heisenberg__> If the shard is a replica set, specify the name of the replica set and specify _a_ member of the set -> so, which one? don't I have to specify all members of my replica-set if I want the whole replica-set to be a shard?!
[11:46:17] <rspijker> _Heisenberg__: nope, any member should do. Remember, the members know what other members are in the RS
[11:46:44] <_Heisenberg__> ok, thanks
[11:46:53] <rspijker> back in the day (not that far back, actually). you had to specify all members of the RS
[11:47:03] <braoru> hello .. I have a little problem
[11:48:08] <braoru> I have a collection of document and in each document a set of included object like ["stat1":"1","stat2":"3"] actually I can use aggregation to count number of occurence of stat2, stat3 etc.. but can I have the sum of all stat2, stat1, etc ?
[11:48:22] <braoru> or do I have to code around the query and go over each document one by one ?
[11:49:13] <braoru> stats : {"stat1":"3", "stat2":"5"}
[11:49:15] <braoru> exactly
[12:42:55] <hroark> so i have voting object like question.options.votes, where options and votes are arrays.
[12:43:00] <hroark> how to atomically clear votes
[12:43:58] <hroark> {$set:{'options.votes':[]}} doesn't work.
[12:47:49] <Nodex> question.options.votes :[]
[12:47:53] <Nodex> + ''
[12:48:19] <Nodex> else you'll have to do this ... $set : {a:{b:{c:[]}}}
[12:48:35] <hroark> ah okay
[12:48:38] <hroark> let's see
[12:48:42] <Nodex> but dot notation should work fine, not sure why it's not
[12:49:09] <hroark> what's the + ''
[12:50:16] <Nodex> 'question.options.votes' :[]
[12:52:23] <hroark> question is actually the collection itself
[12:53:26] <Nodex> can you pastebin a typical document
[12:53:33] <hroark> sure
[12:54:48] <hroark> http://pastebin.com/Res8Ajkz
[12:55:54] <Derick> hroark: it's an array
[12:56:00] <Derick> you need to loop and do them one by one
[12:56:07] <Derick> or do:
[12:56:08] <Nodex> http://pastebin.com/XNVjU9Jw
[12:56:13] <Derick> 'option.0.votes' : [ ]
[12:56:14] <Nodex> works fine
[12:56:15] <Derick> (and :
[12:56:18] <Derick> 'option.1.votes' : [ ]
[12:56:26] <Derick> Nodex: are you sure? :-)
[12:56:38] <Nodex> my pastebin works fine
[12:56:44] <hroark> nodex, tht removes the other properties
[12:56:52] <Nodex> you didn't say anything about those
[12:57:00] <hroark> lol
[12:57:04] <Nodex> you have to loop thm as Derick said
[12:57:22] <hroark> right, and then loop the inserts again
[12:57:25] <Nodex> sorry,I left my Crystal ball on my other machine
[12:57:27] <hroark> what a ballache
[12:57:57] <hroark> okay, so next question, say i do loop through
[12:58:15] <hroark> can i do a mass insert of the array of complete objects?
[12:58:49] <hroark> or do i have to create an async looping function to col.save(obj)
[12:59:11] <Nodex> what is a complete object?
[12:59:26] <hroark> one moment
[12:59:30] <hroark> well, including ObjectId(
[12:59:52] <hroark> AFAIK col.save() doens't accept arrays
[12:59:58] <hroark> only single objects
[13:00:12] <hroark> so i'll have to loop through multiple saves
[13:01:06] <Nodex> col.save(doc); takes any document, I would be VERY surprised if it didn't allow arrays in the document
[13:01:16] <hroark> i mean, an array of documents
[13:01:21] <Nodex> perhaps you want col.update() ?
[13:01:29] <Nodex> with multi:true ?
[13:01:33] <hroark> one second, let me show you what i'm trying to do...
[13:01:53] <Nodex> if you mean this... col.save([doc1,doc2,doc3...]); then no
[13:02:17] <Nodex> you can build them as an array and loop the array easily enough
[13:02:27] <hroark> yeah
[13:02:33] <hroark> seams a massive workaround for something so simple
[13:02:43] <braoru> I got precision for my last unclear question :) I have record like that : http://paste.fedoraproject.org/28967/37518917/ and I would like to know if it's possible with aggregation api to get sum of "motivations" and get something like { "motivationXX":"44", "motivationTT":"66", ..} word are in french but if needed I can just make it completely English and generic..
[13:04:16] <braoru> I can try to change de shema too .. I think it would be a good idea..
[13:05:18] <hroark> right
[13:05:21] <hroark> http://pastebin.com/9tbGe8rE
[13:05:34] <hroark> all this work just to clear some nested arrays
[13:06:22] <Nodex> yup, scalability doesn;t always come easy
[13:06:41] <Nodex> I'd forgotton how ugly coffeescript was too !!
[13:07:51] <hroark> yeah, ugly function
[13:09:33] <hroark> anyways, thanks for the help.
[13:10:25] <Nodex> no probs
[13:24:36] <aandy> having used coffeescript (not anymore, just toyed with it), it's not as bad as people try to make it out to be. everything is context
[13:24:54] <aandy> if you expect it to be a language, it's not. it's sugar and shorthand, use it as such :p
[13:25:09] <hroark> it's just javascript
[13:25:15] <aandy> right
[13:25:17] <hroark> but easier to write imo
[13:25:24] <hroark> fucking curly braces
[13:25:50] <aandy> i deemed it more useful for one of two people: 1. those who are new to javascript, and 2. those who like the syntax
[13:26:37] <hroark> i think it's easier to read
[13:26:45] <aandy> then you're 2
[13:27:26] <aandy> i write a lot of javascript, and while my cs projects are easy to read, it's not as easy to switch between the two
[13:27:47] <hroark> i agree
[13:27:58] <hroark> it's the worst when they are part of the same project
[13:28:00] <aandy> and coffee has its own quirks, but as per 1. it does help novice becoming better js writers
[13:28:07] <hroark> like an occasional clientScript.js
[13:28:09] <aandy> ugh, don't even
[13:28:37] <hroark> for x in y pays for itself
[13:29:15] <aandy> true, but the lack of for loops makes you write some big workarounds sometimes
[13:29:24] <hroark> do does checking.for?.nested?.objects == true
[13:29:46] <aandy> anyway, i don't use it. but it did teach me a few things about a complicated `this` nest i had :)
[13:30:15] <aandy> right, but the same can be done with checking.for && checking.for.nest, but i agree when it becomes more nested than that it's ugly and confusing
[13:30:48] <aandy> and if you go so nested, it's usually because you're *trying* to structure random data :p
[13:31:23] <hroark> true
[13:31:24] <aandy> which is yet another problem in itself. at best the checking.for?.nested?... ignores your failed initializers
[13:31:28] <aandy> i mean, in worst case
[13:32:06] <hroark> i use it mostly for checking JSON strings i get from apis
[13:32:12] <hroark> rather than breaking my app
[13:32:22] <aandy> but still, if people want to use coffee, let them. for smaller projects it can work very well
[13:32:42] <hroark> why not large?
[13:33:06] <aandy> the only example i have is github's hubot. it's way more readable, and easier to follow the plugin/interface use than the js implementation would be
[13:33:37] <aandy> i didn't say not large, i said small. it can work for larger aswel, but i've never tried it, so can't say
[13:33:43] <Nodex> do you guys use less / sass / compass etc ?
[13:33:53] <aandy> no
[13:34:02] <hroark> i use it for some things
[13:34:12] <hroark> in general i prefer vanilla css
[13:34:13] <aandy> right tool for the job (tm)
[13:34:20] <aandy> same
[13:34:33] <hroark> i've actually only really used it for generating CSS as part of an app
[13:34:51] <Nodex> generaly I find people who like shortcutting and using these silly tools all tend to stick to whatever makes them type less
[13:35:25] <hroark> saying that, i do like jade quite a lot
[13:35:35] <aandy> Nodex: i feel you for some aspects
[13:35:40] <aandy> slacker coder is bad coder
[13:35:51] <aandy> i use jade aswel, where appropriate
[13:36:05] <aandy> which is again, some people use it because this is where it actually makes sense
[13:36:10] <aandy> others use it everywhere
[13:36:12] <Nodex> I try to use vanilla everything with exception to jQuery
[13:36:20] <aandy> that's the destinction i'm trying to make :)
[13:36:34] <Nodex> same with everythign aandy
[13:36:41] <hroark> no MVC?
[13:36:43] <aandy> right
[13:36:49] <Nodex> mvc sucks balls
[13:36:54] <hroark> I've started using angular
[13:37:00] <hroark> and it's really, really good.
[13:37:15] <Nodex> I've looked at all these new angluar type things, seems a lot of work to do simple stuff
[13:37:16] <hroark> jquery can only get you so far
[13:37:27] <hroark> it is nodex, but it's for larger apps really
[13:37:35] <hroark> it abstracts everything out of the DOM
[13:37:43] <hroark> so you don't have to worry about updating your views
[13:37:47] <Nodex> yeh and in to memory
[13:37:58] <Nodex> boom - performance goes out of the window in a VERY large app
[13:38:23] <Nodex> I'm not really a fan of these one page apps tbh
[13:38:23] <hroark> the VERY large app that wouldn't exist
[13:38:31] <hroark> SPAs are the future bro
[13:38:37] <Nodex> nah
[13:39:15] <hroark> what makes you say that?
[13:39:18] <Nodex> I've not seen any SPA that can do things I can do faster or in a less / more coherant manner
[13:39:35] <hroark> definitely faster
[13:39:44] <hroark> especially for mobile
[13:40:01] <Nodex> I don't agree but each to their own
[13:40:28] <hroark> when you've got a 1s or higher request cycle
[13:40:46] <Nodex> I never have those so it's not really relevant to me
[13:40:58] <hroark> you're not your users though
[13:41:05] <hroark> unless you only build for LANs
[13:41:27] <hroark> some dude on a phone in hong kong is gonna just move to the next one
[13:41:42] <Nodex> I don't care about him, if I did I would put a node on his edge
[13:42:27] <Nodex> using angular will not kill the latency between my server and somewhere far away
[13:42:36] <Nodex> and the latency is my only bottleneck
[13:42:43] <hroark> it's not about the actual latency, it's about the perceived latency
[13:42:55] <hroark> there is minimal perceived latency with SPAs
[13:43:21] <Nodex> because it updates the page in realtime?
[13:43:37] <Nodex> it's not difficult to mimic that in any app, you don't need to be an SPA for that
[13:43:39] <hroark> because it reacts instantly to user input
[13:44:15] <hroark> not just in a fancy way, but with actual content
[13:44:57] <hroark> ajax on steroids
[13:45:04] <Nodex> an SPA cannot kill the actual latnecy between my app and the server. For example, a chat app between 2 people - one cannot update the other on the other side of the world any faster than actual network latency
[13:45:19] <hroark> i agree
[13:45:23] <hroark> it depends on the app, of course
[13:45:25] <Nodex> then the point is moot
[13:45:33] <hroark> for chat apps, the point is moot
[13:45:45] <hroark> except for user navigation
[13:45:55] <Nodex> what point is an app with all that weight and bloat to just interact with yourself?
[13:46:11] <Nodex> it's great if you're just typing hello world to see it instantly in the page
[13:46:25] <hroark> user experience
[13:46:29] <Nodex> (again - something that doesn't require anglular or the like to do)
[13:47:08] <hroark> why should i have to wait 500ms to change a page when it could be done in 100ms
[13:47:20] <hroark> just need to tweak my profile
[13:47:20] <Nodex> you don't
[13:47:35] <Nodex> I don't write apps that work like that, I only change what needs to be changed
[13:47:43] <hroark> what do you mean
[13:48:05] <Nodex> I want to tweak my profile so ONLY my profile is changed and refelcted in the page
[13:48:20] <Nodex> I don't reaload the entire page for that, that is Stupid
[13:48:28] <Nodex> reload *
[13:48:39] <hroark> i agree
[13:48:48] <hroark> that's the point of using things like angular
[13:49:06] <hroark> oh, but then what if you have user.name in 3 locations
[13:49:08] <Nodex> but I've always written my apps like that, this is nothing new to me, I have been doing it this way for many yeas
[13:49:10] <saml> how can I set up proxy so that /foo/a/b/c will be http://example.com/bar/a/b/c ?
[13:49:11] <Nodex> years *
[13:49:12] <hroark> with just jquery, you have to manually bind each of those views
[13:49:29] <Nodex> saml : wrong chang
[13:49:30] <Nodex> chan *
[13:49:31] <saml> <Location /foo> ProxyPass http://example.com/bar isn't it
[13:49:33] <saml> oops
[13:50:05] <Nodex> hroark : no I don't but that's my development env, I agree many peope would
[13:50:10] <Nodex> people *
[13:50:30] <hroark> i think the performance issue is relatively moot anyway
[13:50:47] <hroark> you could have said the same thing about jquery several years ago
[13:50:57] <hroark> things move on
[13:51:53] <Nodex> I don't agree that SPA's are the future, they have a place for those that want to use them sure but they are not the only way to accomplish the things they do and VV with non SPA's
[13:52:07] <chostrander> Hello all! Speaking of performance... I have a mongodb that consists of 525 million records... every day we must process ~40 million records that can either exist (update) or be new (insert).
[13:52:27] <chostrander> right now the performance is very slow... and we don't have shards...
[13:52:41] <Nodex> you should probably shard that asap
[13:52:56] <hroark> time will tell nodex
[13:53:04] <hroark> anyway, laters sir
[13:53:13] <Nodex> laters
[13:53:34] <chostrander> right now processing the 40 million records takes about 14 hours..
[13:54:00] <Nodex> is there a question coming?
[13:54:30] <chostrander> yes... what are some ways to improve speed? will sharding help?
[13:54:50] <Nodex> [14:51:38] <Nodex> you should probably shard that asap
[13:55:51] <chostrander> ok... will look into sharding...
[13:56:26] <chostrander> one more thing... has anyone used TokuMX? Are their claims of tremendous speed improvement for real?
[13:56:56] <chostrander> Thank you Nodex!!!
[13:59:29] <leifw> chostrander: I'm one of the developers of TokuMX, you can ask me questions or come to the #tokutek channel if you like
[14:00:39] <chostrander> Hi leifw! I understand your response... just wanting to peoples experiences! :-)
[14:03:09] <leifw> chostrander: because of concurrent writers, we can't support a particular count() optimization that vanilla mongodb does, but for almost everything else, pretty much everyone we've heard from has seen big performance improvements
[14:04:16] <leifw> chostrander: these things are of course workload dependent, and I think most people that have tried us were already hitting the limits of mongodb's performance, so there's some selection bias
[14:15:14] <Andy80> is there a way to compare the _id against a list of ids? something like {'_id': ['id1', 'id2', 'id3']}
[14:15:48] <leifw> Andy80: I think you want an $in query: http://docs.mongodb.org/manual/reference/operator/in/#op._S_in
[14:16:28] <Andy80> leifw, yeah perfect! thanks :)
[14:42:58] <remonvv> leifw: Out of pure curiousity, can you explain a bit where the performance gains come from that are being claimed here?
[14:46:51] <Nodex> http://blog.mongodb.org/post/56876800071/the-most-popular-pub-names <--- I thought Derick wouldv'e written that tbh
[14:47:06] <aandy> Nodex: agreed. that's also why i don't try out that manye of the new <fancy libraries>, as i just can't see the use case, and *worse* that it's a lot of work to do less. it's a bad trend :)
[14:47:45] <aandy> it's basically, FORGET EVERYTHING YOU THOUGHT YOU KNEW ABOUT THE WEB, then learn how they think web should be, change all your habbits and then it's super easy to use
[14:47:47] <cheeser> Nodex: heh. i was just reading that, oo.
[14:49:14] <Nodex> aandy : I agree, to much abstraction going on whilst waiting for the browser vendors to catch up
[14:49:32] <Nodex> while/whilst
[14:49:51] <Nodex> ^^ Pretty nifty app that Ross wrote, kudos
[14:50:58] <cheeser> the pub thing?
[14:52:59] <Nodex> yeh
[14:53:22] <Rozza> thx :)
[14:53:25] <remonvv> I wonder what name would float to the top if you do that for pubs around here.
[14:53:32] <remonvv> Here being Amsterdam.
[14:53:38] <remonvv> "The Unfortunate Itch"
[14:53:45] <Rozza> Sorry I only loaded uk pubs
[14:53:47] <cheeser> real pub name?
[14:53:57] <remonvv> cheeser: Ha, no.
[14:54:14] <remonvv> Although that would be rather brilliant
[14:54:43] <cheeser> it'd be a better brothel name
[14:55:00] <remonvv> No shortage of these here.
[14:55:07] <remonvv> We have something called "The Banana Bar"
[14:55:23] <cheeser> potassium is important.
[14:55:36] <remonvv> Not that odd until I tell you that it's in the red light district and does not sell banana related beverages
[14:55:41] <remonvv> true.
[14:56:29] <remonvv> leifw: You here? Where can I find technical details on TokuMX? I'm looking through the code but it's a little tricky to see what you've changed.
[14:56:59] <Nodex> remonvv : it's on their site iirc
[14:56:59] <cheeser> oh, hai, kinabalu
[14:57:28] <remonvv> NodeX: Really? I've been looking at that and all I can find is something about fractal trees instead of b-trees and some benchmarks.
[14:57:45] <Nodex> yeh, what more do you need ? :P
[14:57:58] <Nodex> it's a new word that sounds cool so it must be fast
[14:58:03] <remonvv> I need to know if it breaks contract with vanilla MongoDB and why it's so much faster.
[14:58:17] <Nodex> fractal stemmed = fract which means smaller which must mean faster
[14:58:50] <Nodex> </end-sarcasm>
[14:58:55] <remonvv> After 15 odd years I've learned speed generally does not come for free.
[14:59:04] <remonvv> With the possible exception of an initially shitty implementation ;)
[14:59:29] <Nodex> I was going ot benhcmark it but got put off by requiring my details
[15:00:16] <remonvv> Doesn't require any information when I just tried
[15:00:32] <remonvv> Community edition?
[15:00:45] <Nodex> I think so, this was a few months back, perhaps they changed it
[15:00:59] <remonvv> It asks for e-mail but starts the dl anyway
[15:01:28] <remonvv> Hm, seems to focus on insert speed for a lot of the performance improvement claims.
[15:01:37] <Nodex> dang no 32 bit support :(
[15:01:39] <remonvv> That's a rather specific performance issue.
[15:01:50] <remonvv> Seriously?
[15:01:52] <Nodex> Joke
[15:02:01] <Nodex> I want 8 bit support for my NES
[15:02:02] <remonvv> That's reassuring ;)
[15:02:42] <remonvv> Ah Nintendo..I wrote a Gameboy Advance emulator in Java. A healthy mix of fun and masochism.
[15:03:08] <Nodex> :D
[15:03:19] <Nodex> I wrote a hello world app once, it was awesome
[15:03:29] <Nodex> it said "Hello" then right after "world"
[15:04:02] <deepy> I learned about threads by writing my first "world!Hello " app
[15:04:30] <Nodex> hello world apps are kinda stupid because the only person seeing it is the developer and the developer is in no way the entire world
[15:04:42] <Nodex> it should say "Hi there person in front of me"
[15:04:59] <Nodex> just sayin
[15:05:02] <deepy> You're beeing too literal, the greeting is extended to the world, it's a wide audience, it doesn't mean that everyone has to read it just that everyone is encouraged to :-)
[15:05:25] <Nodex> in your opinion :)
[15:05:54] <deepy> My hello worlds make it into production ;-)
[15:06:16] <Nodex> good for you :
[15:06:17] <Nodex> :)
[15:06:20] <Nodex> remonvv http://www.tokutek.com/resources/technology/
[15:06:25] <Nodex> they haeva video
[15:06:28] <Nodex> have a *
[15:07:00] <remonvv> I noticed.
[15:15:29] <remonvv> Hm. I'll read up on that stuff tonight.
[15:25:09] <Nodex> pretty good video tbh, makes a lot of sense
[15:26:54] <_jo_> I'm seeing this error come up on our staging servers: https://jira.mongodb.org/browse/SERVER-7768 If I pull a more recent Python driver, can that work around the problem?
[15:27:17] <_jo_> Let's assume I can't upgrade the MongoDB version because we're not hosting it in-house.
[15:43:38] <mrlami> does mongoDB have a php driver and if so what are the risks that in 6 months time it'll be invalid and unusable due to changes?
[15:44:41] <cheeser> yes, it does. and it's a pretty safe bet. the php driver author sits two desks down from me. :)
[15:45:35] <remonvv> and he has a bottle of very expensive whiskey
[16:01:57] <leifw> remonvv: sure
[16:02:19] <leifw> remonvv: if you'll be in the boston area tomorrow you can go see my coworker Zardosht give a technical talk about how we do it
[16:03:05] <leifw> remonvv: but basically what we did was we took mongodb, ripped out the storage code (b-trees, file allocation, extents, all that) and replaced it with our storage library that we use in tokudb
[16:03:39] <leifw> remonvv: it's an implementation of a fractal tree, which is a write-optimized data structure
[16:04:37] <leifw> remonvv: http://youtu.be/c-n2LGPpQEw is a description of why they're better than b-trees, we have a bunch of other posts about it if you google around
[16:05:28] <leifw> here's the talk tomorrow in boston if you're interested: http://www.meetup.com/Boston-MongoDB-User-Group/events/127576442/
[16:05:51] <leifw> there's also this: http://www.tokutek.com/2013/07/tokumx-fractal-treer-indexes-what-are-they/
[16:07:32] <leifw> the compatibility issues are: no geo/2d/2dsphere yet, no mixed vanilla/tokumx clusters, otherwise apps shouldn't notice a difference
[16:07:53] <leifw> I'll stop flooding the channel now, come ask in #tokutek if you have specific questions or want a longer lecture
[16:08:07] <leifw> (but we're going to lunch now)
[16:10:22] <remonvv> leifw: Thanks ;) I'll read up on it tonight. The only google hits on Fractal trees are to Toku so it's hard to find additional info that isn't directly from Toku ;)
[16:10:25] <remonvv> Also afk
[16:17:07] <flaper87> What does multi means here? (.explain()'s output) "cursor" : "BtreeCursor sub multi",
[16:17:22] <flaper87> s/means/mean/
[16:25:00] <flaper87> figured it out. It means there are more than 1 bound
[18:35:41] <braoru> hello, I have a collection of document looking like that http://paste.fedoraproject.org/29073/37520921/ and I want to obtain a list of "motivations" containing the sum of each .. any idea how to proceed with aggregation or something else ? (I can change the shema if needed)
[18:36:44] <ninepointsix> Hey, is there a way to force an insert to behave like an upsert? (or have the update command take an array?)
[18:54:07] <braoru> ok I change motivation to an array and I will chain unwind ..
[18:58:37] <braoru> not working too :(
[19:03:55] <braoru> sad .. ok will have to extract each line bring them back to python and do the sum by code ..
[19:34:24] <braoru> can try again .. no one here to help me with a simple aggregation of array ??
[19:34:58] <DanWilson> what was your question?
[19:35:10] <braoru> I have document like that http://paste.fedoraproject.org/29089/37521283/
[19:35:26] <braoru> and I just want to obtain sum of each member of "motivation"
[19:35:51] <braoru> a final array with id:1 value:55 (if the sum of all if 1 = 55)
[19:36:02] <braoru> if the sum of all id=1 is 55
[19:36:34] <braoru> I try from 3 day and I still turning in circle :/
[19:37:36] <DanWilson> hang on
[19:37:39] <DanWilson> lemme import your data
[19:37:52] <braoru> :)
[19:46:58] <braoru> Every thing I try I'm still getting 0 at my sum :(
[19:50:56] <tg2> hey, any idea why batch inserts can only be 16Mb
[19:50:57] <tg2> ?
[19:50:59] <tg2> any way to increase this?
[19:51:11] <cheeser> isn't that the max document size?
[19:51:25] <tg2> yeah but on a batch insert with 1mb documents
[19:51:28] <tg2> you can't put more than 16
[19:51:34] <tg2> likewise with 512kb docs, you can't put more than 32
[19:51:45] <cheeser> right. so you do it in multiple batches
[19:51:59] <tg2> Where is this limitation
[19:52:06] <tg2> and is it configurable?
[19:52:08] <cheeser> in the kernel
[19:52:11] <cheeser> i don't think so.
[19:52:46] <tg2> what kernel limitation is it hitting
[19:53:08] <ron> the mongodb kernel limitation of 16Mb.
[19:53:12] <tg2> k
[19:53:56] <braoru> DanWilson, can it come from the fact taht value and id are string ?
[19:55:00] <tg2> on a server with 256G of ram, seems a bit of a waste to limit batch inserts to 16MB total seeing as single document size limitation is 16Mb
[19:55:23] <DanWilson> braoru: that's because you are trying to sum a string
[19:55:35] <DanWilson> you need to convert the data you want to sum into a number type
[19:55:41] <DanWilson> once you have that,you can use this query:
[19:55:52] <DanWilson> db.braoru.aggregate([
[19:55:52] <DanWilson> {$project: {"_id": 0, "motivations": "$motivations"}},
[19:55:52] <DanWilson> {$unwind: "$motivations"},
[19:55:52] <DanWilson> {$project: {"_id": "$motivations.id", "value":"$motivations.value"}},
[19:55:52] <DanWilson> {$group: {"_id": 1, "summedvalue": {"$sum": "$value"}}}
[19:55:52] <DanWilson> ])
[19:57:10] <braoru> DanWilson, thx .. I will try o find why my insert use tring instead of int..
[20:00:04] <braoru> DanWilson, do you know hoe to force type at insert ?
[20:03:30] <braoru> found..
[20:04:55] <braoru> DanWilson, thx a lot now I just need to extend that to be able to obtain the sum by id :)
[20:06:01] <braoru> DanWilson, any tips to help me to search in the right way ?
[20:06:52] <ron> DanWilson: and for future reference, avoid pasting more than 2 lines in irc channels. that's why paste sites were invented.
[20:09:29] <braoru> found :)
[20:11:16] <braoru> DanWilson, arf no I was wrong :(
[20:12:50] <tjmehta> How does mongo consider subdocuments distinct - will it compare values of keys or does distinct only work for fields that hold a literal value?
[20:13:08] <tjmehta> (db.collection.distinct)
[20:13:09] <braoru> DanWilson, any idea on how I can get the sum by id ? instead of the global one ?
[20:19:31] <cheeser> tjmehta: i believe it compares the entire document
[20:21:05] <tjmehta> cheeser: field by field, so what about indexing fields that are subdocuments -- is it sufficient to index the key of the document or should each sub key be indexed?
[20:21:11] <DanWilson> braoru: did you figure out how to reinsert the document with the right types?
[20:21:29] <braoru> DanWilson, yes
[20:21:38] <braoru> DanWilson, and your request sent bacl a global sum :)
[20:21:46] <cheeser> tjmehta: you should test it but I would think if you're querying by subkeys you should index those explicitly.
[20:22:01] <cheeser> since there are no implicit indexes on subdocuments
[20:22:35] <braoru> DanWilson, I tryed to use _id : _id in the last group
[20:24:29] <tjmehta> cheeser: okay so last question if I do a distinct query for subdocuments and I have indexed the key for the subdocument (but not the children) - is that enough indexing for the distinct query if it is doing a field by field comparison?
[20:24:44] <DanWilson> braoru: http://paste.fedoraproject.org/29101/13752158/
[20:24:58] <DanWilson> that one sends me back each id and a 0, because of the type difference
[20:25:55] <braoru> DanWilson, if I change the type I get number
[20:26:12] <tjmehta> cheeser: got to run thanks ahead - still waiting on your last response though.
[20:27:31] <DanWilson> braoru: but you only get a single record returned?
[20:27:51] <braoru> no I get multiple record but data seem to be wrong .. I have to verify my insert
[22:05:34] <Industrial> Say I have a SQlite database with 500MB of records in 1 table with a date column with a unix timestamp (number) in it. I want to seach through this data set by start/end time. Currently takes 20-30 sec. How would mongodb do a better job at this?
[22:12:48] <Goopyo> Industrial: if its in memory it should be fairly quick
[22:13:23] <Goopyo> Mongodb IDs also have a built in timestamp thats queryable
[23:45:09] <EGreg> hey guys
[23:45:18] <EGreg> how do I do the equivalent of SELECT * FROM foo WHERE a IN (1, 4, 5, 6)
[23:45:26] <EGreg> in node.js
[23:45:29] <EGreg> with mongodb