[00:10:10] <diegoaguilar> schema represents collection in code
[00:10:18] <diegoaguilar> model represents on single document
[00:10:35] <diegoaguilar> as u can imagine/notice, they're so related
[00:10:49] <diegoaguilar> sure Boomtime, wait I'll prepare a gist
[00:11:41] <EllisTAA> diegoaguilar: for some reason this https://github.com/ellismarte/ellis.io/blob/master/server.js#L71-L79 is giving me this error ngel.co
[00:16:07] <diegoaguilar> I will correct the example, something is missing
[00:17:22] <diegoaguilar> Boomtime, this is good example http://kopy.io/WBSyS
[00:18:10] <diegoaguilar> I want to know how many lineups of same team, got docs with round 2 and empty players AND doc with round 1 of same team is not empty
[00:18:26] <diegoaguilar> so the result for the example I gave u: http://kopy.io/WBSyS
[00:19:20] <diegoaguilar> EllisTAA, that is not enough to debug
[00:19:31] <diegoaguilar> the "is not defined" means that somehow the export was not well done
[00:19:43] <diegoaguilar> I probably need to see the server.js file
[00:20:21] <EllisTAA> diegoaguilar: here is the server js file https://github.com/ellismarte/ellis.io/blob/master/server.js#L71-L79
[00:20:33] <EllisTAA> and here is the models file https://github.com/ellismarte/ellis.io/blob/master/models/models.js
[00:22:43] <Boomtime> diegoaguilar: your stated query isn't possible to get that result - "how many lineups of same team" <- are you querying for a specific team?
[00:25:40] <Boomtime> well, i can see a way to do it with aggregation, but the schema kind of fights against you for this information (and it would not be fast) - i don't see how this is good schema for you - what benefit does this schema have?
[00:26:18] <diegoaguilar> hmm, sorry to asnwer with a question, but why is the schema not good for my problem?
[00:27:55] <diegoaguilar> its just a tournament, I record each lineup for each team
[00:28:05] <diegoaguilar> they got distinct lineups as its one single lineup per round
[00:28:24] <diegoaguilar> inside matchDate I got more information, like the date of the match
[00:28:38] <diegoaguilar> I just need to know how to use the aggregation
[00:28:49] <diegoaguilar> I read it but as its first time Id use aggregations I got so confused
[00:29:01] <Boomtime> you should design a schema to hold the infromation you need
[00:29:21] <Boomtime> at the moment, you've clearly spread the information you need across documents - why?
[00:29:40] <diegoaguilar> trust me, this is going on production now and it was not designed by me
[00:29:50] <diegoaguilar> I need to obtain the information
[00:30:01] <Boomtime> then you need to learn aggregation
[00:30:26] <diegoaguilar> could u help me this time, and then Id stick to aggregation docs
[00:32:28] <Boomtime> i can see a way to do it conceptually, but i'd have to run a bunch of tests to make it work - so, no... i can provide some pointers and suggestions on specific aspect sof aggregation to learn though
[00:33:37] <diegoaguilar> any suggestion is welcomed
[00:33:45] <diegoaguilar> and thanks for previous advices
[00:34:35] <Boomtime> first, you need to grasp the basic concept of aggregation: it's a stage pipeline - think of a simple query with a sort as a pipeline of a "query" stage followed by a "sort" stage - in aggregation these stages are literal
[00:34:56] <Boomtime> i.e that example would be a $match stage then a $sort stage
[00:37:17] <Boomtime> $group is very powerful, it can group disparate documents based on key fields - such as "team" - and let you construct ouptut documents that utilize specific fields of the input documents
[00:37:42] <Boomtime> you should go and play around with $group - use a pipeline that consists only of that - see what you can get it to do
[00:37:50] <diegoaguilar> poor EllisTAA I just checked his case :P
[00:38:14] <Boomtime> try grouping on your team field, then try to get your output documents to contain a count of players for example for every team
[00:38:39] <Boomtime> there are examples to try here: http://docs.mongodb.org/manual/reference/operator/aggregation/group/
[00:42:04] <Boomtime> the $group operator is almost entirely what you need - but you'll want to remember that $match exists whenever you need to cull out the results you don't want
[00:42:53] <Boomtime> the suggestion i have is to get a mongo shell, and start playing around with $group, nothing beats seeing stuff happen to learn something new
[00:44:06] <diegoaguilar> sure, hey Im just trying now
[00:44:19] <diegoaguilar> how can I tell the aggregation "please include this field in the output"
[00:46:42] <joannac> I don't know, figure out what's the right thing for your use case?
[00:47:42] <diegoaguilar> I posted more details on here http://stackoverflow.com/questions/31954171/how-to-obtain-difference-between-two-queries-in-mongodb
[01:32:06] <poincare101> Hi.. I'm trying to understand how NoSQL models relationships between data. If I have a bunch of documents called "Artists" and each of these can have a bunch of albums, what's the best way to model this sort of relationship?
[01:32:15] <poincare101> Do I just stick the album documents inside the artists?
[01:32:29] <poincare101> What if several artists own one album?
[05:00:02] <Trel> I'm not 100% on how to read bucktrackers is that going to be in the next version is it more of a wishlist or about where on the Next |-----------------| wishlist scale?
[07:01:54] <leptone> I'm having trouble writing to my db. can anyone tell me whats going on? https://github.com/leptone/myGit/blob/master/myPhraser.js
[07:14:36] <diegoaguilar> leptone, u refered stacktrace
[07:14:44] <diegoaguilar> including files I DONT have at my view
[07:14:53] <diegoaguilar> u shared me myPhraser.js
[07:15:00] <diegoaguilar> not even dynamics_node.js
[07:15:45] <diegoaguilar> AND refering packages/mongo/mongo_driver.js:313:1 means that, ok that's a mongo js module dependency which got code not ready for the bad condition u coded so it fails
[07:15:53] <diegoaguilar> OR, I need to see /dynamics_nodejs.js
[07:17:36] <leptone> diegoaguilar, sry wht file do you need me to include?
[07:18:45] <leptone> diegoaguilar, where is dynamics_node.js?
[07:19:12] <diegoaguilar> well, in ur error stacktrace
[07:19:29] <diegoaguilar> u could notice where in myPhraser.js fails
[09:18:22] <leptone> but its not writing to the db
[09:49:16] <leptone> can someone please give me a hand i have no idea where this error came from or what to make of it I20150812-02:45:37.850(-7)? { [Error: connect ECONNREFUSED] stack: [Getter] }
[12:14:01] <deathanchor> but don't try to change db shards or collections shards/settings while one is down, those commands will fail.
[12:14:20] <aps> I don't know why mongo docs are not clear about backup strategies for huge mongo clusters :/ I can't just stop my prod instances and take backup if they have TBs of data
[12:14:30] <deathanchor> but the config server is so small of a DB, you should just do a mongodump from the 27019 port (default port)
[12:14:47] <d-snp> I have a question: we have a system that automatically creates collections and shards them over the cluster, and the sharding operation takes between 30 and 60 seconds, is that normal?
[12:16:40] <d-snp> we have set initial chunks to 6, could it be allocating chunks is expensive
[12:18:21] <deathanchor> d-snp: yeah, I believe that could be normal, but cheeser might know better
[12:23:54] <aps> deathanchor: we can't take mongodump without stopping mongo, right?
[12:24:57] <deathanchor> it's how I backup my config sever
[12:25:35] <aps> deathanchor: what about mongod servers?
[12:25:53] <deathanchor> oh.. that is a much bigger discussion
[12:27:00] <deathanchor> some servers I stop secondary mongod and do a mongodump using --dbpath (faster than through mongod), others I stop mongod and take aws snapshot
[12:29:10] <aps> deathanchor: I can take aws snapshot by doing db.fsyncLock( ) as well right?
[12:29:26] <deathanchor> I don't know much about that
[12:29:58] <cheeser> i think the official answer is to use mms/cloud backup
[12:30:14] <d-snp> at the moment our daily create collections and shard process takes over an hour, we fear it's not scaling
[12:30:29] <joannac> what? mongodump is NOT a snapshot in time
[12:30:35] <d-snp> oh hi cheeser :) do you know if its normal that sharding a collection over 6 shards takes between 30 and 60 secs?
[12:31:21] <d-snp> we're thinking maybe we should increase the chunk size to 512mb and lower num initial chunks, to reduce the load a little
[12:31:21] <joannac> d-snp: empty? how many chunks?
[12:38:09] <joannac> d-snp: I'm not sure where you think the inefficiency is. the splits have to happen. most of the time of the moves are writing metadata
[12:38:22] <joannac> since you said the collection is empty
[12:39:10] <aps> Is it possible to backup specific collections using MMS?
[12:39:49] <joannac> just exclude the ones you don't want backed up
[12:39:51] <d-snp> yeah, but apparently the cost is quite significant, 60 seconds means our daily creation takes an hour, and that's just with ~30 customers
[12:40:07] <joannac> you create collections every day?
[12:41:14] <d-snp> well, before this scheme deletion took ages, we had huge indexes too
[12:41:27] <d-snp> though with wired tiger the indexes seem to have gotten way smaller, not sure why
[12:41:36] <deathanchor> I have daily rotating collections, but we don't need more than a week's worth, so export, cold store and purge things older than a week
[12:42:03] <deathanchor> I don't use sharding for that
[12:42:29] <deathanchor> my sharded collection is the big data store of userdata
[12:43:59] <d-snp> our bigger customers do over 14k inserts per minute, I don't know what your solution would be, but .remove was most certainly too slow
[12:44:03] <jamiel> Hi all, we are doing a massive amount of updates due to an application upgrade (300m+ records) and it appears our oplog is now 24GB - we need this space for the migration as we are changing array fields and the db is fragmenting ... is there a way to reclaim this space?
[12:45:42] <joannac> jamiel: resize the oplog when you're done?
[12:45:57] <aps> joannac: Since we haven't taken backup using MMS till now, estimated cost shown to me is ~ $3000/month for one replica set. Will this cost be lower for subsequent months since initial backup would've been taken?
[12:53:12] <aps> that is just for one RS. there are 2 others as well
[12:54:08] <deathanchor> I have 8 RS, biggest one is 80GB for me. but then again, my tokumx ones are super compressed
[13:07:05] <d-snp> joannac: when I said removing wasn't fast enough for us did you still think a collection per day is a crazy idea?
[13:48:33] <jamiel> aps: A delayed secondary replica member offsite and daily mongodump to a cheap SATA drive off peak reading from secondary's should be sufficient for most use cases .. if you have the skills you will still be able to do point in time recovery from your secondary or replicas. We've also had to avoid MMS for backup for and DB > a few GB's as it simply doesn't
[13:48:33] <jamiel> make commercial sense ... backup costs can spiral to as much as the deployment, I imagine behind the scenes Mongo Inc are struggling to store things cheaply and still offer all the features of their service
[13:51:39] <jamiel> For a regular DB 10-50GB I find it excellent, though
[13:51:46] <deathanchor> jamiel: agreed. Cheaper for us do exactly what you stated.
[14:28:34] <poincare101> Hi. If I have a document that looks like this: http://pastie.org/10346396, is there a way for me to update the "votes" on one of the list elements in a single query (i.e. not having to query first for the document, search for the right personID, update it and then push the update to Mongo?
[14:42:25] <deathanchor> I think it is: .update({ "persons.personID" : "2" }, { $inc : { "persons.$.votes" : 1 } })
[14:44:49] <diegoaguilar> Hello, could someone help me with this index (probably) related case? http://stackoverflow.com/questions/31960358
[15:23:22] <Haris> I have this article bookmarked ( http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis ). but it is old now. Is there a more recent comparison between nosql/etc other solutions for huge data ?
[15:23:59] <Haris> that compares mongo with cassandra
[15:24:48] <Haris> also, does mongo support Thrift
[15:26:33] <Haris> with cassandra, I know when new nodes are added to the cluster, the retrieve performance increases. how is that in mongo
[15:26:51] <Haris> Correction: how is that or how does that work in mongo
[15:27:07] <mkjgore> hey folks, I was just curious, I'm looking through a sharded mongo setup that upgraded from mongo 2.x to 3.0.3 which is having some issues
[15:27:20] <mkjgore> and I'm noticing lock files on several configuration servers
[15:27:43] <mkjgore> I thought that the locks were located in the internal collection
[15:27:54] <Mez> Hi all, wondering if anyone can help me improve my indexes for a query.
[15:27:56] <mkjgore> (IE in db, not a file on a server)
[15:40:39] <Derick> Mez: i don't really know. Only thing I can think of is *one* index where you have the "to" bit first, as that seems to have the highest cardinality
[15:41:22] <Mez> so, remove the dual indexes, and add a combined one with to first ?
[16:11:25] <iaotls> has anyone had success using mongo 2.6 on armhf?
[16:30:01] <jpfarias> hey guys, is stop possible to cancel a movePrimary operation?
[17:15:43] <zcserei> hello! how can I make sure my User model (I'm using mongoose) only saves when the value of role (type is String) is either "user", "provider" or "admin"?
[17:43:55] <StephenLynx> I suggest you don't use mongoose.
[18:07:08] <deathanchor> in a js script to use wiht mongo commandline is there away to tell it only use a secondary?
[18:12:08] <deathanchor> nevermind can do it via the URI of the connect
[18:44:53] <albertom> well here is the bug, it crashes awfully
[18:49:09] <albertom> Same enviroment on mongo 2.4 doesnt fail
[18:49:30] <GothAlice> Null pointer reference after an assertion failure; that null pointer is likely the collection name.
[18:49:43] <albertom> I was thinking on bugs at ceilometer but... even if the querys are not well formed, the mongo server shouldn't die that bad
[18:50:01] <GothAlice> For performance reasons, MongoDB doesn't heavily audit the structures.
[18:50:23] <GothAlice> (This lets people occasionally do bad things accidentally, like including "." in field names, or having a collection named "[object Object]".)
[18:51:18] <GothAlice> What's particularly strange is that the query planner seems to know what it's talking about (i.e. it knows which collection is being queried, and can figure out indexes on it).
[18:51:45] <GothAlice> albertom: You've tried 3.0 and 2.4, have you tried 2.6?
[18:52:21] <GothAlice> It would be greatly helpful to "bisect" the versions until you can pinpoint which build added the bug.
[18:52:46] <saml> how can I sort by a string field's length?
[18:53:02] <saml> or find the doc with longest field
[18:53:05] <albertom> let me see if centos has 2.6. it normally has newer packages than ubuntu
[18:54:26] <GothAlice> albertom: The installation instructions for MongoDB pretty much universally have you add their own package source, rather than relying on the upstream distro package sets.
[19:16:39] <GothAlice> albertom: I'd add a comment to the ticket mentioning the additional versions tested.
[19:21:16] <deathanchor> default oplog size says 5% of free space or 1GB (whichever is greater), but is it continually changing or set only upon initial setup or every mongod restart?
[19:36:37] <GothAlice> deathanchor: Because the oplog is a capped collection, it's preallocated once during first startup.
[19:36:52] <deathanchor> yeah I figured that after I read the doc a few time
[19:50:57] <d4rklit3> I have a collection of documents: {approved:bool, featured:bool}. I am wondering if I am able to query 20 approved:true in random order. So if there are 120, i want a random 20 of those. On top of that I also want all the ones that are featured without with limit. so 20 approved + (n) featured.
[19:51:17] <d4rklit3> is this possible in one query
[19:51:29] <d4rklit3> without using any logic on the server ?
[19:59:17] <deathanchor> d4rklit3: looks like two queries