[01:02:31] <titosantana> thats the results I'm geting
[01:03:04] <titosantana> so the articles aren't being set correctly
[01:03:16] <skot> your input doc doesn't seem to have players…http://pastebin.com/qxC2KFNe and your reduce function refers to things incorrectly, i think.
[01:03:34] <titosantana> sorry i grabbed a doc without one
[01:14:24] <titosantana> ok so that makes sense its iterating through the articles array and dumping it to the return object
[01:15:15] <titosantana> so is it best to do all the dupe checks in finalize
[01:16:57] <titosantana> so in the finalize method u said it wasn't possible to filter out results right?
[01:17:13] <titosantana> id basically need to do that after retrieving the results?
[02:13:34] <titosantana> when you save the results from map-reduce
[02:13:45] <titosantana> is there a way to remove the _id, value ?
[02:14:03] <titosantana> i can't seem to sort on the map-reduce collection
[03:30:47] <xxtjaxx> Could you make something like an event emitter using mongodb so if a document in a collection is changed an event with a ref to the collection / document could be emitted?
[04:23:22] <ossipov> Do you use transactions in MongoDB?
[05:17:25] <asdfqwer> mongodb ruby driver questions ot here?
[05:52:46] <xxtjaxx> ossipov: Was that the question for me?
[06:06:03] <ossipov> xxtjaxx: that question was for community :) now we're in the proccess of choosing database for our project. We wanted to have nodejs+express+mongodb... but a friend from a big corporation suggested to use oracle or mssql as a more stable database that has transactions...
[06:27:54] <ossipov> Is there any specific time when mongodb community is active and supportive? :)
[07:28:01] <kizzx2> if a replica lag exeeds "log length start to end"
[07:28:08] <kizzx2> does that mean there is no choice but to do a full resync?
[07:30:01] <rspijker> if I understand you correctly, that means that your secondary has gone stale. That is, it's in a state where it can't 'catch up' to the primary anymore, because the oplog data doesn't cover the entire timespan between the state of the primary and secondary.
[07:30:06] <rspijker> Then yeah, you have to resync
[08:04:11] <rspijker> kizzx2: you supposed correctly
[08:04:23] <rspijker> your primary is the one that matters. It shows it has 26 hours of oplog
[08:04:58] <rspijker> which means any secondary can run that amount behind (be down for that amount of time) and still recover by replaying the primary oplog
[08:05:08] <xxtjaxx> ossipov: Okay. Well, I personally don't like express since its not opinionated enough on how to structure which means making a big mess without guidelines is easier. MSSQL if you are running in a *Nix env. is kind of wrong and I highly doubt Sybase connectors work that well anymore(thinking 2012 SQLSrv.) Depending on the size of the project and predicted growth mongodb is fine.
[08:06:34] <kizzx2> ....so now i'm looping db.printSlaveReplicationInfo() on primary, it shows taht B and C's "secs ago" figure keeps increasing
[08:06:34] <xxtjaxx> ossipov: Transactions aren't necessarily what mongodb does. You have a cursor you talk to that inserts something similar to Javascript Objects or JSON data into the database. This means you have maps/objects nested aswell in your documents inside the collection. Please read the docs on how that works.
[08:07:16] <kizzx2> does that mean that's bugging and i can discard it..?
[08:07:17] <xxtjaxx> rspijker: Correct me if I'm totally off the point. And if you know better.
[08:08:28] <kizzx2> rspijker: i'm seeing this log on replica D… https://gist.github.com/kizzx2/99bb3f728107f3098b26, does that look suspcious?
[08:08:33] <rspijker> kizzx2: those figures should increase, but at some point decrease again swell, to 0...
[08:09:29] <kizzx2> rspijker: well, i meant replica C (the most lagging replica), anyways
[08:11:24] <ossipov> xxtjaxx: do you know how stable is mongodb now compared to mongodb 1.8? AFAIK it crashed with data losses...
[08:14:16] <rspijker> ossipov: mongodb is very stable now. Using it in all kinds of situations (production as well) and haven't seen any problems so far
[08:15:51] <kizzx2> rspijker: oh does that mean if "all goes well", i may suddenly discover that db.printSlaveReplicationInfo() jumps from 10000 to 0?
[08:16:50] <rspijker> yes, it says how long ago it synced, so when it eventually syncs, it will go to 0
[08:17:00] <rspijker> the figures you are seeing are *very* high though...
[08:17:17] <rspijker> let me have an actual look at the log you posted :P
[08:17:26] <kizzx2> yea, well i launched replica C and added it to the set an hour ago
[08:18:07] <kizzx2> it's running on EC2, so there's about an hour or so of lagging, because i needed to snapshot the EBS, launch a new instance, wait for it to settle and then add it to the set
[08:18:53] <ossipov> rspijker: have you tried to make a shop or any kind of a billing system using mongo?
[08:18:57] <Nodex> ossipov : I have never lost data with any version of MongoDB. Obviously keeping a backup is the best way to ensure no data loss
[08:20:53] <rspijker> ossipov: a billing system, yes
[08:22:03] <ossipov> rspijker: did you have any problems while making it?
[08:22:24] <rspijker> ossipov: loads, but not necessarily due to mongo ;)
[08:22:38] <Nodex> ossipov : if you need rollbacks of any sort I suggest you implement that in a database that was made to handle those sorts of things
[08:23:40] <rspijker> kizzx2: the fact that you are seeing rollbacks is a bad sign… did the replica (C) use to be a primary?
[08:25:22] <rspijker> what happened is: C was primary, things got written to it, they were not synced to the secondaries. Then C was stepped down and is now a secondary, but it has more information (the unsynced writes) than the current primary
[08:25:37] <rspijker> this is also what the rollback is used for
[08:25:39] <kizzx2> so the fact that i accidentally ran "rs.initate()" on it is causing issues? (it's included in the startup script i used, for some reason)
[08:42:15] <rspijker> who is that aimed at double_p ?
[09:03:38] <xxtjaxx> ossipov: I haven't used it yet in such a high demand sphere. I plan on this myself.
[09:05:19] <xxtjaxx> ossipov: failing is a harsh word given the abillity to shard the server and replicate per shard should give some sort of safety. (MongoDB Devs? Please either (1) correct my false assumption or (2) make me warm and fuzzy for being right)
[09:06:15] <xxtjaxx> ossipov: I also don't know how far you want to scale at this point in time.
[09:08:45] <Nodex> you can replaicate without sharding
[09:08:57] <Nodex> sharding simply allows separation and write scaling
[09:09:50] <kizzx2> rspijker: so i did the launching of new replica member without `rs.intiate()` on it, i guess it is mandatory to do a full resync on launch
[09:10:00] <kizzx2> i'm getting "replSet initial sync clone all databases" from the log
[09:10:24] <kizzx2> s/full resync on "proper" launch?/
[09:11:14] <rspijker> hmmm, strange… I thought you could start with a dataset
[09:11:20] <rspijker> never really tried it though, tbh :)
[10:32:43] <rspijker> did you follow the tutorial on adding users?
[10:33:00] <rspijker> as in, first adding an admin user, then using that to add specific users to dbs?
[10:34:10] <rspijker> cllamas: check this: http://docs.mongodb.org/manual/tutorial/enable-authentication/ and the following 2 pages
[10:40:28] <cllamas> but my mongo is started from /etc/init.d/mongodb
[10:42:55] <remonvv> I love graylog's homepage. "Manage your logs in the dark and have lasers going and make it look like you're from space." and then proceed to say "used in big production deployments" only to list companies I've never even heard of.
[11:08:04] <cllamas> my user has the useradminanydatabase role, but i cannot access direclty the database with mongo --username grayloguser --host localhost --password 123 graylog2
[12:27:03] <remonvv> Anyone know if "numInitialChunks" parameter is supported for non-hashed indexes? It seems to be ignored.
[12:33:51] <remonvv> Ah, yeah, found it in the code.
[14:24:16] <newbie35> Hi . I am using the C APi of mongodb and I want to run the command enableSharing("base_name") but it does not work with the : mongo_simple_str_command
[14:24:55] <newbie35> Can any one show me what is the good way to execute this commands in the server please ?
[14:30:08] <Routh> Hey, pretty new to MongoDB (fair warning :P ) - I'm looking at saving a new object with a many to many relationship in it. It's a business and the relationship is categories. I'm using NodeJS and Mongoose with MongoDB. I'm just trying to figure out how a save would work in this? Do I need to save to the category model or the business, or both tables?
[14:34:15] <remonvv> Well, a) Mongo is not relational so know that any sort of relationship is enforced by the application (or mongoose in this case). b) Given the fact that a) how you save many-to-many relationships (N:M) will be documented in mongoose docs most likely.
[14:34:59] <remonvv> Without mongoose you would have to pick one of the storage options based on how many items are typically on the left and right side of the relationship and so on.
[14:35:14] <remonvv> You could store IDs as an embedded array, make something similar to a ref table, etc.
[14:39:40] <newbie35> Hi . I am using the C APi of mongodb and I want to run the command enableSharing("base_name") but it does not work with the : mongo_simple_str_command
[14:39:44] <newbie35> Can any one show me what is the good way to execute this commands in the server please ?
[14:43:53] <newbie35> Any idea about the way to write this commands ?
[14:44:24] <Routh> remonvv: Thanks for the response. I'll have to dig deeper into the mongo docs. I'm thinking I only have to save the relationship to one side of the many-to-many - just not sure if it should be the object or category.
[14:44:44] <newbie35> the mongos write me : command denied: { enableSharding: "base_name" }
[14:49:30] <remonvv> Routh, typically the relationship data is on the side where the relationship is heavy. Meaning if you have entities A and entities B and B tends to have a lot more relationships to A then add the id to B. If it's roughly equal AND the amount of relationships is high then use a dedicated collection that stores ID pairs
[14:51:01] <Routh> remonvv: So in my case, I would save the business to the category rather than the category to the business since a business will only have 1 - 3 cats while the categories could have hundreds of businesses?
[14:59:20] <remonvv> In that case it's probably best (from Mongo perspective) to embed that small array of references in the document for business.
[15:00:03] <remonvv> Rather than have the ID pair collection which is probably not needed (and is a worse solution since it makes writes a two-step process or worse)
[15:36:12] <Routh> remonvv: So your approach would be a Boolean than?
[17:18:05] <hahuang65> anyone here use mongoid3 that's experiencing a shit ton of namespace queries?
[17:35:16] <titosantana> I'm trying to sort on a map-reduced collection but doesn't seem to be working
[17:35:26] <titosantana> i tried {"num_articles" : -1 }
[17:35:27] <awpti> Okay, I'm lost. How do I increment a score value inside a sub document.. and only a specific one? eg: http://pastie.org/8156258 Lets say, I only want to increment the number 35. The docs do not offer any clarity on this.
[17:35:37] <titosantana> and {"value.num_articles" : -1 }
[17:36:49] <titosantana> this is what a document looks like in the collection
[18:08:02] <tg2> anybody running mongodb inside compressed zfs? I imagine the compression rate would be substantial due to the fact key's are repeated?
[18:39:19] <titosantana> can you not sort on a map reduced collection?
[18:39:25] <titosantana> i can't seem to get it to work
[18:47:02] <t0th_-> i am using mysql and sphinx, i am trying use mongodb, if i use mongo i don't need mysql ?
[19:43:22] <_tinman_`> So I have a question on migrating document structures in mongo.
[19:45:04] <_tinman_`> I know schema isn't enforced, but I need to embed some information from one collection into another collection's documents to avoid making two queries to the db.
[19:46:36] <_tinman_`> I was thinking of modifying each retrieved document as it was accessed. Is this a good plan and an idiomatic way of embedding related information within other documents after they are created?
[19:48:58] <_tinman_> Or should I just do the migrations up front like I would in SQL?
[19:50:51] <tg2> you mean like a foreign key in traditional relational db?
[19:51:42] <tg2> it would just be doing 2 queries anyway even if there were a way to reference an object in a foreign collection
[19:52:21] <_tinman_> yeah, so we have a set of forms and some formentries that previously were never displayed together. Now a customer wants the name of the form on the form entry csvs and reports.
[19:53:15] <_tinman_> Yes, i know. what i want to do is embed the form document within the formentry document now.
[19:53:52] <_tinman_> but i dont want to update them all at once if it will bring my db to its knees.
[19:54:55] <_tinman_> so i am asking which is the correct way to do it: run migrations on deploy or do them incrementally at document access?
[20:00:00] <_tinman_> Existing Doc A relates to Doc B as a 1 to many. Requires a second query to get Doc A information when I find Doc A.
[20:01:13] <_tinman_> What I want to do DocB contains DocA. This will allow me to make one query and get all the information I need when I find Doc B.
[20:02:23] <t0th_-> i am using mysql and sphinx, i am trying use mongodb, if i use mongo i don't need mysql ?
[20:03:16] <_tinman_> What I want to know: is it better to go through the DocB collection, embedding the related DocA in each DocB up front, or should I do it on the first access of any DocB?
[20:07:35] <tg2> that is a questoin of your preference
[20:07:41] <tg2> it would utimately be best to have all data records updated
[20:20:23] <cpu> Anyone ever used more than 1 core with mongo? I've got a setup that should reach a million writes per minute (SSD/dozens of cores/200 byte per doc/1 secondary index), my single mongod service utilizes its 1 core to the end.
[20:30:49] <dusts66> having issues with bulk inserts into a large replica set.... anyone have experience with this?
[20:32:29] <tg2> you inserting with a write concern?
[20:33:08] <tg2> @cpu - I'm sure index updating etc is linear and cannot scale horizontally
[20:33:23] <tg2> solution could be to run more instances per node
[20:53:56] <dusts66> should I turn off the secondary and bulk import into the primary .... then spin up the secondaries and allow the replication to sync
[20:54:09] <tg2> no you leave the secondaries online
[20:54:15] <tg2> just make sure when you're inserting
[20:54:28] <tg2> you aren't specifying that it should be commiting to N replicasets before returning
[21:21:29] <tg2> I imagine it would be as simple as using differnet data directories, pid directories and config directories when launching, I will look into it
[21:21:41] <tg2> each instance would have to be on a different port
[21:22:59] <Vile> Thanks guys! It actually works :)
[22:00:50] <cpu> what happens if a shard goes down?
[22:07:22] <cpu> Intersting, I get "couldn't connect to new shard socket exception [CONNECT_ERROR]" but the instances are listening according to netstat
[22:23:09] <cpu> I started a shard as replica set rs0, and did rs.initiate
[22:23:20] <cpu> the shell looks now like rs0:primary>
[22:23:47] <cpu> but when I try to addShard("rs0/hisAddress") I get a message that he's not part of rs0
[22:24:12] <cpu> I do the addShard on mongos by the way
[22:40:57] <cpu> do all replicate set members (/shards) need to know about all their "brother" shards.. why is there "members" field in the replica set
[22:51:16] <joshua> A replica set is a group of servers with the same data. Its a slightly different configuration than a single mongod
[22:52:35] <joshua> Sharding is use to scale your data horizontally so you split a collection up among more than one server/replica set
[23:20:08] <cpu> joshua - can I msg you in private?
[23:38:48] <joshua> cpu: Sorry, I'm at work and a little busy.