[06:41:55] <afshinmeh> I wrote a library using NodeJS to watch and tail oplogs
[06:43:42] <afshinmeh> now I get an event for each insert and update operations
[06:44:46] <afshinmeh> But as you know, oplog records contain the changed part not the whole records of that collection
[06:46:10] <afshinmeh> How can I make a diff and return the whole record? I mean, what's the best way?
[07:32:39] <Boemm> Good Morning, I don't know if I'm on the right place here, but I have a question about replica set configuration and hope to find the answer here :-)!?
[07:33:27] <Boemm> When setting up a replica set and initialize it, per default the short hostname is used for the current mongo instance
[07:33:42] <Boemm> Where does mongodb gets the node name to use?
[07:33:54] <Boemm> And is it possible to override the name in the config file?
[07:37:08] <newbsduser> Hello, I need better performance for update jobs. So I'm running 10 different mongodb instances on the same machine. Because mongodb is working single thread. Is it correct way? Or do you advise another method for better update operation performance?
[07:49:27] <joannac> Boemm: mongodb uses whatever you give in in the rs.config()
[07:50:39] <joannac> newbsduser: no. i'd be very surprised if you get better performance from that. i have no idea how that works
[08:04:55] <Boemm> joannac: if I understand it right I can add a node with rs.add("IP:27017"), right`
[08:04:57] <Boemm> joannac: if I understand it right I can add a node with rs.add("IP:27017"), right?
[08:05:17] <Boemm> or instead of the IP with the fqdn ... if resolvable
[08:06:10] <Boemm> but how can I set the fqdn for the primary node while setting up the replset from scratch via the config file mongod.conf
[08:07:06] <Boemm> I would like to confige each of my nodes, that they use the fqdn by default (maybe by setting it explicit in mongod.conf or by do the right setting system wide ...
[08:07:26] <Boemm> Is the default taken out of the /etc/hosts file?
[08:48:22] <joannac> where conf is the config you want with whatever hostname you want
[08:53:08] <newbsduser> joannac, if i increase instance count of mongodb, does it affect update or insert count for per seconds?
[08:56:24] <newbsduser> if I increase instance count of mongodb, does it affect update or insert operation count for per second?
[09:15:01] <KekSi> newbsduser: you'll have to be more precise with your question, are you talking adding more replica set members? slaves in a master-slave configuration? shards in a cluster?
[09:24:04] <newbsduser> if I increase instance count of mongodb, does it affect update or insert operation count for per second? ( iam talking about splitting data to different parts and using different instances - not about cluster)
[10:47:57] <cheeser> we're #notallmen here, for the record.
[10:52:01] <andrey_30> I add entry ($set used). enrty appears at the beginning of the document. how to add entry in the end of the document? version mongo old (2,4,6)
[11:36:17] <pamp> it is normal the same query be faster in stadalone server than a cluster with two shards?
[12:26:00] <jmeister> @pamp depends on countless variables (Deployment, type of query, amount of data, etc)
[12:35:53] <cheeser> querying a sharded cluster will possibly mean querying each shard primary and merging the results...
[14:33:12] <Douhan> Guys I have a question about naming fields in MongoDB: If your field is an array, do you name your field as singular or plural? Example: word: ['hello', 'hi', 'hey'] --> do you name this field "word" or "words"?
[14:35:53] <Douhan> OR If your field can be both a string and an array, what do you name it? So it can be (#1) word: 'hello' and (#2) word: ['hello', 'hi', 'hey']
[15:55:53] <pamp> Anyone with C# driver knowledge?
[15:56:28] <pamp> When insert a batch like that "managedObjetsToWrite.InsertBatch(batch.ToArray(),WriteConcern.WMajority);"
[15:56:45] <pamp> how can i add option for unordered batch insert
[15:57:37] <pamp> in this way is inserting ordered
[16:19:32] <aendrew> I know this is probably a ludicrously simple question, but I have a collection with a bunch of records with similar structure and varying levels of completeness (I.e., most of the fields are filled out, but some are not). Some have a duplicate “name” key. How do I merge the duplicates such that it creates the most complete final record possible?
[16:20:25] <aendrew> I.e., if I have {a: ‘yay’, b: ‘’, c: ‘llama’} and {a: ‘yay’, b: ‘woo’, c: ‘’}, I end up with: {a: ‘yay’, b: ‘woo’, c: ‘llama’}?
[16:31:49] <Thinh> What's going on guys--anyone here use pymongo with a super low *TimeoutMS ?
[16:41:46] <aendrew> Nevermind my earlier query, I figured it out.
[16:42:38] <aendrew> Wait, maybe not — is there anything like $first, but uses the first non-empty value?
[16:43:19] <GothAlice> Thinh: I do, but I do not manipulate timeoutMS. (I tried, but ran into a server-side issue relating to tailing cursors ignoring timeouts completely.)
[16:53:24] <Thinh> Do you have any suggestions as to how to test this?
[16:53:26] <GothAlice> Timeit attempts to run as many iterations as possible in a given time frame; if it can only squeeze one iteration out, the results will be noise.
[16:54:55] <GothAlice> After that, wait for the user to press enter once, then repeat it. This will give you a chance to kill the mongod server or something for testing.
[16:55:52] <Thinh> yeah, it gives me the same value
[16:58:01] <Thinh> no I haven't :) I'll do that now
[16:58:34] <GothAlice> https://github.com/mongodb/mongo-python-driver/blob/master/test/test_client.py < these tests make extensive use of various timeout values, some as low as 1ms.
[17:02:10] <dcrosta> is there a recommendation on block device readahead settings for WT storage engine? the docs say 32 for MMAPv1, but don't say anything about WT. should we assume that the default is good enough?
[17:14:01] <pamp> Its possible with findAndModify make a change like this : http://dpaste.com/3RWKY2H
[20:33:02] <GothAlice> deathanchor: Let me dig something up for you. It's a script I use to spin up a 2x3 sharded replica set with authentication all on one host, for testing. It lets me play around with things and very, very quickly nuke settings and rebuild it when needed.
[20:33:55] <GothAlice> You could add your own steps after line 162 to perform any extra work you want during initial construction of the "fake cluster".
[20:34:31] <GothAlice> Once you're happy with how it's building, you've got your final set of instructions for use in production. :)
[20:36:34] <deathanchor> GothAlice: I'm doing an even crazier thing than you can imagine
[20:39:09] <deathanchor> yeah, I have 4 sets sharded, so I have 4 members in another DC staying in sync. going chop off those 4 members in other DC and create a new mongocfg there to be a new shard cluster with a copy of the live data from a point in time to use for testing env.
[20:39:20] <android6011> I have a collection with a date field. The date includes time. I need to query where the date is say 4/15/15 - 4/20/15 and the time on those days is like 08:30 - 10:45 each day. I'm not sure what the best way to write that query is. I thought about just looping the date range and doing an OR for each day on that time. thoughts?
[20:45:12] <deathanchor> GothAlice: I really liked that article you shared a while back about realtime aggregation
[20:45:27] <android6011> GothAlice: unless you think the examples in that link could compare with just doing each date with that time range?
[20:46:02] <GothAlice> You can split out the parts of the date, then query on them separately. I.e. db.collection.aggregate([{$match: {datetime: {$gte: ISODate("2015-04-15T00:00:00Z"), $lte: ISODate("2015-04-21T00:00:00Z")}}}, {$project: {hour: {$hour: "$datetime"}, minute: {$minute: "$datetime"}}}, {$match: {… use $and matching the hour and minute for the min/max cases …}})
[20:46:17] <GothAlice> Hmm. Aggregates are hard to fit in one IRC line. XD
[20:46:54] <GothAlice> deathanchor: Indeed; we store click tracking data at work and pre-aggregation is a must to keep the data storage sizes down and query performance up.
[20:47:12] <deathanchor> android6011: if you are doing analytics, I suggest reading this, courtesy of GothAlice http://www.devsmash.com/blog/mongodb-ad-hoc-analytics-aggregation-framework
[20:47:26] <android6011> thats actually what im working with is tracking data
[20:48:00] <android6011> and need to pull stats for date range during certain times
[20:48:27] <GothAlice> If you're going to be doing a lot of split date range + time range queries, I'd store the dates and times separately.
[20:48:40] <GothAlice> I.e. 9-5 each day on week X.
[20:49:24] <GothAlice> (Pro tip: to store times as just times, use the UNIX epoch as the date component.)
[20:49:42] <GothAlice> (And make sure the timezone is UTC on those.)
[20:49:49] <android6011> I've also been considering doing that
[20:50:19] <android6011> when i send a date of just like 04-04-2015 to mongodb it makes it an isodate with a timestamp, is there a way around that?
[20:51:02] <GothAlice> android6011: http://s.webcore.io/image/142o1W3U2y0x is our dashboard. It processes around 17K records in a few hundred milliseconds.
[20:51:34] <GothAlice> android6011: Not really; you always want your dates (even if they omit times; the time would just be midnight the morning of) to be stored as real dates, to preserve the ability to extract elements, range compare, etc.
[20:57:25] <GothAlice> android6011: https://gist.github.com/amcgregor/fb7471f072b5630e43be < I'd be curious to know how this performs for you.
[20:58:07] <GothAlice> (Minor optimization: include the minimum time in the datetime: $gte, and the maximum time in the datetime: $lte match queries, respectively.
[20:59:25] <android6011> GothAlice: ill run it and let you know, give me a few
[20:59:47] <GothAlice> (You'll need to adjust for real field and collection names, of course.)
[21:57:36] <Xeon06> Hey guys. Can anyone tell me if it would be a good idea to make sure that queries like {foo: {$in: ["one item"]}} are changed to {foo: "one item"} or is that optimization already taken care of?
[22:04:23] <morenoh149> Xeon06: what's the latter a find query?
[22:48:45] <Jameswnl> how much memory do i need to hold all dataset? is it just the storage size? or what else?
[22:54:11] <GothAlice> Jameswnl: https://jira.mongodb.org/browse/SERVER-17424 https://jira.mongodb.org/browse/SERVER-17456 https://jira.mongodb.org/browse/SERVER-16311 (See also: https://jira.mongodb.org/browse/SERVER-17386)
[23:03:10] <Jameswnl> GothAlice: I haven't seen those problems yet. I just want to size the memory to prevent mongod from the need to read from disk. (i need single-digit millisecond latency under high throughput)
[23:04:30] <GothAlice> Yes; in theory matching the cache size to your dataset size should suffice to keep it in RAM at all times. I'm not fully aware of the nuances of how Wired Tiger _uses_ its cache; I'm more familiar with the technical aspects of mmapv1, though.
[23:05:07] <Jameswnl> right. i am not able to find doc about WT in this respect
[23:05:31] <GothAlice> Usually when I'm in doubt, I look at the source. (There is no more up-to-date documentation than the code. ;)
[23:05:46] <Jameswnl> so there's no sizing doc on WT yet?
[23:06:10] <GothAlice> You might try: http://source.wiredtiger.com/2.5.0/tune_memory_allocator.html
[23:06:23] <GothAlice> Wired Tiger was formerly a third-party project, and its documentation is still online.