pmxbot IRC Log Viewer

[06:41:42] <afshinmeh> hey guys

[06:41:55] <afshinmeh> I wrote a library using NodeJS to watch and tail oplogs

[06:43:42] <afshinmeh> now I get an event for each insert and update operations

[06:44:46] <afshinmeh> But as you know, oplog records contain the changed part not the whole records of that collection

[06:46:10] <afshinmeh> How can I make a diff and return the whole record? I mean, what's the best way?

[07:32:39] <Boemm> Good Morning, I don't know if I'm on the right place here, but I have a question about replica set configuration and hope to find the answer here :-)!?

[07:33:27] <Boemm> When setting up a replica set and initialize it, per default the short hostname is used for the current mongo instance

[07:33:42] <Boemm> Where does mongodb gets the node name to use?

[07:33:54] <Boemm> And is it possible to override the name in the config file?

[07:37:08] <newbsduser> Hello, I need better performance for update jobs. So I'm running 10 different mongodb instances on the same machine. Because mongodb is working single thread. Is it correct way? Or do you advise another method for better update operation performance?

[07:49:13] <joannac> Boemm: huh?

[07:49:27] <joannac> Boemm: mongodb uses whatever you give in in the rs.config()

[07:50:39] <joannac> newbsduser: no. i'd be very surprised if you get better performance from that. i have no idea how that works

[08:04:55] <Boemm> joannac: if I understand it right I can add a node with rs.add("IP:27017"), right`

[08:04:57] <Boemm> joannac: if I understand it right I can add a node with rs.add("IP:27017"), right?

[08:05:17] <Boemm> or instead of the IP with the fqdn ... if resolvable

[08:06:10] <Boemm> but how can I set the fqdn for the primary node while setting up the replset from scratch via the config file mongod.conf

[08:07:06] <Boemm> I would like to confige each of my nodes, that they use the fqdn by default (maybe by setting it explicit in mongod.conf or by do the right setting system wide ...

[08:07:26] <Boemm> Is the default taken out of the /etc/hosts file?

[08:47:47] <joannac> Boemm: right

[08:48:12] <joannac> Boemm: rs.initiate( conf)

[08:48:22] <joannac> where conf is the config you want with whatever hostname you want

[08:53:08] <newbsduser> joannac, if i increase instance count of mongodb, does it affect update or insert count for per seconds?

[08:56:24] <newbsduser> if I increase instance count of mongodb, does it affect update or insert operation count for per second?

[09:15:01] <KekSi> newbsduser: you'll have to be more precise with your question, are you talking adding more replica set members? slaves in a master-slave configuration? shards in a cluster?

[09:22:53] <jokke> hi

[09:23:04] <jokke> my mongod fails to start

[09:23:08] <newbsduser> KekSi, no

[09:23:24] <newbsduser> iam talking about splitting data to different parts and using different instances

[09:23:29] <jokke> nevermind

[09:23:32] <jokke> it's starting

[09:23:34] <jokke> :D

[09:23:36] <newbsduser> on the same machine

[09:24:01] <jokke> nope

[09:24:04] <newbsduser> if I increase instance count of mongodb, does it affect update or insert operation count for per second? ( iam talking about splitting data to different parts and using different instances - not about cluster)

[09:24:49] <jokke> oh ok

[09:24:51] <jokke> i got it

[09:24:57] <jokke> 2015-04-29T11:22:54.840+0200 [initandlisten] ERROR: Insufficient free space for journal files

[09:25:11] <jokke> gotta use smallfiles

[09:34:11] <KekSi> newbsduser: the biggest limiting factors with mongodb are RAM and I/O performance

[09:34:47] <KekSi> if that answers your question - if anything you have less usable RAM and less I/O performance when running multiple instances

[10:30:17] <andrey_30> hi men

[10:35:42] <andrey_30> can you help me?

[10:39:41] <KekSi> how about asking a question

[10:47:57] <cheeser> we're #notallmen here, for the record.

[10:52:01] <andrey_30> I add entry ($set used). enrty appears at the beginning of the document. how to add entry in the end of the document? version mongo old (2,4,6)

[10:52:36] <cheeser> why would the order matter?

[10:55:25] <andrey_30> doesn't matter. it is simply interesting to me

[10:55:59] <andrey_30> why quite so...?

[11:00:58] <jar3k> Hey, how to install Mongo 3.x on Debian 8 (Jessie)?

[11:03:26] <Johnade> download the file from mongodb

[11:03:29] <Johnade> extract it

[11:03:40] <Johnade> put the mongodb/bin into usr/bin

[11:03:42] <Johnade> and voila

[11:04:23] <jar3k> Johnade: No, no. I need Mongo from repository.

[11:04:35] <jar3k> It's yeasier to upgrade if is from repo.

[11:05:11] <jar3k> There is no Mongo in Mongo Debian Repo for Jessie yet.

[11:05:49] <Johnade> why from "repository"

[11:06:31] <jar3k> Like I wrote, packages from repo are easier to upgrade and managment.

[11:06:51] <jar3k> I have a rule "never install packages from sources on production".

[11:07:09] <Johnade> it's also easy but ok, if it's your rule

[11:07:26] <andrey_30> apt-get install mongodb-server?

[11:11:56] <andrey_30> oh.. 3.0 only source

[11:35:58] <pamp> hi

[11:36:17] <pamp> it is normal the same query be faster in stadalone server than a cluster with two shards?

[12:26:00] <jmeister> @pamp depends on countless variables (Deployment, type of query, amount of data, etc)

[12:35:53] <cheeser> querying a sharded cluster will possibly mean querying each shard primary and merging the results...

[14:33:12] <Douhan> Guys I have a question about naming fields in MongoDB: If your field is an array, do you name your field as singular or plural? Example: word: ['hello', 'hi', 'hey'] --> do you name this field "word" or "words"?

[14:35:53] <Douhan> OR If your field can be both a string and an array, what do you name it? So it can be (#1) word: 'hello' and (#2) word: ['hello', 'hi', 'hey']

[15:55:10] <pamp> Hi

[15:55:53] <pamp> Anyone with C# driver knowledge?

[15:56:28] <pamp> When insert a batch like that "managedObjetsToWrite.InsertBatch(batch.ToArray(),WriteConcern.WMajority);"

[15:56:45] <pamp> how can i add option for unordered batch insert

[15:57:37] <pamp> in this way is inserting ordered

[16:19:32] <aendrew> I know this is probably a ludicrously simple question, but I have a collection with a bunch of records with similar structure and varying levels of completeness (I.e., most of the fields are filled out, but some are not). Some have a duplicate “name” key. How do I merge the duplicates such that it creates the most complete final record possible?

[16:20:25] <aendrew> I.e., if I have {a: ‘yay’, b: ‘’, c: ‘llama’} and {a: ‘yay’, b: ‘woo’, c: ‘’}, I end up with: {a: ‘yay’, b: ‘woo’, c: ‘llama’}?

[16:31:49] <Thinh> What's going on guys--anyone here use pymongo with a super low *TimeoutMS ?

[16:41:46] <aendrew> Nevermind my earlier query, I figured it out.

[16:42:38] <aendrew> Wait, maybe not — is there anything like $first, but uses the first non-empty value?

[16:42:39] <StephenLynx> Thinh GothAlice uses pymongo afaik

[16:43:10] <StephenLynx> but you could post your issue before she notices the ping.

[16:43:18] <Thinh> ah cool thanks

[16:43:19] <GothAlice> Thinh: I do, but I do not manipulate timeoutMS. (I tried, but ran into a server-side issue relating to tailing cursors ignoring timeouts completely.)

[16:43:28] <Thinh> Hi GothAlice again

[16:43:40] <GothAlice> StephenLynx: Interestingly, I noticed the question; just didn't respond as I don't have much input on that particular topic. ;)

[16:43:44] <GothAlice> Howdy, Thinh. :)

[16:44:03] <Thinh> So the issue I'm having is that, even when I set the timeout value to let's say 50ms, it still waits a minimum of 500ms

[16:44:11] <GothAlice> Thinh: The trick with asynchronous protocols like IRC is to ask the question, not ask to ask the question. ;)

[16:44:20] <Thinh> hahaha, noted

[16:45:02] <GothAlice> So… which timeout value are you setting?

[16:45:17] <GothAlice> socketTimeoutMS?

[16:45:21] <Thinh> yup

[16:45:28] <Thinh> here's how I'mt esting it

[16:46:23] <GothAlice> Hmm; I do not think that is the value you are truly wanting to set.

[16:46:30] <GothAlice> https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/cursor.py#L447

[16:46:47] <GothAlice> Have you tried explicitly setting a timeout on the individual cursor?

[16:47:25] <Thinh> Hmm--I haven't, let me try that

[16:47:53] <Thinh> yeah it's the same issue

[16:48:04] <Thinh> basically the case I'm trying to cover is that, if there is no connection to the server at all

[16:48:19] <Thinh> it should return relatively fast enough (whatever I set my timeout to be)

[16:48:37] <Thinh> so that my code can continue

[16:49:18] <GothAlice> Oh; that's an entirely different timeout value.

[16:49:28] <GothAlice> Thinh: connectTimeoutMS

[16:49:29] <Thinh> yeah, I was assuming that to be the socketTimeoutMS

[16:49:36] <Thinh> ah, I also set that as well

[16:49:41] <GothAlice> See also: serverSelectionTimeoutMS

[16:49:43] <Thinh> second, I'll paste my test script

[16:50:12] <GothAlice> Reference: https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/mongo_client.py#L121-L137

[16:51:20] <Thinh> GothAlice: https://gist.github.com/Taik/6c03dac04815554fb74c

[16:51:40] <Thinh> naively, I basically set all the timeout values to one value

[16:51:45] <Thinh> just to see if it even works

[16:51:57] <Thinh> that timeit should return something like 500+ ms

[16:52:29] <GothAlice> Woah. I'd not use timeit on that until you've truly nailed down the timeout issue.

[16:52:44] <GothAlice> Timeit will potentially run a lot of copies of that function.

[16:52:54] <Thinh> You don't think that's a good way to measure it?

[16:52:58] <GothAlice> It won't work.

[16:53:08] <Thinh> Hmm

[16:53:24] <Thinh> Do you have any suggestions as to how to test this?

[16:53:26] <GothAlice> Timeit attempts to run as many iterations as possible in a given time frame; if it can only squeeze one iteration out, the results will be noise.

[16:53:41] <Thinh> I see

[16:54:07] <Thinh> So a better 'measurement' would just to wrap it with like, t0 = time.time(), func(), t1 = time.time()

[16:54:33] <GothAlice> import time; start = time.time(); test(); print("finished", time.time() - start)

[16:54:41] <Thinh> gotcha

[16:54:47] <Thinh> trying that now

[16:54:55] <GothAlice> After that, wait for the user to press enter once, then repeat it. This will give you a chance to kill the mongod server or something for testing.

[16:55:52] <Thinh> yeah, it gives me the same value

[16:55:57] <Thinh> 0.511 seconds

[16:56:44] <GothAlice> I browsed through the code; I could find no place where those timeout values are being clamped (min/max'd).

[16:56:55] <Thinh> me too :/

[16:57:12] <GothAlice> Have you examined the tests for the timeout code?

[16:57:39] <Thinh> I was trying to narrow down to where this extra latency comes from--it's not where the connections are created

[16:57:40] <Thinh> https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/pool.py#L335

[16:58:01] <Thinh> no I haven't :) I'll do that now

[16:58:34] <GothAlice> https://github.com/mongodb/mongo-python-driver/blob/master/test/test_client.py < these tests make extensive use of various timeout values, some as low as 1ms.

[16:59:29] <Thinh> hmmm

[17:00:28] <Thinh> so, here's the weird thing

[17:00:38] <Thinh> if I actually update the timeout values to let say, 1000ms

[17:00:49] <Thinh> it returns as expected

[17:01:11] <Thinh> (finished in 1.020)

[17:02:10] <dcrosta> is there a recommendation on block device readahead settings for WT storage engine? the docs say 32 for MMAPv1, but don't say anything about WT. should we assume that the default is good enough?

[17:14:01] <pamp> Its possible with findAndModify make a change like this : http://dpaste.com/3RWKY2H

[17:22:31] <pamp> Anyone?

[17:23:48] <StephenLynx> yes

[17:23:57] <StephenLynx> you can use '$field'

[17:24:03] <StephenLynx> if I am not mistaken

[17:25:15] <deathanchor> StephenLynx: but how would you concat them? $set : { field2 : "$field1$field2" }?

[17:25:44] <StephenLynx> don't know, maybe with "asdas"+"asdas" or with an operator for that?

[17:26:23] <deathanchor> $concat looks to be documented for aggregation.

[18:01:31] <deathanchor> is there a way to tail a capped collection in the mongoshell?

[18:23:02] <cheeser> deathanchor: https://stackoverflow.com/questions/5528631/how-to-create-tailable-cursor-in-mongodb-shell

[18:30:47] <deathanchor> probably won't work with pretty() huh?

[18:31:25] <cheeser> probably not. though you could pretty print each document manually.

[18:31:34] <deathanchor> tru

[19:46:21] <arussel> {arr: [{a:"a", t:1}, {a:"b", t:2}]} is there a way to $inc the field t where, in arr a == "b" ?

[19:52:18] <GothAlice> arussel: Yes, but only for the first matched embedded document in the array.

[19:53:24] <GothAlice> db.collection.update({arr: {$elemMatch: {a: "b"}}}, {$inc: {"arr.$.t": 1}})

[19:53:26] <GothAlice> arussel: ^

[19:54:24] <GothAlice> Actually.

[19:55:26] <GothAlice> Yup, works.

[20:20:39] <deathanchor> mongos specified a different config database string. I have restarted both the mongos and mongo configsvr.

[20:20:51] <deathanchor> still get that error

[20:25:21] <deathanchor> ah crap... in-memory on the shard mongod servers

[20:25:38] <deathanchor> time to bouncy things

[20:33:02] <GothAlice> deathanchor: Let me dig something up for you. It's a script I use to spin up a 2x3 sharded replica set with authentication all on one host, for testing. It lets me play around with things and very, very quickly nuke settings and rebuild it when needed.

[20:33:18] <GothAlice> deathanchor: https://gist.github.com/amcgregor/c33da0d76350f7018875

[20:33:55] <GothAlice> You could add your own steps after line 162 to perform any extra work you want during initial construction of the "fake cluster".

[20:34:31] <GothAlice> Once you're happy with how it's building, you've got your final set of instructions for use in production. :)

[20:36:34] <deathanchor> GothAlice: I'm doing an even crazier thing than you can imagine

[20:39:09] <deathanchor> yeah, I have 4 sets sharded, so I have 4 members in another DC staying in sync. going chop off those 4 members in other DC and create a new mongocfg there to be a new shard cluster with a copy of the live data from a point in time to use for testing env.

[20:39:20] <android6011> I have a collection with a date field. The date includes time. I need to query where the date is say 4/15/15 - 4/20/15 and the time on those days is like 08:30 - 10:45 each day. I'm not sure what the best way to write that query is. I thought about just looping the date range and doing an OR for each day on that time. thoughts?

[20:40:48] <deathanchor> android6011: datetime : { $gte : ISODate("2015-04-28T08:30:00Z"), $lte : ISODate("2015-04-28T10:45:00Z") }

[20:41:26] <GothAlice> android6011: There's actually a very fun and intuitive way to do what you are asking, using aggregate queries.

[20:41:29] <deathanchor> or if you aggregate, you can $project dates and times into sep fields and then $match after that, but that will be very slow

[20:42:01] <GothAlice> http://docs.mongodb.org/manual/reference/operator/aggregation-date/

[20:42:25] <android6011> deathanchor: ya there are usually about 1 record for every second in that range so I think projection would be too slow

[20:44:25] <deathanchor> yeah I would just do each day for the hours you want using my first example

[20:44:44] <android6011> ok thanks

[20:45:12] <deathanchor> GothAlice: I really liked that article you shared a while back about realtime aggregation

[20:45:27] <android6011> GothAlice: unless you think the examples in that link could compare with just doing each date with that time range?

[20:46:02] <GothAlice> You can split out the parts of the date, then query on them separately. I.e. db.collection.aggregate([{$match: {datetime: {$gte: ISODate("2015-04-15T00:00:00Z"), $lte: ISODate("2015-04-21T00:00:00Z")}}}, {$project: {hour: {$hour: "$datetime"}, minute: {$minute: "$datetime"}}}, {$match: {… use $and matching the hour and minute for the min/max cases …}})

[20:46:17] <GothAlice> Hmm. Aggregates are hard to fit in one IRC line. XD

[20:46:54] <GothAlice> deathanchor: Indeed; we store click tracking data at work and pre-aggregation is a must to keep the data storage sizes down and query performance up.

[20:47:12] <deathanchor> android6011: if you are doing analytics, I suggest reading this, courtesy of GothAlice http://www.devsmash.com/blog/mongodb-ad-hoc-analytics-aggregation-framework

[20:47:26] <android6011> thats actually what im working with is tracking data

[20:48:00] <android6011> and need to pull stats for date range during certain times

[20:48:27] <GothAlice> If you're going to be doing a lot of split date range + time range queries, I'd store the dates and times separately.

[20:48:40] <GothAlice> I.e. 9-5 each day on week X.

[20:49:24] <GothAlice> (Pro tip: to store times as just times, use the UNIX epoch as the date component.)

[20:49:42] <GothAlice> (And make sure the timezone is UTC on those.)

[20:49:49] <android6011> I've also been considering doing that

[20:49:59] <GothAlice> At work we just aggregate.

[20:50:19] <android6011> when i send a date of just like 04-04-2015 to mongodb it makes it an isodate with a timestamp, is there a way around that?

[20:51:02] <GothAlice> android6011: http://s.webcore.io/image/142o1W3U2y0x is our dashboard. It processes around 17K records in a few hundred milliseconds.

[20:51:34] <GothAlice> android6011: Not really; you always want your dates (even if they omit times; the time would just be midnight the morning of) to be stored as real dates, to preserve the ability to extract elements, range compare, etc.

[20:51:42] <android6011> that's awesome

[20:51:59] <GothAlice> http://s.webcore.io/aSsD < and use is totally not stressing the cluster at all at the moment. ;)

[20:52:14] <GothAlice> (Those blue peaks are our every 4 hour sync runs.)

[20:54:56] <android6011> impressive

[20:57:25] <GothAlice> android6011: https://gist.github.com/amcgregor/fb7471f072b5630e43be < I'd be curious to know how this performs for you.

[20:58:07] <GothAlice> (Minor optimization: include the minimum time in the datetime: $gte, and the maximum time in the datetime: $lte match queries, respectively.

[20:59:25] <android6011> GothAlice: ill run it and let you know, give me a few

[20:59:47] <GothAlice> (You'll need to adjust for real field and collection names, of course.)

[21:06:33] <android6011> invalid operator $dateToString

[21:11:15] <android6011> http://pastebin.com/fSDkibut

[21:57:36] <Xeon06> Hey guys. Can anyone tell me if it would be a good idea to make sure that queries like {foo: {$in: ["one item"]}} are changed to {foo: "one item"} or is that optimization already taken care of?

[22:04:23] <morenoh149> Xeon06: what's the latter a find query?

[22:48:45] <Jameswnl> how much memory do i need to hold all dataset? is it just the storage size? or what else?

[22:48:57] <Jameswnl> I'm using wired tiger

[22:52:02] <GothAlice> Jameswnl: Under WT? All the RAM, currently.

[22:53:02] <Jameswnl> GothAlice: so my storage size on MMS is 110GB, do I set the wiredTigerCacheSizeGB=110 in mongod.conf?

[22:53:09] <Jameswnl> yes, WT

[22:54:11] <GothAlice> Jameswnl: https://jira.mongodb.org/browse/SERVER-17424 https://jira.mongodb.org/browse/SERVER-17456 https://jira.mongodb.org/browse/SERVER-16311 (See also: https://jira.mongodb.org/browse/SERVER-17386)

[23:03:10] <Jameswnl> GothAlice: I haven't seen those problems yet. I just want to size the memory to prevent mongod from the need to read from disk. (i need single-digit millisecond latency under high throughput)

[23:04:30] <GothAlice> Yes; in theory matching the cache size to your dataset size should suffice to keep it in RAM at all times. I'm not fully aware of the nuances of how Wired Tiger _uses_ its cache; I'm more familiar with the technical aspects of mmapv1, though.

[23:05:07] <Jameswnl> right. i am not able to find doc about WT in this respect

[23:05:31] <GothAlice> Usually when I'm in doubt, I look at the source. (There is no more up-to-date documentation than the code. ;)

[23:05:46] <Jameswnl> so there's no sizing doc on WT yet?

[23:06:10] <GothAlice> You might try: http://source.wiredtiger.com/2.5.0/tune_memory_allocator.html

[23:06:23] <GothAlice> Wired Tiger was formerly a third-party project, and its documentation is still online.

[23:07:25] <Jameswnl> ok. tks

[23:20:55] <pamp> Hi, anyone know if its possible make an update like this : http://dpaste.com/3RWKY2H , with findAndModify ?

[23:32:18] <morenoh149> pamp: http://docs.mongodb.org/manual/reference/method/db.collection.findAndModify/ I think so

[23:40:31] <pamp> morenoh149 thanks. but doesn t help

[23:40:46] <pamp> i cant find the info that I need

[23:41:29] <morenoh149> pamp: what driver

[23:43:54] <pamp> shell

[23:43:58] <pamp> or c#

[23:44:00] <pamp> boot

[23:53:57] <morenoh149> pamp: can't help you there. I do node

Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 29th of April, 2015