PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 19th of May, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[01:58:57] <sabrehagen> would really appreciate it if somebody could provide some input on my question here: http://stackoverflow.com/questions/30229872/node-js-mongodb-connection-in-master-or-forked-thread
[02:00:06] <StephenLynx> don't use mongoose.
[02:00:15] <StephenLynx> your are welcome.
[02:00:19] <StephenLynx> you are*
[02:00:56] <StephenLynx> and AFAIK, you need to open a connection on each worker.
[02:01:09] <StephenLynx> clusters don't share memory objects.
[02:01:15] <StephenLynx> they act as independent processes.
[02:01:18] <StephenLynx> not as threads.
[05:31:09] <svm_invictvs> ugh
[05:31:23] <svm_invictvs> So I made a boneheaded move and created a collection with a - in the name.
[05:31:30] <svm_invictvs> But now I can't delete it from my database
[05:36:05] <joannac> why not?
[05:41:29] <svm_invictvs> joannac: The shell says that the collectinon doesn't exist
[05:43:29] <Boomtime> svm_invictvs: I just tried creating a collection named - (single hyphen) and it works fine
[05:43:45] <Boomtime> what commands are you using?
[05:45:25] <Boomtime> 15:39:44@test>db.getCollection("-").drop()
[05:45:26] <Boomtime> true
[05:45:37] <svm_invictvs> Let met ry that
[05:46:21] <svm_invictvs> aaah
[05:46:22] <svm_invictvs> Okay
[05:46:34] <svm_invictvs> Boomtime: I was trying db.foo-bar.remove()
[05:46:35] <svm_invictvs> derp
[05:47:55] <Boomtime> yep, shell is a javascript console, first and foremost it will be interpreted as javascript
[05:48:02] <svm_invictvs> yeah
[05:48:15] <svm_invictvs> well, I changed my java code
[05:48:28] <svm_invictvs> To not use "-" in collection names
[05:48:32] <svm_invictvs> use the . notation like you're supposed to
[05:49:51] <Boomtime> you can use . in a collection name if you like, but you'll see a similar problem in the shell most probably
[05:51:31] <Boomtime> or not.. ok, the shell seems happy with . in collection names using the shorthand format, so go for gold i guess
[06:43:27] <svm_invictvs> Does Mongo have a mocking framework for Java?
[08:30:03] <donCams> hi. is it possible that a capped collection be replicated?
[08:37:32] <Asenar> Hi, do you know if it's possible to build mongodb 3.x for ppc64el ?
[08:50:41] <KekSi> i have to update some legacy java code from 2.11 to 3.0.1 to use ssl - it uses WriteResult.getError() .. what do i do with it? the function is gone
[09:40:10] <OhMyGuru> hi
[11:21:40] <CustosLimen> hi
[11:21:49] <CustosLimen> how exactly do I call fsync: http://docs.mongodb.org/v2.2/reference/command/fsync/
[11:21:55] <CustosLimen> the documentation is not clear on this
[11:23:03] <CustosLimen> nvm
[11:23:03] <CustosLimen> db.runCommand({fsync:1});
[12:48:42] <soosfarm_> hi, how can I perform an incremental backup on mongodb?
[12:51:47] <deathanchor> the oplog is incremental, but doesn't go all the way back in time.
[12:52:22] <deathanchor> Ooo that would be badass, a new replica member which only stores oplogs :)
[13:07:26] <soosfarm_> deathanchor: lol, i'll just buy an enterprise advanced license :(
[13:07:42] <StephenLynx> wait wait wait
[13:08:03] <StephenLynx> why whos buying what with FOSS?
[13:08:21] <paradoxquine> Hi folks! Quick q: i have an admin interface for events, which each have a start Date. I want to do a query that gets the next one after today. is the best way to make an index on start time and ensure it uses that (how would I ensure that?) in a findOne query with $gt: <right now's date>, or is there a better way? Thanks in advance!
[13:09:21] <StephenLynx> paradoxquine if the events have its date recorded a something different than when they were inserted on the database, you just need to store this date as a regular date object.
[13:09:25] <StephenLynx> if not
[13:09:31] <StephenLynx> you can use the _id upload date for that.
[13:09:41] <StephenLynx> I never did though, so you would have to look for that.
[13:10:17] <paradoxquine> StephenLynx: yes, the start dates aren't correlated with the creation date of the entity in the db
[13:11:08] <paradoxquine> i have the storage working fine, but retrieving the upcoming one i'm not sure about, since it says findOne uses natural ordering by default, and I'd prefer not to fetch all future events just to get the next one
[13:11:09] <StephenLynx> yeah, so you need to store a date object (optimal) or any other method you wish.
[13:11:22] <StephenLynx> wait, wait
[13:11:25] <StephenLynx> let me read it again
[13:11:47] <StephenLynx> ok
[13:12:08] <StephenLynx> so there might be more than one result for that find one because there might be more than one event that fits into the date?
[13:12:18] <StephenLynx> why not use a find with a sort and limit?
[13:12:51] <paradoxquine> i could do that, yea. i was hoping, since i already have an index on start date, that I could utilize that to just find the next one after today's date
[13:13:01] <StephenLynx> hm
[13:13:03] <StephenLynx> you might be.
[13:13:09] <StephenLynx> there is this thing "hint"
[13:13:20] <StephenLynx> that tells mongo which index to use.
[13:13:25] <StephenLynx> I never used it though
[13:13:28] <StephenLynx> don't know how it works.
[13:13:31] <paradoxquine> oh, that sounds like exactly what i want
[13:13:41] <paradoxquine> i will look into that right now, thank you StephenLynx!
[13:13:45] <StephenLynx> np
[14:37:35] <CustosLimen> hi
[14:37:52] <CustosLimen> does mongodb lock database during msync ?
[15:43:19] <V10l4t3d> Hi all
[15:43:43] <V10l4t3d> i've a replset with 4 members
[15:44:11] <V10l4t3d> and mongodb php drive that say: Read timed out after reading 0 bytes
[15:44:28] <V10l4t3d> but the same query via cli responde into 1 seconds
[15:44:42] <V10l4t3d> can you help me ?
[15:45:31] <Derick> V10l4t3d: make a log: see http://derickrethans.nl/mongodb-debugging.html#mongolog
[15:46:30] <V10l4t3d> can be the firewall that drop packets ?
[15:47:46] <Derick> make a log, it will tell you
[15:48:07] <Derick> also make sure you run the latest version too
[15:48:29] <V10l4t3d> Thanks Derick
[15:49:24] <saml> if my program does things based on oplog. and it went down. and oplog scrolled away, waht do I do when the program comes back up?
[15:49:51] <saml> i think oplog processing should be part of mongod
[15:50:04] <saml> so as long as mongod is up, oplog gets processed to my liking
[15:50:34] <cheeser> what does that mean? "part of mongod?"
[15:50:53] <saml> like, write a plugin for mongod or something..
[15:51:18] <saml> or instead of a capped oplog, oplog grows without limit until an app consumes it
[15:52:10] <cheeser> apps tail the oplog all the time...
[15:52:13] <saml> as long as db is up, i want to guarantee eventual consistency of data
[15:52:37] <saml> but if db and app are separate processes, app could go down while db is running.. and app loses those oplog
[15:52:44] <cheeser> consistency across replica set notdes?
[15:53:18] <cheeser> yes. that's a danger. replica sets have the same danger. it's only a problem for prolonged downtimes.
[15:53:21] <saml> consistency across differnent dbs. i have a db and another that's denormalized version of db
[15:53:57] <saml> maybe i need different backend db and use mongodb as only denormalized version
[16:04:16] <V10l4t3d> Derick semms to be nothing strange in logs
[17:39:39] <grazfather> is it possible to insert a new document, but fake the DT part of the ObjectID. We'd like to avoid needed to keep dt is a separate field, but then we'd need to fake first four bytes of the objectid
[17:40:22] <StephenLynx> why not just have this separate field created manually?
[17:40:56] <grazfather> sounds like i will need to, but we basically wanted to simplify querifes, filtering on the foreign key being newer than a certain date
[17:41:32] <grazfather> but migrating to this new schema would have all object ids look like they were created within a few minutes of each other
[17:42:16] <grazfather> I know I can specify the object id but ideally I'd just specify the timestamp part of the field and have mongo db create the rest of it/guarantee uniqueness
[17:43:21] <StephenLynx> for one, I just don't touch the _id ever.
[17:43:30] <StephenLynx> I might read it sometimes.
[17:43:40] <StephenLynx> but I don't like to touch stuff with special rules and stuff.
[17:44:08] <grazfather> well it wouldn't be changed, created with a fake timestamp
[17:44:14] <cheeser> depending on your driver, you can specify the timestamp part of the ObjectId, iirc
[17:45:26] <grazfather> pymongo
[17:58:04] <deathanchor> anyone know why dev insist on making fields a dynamic value in mongodbs?
[17:58:15] <cheeser> what?
[17:58:46] <deathanchor> we have some ID value, and the devs made it a field name instead of a value. makes querying on that field a pain.
[17:59:51] <StephenLynx> so you are just complaining about your stupid co-workers?
[18:00:00] <cheeser> not sure how we're expected to explain the actions of your coworkers...
[18:01:13] <deathanchor> yeah, cause I have to query that data now and it's going to do a full scan.
[18:01:28] <deathanchor> one way thinking of devs.
[18:02:16] <StephenLynx> lolk
[18:51:36] <gswallow> Howdy y'all. Will this ever end? Tue May 19 18:49:38.659 [rsMgr] replSet I don't see a primary and I can't elect myself
[18:52:00] <GothAlice> gswallow: Not unless the problem is corrected, no.
[18:52:22] <gswallow> They're shiny new instances, all spun up from the same EBS snapshot.
[18:53:11] <GothAlice> Did you snapshot after running rs.initiate() or before? I.e. where in this process (http://docs.mongodb.org/manual/tutorial/deploy-replica-set/) did you snapshot?
[18:53:24] <gswallow> well after
[18:54:31] <GothAlice> Then those snapshots can't really work for you. You need to snapshot before. (rs.initiate() populates an initial replica set containing only one node, the one you are running rs.initiate() on, the rest need to be added to this cluster by running rs.add() on the node you ran rs.initiate() on.)
[18:54:31] <gswallow> My production replica set looks like this:
[18:54:32] <gswallow> https://gist.github.com/gswallow/ab1b7963fb05c717a78f
[18:54:55] <GothAlice> (And because that node configuration includes things like host names… yeah… snapshot before.)
[18:55:02] <gswallow> hrm
[18:55:12] <gswallow> So…I can force an initial sync.
[18:55:21] <gswallow> I'd rather avoid that, though. :)
[18:56:14] <GothAlice> … are all of the nodes clones copies of a single original configured with rs.initiate() and rs.add() with all nodes? If that's the configuration, then the error boils down to the nodes just can't talk to each-other.
[18:56:19] <GothAlice> Also, oh, you're using TokuMX.
[18:56:32] <gswallow> yep
[18:57:02] <gswallow> all of the nodes create EBS volumes from snapshot, where the snapshot is taken (obviously) from the one EC2 instance I have.
[18:57:34] <gswallow> So when I spin up the new instances, I tried to just drop a new rs configuraiton in place with the new hostnames.
[18:57:58] <gswallow> I *thought* about setting up a private hosted zone in route53 and setting up tokumx*.prod.indigobio.com as CNAMEs.
[18:58:48] <GothAlice> Typically if one has a reason to use TokuMX, one has a reason to get a commercial support agreement from them. ;) What filesystem are you running on those volumes? You need to use one capable of snapshotting. Beyond Amazon EBS's natural snapshot capability, which will potentially hit partial writes.
[18:59:01] <GothAlice> http://docs.mongodb.org/manual/tutorial/expand-replica-set/#data-files
[18:59:05] <gswallow> ext4
[18:59:19] <GothAlice> That's not so good, then. The EBS snapshots won't actually be filesystem snapshots.
[18:59:21] <gswallow> I can fsfreeze it
[18:59:26] <gswallow> can/do
[19:00:04] <gswallow> according to the tokumx documentation, so long as I fsfreeze it it should be good. I was — when we ran mongo — experimenting with xfs, and running a db.fsynclock or whatever.
[19:00:15] <gswallow> And I agree about a support contract but I don't pay the bills.
[19:00:48] <gswallow> the replication engine is atomic so if I stop the database's ability to write to the fs (using fsfreeze) it should be ok
[19:01:10] <GothAlice> … yeah, but I'd imagine that introduces a raft of new problems, depending on the duration of the snapshot process.
[19:01:25] <gswallow> sure; we can fall way behind.
[19:01:53] <gswallow> actually, I think that mongodb's idempotent replication engine is better. atomic operations back up very easily
[19:02:33] <gswallow> but we needed document level write locks and 85% storage savings isn't bad, for free. I'd probably give 3.0/wired tiger a closer look
[19:02:48] <GothAlice> It's not ready. WT should be ready in 3.2/3.3.
[19:02:58] <gswallow> good to know! :)
[19:03:44] <GothAlice> So, it still boils down to the nodes can't talk, if you're seeing what appears to be a valid rs.conf() on each of them.
[19:03:54] <MrAmmon> for the benefit of us noobs, can you elaborate slightly on the 'not ready' part?
[19:04:27] <GothAlice> MrAmmon: There are a large number of outstanding critical data loss, reliability, and performance issues.
[19:04:53] <GothAlice> I.e. unless you can throw an infinite amount of RAM at it, it _will_ eventually crash after it finishes consuming all of it.
[19:05:02] <MrAmmon> fair enough. that sounds remarkably similar to not ready
[19:05:07] <cheeser> certain use cases have issues. for my little app, it's been running fine.
[19:05:22] <cheeser> mms runs part of its infrastructure on WT
[19:05:23] <gswallow> the nodes can talk. When I shut one down because I'm impatient, I get notifications that the node has gone down / up / back into startup2
[19:05:28] <GothAlice> (Gratuitously large quantities of RAM in a machine can "practically" avoid the memory issues by simply deferring them well into the future.)
[19:05:52] <gswallow> plus I can connect to port 27017 on each host
[19:06:04] <GothAlice> By its DNS name, from another one of the hosts?
[19:06:34] <leandroa> hi, I'm preparing the scheme for time series data, but I'm confuse on how to store/pre-allocate some dynamic keys. What do you think of this schema? how can I make the second case work? https://gist.github.com/lardissone/96b71cf666f5c5d71e31
[19:06:42] <gswallow> yes
[19:06:54] <GothAlice> leandroa: Avoid dynamic keys like the plague. They are a trap for unwary travellers.
[19:07:33] <gswallow> this is purely a toku thing, but I also see a lot of disk read activity. I spun this up with a cloudformation template. I'm going to let this one sit for a while now that I'm not running on r3.4xlarges, and see what happens after….say…two or three hours.
[19:07:54] <GothAlice> Also, that compound _id is… gratuitous. MongoDB already supports multi-value _ids using a nested document. (Order of the keys matters, obv.) I.e. {_id: {uid: userID, pid: productId, ts: timestamp_hour}, …}
[19:07:56] <leandroa> GothAlice, I should yes.. but I'm not sure how can I make it work well, even with queries using these dynamic keys
[19:08:14] <GothAlice> {values: [{name: "0", clicks: 10}, …]}
[19:08:15] <leandroa> ah, nice
[19:08:56] <leandroa> and how can I preallocate this? or using them as values doesn't requires disk pre allocation?
[19:08:58] <GothAlice> This lets you index on values.name to optimize queries on those names.
[19:09:41] <GothAlice> No need to pre-allocate, unless you're expecting the document to substantially increase in size across several operations.
[19:10:08] <leandroa> all "clicks" are incremental
[19:10:16] <GothAlice> Hmm.
[19:10:19] <leandroa> I'll have a lot of updates to these documents
[19:10:28] <bogomips> hello to everyone, I was needed, by php, to update an object adding a sub-object like this {'a:'1',arr:[]} the problem was that the php code array('a'=>'1',arr:array()) makes {'a:'1',arr:{}} so I had to use a workaround declaring in php array('a'=>'1',arr:array(array())) and later get empty the array "arr". I bet exists a clear way!Thanks
[19:11:04] <GothAlice> leandroa: http://www.devsmash.com/blog/mongodb-ad-hoc-analytics-aggregation-framework
[19:11:31] <leandroa> GothAlice, thanks, let me see
[19:12:08] <GothAlice> leandroa: See also: https://gist.github.com/amcgregor/1ca13e5a74b2ac318017 (some extracts from my work, where we do click tracking)
[19:13:01] <GothAlice> leandroa: In my case, I don't worry about preallocation at all. The number of unique combinations of browser and OS are limited.
[19:13:36] <GothAlice> (And since not every browser / OS / etc. will be represented in each per-hour slice of statistics, pre-allocation would be excessively wasteful.)
[19:13:56] <leandroa> ah I see
[19:14:02] <leandroa> yes, that makes sense
[19:14:28] <GothAlice> (My structures here go against the advice I gave earlier on storing "dynamic attributes" as a list of named sub-documents… *because* this is a fixed set of attributes, not truly dynamic.)
[19:14:30] <leandroa> I should preallocate these keywords, that aren't happening on every time series
[19:15:05] <GothAlice> If you scroll down to the sample record on my gist, you'll see that I explicitly don't pre-allocate. I don't list a count of zero against every possible version of Windows, Linux, Mac, Android, etc. just to have zeros there.
[19:15:19] <GothAlice> When querying, summing a zero is the same as skipping the record because the field is missing. ;)
[19:16:24] <leandroa> what's the minimal time slot you're tracking there? by the hour?
[19:16:40] <GothAlice> Hour, though that's entirely arbitrary, and it's common to have multiple time scales.
[19:16:49] <GothAlice> (I.e. replicate the statistics on a per-minute, per-hour, and per-day basis.)
[19:17:01] <leandroa> righ, perfect
[19:17:26] <leandroa> thank you, GothAlice !
[19:17:28] <GothAlice> This helps reduce the individual number of records that need to be processed to handle different scales of query. I.e. asking for aggregate stats over the course of a week? Use the daily pre-aggregated data and only need to sum 7 records! :D
[19:17:50] <GothAlice> (Pre-aggregation makes most reporting constant-time, worst-case.)
[19:19:06] <leandroa> that makes a lot of sense
[19:19:19] <leandroa> thank you for make me rethink everything from scratch :P
[19:19:28] <GothAlice> It never hurts to help. :)
[19:22:49] <GothAlice> leandroa: I've got one last tidbit for you. :3
[19:23:14] <leandroa> sure, it's welcome :)
[19:23:45] <GothAlice> leandroa: http://s.webcore.io/image/142o1W3U2y0x < our dashboard, running across that pre-aggregated click data
[19:23:56] <GothAlice> Might give you some ideas. :)
[19:24:19] <GothAlice> (A click is an "ITA" in this dashboard.)
[19:25:01] <GothAlice> Oh, and the "Live Activity Feed" at the bottom is a live stream out of a capped collection. :)
[19:25:16] <leandroa> oh, beautiful
[19:26:18] <leandroa> great
[19:26:26] <leandroa> it's inspiring
[19:26:56] <GothAlice> The aggregate query in the gist I gave generates the top left click-through comparison chart, BTW.
[19:27:59] <gswallow> woop woop!
[19:28:12] <gswallow> the answer is, "yes, eventually it will end and an election will happen."
[19:28:27] <GothAlice> Huh.
[19:29:05] <gswallow> I thought I was crazy when I tested this earlier with a 40GB data set and it worked. I have like 1.3TB in production.
[19:30:27] <GothAlice> ^_^ I rolled my own xz-based GridFS compression before TokuMX was around, alas, and there's no way on Earth I'll migrate that 28 TiB to a different cluster at this point. It's just too big to move. XD
[19:30:34] <gswallow> heh
[19:31:25] <gswallow> where we are contractually obligated to host our app, we're on 7200 RPM spinning rust o_O
[19:31:34] <gswallow> that contract ends in September.
[19:31:45] <GothAlice> Platters do scale better than SSDs.
[19:32:01] <gswallow> there are — I think — 17 spindles total.
[19:32:19] <gswallow> and it costs me $3.5x what it cost per gb at amazon
[19:32:37] <GothAlice> Weirdly, my array is running 7200RPM WD Green Power low-speed disks.
[19:32:45] <GothAlice> Yeah. You noticed the "cloud gouging" pricing model, eh?
[19:32:53] <gswallow> I inherited it.
[19:33:25] <gswallow> Ironically I worked at the ISP I'm hosted at when they picked their pricing model and I thought they were on crack back then.
[19:33:33] <GothAlice> https://twitter.com/GothAlice/status/582920470715965440 < it'd cost me around half a million $ per month to host my array with Compose.io. ;)
[19:37:02] <gswallow> That $14,400 per month price is about what I'm paying, yeah. I can buy them a *new SAN* every six months.
[19:37:11] <GothAlice> Yup.
[19:37:40] <GothAlice> If you host > 1TB, it's actually more cost effective to buy 100% new hardware, colo it, and hire a DBA to manage that one box. ;)
[20:47:56] <dave_den> what's the equivalent of db.stats() using the new 2.x mongo ruby gem? using client.database.command(:stats) just hangs for me
[21:23:07] <dave_den> derp, nm.
[21:57:03] <ehershey> 2.6.10 is out
[21:57:09] <ehershey> less exciting than 3.0.3 was
[21:57:13] <ehershey> but still yay!
[22:18:38] <Thinh> hey guys, anyone here use maxTimeMS with pymongo?
[22:19:02] <Thinh> I'm using the latest pymongo, and this query just seems to throw me an error: col.find().max_time_ms(100)
[22:19:58] <Thinh> "OperationFailure: database error: Can't canonicalize query: BadValue unknown top level operator: $query"
[22:21:11] <Thinh> This is me going through mongos