PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 10th of June, 2015

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[00:00:54] <brotatochip> joannac: yes it finished
[00:01:08] <brotatochip> took a total of 4.25 hours to complete the mongorestore :/
[00:02:36] <joannac> brotatochip: other load on the disks?
[00:02:39] <brotatochip> also according to the log, background is set to true on this table
[00:02:51] <brotatochip> no, no load at all, this is a testing environment
[00:03:07] <joannac> monitored in MMS?
[00:03:14] <brotatochip> no
[00:03:23] <joannac> io data during the index build?
[00:03:59] <brotatochip> Not sure exactly what that means
[00:04:02] <joannac> background index builds take longer but are non blocking. How large is the database and index?
[00:04:22] <joannac> did you collect io statistics during the index build to see what your disks were doing?
[00:04:44] <brotatochip> yes, I was monitoring with iostat -x
[00:04:53] <joannac> pastebin?
[00:04:57] <brotatochip> util was 100%, write operations were maxed at 900
[00:05:21] <brotatochip> for the entire time of the index
[00:06:00] <joannac> brotatochip: okay. so what are you expecting? you maxed out your disks
[00:06:55] <brotatochip> I wasn't expecting 900 iops for 4 hours
[00:07:14] <brotatochip> which would be equivalent to 4k iops for almost 1 hour
[00:07:15] <Boomtime> brotatochip: how big is your database?
[00:07:20] <brotatochip> 32gb
[00:07:43] <Boomtime> how many indexes?
[00:08:12] <Boomtime> do a db.stats() on the database you restored, this will provide the details of index size
[00:08:14] <brotatochip> Not sure, how do I find out?
[00:08:20] <brotatochip> oh ok, one sec
[00:08:55] <brotatochip> "indexSize" : 9558422608, Boomtime
[00:09:05] <brotatochip> 12 indexes
[00:09:24] <Boomtime> uh-huh
[00:09:35] <Boomtime> you built nearly 10GB of indexes
[00:09:50] <brotatochip> is there any way to expedite that process? if SHTF I'm fucked
[00:10:12] <brotatochip> I love administrating software that I have next to no experience with :/
[00:10:13] <Boomtime> a better schema design?
[00:10:49] <brotatochip> i doubt that will happen sadly
[00:11:01] <Boomtime> the design is the problem
[00:11:14] <Boomtime> or you live with it
[00:11:42] <brotatochip> great...
[00:11:44] <Boomtime> it isn't mongodb's fault that the design generates a huge amount of indexes, those are choices made entirely by you (or your developers)
[00:12:08] <brotatochip> not so much, by the developers of a project called nodebb
[00:12:23] <Boomtime> fine, that still isn't mongodb's fault
[00:12:26] <brotatochip> which our PM and CTO have decided to incorporate into our platform
[00:12:38] <brotatochip> right, thank you for explaining the reason it was slow
[00:12:45] <brotatochip> it makes sense
[00:13:34] <Boomtime> you could raise a bug report with nodebb and ask them to explain why they need an index ratio of 1:3 (which is absurd)
[00:14:02] <Boomtime> by that measure, a 1TB database would require 300GB of indexes
[00:14:12] <brotatochip> it's a forum, the collection that index was building for is one for searching posts
[00:14:16] <Boomtime> that isn't just bad, it's absolutely ridiculous
[00:14:40] <Boomtime> it is full text searching?
[00:14:47] <brotatochip> yep
[00:15:01] <Boomtime> welcome to the cost of full text searching
[00:15:29] <Boomtime> btw, it will become a little more efficient over time
[00:15:43] <Boomtime> but if you're not handling the load now, then that probably won't matter
[00:36:32] <brotatochip> the only time it matters Boomtime is if there is a disaster and I have to restore using mongodump (which is less likely as I have a secondary)
[00:37:14] <brotatochip> a hidden slave for backups is still a good idea but restoration is going to never be fast as far as I see it
[00:38:03] <ianseyer> hi all. could somebody help me construct an aggregation pipeline? have been struggling to no avail
[00:38:53] <ianseyer> need to group all my documents by a field called 'code', and then find all those documents matching that code that have distinct titles
[00:41:49] <joannac> ianseyer: what do you have so far?
[00:44:05] <ianseyer> well, it's definitely worth mentioning this is my first serious exploration of mongo
[00:44:39] <ianseyer> but, here's my pipeline: [{'$group':{'_id':'$code'}}, {'$project':['title', 'code']}]
[00:44:41] <brotatochip> Boomtime: any idea as to how I can find out why my secondary keeps getting veto'd out of an initial sync?
[00:44:59] <ianseyer> grouping my code, then trying to get the title for each resulting document to see if any code groupings had multiple titles?
[00:45:09] <ianseyer> (doing this in pymongo, btw)
[00:45:28] <joannac> ianseyer: erm, get rid of the second bit (the $project) and just print out what you have
[00:45:44] <joannac> I think it'll help you understand what's actually happening
[00:45:49] <joannac> hint: not what you expect
[00:47:15] <ianseyer> joannac: all the codes?
[00:48:26] <brotatochip> oh, nevermind, I just found out why in the log 2015-06-10T00:46:14.077+0000 [rsBackgroundSync] replSet error RS102 too stale to catch up
[00:49:33] <brotatochip> is there any way to force the initial sync?
[00:49:41] <brotatochip> nvm, I can google
[00:49:42] <ianseyer> i figured it would be that, but would combine all those documents with matching codes into some sort of superdocument...
[00:50:04] <ianseyer> which i could then do some sort of $distinct title operator on
[00:50:14] <tejasmanohar> hey guys
[00:50:31] <tejasmanohar> should i host mongodb on a separate ec2 instance than my node.js server for production usage?
[00:51:05] <cheeser> you don't *have* to but mongodb does like to use as much memory as it can
[00:51:13] <tejasmanohar> fair enough
[00:51:22] <tejasmanohar> probably would be easier to scale if it was separate
[00:51:28] <tejasmanohar> makes management easier
[00:52:05] <tejasmanohar> cheeser: would it be wise to use an ACL in front of the EC2 instances then so only certain IPs can access?
[00:52:07] <tejasmanohar> especially mongo
[00:53:01] <tejasmanohar> also, anyone have experience with services like MongoLab and heroku?
[00:53:11] <brotatochip> tejasmanohar why would you ever not only permit IP access to a server unless it is explicitly needed?
[00:53:20] <tejasmanohar> is latency a big deal when im hosting them both on us east servers-- should be AWS so im guessing not?
[00:53:28] <brotatochip> no...
[00:53:29] <tejasmanohar> brotatochip: i guess my qeustion is - is an AWS ACL the best way to do that?
[00:53:43] <brotatochip> latency is no issue between two servers in the same region tejasmanohar as far as I have seen
[00:53:45] <cheeser> there are security groups you could set up, yes.
[00:53:51] <tejasmanohar> gotcha
[00:53:54] <brotatochip> why don't you just use security groups tejasmanohar ?
[00:54:04] <tejasmanohar> in AWS, right?
[00:54:06] <brotatochip> whitelist all the things
[00:54:13] <tejasmanohar> yes
[00:54:16] <tejasmanohar> that's what i was asking about
[00:54:19] <brotatochip> yes
[00:54:28] <tejasmanohar> inbound/outbound rules for security groups yeah
[00:54:41] <brotatochip> outbound not as vital but be sure as shit to lock down inbound
[00:55:00] <tejasmanohar> yeah
[00:55:11] <tejasmanohar> exclude * except necessary ones
[00:55:20] <tejasmanohar> deny *, allow necessary ip's i mean
[00:55:23] <tejasmanohar> ok
[00:55:24] <brotatochip> that's how the security groups work by default
[00:55:30] <brotatochip> implicit deny
[00:55:40] <tejasmanohar> great
[00:55:45] <brotatochip> yup
[00:56:40] <tejasmanohar> considering mongolab, trying to do pro vs con on this
[00:56:58] <tejasmanohar> are there any other dedicated mongodb hosts in the AWS US East region that I should give strong consideration?
[00:57:25] <brotatochip> I don't know, I've been hosting it myself in AWS, but mongolab looks like it might be something I could really benefit from
[00:57:43] <cheeser> i'd just use MMS+AWS, personally.
[00:57:49] <cheeser> but i'm a bit biased.
[00:58:00] <tejasmanohar> https://mms.mongodb.com/
[00:58:01] <tejasmanohar> ah i see
[00:58:10] <tejasmanohar> cheeser: biased because you used this a lot?
[00:58:16] <tejasmanohar> or do you work at mongo or something
[00:58:23] <cheeser> well, and i work for the company :D
[00:58:25] <tejasmanohar> it looks neat
[00:58:28] <tejasmanohar> haha
[00:58:34] <cheeser> it's the bee's knees
[00:58:43] <tejasmanohar> wow it looks rly neat
[00:58:55] <tejasmanohar> especialyl monitoring
[00:59:20] <tejasmanohar> and the pricing is... fair cheeser
[00:59:23] <tejasmanohar> lol
[00:59:24] <ianseyer> joannac: oh hey! i think i got it. [{'$group':{'_id':'$code', 'titles':{'$addToSet': "$title"}}}]
[00:59:30] <cheeser> with automation, upgrades are *super* simple.
[00:59:41] <tejasmanohar> but it's still not "managed"
[00:59:42] <cheeser> want to add a new shard? no big whoop!
[00:59:50] <cheeser> managed in what way?
[00:59:54] <tejasmanohar> well i guess it pretty much is
[00:59:58] <tejasmanohar> managed mongo hosting
[01:00:07] <cheeser> managed in what way?
[01:00:08] <cheeser> :D
[01:00:20] <tejasmanohar> if mongolab goes down i wont be the one diagnosing the issue in the server
[01:00:30] <tejasmanohar> which is a good thing
[01:00:31] <cheeser> ah, well, there's that.
[01:00:41] <cheeser> good thing mongo never goes down!
[01:00:53] <tejasmanohar> lol
[01:00:57] <GothAlice> What plan are you on?
[01:01:09] <tejasmanohar> GothAlice: Who?
[01:01:20] <GothAlice> Mongolab, you?
[01:01:24] <tejasmanohar> Don't use it
[01:01:27] <tejasmanohar> Considering it
[01:01:30] <GothAlice> Ah.
[01:01:33] <tejasmanohar> Well
[01:01:35] <tejasmanohar> I've used Sandbox
[01:01:43] <tejasmanohar> For a URL shortener app lol :P
[01:02:07] <GothAlice> https://twitter.com/GothAlice/status/582920470715965440 may be relevant. ;P
[01:02:09] <tejasmanohar> Just because it was easy on Heroku, and it works for our internal URL shortener that doesnt need that much space, 500mb is >>>>> than enough
[01:02:14] <tejasmanohar> lemme tkae a peek
[01:02:17] <tejasmanohar> *take
[01:03:24] <tejasmanohar> lol wo
[01:03:27] <tejasmanohar> w
[01:03:34] <tejasmanohar> never heard of compose
[01:03:39] <GothAlice> Mongolab are $3,790 TB/mo.
[01:03:54] <tejasmanohar> can you explain that $3,790 per TB per mo right?
[01:04:03] <GothAlice> Printed on their pricing plan page.
[01:04:13] <GothAlice> https://mongolab.com/plans/pricing/#dedicated-cluster-plans < high storage M6.
[01:04:18] <tejasmanohar> Yeah gotcha
[01:04:35] <tejasmanohar> compose expensive my god
[01:04:45] <tejasmanohar> compose.io
[01:05:27] <tejasmanohar> seems like MongoLab is the industry leader for managed mongo tho, right? GothAlice
[01:05:37] <tejasmanohar> like the safest bet
[01:05:40] <GothAlice> Yeah. For me, a cluster of HS-M5s from mongolab would cost me… $76K/month, going with the less-RAM one. $106K/mo. for the more RAM one.
[01:05:53] <GothAlice> Don't know about leader, the yeah was to the expensive comment. ;P
[01:06:07] <GothAlice> So compose is actually only 4x as expensive for my scale.
[01:06:17] <tejasmanohar> ah
[01:06:31] <tejasmanohar> well i dont quite have that scale :P
[01:07:17] <tejasmanohar> I've heard a lot of things about MongoDB going to be terrible if my app goes to scale and uses multiple nodes due to its pooor distributed consistency and especially because im charging users for things and then saving it in the db so if it doesnt save thats not good
[01:07:34] <tejasmanohar> Is this just people are not configuring the DB properly and out-of-the-box that's a common situation faced?
[01:08:38] <GothAlice> https://blog.serverdensity.com/does-everyone-hate-mongodb/ covers most of the naysayers.
[01:08:49] <GothAlice> And the answer is yes, that is the most typical situation.
[01:08:57] <GothAlice> (Misunderstanding coupled with misconfiguration or misuse.)
[01:09:34] <GothAlice> The default settings used to be less… tolerant of ignorance… than they are today.
[01:10:11] <tejasmanohar> Ah ok
[01:10:24] <tejasmanohar> But w/ a hosted MongoDB platform like MongoLab, I should see less of that? GothAlice
[01:10:56] <tejasmanohar> Is there just a top notch book or somethign I can read about configuring, scaling, etc. MongoDB?
[01:11:00] <GothAlice> You'll still need to be aware of things like write concern and read preference.
[01:11:05] <tejasmanohar> Yeah
[01:11:09] <GothAlice> docs.mongodb.org :P
[01:11:19] <tejasmanohar> Write Concerned Journaled
[01:11:51] <tejasmanohar> So later, I can do Replica Acknowledged I guess if I have a muti-enode setup ? GothAlice
[01:12:01] <GothAlice> If that is a requirement of yours, yes.
[01:12:02] <tejasmanohar> (for extended validation that it's saved)
[01:12:07] <GothAlice> It comes at a ever-steepening performance cost.
[01:12:12] <tejasmanohar> Ah shit
[01:12:23] <GothAlice> Replica-confirmed means additional network roundtrips.
[01:12:36] <tejasmanohar> At the point that I put Replica Acknowledged, is there really the high speed of Mongo compared to other dbs?
[01:12:42] <tejasmanohar> Prob not
[01:12:47] <tejasmanohar> But that's because of configuration i guess
[01:13:13] <tejasmanohar> Journaled is defualt now, right GothAlice ?
[01:13:15] <tejasmanohar> oh nvm Acknowledged
[01:13:45] <GothAlice> tejasmanohar: https://www.mongodb.com/blog/post/high-performance-benchmarking-mongodb-and-nosql-systems
[01:14:26] <tejasmanohar> Ah ok lol
[01:14:44] <GothAlice> :)
[01:14:57] <tejasmanohar> We then tested a configuration that prevents any possible data loss. In this configuration, MongoDB outperforms Cassandra and Couchbase by more than 25x, with latency that is more than 95% better than Cassandra, and more than 99.5% better than Couchbase.
[01:14:59] <tejasmanohar> enough for me
[01:15:03] <GothAlice> Yeeeah.
[01:15:09] <tejasmanohar> not something i have to worry about right now much because i have no one using my app
[01:15:19] <tejasmanohar> but its just something i want to think about like do i need to be considering switching etc
[01:15:43] <GothAlice> Understanding how to construct your sharding keys for maximum benefit is something you'll need to worry about later, too.
[01:15:50] <tejasmanohar> GothAlice: for any financial application, would you recommend pushing up to a higher write concern than the default Acknowledged like Journaled?
[01:15:52] <GothAlice> Third-pary hosted or not.
[01:15:56] <tejasmanohar> Not doing replication yet because of the super low vol of data
[01:16:01] <tejasmanohar> not multi node
[01:16:09] <tejasmanohar> With a journaled write concern, the MongoDB acknowledges the write operation only after committing the data to the journal. This write concern ensures that MongoDB can recover the data following a shutdown or power interruption.
[01:16:12] <tejasmanohar> that seems ideal
[01:16:32] <GothAlice> For financial information I recommend using a transactional database (TokuMX, a fork of MongoDB, or any actual transactional DB), or figuring out two-phase commits and rollback strategies.
[01:16:35] <GothAlice> Financials are hard.
[01:16:51] <tejasmanohar> hm hard term too
[01:16:57] <tejasmanohar> what does "financial" mean to you?
[01:17:01] <tejasmanohar> any serviec that you charge users money for?
[01:17:06] <tejasmanohar> im not like PayPal
[01:17:18] <tejasmanohar> you can just buy stuff (lottery tickets) in my app
[01:18:11] <tejasmanohar> Shoot, is TokuMX the same interface tho? Like if i have a node mongoose express app can i still use it in the same way as mongo?
[01:18:15] <tejasmanohar> or is it a lot different
[01:18:25] <tejasmanohar> ?
[01:19:37] <GothAlice> Very, very similar.
[01:19:45] <GothAlice> Most differences only apply to back-end server-to-server connections.
[01:19:52] <GothAlice> (I.e. you can't mix replica nodes.)
[01:20:18] <GothAlice> Any system where users can potentially exploit race conditions for direct or indirect financial gain.
[01:20:50] <tejasmanohar> hm
[01:21:19] <tejasmanohar> gotta think about if we fall into that hehe
[01:21:57] <brotatochip> race condition attacks are super cool
[01:22:03] <tejasmanohar> What would be an example? Like a gambling casino app? GothAlice
[01:22:10] <GothAlice> You want to know that if the user hits "Stop" in their browser part-way through a deduction from one account and addition to another, that either both operations succeed or both operations fail, also.
[01:22:38] <GothAlice> If not, either money disappears from existence, or money can be spontaneously created.
[01:22:57] <tejasmanohar> Oh no we don't have a system like that
[01:23:00] <tejasmanohar> Not a finance application
[01:23:11] <tejasmanohar> You can just buy tickets, and Stripe handles the charges GothAlice
[01:23:11] <GothAlice> "Money" being the thing you're counting.
[01:23:14] <GothAlice> Doesn't have to be $.
[01:23:17] <tejasmanohar> Ah that's true true
[01:23:20] <tejasmanohar> We do store user credit
[01:23:22] <tejasmanohar> hm
[01:23:30] <tejasmanohar> Yeah
[01:23:37] <GothAlice> If a balance is deducted after the creation of the invoice, for example…
[01:23:38] <GothAlice> Free orders for me.
[01:23:58] <tejasmanohar> Yes
[01:24:00] <tejasmanohar> AH
[01:24:01] <GothAlice> One must learn to think like an attacker in these situations. :)
[01:24:02] <tejasmanohar> AH
[01:24:04] <tejasmanohar> Ah
[01:24:13] <tejasmanohar> *sorry i dont know why i kept doing capital h :P
[01:24:30] <tejasmanohar> Hm GothAlice but how would changing the database, change that?
[01:24:32] <GothAlice> XD Thought you were having a eureka-seisure.
[01:24:51] <cheeser> in practice, financial transactions are never as simple as a single transaction "take from one give it to another" action
[01:25:09] <GothAlice> In TokuMX you can perform a bulk operation set that will either all succeed, or all fail, regardless of the client connection failing or even the server being written to disappearing. On startup, the incomplete transaction will be rolled back safely.
[01:25:34] <GothAlice> TokuMX is fully ACID compliant.
[01:25:56] <tejasmanohar> So you're saying I can connect to TokuMX server the same way I can connect to mongo servers etc?
[01:26:00] <tejasmanohar> Like I don't nede another driver?
[01:26:04] <tejasmanohar> Becuase I'm using Mongoose
[01:26:13] <tejasmanohar> And I don't see any documentation of this so I'm getting weary ;)
[01:26:22] <GothAlice> Pity about you using mongoose.
[01:26:27] <GothAlice> But yeah, it's wire-protocol compatible.
[01:26:37] <GothAlice> Like MongoDB vs. MariaDB.
[01:26:41] <cheeser> toku is a mongo fork, essentially, with a different storage engine
[01:26:42] <tejasmanohar> u mean MySQL
[01:26:43] <tejasmanohar> :P
[01:26:48] <GothAlice> Er, yeah.
[01:26:49] <tejasmanohar> Gotcha cheeser GothAlice
[01:26:50] <GothAlice> Habits.
[01:26:52] <tejasmanohar> hehe np
[01:27:34] <tejasmanohar> Alright, I'll have to give it a shot I think... so how do things just magically get better and more ACID compliant without changing the way i interact with the DB?
[01:27:41] <tejasmanohar> like why doesn't Mongo just incorporate those things? GothAlice
[01:28:01] <cheeser> part technical, part product reasons
[01:28:31] <tejasmanohar> fair enough
[01:28:47] <cheeser> toku's txns, e.g., only apply to a single shard. mongo's philosophy to date has been to do nothing that can't be applied equally well in a sharded sitation
[01:28:52] <StephenLynx> tejasmanohar you talking about toku?
[01:28:57] <GothAlice> TokuMX uses a very different behind-the-scenes structure that allows for point in time everything.
[01:29:08] <GothAlice> (Fractal trees instead of b-trees.)
[01:29:08] <StephenLynx> they did A LOT of compromises to achieve what they achieved.
[01:29:11] <tejasmanohar> StephenLynx: yes
[01:29:17] <tejasmanohar> StephenLynx: go on :)
[01:29:17] <StephenLynx> from what I heard, it doesn't even support unique indexes.
[01:29:19] <cheeser> StephenLynx: and not all of them good
[01:29:31] <StephenLynx> toku has a really narrow use case.
[01:29:33] <tejasmanohar> sheet so i MAY have to change things in my code :P
[01:29:33] <GothAlice> Indeed. It's a trade-off for those with slightly different needs. :)
[01:29:53] <tejasmanohar> so i do need to change things in my code from that then
[01:30:03] <GothAlice> (Those needs being über performance, compression at the time was unique to TokuMX, and the aforementioned point-in-time everything that facilitates true ACID transactions.)
[01:30:14] <StephenLynx> toku is a very advanced tool that one shouldn't adopt without studying it a lot.
[01:30:17] <GothAlice> There would be certain patterns you would have to avoid.
[01:30:47] <StephenLynx> because it will impose severe limitations on what you can do.
[01:31:22] <tejasmanohar> trying to find the limitations doc
[01:31:38] <tejasmanohar> mongodb on 1 node with journaled write concern seems safe as ever
[01:31:39] <tejasmanohar> am i right?
[01:31:41] <StephenLynx> they don't seem to be very concerned about honesty from what I felt from their website.
[01:31:48] <tejasmanohar> the real problem only comes out when i use multiple nodes and scale
[01:31:58] <tejasmanohar> and that's when it takes more thought into consideration
[01:32:02] <StephenLynx> I don't expect them to highlight limitations.
[01:32:18] <tejasmanohar> Ugh
[01:32:21] <tejasmanohar> Having trouble finding :P
[01:32:36] <tejasmanohar> THe best I can find is like getting more out of mongo lol
[01:32:54] <StephenLynx> yeah, they just put it like they did magic and its perfect and you should TOTALLY use it.
[01:33:03] <tejasmanohar> It actually does highlight limitations in the replication set doc
[01:33:08] <tejasmanohar> But there's not a concise place
[01:33:10] <StephenLynx> always suspect stuff like that.
[01:33:19] <StephenLynx> if its too good to be true
[01:33:20] <tejasmanohar> So... for launching our beta in < 1 week, I'll pass this time
[01:33:22] <StephenLynx> its too good to be true.
[01:33:34] <tejasmanohar> I can always migrate later as/if we need to expand to multiple nodes
[01:33:38] <StephenLynx> yeah.
[01:33:49] <StephenLynx> study it, learn and adopt if its fit.
[01:33:52] <tejasmanohar> Does someone have a performance stats of Journaled vs Acknowledged write concern speeds?
[01:36:00] <tejasmanohar> Is Write Concern something set at the driver level when you write to the DB or in the DB's configuration itself? Looks like a driver level but checking since I don't see it written anywhere in the docs (prob just missing something)
[01:36:04] <tejasmanohar> / cc GothAlice
[01:36:19] <tejasmanohar> Looks like driver
[01:36:46] <StephenLynx> yes, at driver.
[01:36:56] <StephenLynx> the default used to be 0, now its 1.
[01:37:10] <cheeser> has been for at least 2 years now
[01:37:50] <StephenLynx> I wouldn't be surprised if they went with 0 at launch for marketing reasons.
[01:37:58] <StephenLynx> "HEI GUISE, LOOK HOW FAST IT IS :^)"
[01:38:14] <StephenLynx> them bad RP hit and they moved to 1.
[01:38:23] <StephenLynx> then*
[01:38:28] <StephenLynx> PR*
[01:39:52] <tejasmanohar> lol
[01:40:02] <tejasmanohar> im considering Journaled write concern
[01:40:16] <tejasmanohar> I just can't find the speed comparisons anywhere StephenLynx
[01:40:31] <StephenLynx> I expect toku to face some serious bad PR if they gain traction.
[01:40:35] <tejasmanohar> nvm
[01:40:37] <tejasmanohar> https://whyjava.wordpress.com/2011/12/08/how-mongodb-different-write-concern-values-affect-performance-on-a-single-node/
[01:40:42] <tejasmanohar> but this is too old
[01:40:45] <tejasmanohar> 011
[01:40:46] <cheeser> they got bought by percona so ...
[01:40:47] <tejasmanohar> 2011
[01:41:04] <StephenLynx> percona?
[01:41:33] <tejasmanohar> https://www.percona.com/
[01:41:34] <tejasmanohar> never heard of em
[01:42:02] <tejasmanohar> the biggest thing for me is thinking DIY vs managed hosting
[01:42:03] <cheeser> https://www.percona.com/blog/2015/04/14/tokutek-now-part-of-the-percona-family/
[01:42:14] <tejasmanohar> and which is going to ultimately save more money
[01:42:37] <tejasmanohar> if i screw up and cause downtime that could potentially lose money too lol
[01:42:39] <GothAlice> Ah, tejasmanohar: http://edgystuff.tumblr.com/post/93523827905/how-to-implement-robust-and-scalable-transactions
[01:43:10] <GothAlice> Heh, Percona are a somewhat large storage vendor.
[01:43:13] <tejasmanohar> GothAlice: for the stats?
[01:43:19] <tejasmanohar> nv i ll look onesec
[01:43:21] <tejasmanohar> nvm
[01:43:36] <tejasmanohar> eventual consistency scares me
[01:43:49] <tejasmanohar> when we have time sensitive data like a lottery draw closing NOW :P
[01:43:50] <GothAlice> Depending on how you structure your updates (i.e. using two-phase, synchronization, stacks, queues, etc.) they can be highly resistant to partial writes.
[01:44:19] <cheeser> tejasmanohar: eventual only applies to the secondaries. so long as you do primary reads, you're fine.
[01:44:29] <GothAlice> Critical reads you point at primaries.
[01:44:44] <tejasmanohar> aight
[01:44:53] <GothAlice> Only for queries where getting stale data is A-OK (i.e. user profile data like profile picture, etc.) do you direct them at secondaries.
[01:45:20] <cheeser> you need to be wary of rollbacks during primary elections but those are relatively rare and you can mitigate by using a higher write concern
[01:45:43] <GothAlice> Related: http://askasya.com/post/canreplicashelpscaling
[01:46:16] <tejasmanohar> ah
[01:46:16] <GothAlice> (Replica sets are more of a reliability thing than a performance thing.)
[01:46:19] <tejasmanohar> i only use primary
[01:46:24] <GothAlice> Exactly. :)
[01:46:33] <GothAlice> Most people do. It's the safest option.
[01:47:00] <cheeser> and sufficient for most use cases
[01:48:07] <cheeser> tejasmanohar: https://www.packtpub.com/packt/offers/free-learning/
[01:48:28] <tejasmanohar> oh nicee thansks
[01:48:38] <tejasmanohar> im feeling a lot better after going over all this stuff here lol
[01:48:43] <GothAlice> For things like large-scale analytics, the laggiest data we've had back was 15 seconds old, and that's perfectly acceptable for fifteen-minute to one-hour granularity graph data. The data being closer to the people querying it was more important. :)
[01:48:43] <tejasmanohar> was p scared about launching before
[01:48:49] <cheeser> that book might be terrible but it's free :)
[01:48:52] <tejasmanohar> true
[01:49:02] <GothAlice> Hard to beat free. There's also a bunch of courses available on mongodb.com, I believe.
[01:49:06] <GothAlice> Many are free.
[01:49:40] <cheeser> https://university.mongodb.com/
[01:49:44] <GothAlice> That'd be it.
[01:50:26] <GothAlice> I also find the whitepapers and whatnot that are on mongodb.com (but not .org for some reason) to be quite useful in gauging capabilities and approaches to problems.
[01:50:38] <GothAlice> Helps to have examples to follow. :)
[01:50:47] <cheeser> .org is the community site. .com is the commercial side.
[01:51:19] <tejasmanohar> ahh
[01:51:20] <cheeser> so .org is more about code, contributions, and open source woowoo! where .com is more for business interests
[01:51:25] <GothAlice> Indeed; still useful resources, though.
[01:51:25] <tejasmanohar> cheeser: not always :P but im glad it is here
[01:51:37] <cheeser> tejasmanohar: not always what?
[01:51:50] <tejasmanohar> .org -> community site
[01:51:56] <tejasmanohar> (common misconception by standardized tests)
[01:52:04] <tejasmanohar> (fkin annoys me when they say which site is reliable)
[01:52:16] <GothAlice> "Organization", typically non-profit.
[01:52:18] <cheeser> i'm not speaking philosophically. that's exactly how those two sites break down.
[01:52:27] <GothAlice> Originally required proof of non-profitness, actually.
[02:02:34] <tejasmanohar> Yeah
[02:02:48] <tejasmanohar> I understand :)
[02:07:57] <tejasmanohar> GothAlice: do you know Journaled vs Acknowledged speed?
[02:08:16] <tejasmanohar> That's the only thing I haven't been able to find
[02:08:21] <tejasmanohar> Just wanna know how much of a performance hit it is
[02:08:33] <cheeser> journaled will be slightly slower
[02:09:00] <GothAlice> Not sure of the applicability of these results: http://techidiocy.com/write-concern-mongodb-performance-comparison/
[02:09:11] <GothAlice> A drop to 25/sec seems extreme to me.
[02:09:14] <cheeser> all writes are journaled. the journaled write concern waits for that to happen vs acknowledged is just the server ack'ing the write back to the driver
[02:09:30] <cheeser> seems questionable
[02:10:20] <cheeser> even unack'd should be faster than what he's showing
[02:10:58] <GothAlice> https://blog.serverdensity.com/mongodb-benchmarks/
[02:11:08] <GothAlice> When in doubt, ServerDensity comes to the rescue.
[02:11:20] <cheeser> ah, david mytton.
[02:11:48] <cheeser> "This is true because you can only really get an idea of performance when you’re testing your own queries on your own hardware. Raw figures can seem impressive but they’re not representative of how your own application is likely to perform."
[02:11:55] <GothAlice> Truth.
[02:13:07] <tejasmanohar> ah
[02:13:15] <tejasmanohar> "all writes are journaled. the journaled write concern waits for that to happen vs acknowledged is just the server ack'ing the write back to the driver"
[02:13:22] <tejasmanohar> so it really doesnt make a difference
[02:13:29] <tejasmanohar> if the server crashes, i still lose data thats not journaled lol
[02:13:43] <tejasmanohar> oh wait nvm
[02:13:46] <tejasmanohar> i do things when things are saved
[02:13:59] <cheeser> i think journal writes are done every 10ms
[02:14:01] <cheeser> so ...
[02:14:02] <tejasmanohar> so i know if its really "Saved" safely or not based on if its journaled
[02:14:04] <tejasmanohar> ah
[02:14:06] <tejasmanohar> then fkit
[02:14:12] <cheeser> pretty much
[02:17:43] <Doyle> Hey. Does this look about right for a geo distributed replication-set with sharding? https://drive.google.com/file/d/0B5g2nsz5NekdSnB1RVoyeC1FM1k/view?usp=sharing
[03:18:09] <arussel> anyone knows of a graphite plugin for mongo ?
[03:39:37] <arussel> how do you get the lag of a secondary out of rs.status() ?
[03:41:09] <joannac> arussel: diff the optime with the primary optime
[03:41:21] <Boomtime> @arussel: or use db.printSlaveReplicationInfo()
[03:41:42] <Boomtime> (what joannac says is correct, i just find the other command much easier)
[03:43:16] <arussel> Boomtime: I have to parse it in javascript to send it to graphite
[03:43:32] <arussel> joannac: thanks
[03:47:04] <sabrehagen> given that objectids store a timestamp in them, is there a query to check if any documents in a collection are newer than a given document by comparing ids?
[03:48:20] <joannac> yes, but it'd only be at millisecond accuracy
[03:50:26] <sabrehagen> that's plenty. how might i structure the query?
[03:50:46] <arussel> sabrehagen: Stackoverflow has plenty example
[03:51:38] <arussel> iirc, just $lt, $gt on _id should do
[03:53:25] <sabrehagen> arussel: thanks, was not immediately obvious when googling. will try this now.
[03:56:31] <sabrehagen> thanks, works great!
[04:11:20] <sabrehagen> lil leo
[04:11:29] <sabrehagen> sorry, wrong channel...
[04:11:57] <tejasmanohar> 557668d932d0f724c8af263a Unhandled rejection Error: Invalid val: {"_bsontype"=>"ObjectID", "id"=>"UvjÒ¬\u0092\f B\u0082\u009D8"} must be a string under 500 characters
[04:11:59] <tejasmanohar> mongoose why
[04:12:37] <tejasmanohar> https://gist.github.com/tejasmanohar/f4d589fad308de8abf30
[04:19:45] <arussel> how do you compute the lock % (mongo 2.6) ?
[04:20:54] <arussel> is it globallock.totalTime / globallock.lockTime ?
[09:57:45] <roelof> Can someone help me with this problem : /msg NickServ IDENTIFY account password
[09:58:05] <roelof> sorry: I mean this problem : https://groups.google.com/forum/#!topic/mongodb-user/FnxKJnFJFik
[10:07:19] <roelof> Can someone help me with this problem : https://groups.google.com/forum/#!topic/mongodb-user/FnxKJnFJFik
[10:10:59] <joannac> roelof: looks like homework to me.
[10:11:16] <joannac> M102?
[10:11:29] <roelof> nope
[10:12:35] <joannac> roelof: then where's it from?
[10:13:02] <roelof> a try for myself
[10:15:20] <joannac> roelof: you can traverse an array in whatever language you want
[10:15:47] <joannac> it's not really related to mongodb...
[10:16:51] <roelof> oke, I will try further to find the scores.
[10:17:08] <roelof> joannac: thanks
[10:24:13] <pagios> hi all
[10:24:43] <pagios> i am willing to run a db on my rpi, rpi can crash at anytime (by removing the power for example) is mongodb a good db to use in this case?
[10:24:54] <pagios> does it run on persistent stroage or purely in ram volatile
[10:39:46] <einyx> http://docs.mongodb.org/manual/core/write-concern/
[11:23:58] <mehdy314> I have a python script that insert docs of a mongo collection into another db using pymongo. the mongo collection has 306099 docs after running the script without any error just 33426 of docs has been inserted. i use find method by the nway
[11:49:11] <mehdy314> solved! problem was in the script
[12:11:05] <Doyle> When running a distributed mongodb setup, is it recommended to place config servers and query routers at different geographical locations?
[12:14:07] <Doyle> Say you have US-EAST and US-WEST with a sharded replication set. You place the shards in EAST (Primary/Secondary/Arb), and another secondary for each shard (priority 0) in West.
[12:14:31] <Doyle> In this setup you have all the config servers and query routers in East. No problem.
[12:16:38] <Doyle> Can you place a new shard in West (Pri/Sec/Arb) and add it to the configuration so the config servers track all the meta data for it as well? Is there anything to be concerned with when spanning config servers and query routers across different sites?
[13:05:53] <leev> i have a cluster with a primary, secondary and arbiter. when I shutdown the secondary, the primary drops back to secondary. shouldn't it stay primary as it's also connected to the arbiter?
[13:06:55] <cheeser> what do the logs say?
[13:09:19] <leev> hmm
[13:09:20] <leev> Wed Jun 10 13:06:23.362 [rsMgr] can't see a majority of the set, relinquishing primary
[13:09:20] <leev> Wed Jun 10 13:06:23.362 [rsMgr] replSet relinquishing primary state
[13:09:21] <leev> Wed Jun 10 13:06:23.362 [rsMgr] replSet SECONDARY
[13:10:06] <leev> but the arbiter is there and has health:1
[13:13:05] <leev> the arbiter sees the primary go into secondary, so they are connected
[13:13:22] <cheeser> does the primary see the arbiter? what does rs.status() show on the primary
[13:17:18] <leev> yeah looks like it loses connection to the arbiter
[13:18:33] <leev> i'm guessing this has something to do with the upgrade i'm doing at the moment ...
[13:19:03] <leev> following the guide. i was on 2.4. all mongos are now 2.6, config servers are 2.6, now trying to do the mongod's.
[13:19:34] <StephenLynx> any good reason for not using 3.0?
[13:19:53] <leev> StephenLynx: is that directed at me?
[13:20:17] <leev> i'm on my way there, have to through 2.6 to get there though :)
[13:20:42] <StephenLynx> ah
[13:21:12] <leev> straight after the secondary goes down, the primary see:
[13:21:13] <leev> Wed Jun 10 13:06:26.426 [ReplicaSetMonitorWatcher] Socket say send() errno:9 Bad file descriptor 10.192.8.39:10002
[13:21:17] <leev> which is the arbiter
[13:24:26] <leev> i also get "[LockPinger] Socket say send() errno:9 Bad file descriptor" to all config servers
[13:26:31] <nfo> Hi. About the reuse of disk space freed by deleted documents, does setting a TTL on documents is more efficient than removing the docs with the `delete` command ? http://docs.mongodb.org/manual/tutorial/expire-data/ As far as I know, with mmapv1, one has to compact() or repairDatabase() or resync the collection/DB to be able to reuse disk space freed by deleted documents, is it true ?
[13:27:21] <nfo> ho it's more or less documented here: http://docs.mongodb.org/manual/faq/storage/#faq-empty-records
[14:00:55] <rickardo1> I am new to mongodb and need it only to handle a large amounts of products ~1gb they are listed under root with key "Data" : [....] Is there any way I can import them directly from cmd line?
[14:01:48] <StephenLynx> How is this data formatted and organized?
[14:02:18] <StephenLynx> I have never used mongoimport, but I think it might be able some common standards.
[14:02:48] <StephenLynx> if you have any change of automatically importing it, it will be with mongoimpor.
[14:12:21] <rickardo1> StephenLynx: mongoimport -d test -c Data data/products.json it goes to 100% but then a memory error.. have 8 gb on this machine.. :/
[14:13:02] <saml> how can I drop databases with prefix test ?
[14:13:18] <saml> due to bug in test suite, it created so many test* databases
[14:13:50] <tubbo> hey fellas
[14:16:31] <Doyle> Hey. When spanning multiple sites with a replication set, are the config and query servers distributed as well?
[14:16:39] <tubbo> any mongoid users here? wondering how i can solve a timeout issue with an aggregation i have. https://gist.github.com/tubbo/aecf639c8e683b43777c
[14:16:41] <Doyle> Couldn't find any specifics on that
[14:17:04] <tubbo> actually more just looking for documentation how one would optimize it
[14:17:10] <saml> db.adminCommand('listDatabases').databases.map(function(x){return x.name;}).filter(function(x){return /^test/.exec(x);}).forEach(function(x){db.getSiblingDB(x).dropDatabase();})
[14:17:12] <saml> that worked
[14:38:05] <StephenLynx> rickardo1 you could write a simple import script.
[14:38:19] <StephenLynx> that would stream content.
[14:41:06] <StephenLynx> but then you would have to split your json in multiple parts to make that easier.
[14:42:56] <saml> what does ns not found mean? when you do db.collection.drop()
[15:19:32] <yauh> I have a collection with docs that contain highscores for a game. I want to show the top 10 plus the current player
[15:20:03] <StephenLynx> $or
[15:20:07] <StephenLynx> wait, wait
[15:20:10] <yauh> I only want to find 10+1 documents, but the current player's position in the score list (you are on position #27153) should be listed as well
[15:20:25] <StephenLynx> hm
[15:20:29] <yauh> without the position in the score list I can do it already ;)
[15:20:38] <yauh> is that what map/reduce may help with?
[15:20:47] <StephenLynx> I would pre-aggregate the position of the player.
[15:20:48] <yauh> only dabbled with aggregation so far
[15:22:14] <yauh> pre-aggregated seems like an inefficient approach because almost all documents must be changed if some of the first positions change
[15:22:22] <yauh> is that really my best choice?
[15:25:21] <StephenLynx> hm
[15:25:46] <StephenLynx> a quick search revealed that mongo can't give you the position of a document.
[15:26:02] <StephenLynx> bug
[15:26:04] <StephenLynx> but*
[15:26:14] <StephenLynx> you could use count.
[15:26:23] <StephenLynx> http://stackoverflow.com/questions/10813908/get-position-of-selected-document-in-collection-mongodb
[15:26:41] <StephenLynx> so you can get the amount of documents with a score higher than your player.
[15:26:54] <StephenLynx> and figure the position of the player based on that.
[15:26:59] <yauh> that would work
[15:27:09] <yauh> I just need to count the players in front of the current and bam - done
[15:27:38] <StephenLynx> yep
[15:30:30] <yauh> thanks a lot, StephenLynx
[15:30:35] <StephenLynx> np
[15:30:37] <yauh> you made my thursday :)
[15:30:40] <StephenLynx> :v
[15:53:57] <Lonesoldier728> is there something wrong with this query using mongoose http://pastebin.com/2cZ36mP1
[15:57:04] <StephenLynx> probably the value in selecQuery.
[15:57:11] <StephenLynx> but I suggest not using mongoose anyway.
[15:57:28] <StephenLynx> its the most infame ODM I ever heard about.
[15:57:32] <StephenLynx> infamous*
[15:57:39] <tubbo> worse than mongoid? :P
[15:57:48] <StephenLynx> never heard about mongoid :v
[15:58:01] <tubbo> it's an ODM for MongoDB in Rails
[15:58:33] <diegoaguilar> Hello, I'm trying to restore some datbases
[15:58:50] <StephenLynx> it is specifically designed for the rails framework rather than just ruby?
[15:58:52] <diegoaguilar> I received multiple .json and .bson for each
[15:59:01] <diegoaguilar> but when I run mongorestore I obtain back:
[15:59:03] <Lonesoldier728> \why would you say the value is wrong StephenLynx in what sense
[15:59:05] <diegoaguilar> "Failed: error scanning filesystem: error reading root dump folder: open dump: no such file or directory"
[15:59:40] <tubbo> StephenLynx: i suppose you could use it however you want, but it's definitely intended for use in a rails app. follows ActiveRecord's lead on a lot of things.
[15:59:55] <StephenLynx> >ruby >web frameorks >ODM
[15:59:57] <StephenLynx> no thanks.
[16:00:15] <StephenLynx> that its a three out of three of stuff I wouldn't touch with a mile long pole.
[16:00:28] <StephenLynx> ah
[16:00:32] <StephenLynx> "follows"
[16:00:37] <StephenLynx> I read "follow" :v
[16:01:03] <StephenLynx> lone, that looks plain wrong from everything I know from the standard driver.
[16:01:16] <StephenLynx> but again, mongoose might do some ass backward thing where that is right.
[16:01:23] <diegoaguilar> could anyone guide me on it?
[16:01:24] <StephenLynx> being the reason I suggest not using it.
[16:07:11] <Lonesoldier728> well there is my code and question http://stackoverflow.com/questions/30761620/mongoose-populate-not-returning-results
[16:35:49] <jecran> Hello guys. I am using mongodb with node. https://gist.github.com/anonymous/d1920d655b0c1655e03f simple code here, I cannot figure out how to use the findOne() method instead of find(). Please help!
[16:37:04] <jecran> var dbs = db.collection('users').findOne(query); i tried this and similar and keep getting errors
[16:37:25] <diegoaguilar> jecran, what does query look like?
[16:37:50] <jecran> var query = {name:data.name}; this query works fine with the find() function
[16:38:16] <diegoaguilar> ok, and what are those errors?
[16:38:44] <diegoaguilar> are u using node "mongodb" module?
[16:38:48] <jecran> https://gist.github.com/anonymous/d1920d655b0c1655e03f the sample code. Just want to apply findOne() instead of find() lol.... 'object is not a function' is the errors that I get
[16:39:04] <diegoaguilar> ok, this is because what find and findOne returns
[16:39:15] <jecran> diegoaguilar: the sample works
[16:39:19] <diegoaguilar> mongodb module is quite "odd" in this sense
[16:39:42] <StephenLynx> you are not passing anything to find
[16:39:51] <StephenLynx> your query variable is unused.
[16:40:02] <svm_invictvs> In Morphia, if I have an @Reference annotated Map<String, Foo> does that basically make it a mapping of strings to object ids?
[16:40:45] <svm_invictvs> And second question, does morphia cascade the put operation when persisting the object?
[16:41:26] <jecran> var dbs = db.collection('users').findOne(query, {}, function(err, doc){}) .... this produces null value without any errors
[16:42:29] <StephenLynx> probably because didn't find anything.
[16:43:14] <diegoaguilar> aha :)
[16:43:16] <diegoaguilar> right jecran
[16:43:45] <jecran> StephenLynx: query = {name:data.name}; the sample here works fine and displays the data correctly, but I want just the 1 piece of data, not all.
[16:44:09] <StephenLynx> what you mean "1 piece of data"?
[16:44:18] <jecran> StephenLynx: findOne()
[16:44:27] <StephenLynx> what sample works?
[16:44:41] <jecran> https://gist.github.com/anonymous/d1920d655b0c1655e03f
[16:45:00] <StephenLynx> what is your code?
[16:45:20] <jecran> StephenLynx: lol https://gist.github.com/anonymous/d1920d655b0c1655e03f
[16:45:29] <StephenLynx> that is the sample that works.
[16:45:33] <StephenLynx> I want the code that doesnt.
[16:46:05] <StephenLynx> and I would like to give advice on that
[16:46:13] <StephenLynx> you can just use !doc
[16:46:21] <StephenLynx> instead of comparing it to null
[16:46:35] <StephenLynx> and you can just use doc.name instead of ['name']
[16:46:52] <StephenLynx> in general, refer to this : https://github.com/felixge/node-style-guide
[16:47:12] <jecran> StephenLynx: any change at all in the code to try to make it findOne() results in an error, except for: var dbs = db.collection('users').findOne(query, {}, function(err, doc){}) Which results in null. If I exclude the middle param {}, I get actual errors
[16:47:33] <StephenLynx> what error?
[16:47:48] <StephenLynx> anb give me the full code.
[16:47:51] <StephenLynx> not just that line.
[16:48:36] <StephenLynx> you are not trying to call each after findone, are you?
[16:48:45] <StephenLynx> findone doesn't return a cursor like find, it returns null
[16:50:10] <jecran> StephenLynx: https://gist.github.com/anonymous/b52d0d47e7ad1705d82b I put the error in a comment next to the line
[16:50:41] <StephenLynx> what line is giving the error?
[16:51:03] <jecran> 6
[16:51:17] <jecran> object is not a function error
[16:51:49] <jr3> does it make sense to index a column with low cardinality?
[16:51:58] <StephenLynx> try putting findOne on a separate line.
[16:52:17] <StephenLynx> and see if the error occurs on the 6th line or the next one.
[16:53:06] <StephenLynx> ah
[16:53:20] <StephenLynx> probably its complaining because you are not passing a callback.
[16:53:47] <StephenLynx> findOne(query,function gotObject(error,object){});
[16:55:57] <jecran> Im playing with it lol.. back in 2mins
[16:59:45] <jecran> StephenLynx: latest attempts..... no crashing, but still null. with or without using a cursor... https://gist.github.com/anonymous/79af1207cd4a19243355
[17:00:15] <StephenLynx> if(!doc) return db.close();
[17:00:22] <StephenLynx> where do you actually use the doc?
[17:00:36] <jecran> StephenLynx: inside the callback function
[17:00:51] <StephenLynx> you just check if it doesn't exist and close the connection pool if it doesnt.
[17:01:35] <StephenLynx> you never do anything with the returned value or output any message based on its existence.
[17:02:11] <StephenLynx> plus, add brackes to your ifs.
[17:03:02] <StephenLynx> the first one isn't doing anything, the second one should have worked.
[17:03:19] <StephenLynx> so the third one.
[17:03:27] <StephenLynx> are you sure you have this data on your db?
[17:03:39] <StephenLynx> that you are using the right collection?
[17:04:09] <jecran> StephenLynx: Yes. its there, and displays when I search through the whole collection
[17:05:43] <jecran> StephenLynx: I don't know , but I just erased it all and started over, and it worked first try.... what the hey lol .... thanx for your input
[17:05:59] <StephenLynx> wait
[17:06:17] <StephenLynx> did you reloaded the code between attempts?
[17:06:28] <StephenLynx> either by restarting the application or deleting the require cache?
[17:06:58] <jecran> StephenLynx: nodemon .... probably a caching issue
[17:07:05] <StephenLynx> ugh
[17:07:17] <StephenLynx> I try to follow a KIS approach.
[17:07:23] <jecran> StephenLynx: super annoying, i know lol
[17:07:25] <StephenLynx> more dependencies, more problems.
[17:07:46] <jecran> StephenLynx: ..... better software :P
[17:07:47] <_rgn> \
[17:07:57] <StephenLynx> quantity is not quality.
[17:08:42] <StephenLynx> and when you assert the average skill of developers, you will want anything but the work of more people in your project.
[17:09:20] <cheeser> on the other hand, if your teammates are so crappy, having them write less code by using libraries is a net win.
[17:09:30] <StephenLynx> indeed.
[17:09:38] <cheeser> but if your teammates are so terrible, fire them or find a new gig
[17:09:41] <jecran> Agreed. But some of the stuff I do i would be lost without some api's.... I have only been developing for 4months now lol
[17:10:33] <StephenLynx> if you are learning, you should go for the most didactic approach, imo.
[17:10:50] <StephenLynx> reserve dependencies for defined standards.
[17:10:58] <cheeser> that i agree with.
[17:11:11] <cheeser> like learning math with calculator.
[17:11:21] <cheeser> sure it's "easier" but you dont' really learning anything.
[17:11:23] <jecran> much fun
[17:11:39] <StephenLynx> I am more concerned with skill than fun to be honest.
[17:12:24] <jecran> StephenLynx: turn's out that all the reading I have done in my spare time has paid off. Learning concepts, not code, like inheritance and encapsulation etc
[17:12:26] <StephenLynx> you should derive enjoyment from the challenge of mastering it.
[17:13:20] <StephenLynx> not from spitting out easy and low-quality software.
[17:15:57] <jecran> agreed. And I still love it. And honestly, I learned regular java, and am more than happy to span off to all this: node, mongo, js, unity3d and c# , php.... And my code, less refactoring every week :P
[17:16:10] <svm_invictvs> HOw do I get a sequential object ID in Morphia?
[17:16:17] <cheeser> what?
[17:16:41] <svm_invictvs> http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/
[17:16:53] <StephenLynx> what is morphia?
[17:16:59] <svm_invictvs> cheeser: Also looking at this example here: https://github.com/mongodb/morphia/blob/8f70190862c0094f02f3bf27713ac0f63d1f2cd2/morphia/src/test/java/org/mongodb/morphia/utils/LongIdEntity.java
[17:17:34] <cheeser> if you want that, you could findAndModify() to emulate a sequence in the DB
[17:17:46] <svm_invictvs> I see
[17:17:46] <cheeser> why would the _id values matter so much?
[17:18:04] <svm_invictvs> It doesn't, I suppose
[17:18:04] <GothAlice> Also, notably, you run into the Twitter problem when you start using "auto increment". I.e., it doesn't scale.
[17:18:14] <svm_invictvs> Yeah, good point.
[17:18:19] <StephenLynx> what was the twitter problem?
[17:18:20] <cheeser> IDs should be meaningless
[17:18:36] <GothAlice> StephenLynx: How do you have two machines simultaneously insert new data without a collision?
[17:18:38] <svm_invictvs> WHat does Morphia do when you have an @Id long annotatied field?
[17:18:41] <svm_invictvs> er
[17:18:45] <StephenLynx> ah.
[17:18:46] <svm_invictvs> @Id Long id; or something
[17:18:56] <cheeser> svm_invictvs: nothing. you have to specify the id value manually.
[17:19:01] <svm_invictvs> I see
[17:19:24] <svm_invictvs> Well, then @ObjectId it is then
[17:19:25] <svm_invictvs> er
[17:19:30] <svm_invictvs> @Id ObjectId id;
[17:19:36] <StephenLynx> in some cases I need sequential ids, TBH.
[17:19:43] <svm_invictvs> My auto incrementing idea seems to be pretty half-baked anyhow.
[17:19:51] <cheeser> sometimes, sure. but rarely.
[17:19:54] <GothAlice> StephenLynx: To the point they wrote https://github.com/twitter/snowflake/tree/snowflake-2010
[17:19:56] <StephenLynx> I just keep retrying and upping the id if the error its because of a collision :v
[17:19:59] <svm_invictvs> StephenLynx: Yeah, in my case, I think I can live without it.
[17:20:06] <svm_invictvs> cheeser: Morphia doesn't cascade puts does it?
[17:20:20] <cheeser> cascade what?
[17:20:31] <svm_invictvs> @Refernce annotated fields
[17:20:43] <svm_invictvs> class Foo { @Reference Bar bar; }
[17:20:46] <cheeser> no, it doesn't
[17:20:50] <svm_invictvs> Putting foo, won't cascade that, right?
[17:20:56] <svm_invictvs> I didn't think so, just wanted to be sure
[17:20:56] <GothAlice> MongoDB isn't relational.
[17:21:01] <cheeser> it will autofetch them for you but it won't save the @Referenced entity
[17:21:05] <GothAlice> It has no concept that field X points at collection Y.
[17:21:20] <cheeser> well, no enforcement of that. DBRef is a thing, though.
[17:21:27] <svm_invictvs> GothAlice: I understand that it's not relational, but that's not why I asked.
[17:21:43] <svm_invictvs> GothAlice: That wouldn't prevent soembody from trying to add something like that to Morphia
[17:22:06] <cheeser> i would prevent that, though. :)
[17:22:10] <svm_invictvs> hah
[17:22:17] <GothAlice> It is, but. No enforcement, no cascading rules, and ODMs that add such behaviour often trap users into thinking MongoDB does support it, and are surprised when the command-line tools don't do the same thing.
[17:22:31] <GothAlice> (Ending up with bad data as a result.)
[17:22:38] <svm_invictvs> GothAlice: THre have been such things as poorly written tools, that's why I asked.
[17:22:50] <svm_invictvs> Though I was 99% sure that it ididn't try do to anything like that.
[17:23:00] <svm_invictvs> Now I'll shut up before cheeser comes at me with a fireaxe
[17:24:07] <GothAlice> When db.collection.remove(somecriteria) in the shell and db.collection.find(somecriteria).remove() in my app do different things… there may be a problem.
[17:24:07] <GothAlice> ;P
[17:24:31] <GothAlice> (I.e. the app cascades, the shell obv. won't.)
[17:24:45] <svm_invictvs> No,I agree
[17:24:56] <svm_invictvs> Cascacing is a nightmare anyhow.
[17:24:56] <jecran> StephenLynx: https://gist.github.com/anonymous/3cec295353b57cd62658 .... any recommendations on this? I get all expected results now..... coolbeans, although stupidly simple
[17:25:09] <svm_invictvs> Even in an RBDBMS
[17:26:25] <ggoodman> It appears that in the 2.x series of node-mongodb-native, the semantics of `findAndModify` have been significantly changed.
[17:27:33] <ggoodman> What is now available to do a conditional update on many documents and return the modified documents resulting from the operation?
[17:27:33] <svm_invictvs> What's wrong with findAndModify?
[17:27:34] <GothAlice> The edge cases are substantial: http://docs.mongodb.org/manual/reference/method/db.collection.findAndModify/#return-data
[17:27:49] <svm_invictvs> oh jesus
[17:27:56] <svm_invictvs> Yeah, I forgot about all that bullshit
[17:28:04] <GothAlice> Like, way substantial. Not worth using level substantial. Just modify… then find.
[17:28:21] <GothAlice> The data will be in the hot cache anyway. ;)
[17:28:31] <svm_invictvs> Hot stuff. Coming through.
[17:32:07] <jecran> thanx for the helps guys... have a good one
[17:38:03] <shlant> anyone know why MMS automation agents take 5-10 minutes before even realizing that the servers they run on don't exist anymore???
[17:38:14] <shlant> is there a check threshold or something I can modify?
[17:38:24] <shlant> seems rediculous
[17:39:38] <cheeser> "ridiculous"
[17:39:50] <cheeser> if the server is gone, the agents are too.
[17:40:01] <cheeser> so they're not really around to notice anything.
[17:40:06] <GothAlice> shlant: MMS uses a once-per-minute ping, AFIK, and debouncing means that that once a minute thing gets stretched for a few minutes before the web interface really notices the problem.
[17:40:27] <cheeser> the server has to account for lag and this and that.
[17:40:32] <GothAlice> (A single failure to ping might not indicate a problem. Two or more in a row missed = a problem.)
[17:40:43] <cheeser> that threshold is probably configurable. i don't recall.
[17:42:24] <shlant> fair enough
[17:42:26] <GothAlice> Debouncing is useful, but does add to the latency of alerts. Nobody wants their monitoring to flap, though. (It's down! It's back! It's down! It's back! …)
[17:43:05] <shlant> understood, just seemed like 5 minutes of losed connectivity is more than a "flap"
[17:43:26] <shlant> I still have servers showing on MMS that _had_ monitoring and backup agents on them
[17:43:39] <shlant> I chose "uninstall" before shutting down the servers
[17:43:40] <GothAlice> Oh, MMS doesn't forget about hosts automatically.
[17:43:59] <GothAlice> You have to go in and trash the host mappings manually, AFIK.
[17:44:02] <cheeser> and you wouldn't want it to
[17:44:12] <GothAlice> Yeah.
[17:44:16] <shlant> makes sense
[17:44:20] <shlant> how do I remove hosts?
[17:44:29] <GothAlice> Under Deployment > Host Mappings.
[17:44:41] <shlant> "No host mappings"
[17:45:41] <GothAlice> Where else are you seeing the hosts, then? Under the main Deployment section?
[17:45:48] <shlant> yea
[17:45:54] <shlant> unser the servers tab
[17:48:19] <GothAlice> Hmm, also check Administration > Agents and try the "…" button, "Remove from MMS", if available in the server list against each host.
[17:48:43] <GothAlice> (If still enrolled in a replica set, you'll have to "Remove from Replica Set" first.)
[17:51:10] <shlant> yea I already removed the replica
[17:51:35] <shlant> I don't see an option under Agents to remove them
[17:51:47] <shlant> even though I already uninstalled them before killing the hosts
[17:51:56] <shlant> so not sure why they are even there still
[17:53:38] <shlant> like they are on the server and yet I don't have the option to uninstall them anymore as I already did
[17:53:38] <shlant> https://github.com/phusion/baseimage-docker
[17:53:41] <shlant> oops
[17:53:46] <shlant> http://imgur.com/M0N5t7k
[18:00:58] <shlant> the agents are showing red now
[18:01:15] <shlant> so they have become self aware
[18:01:29] <cheeser> all hail our new overlords!
[18:01:32] <shlant> excppt they don't realize they were uninstalled I guess
[18:01:36] <shlant> haha
[18:01:47] <shlant> immutable overlords
[18:02:06] <cheeser> it's not the agents any more. it's the server saying it can't talk to them.
[18:02:11] <cheeser> the agents are gone.
[18:02:21] <shlant> indeed
[18:02:32] <shlant> so I want the server to unfriend them
[18:05:02] <mbeacom> Hello everyone! Can someone please assist me? http://pastebin.com/YwLhKGMB In the provided JSON example, you'll find three example mongodb 2.6 results. I need to filter down the results to only return documents where the sub/embedded document in each document doesn't have more than one messages.author_id over all 'messages.$.author_id'
[18:09:29] <shlant1> so I guess I'll just wait and see if the server go away or contact support?
[18:10:00] <cheeser> you still see the hosts?
[18:12:15] <shlant1> cheeser: http://imgur.com/NsOTjER
[18:13:21] <cheeser> there isn't a remove option under the "..." button?
[18:14:13] <jr3> is __v a mongoose thing?
[18:14:26] <StephenLynx> never seen that before.
[18:14:28] <StephenLynx> I assume it is.
[18:14:32] <shlant1> cheeser: well, there was
[18:14:33] <jr3> the native bulk api doesnt seem to increment the __v
[18:14:38] <jr3> so I guess so
[18:14:40] <shlant1> "uninstall backup/monitoring"
[18:14:43] <shlant1> which i did
[18:14:52] <shlant1> but I gues it decided to just ignore my request
[18:15:01] <shlant1> but still remember that I chose it
[18:15:19] <shlant1> dangit robots, do what I want
[18:21:16] <mbeacom> If there is a Bueller out there? If so, does Bueller want to help me resolve a simple aggregation pipeline issue? :D http://pastebin.com/YwLhKGMB
[19:18:40] <jacksnipe> Hey, I'm using mongo with python, flask, and the MongoEngine extension. Since I'm pretty sure MongoEngine is just a wrapper over the basic python mongo lib, do I have to worry about ints vs longs or other variable size stuff?
[19:20:18] <jacksnipe> nvm I need to be using an explicitly specified LongField
[19:33:02] <GothAlice> jacksnipe: As a note, there's also #mongoengine, and sometimes people even talk there. ;)
[19:33:28] <jacksnipe> GothAlice: oh! thanks :)
[19:41:51] <qrome> Let's say I wanted to start purging data where it would remove it from my mongodb after a certain time period and store it somewhere slow. Is this normal?
[19:46:27] <GothAlice> qrome: It's not unusual.
[19:46:42] <qrome> GothAlice: can you point me in the right direction?
[19:46:47] <GothAlice> Typical is to stream the data into the "elsewhere" as it comes in, and let MongoDB clean itself up.
[19:46:56] <qrome> ahh
[19:47:04] <GothAlice> https://github.com/10gen-labs/mongo-connector to sync the data out to another service
[19:47:12] <qrome> thanks!
[19:47:17] <GothAlice> http://docs.mongodb.org/manual/core/index-ttl/ for the auto-expiry bit
[19:47:24] <qrome> oh wow that's even more awesome
[19:47:39] <GothAlice> :)
[19:47:53] <qrome> i have a ephemeral app so this is perfect and will make sure queries are fast
[19:50:50] <deathanchor> WTF? rs_a:SECONDARY> rs.reconfig(cfg, { force : true }); "errmsg" : "exception: need most members up to reconfigure, not ok : host2:27018"
[19:51:05] <deathanchor> even for a force true?!
[19:52:10] <GothAlice> That node is in a failure state, at the moment, unable to hold an election.
[19:52:46] <GothAlice> Thus no way to propagate the reconfig to other nodes reliably, and yeah, even if you force it, it'll say no to that.
[19:53:20] <deathanchor> it's only a test setup anyway
[19:53:31] <GothAlice> One of the things I love about MongoDB, when in doubt, rm -rf. ;)
[19:53:42] <deathanchor> funny it is complaining about the new host
[19:53:54] <deathanchor> how can I check what mongodb think is the host name?
[19:54:24] <GothAlice> rs.conf() will hand you back the member list, with host names.
[19:54:29] <GothAlice> Those are the ones MongoDB would use to connect.
[19:55:17] <GothAlice> If DNS names are in there, DNS becomes critically important to the reliability of your DB cluster, which is why I have /etc/hosts managed by the cluster to automatically include hard references to all other hosts in the cluster. No DNS problems. :)
[19:55:45] <deathanchor> yeah, I think someone changed the box from hostname, to hostname.fully.qualified
[19:57:08] <GothAlice> Fully-qualified is good. Changing is bad. ;)
[19:57:55] <deathanchor> yeah, hence the testing, someone messed with it and didn't bounce the service
[19:58:38] <deathanchor> oh man I totally broke it now
[19:59:47] <deathanchor> ah damn DNS
[19:59:49] <GothAlice> If you have fewer than 9 hosts, for the free tier, I can highly recommend https://mms.mongodb.com/ as a service to manage your cluster. Full disclosure: satisfied customer.
[20:00:07] <deathanchor> LOL fewer than 9 hosts...
[20:00:12] <GothAlice> You'll still need to resolve the DNS problems yourself, but everything else is made easier by MMS. :)
[20:00:42] <deathanchor> yeah I know now that too many things changed since it was last running
[20:00:47] <deathanchor> ... er.. started
[20:06:13] <jacksnipe> uh wtf is a 2dsphere -- as in a sphere with 2 dimensions to specify a point (i.e., lat/long)?
[20:06:25] <GothAlice> Effectively yes.
[20:06:50] <jacksnipe> just weird wording I guess
[20:06:56] <jacksnipe> I don't normally think of a sphere as being 2d
[20:07:04] <GothAlice> It's a non-euclidian mapping of square grid coords onto a sphere.
[20:07:16] <TimeBobby> its like a map, its just the surface you can't go and dig into the mountains
[20:07:36] <GothAlice> It also makes approximations about your height from sea level and things like that.
[20:07:44] <jacksnipe> ah thanks, good to know
[20:07:51] <jacksnipe> comparable speed to GeoSpatialIndex?
[20:08:10] <GothAlice> 2dsphere _is_ a geospatial index.
[20:08:10] <jacksnipe> or w/e the old one was called
[20:08:16] <jacksnipe> ah woops
[20:08:54] <GothAlice> The old "2d" index, meant for backwards compatibility with 2.2 and earlier, assumes a flat two-dimensional euclidian plane.
[20:08:55] <cheeser> 2d sphere, as I understand it, is a more appropriate way to handle geo stuff than geo2d because distances and the like aren't as simplistic as a flat surface
[20:09:01] <jacksnipe> ah that's it, what's the speed difference between a Geo2D index and a 2dsphere?
[20:09:02] <cheeser> exactly
[20:09:32] <GothAlice> There's a reason planes fly in a curve across the surface of the Earth instead of taking a "straight line" approach. Calculating distances around spheres can be mind-bending, and a curved path is often shorter.
[20:09:50] <jacksnipe> yeah, but is it a huge speed hit?
[20:10:09] <GothAlice> Optimization without measurement is by definition premature. Try both, and measure. ;)
[20:10:21] <jacksnipe> fair enough
[20:10:23] <GothAlice> It may depend very heavily on your queries.
[20:10:53] <jacksnipe> yeah, thought so. ATM I'm just looking for users in a radius, and calculating the distortion from the rect mapping to the sphere mapping
[20:11:02] <jacksnipe> and then accounting for that manually
[20:11:29] <GothAlice> 2dsphere would handle that for you. You can specify a search with a central point and circular radius.
[20:11:46] <jacksnipe> yeah I guess I just have to time it
[20:11:57] <GothAlice> http://docs.mongodb.org/manual/reference/operator/query/centerSphere/#op._S_centerSphere
[20:12:27] <jacksnipe> yeah I know what functionality I'd get I'm really _only_ concerned about the speed of the queries :P
[20:13:03] <GothAlice> I have peculiar requirements not covered by MongoDB's geographic capabilities, assuming operation on the surface of a spherical planetoid is highly restrictive for my needs. ;)
[20:14:01] <jacksnipe> yeah I need very low precision/accuracy (adding or dropping ~20 miles is nbd) so I was thinking that the Geo2D was good enough but I guess I'll take a look
[20:14:22] <GothAlice> centerSphere should work on geo2d, too.
[20:14:54] <GothAlice> And the note that geo2d is meant for legacy compatibility with 2.2 (and we're in 3.0…) should be a sign that its use is flatly deprecated.
[20:15:18] <GothAlice> (For use on mostly spherical planetoids, that is.)
[20:16:53] <jacksnipe> hmmm spherical only supports the standard projection though right?
[20:17:18] <GothAlice> I'm not sure of the exact projection being used in MongoDB.
[20:17:28] <jacksnipe> Am I SOL if, say, I want to use the USGS Maryland projection with spherical?
[20:17:39] <GothAlice> You'd need to convert.
[20:17:47] <jacksnipe> thought so, damn
[20:17:56] <jacksnipe> god dammit why can't the earth be a plane...
[20:18:00] <GothAlice> Heh.
[20:18:21] <GothAlice> I'm a fan of ringworlds. The benefits of a plane, and 1000x the surface area of a single sphere.
[20:19:13] <GothAlice> On a ringworld, a geo2d index would be perfectly acceptable. :D
[20:20:19] <jacksnipe> mobiusstripindex
[20:20:32] <jacksnipe> for your ringworld I mean
[20:20:36] <GothAlice> That just doubles the "width" of the strip. Still a flat plane for distance calculations.
[20:20:50] <jacksnipe> damn good point
[20:21:00] <jacksnipe> you've... thought a lot about this lol
[20:22:41] <GothAlice> I write hard sci-fi.
[20:22:46] <GothAlice> :P
[20:22:58] <jacksnipe> that'd explain it
[20:23:15] <jacksnipe> XD
[20:34:40] <jacksnipe> how do I direct mongo to autotimestamp something? Just have an empty field called "ts" on insert?
[20:35:08] <cheeser> you can't
[20:35:20] <cheeser> your app will have to take care of that.
[20:35:37] <cheeser> but if you use ObjectId as your _id, that has a timestamp component built in
[20:36:45] <jacksnipe> gotcha
[20:37:10] <jacksnipe> also do field names take up space per doc? I assume so but just checking
[20:37:24] <cheeser> they do.
[20:38:07] <jacksnipe> ah well guess that's the cost of schemaless
[20:38:13] <cheeser> pretty much
[20:38:49] <cheeser> though we prefer the term "dynamic schema" ;)
[20:39:25] <jacksnipe> fair enough :P
[20:43:51] <deathanchor> ts : ISODate()
[20:45:44] <GothAlice> The more records you have, the more the field name overhead adds up. Renaming like this, though, really hurts when I choose to use non-Python tools like the mongo shell, though.
[20:45:48] <deathanchor> GothAlice: what ODM you use for python?
[20:45:53] <GothAlice> MongoEngine, deathanchor.
[20:46:27] <deathanchor> gonna check it out, our devs use something for java for the same resone
[20:47:34] <GothAlice> MongoEngine (and ODMs in general) provide a bit more functionality than just field re-naming. ;) MongoEngine is a full schema enforcement system, with triggers (signals), etc.
[20:48:19] <deathanchor> Poop to that, I like freeballing it in my documents :D
[20:48:47] <GothAlice> Also handles things like caching references (storing "remote" fields alongside the ObjectId of the reference) automatically, such that updating the referenced document updates all cached values, too.
[20:48:58] <GothAlice> Kinda handy. ;)
[20:58:32] <saml> update and then find right away gives inconsistent results
[20:59:00] <saml> i have integration tests that inserts documents and removes them.. and executes count and other queries very fast. and sometimes tests fail
[21:08:26] <greyTEO> GothAlice, have you ever had an object override itself? e.g. lets say you have a web admin. Page 1 has an object and page 2 has the same object. page 2 updates. will page 1 updates override page 2? Similiar to a race condition with "stale" data.
[21:08:56] <greyTEO> this would be give the user has 2 tabs open at the same time...all hypothetical of course.
[21:09:19] <GothAlice> "Last to save wins." is a typical approach.
[21:09:46] <GothAlice> Github, on the wiki portion of projects, will alert a user in the process of editing something that someone else managed to hit save first, and leaves resolving the conflict to the user.
[21:10:30] <greyTEO> but this isn't something Mongo would catch per se
[21:10:33] <GothAlice> The other extreme from "whoever saves last wins" is to have every field fully live. Someone changes something in one place, all other places are immediately updated, regardless of manually refreshing.
[21:10:38] <GothAlice> MongoDB doesn't care. :)
[21:11:11] <greyTEO> That is what I thought. I was wondering if there was some magical method for merging docs...lol
[21:11:54] <greyTEO> ok thanks. I can live with that.
[21:12:01] <GothAlice> Versioning would be a simple approach, and would leverage update-if-not-different queries.
[21:12:58] <GothAlice> I.e. user A and B load the form, each get version 1. User A presses save, db.foo.update({_id: …, version: 1}, …). User B presses save, the update can't find the record because the version is different, and user B is presented with the updated data, with their changes applied, and a message that someone else got to it first.
[21:14:36] <greyTEO> Yea that would be a fairly simple implementation. For my case, last to save wins works.
[21:14:50] <GothAlice> It's usually "good enough" for most apps. ;)
[21:15:01] <GothAlice> It's an edge case to handle, not usually a regular thing.
[21:15:02] <greyTEO> it would be the User A overriding User A. going out of his way to be a PITA
[22:42:37] <jecran> Hi guys. Using express , I am just trying to do a simple 'post' but I cant access my data. If I do a 'get' everything is there, but I need post lol. app.post('/login', function(req, res){}); This is the line that handles the post. I make it inside this function but can't retrieve my data. Any ideas on what im doing wrong?