pmxbot IRC Log Viewer

[00:30:31] <zylo4747> works like a charm

[00:48:14] <cheeser> awesome!

[00:48:58] <Doyle> Hey. Is there a number of DBS that you can potentially have on an RS, that would cause perpetual 100% disk util from replication even if there's nothing being written to the dbs?

[00:49:30] <Boomtime> @Doyle: no

[00:49:43] <Boomtime> the existence of databases does not generate load - use of them does

[00:50:13] <Doyle> I've got an RS with disk IO pegged. iostat shows mongodb as the culprit, but mongo's not doing too much as far as I can tell

[00:50:38] <Boomtime> "mongo's not doing too much as far as I can tell" <- what is your method of "telling"?

[00:50:47] <Boomtime> run mongostat and find out

[00:51:26] <Boomtime> when you say you have an "RS", do you mean a replica-set?

[00:51:38] <Boomtime> do all members show this IO activity?

[00:51:39] <Doyle> Yep

[00:52:21] <Boomtime> so mongostat and if there isn't much there, then db.currenOp in a shell

[00:52:23] <Doyle> yes. Replica set, pegged io on both

[00:52:54] <Doyle> 0s's across insert... getmore, then 2, 4, 11, etc commands

[00:55:14] <Doyle> Whatever's going on doesn't have an op that lasts long enough for currentop to catch, even if I hammer it

[00:55:33] <GothAlice> Doyle: Underlying filesystem?

[00:55:49] <Doyle> Just the usual local.oplog.rs ops

[00:55:57] <GothAlice> And/or does this happen to be on Amazon EC2 using EBS volumes?

[00:56:05] <Doyle> Yes, ebs

[00:56:33] <GothAlice> Yeah. You might have a stuck EBS volume. Is the load average oddly high, yet SSH access still responsive?

[00:57:53] <Doyle> iostat -xmt 1 shows even reads across the lvm.

[00:58:14] <Doyle> It's 0... something just changed

[00:58:47] <GothAlice> The plot thickens.

[00:58:52] <Doyle> ah, it's back

[00:59:21] <Doyle> I'm going to add the disk stats to a cloudwatch dashboard and see if there's a high latency disk or something

[00:59:33] <Doyle> 8 disk in this lvm... so much clicking

[00:59:39] <GothAlice> What's the %iowait?

[00:59:58] <Doyle> 22

[01:00:27] <Doyle> mongostat's ar/aw is even at 2|0/3|0

[01:01:03] <Doyle> How expensive is a listCollections op?

[01:01:14] <GothAlice> Should be trivial.

[01:01:28] <GothAlice> Note that the monitor itself will show up as commands executed, too.

[01:02:53] <Doyle> true

[01:03:02] <Doyle> I have 21248 collections on this RS :/

[01:03:05] <Doyle> trivial?

[01:03:14] <GothAlice> Okay, so less trivial. That's a gosh-darn lot of collections.

[01:03:25] <Doyle> We have a log of data

[01:03:32] <GothAlice> That's also an unusually high iowait percentage, to my eyes. Your CPU is spending literally 22% of its time waiting on the EBS volumes to talk to it.

[01:03:32] <Doyle> lot*

[01:04:08] <Doyle> Yea, these instances have much lower iowait than the last setup we had going. 150% iowait, 200+ sometimes I remember

[01:04:20] <GothAlice> ...

[01:05:19] <Doyle> LVMs for the win

[01:06:20] <GothAlice> Heh, for me it's just a reminder of one of the reasons we aren't on AWS any more. XD

[01:06:35] <Doyle> lol

[01:06:39] <Doyle> you in a colo?

[01:06:49] <GothAlice> That and the cross-zone cascading failure in EBS controllers that made volumes inaccessible and un-controllable for 22 days straight, but…

[01:07:02] <Doyle> ew

[01:07:17] <GothAlice> Yeah. SLA says "deploy to multiple zones and you're safe" — not so much. XP

[01:07:46] <GothAlice> These days we no longer care about physical hosts.

[01:07:53] <GothAlice> Only apps.

[01:07:59] <GothAlice> (There is no Dana, only Zuul.)

[01:08:56] <Doyle> You going to see the new one?

[01:09:00] <GothAlice> Nope.

[01:09:20] <GothAlice> The lack of funny projected in the trailers encouraged me to spend money on something else. ;P

[01:09:33] <Doyle> You're a sensible person

[01:09:43] <Doyle> I'm skipping it. Same reason

[01:09:58] <Doyle> The anti-hype is crazy surrounding this movie

[01:11:03] <GothAlice> The biggest reason for that is it's a movie with an agenda / purpose. Being un-funny (as seen so far) didn't help. XD

[01:13:36] <Doyle> I'm looking forward to the social media explosion when it releases. Should be interesting.

[01:14:53] <GothAlice> I'm looking forward to the "everything wrong with" summary. Points over 9000, maybe? And don't get me started on the poor scene remakes, seeming lack of pacing, and lack of understanding suspense—such as not seeing Slimer's first attack in the original. That highschool film class was totally worth it. ;^)

[01:18:40] <GothAlice> It's… also a confusing sequel / remake hybrid. From the trailers, it's hard to tell what they were going for, in that regard. But blah. XP

[01:19:46] <Doyle> Yea, it presents very strangely. Someone mentioned that most kids growing up today will think that this movie is what Ghostbusters is, and that's what saddens me the most

[01:20:26] <GothAlice> Indeed. Potentially ruined for a generation. :'( The original still hold up, too. (Made even more comical due to the SFX differences with today.)

[01:22:33] <GothAlice> Doyle: Have you increased the namespace allocation, and/or are using the WiredTiger storage engine?

[01:22:45] <GothAlice> Upon consideration (I haven't slept) you may be running into https://docs.mongodb.com/manual/reference/limits/#Number-of-Namespaces with the sheer number of collections you have.

[01:23:40] <Doyle> GothAlice, 3.0, MMAPv1

[01:23:54] <Doyle> Have yet to make the jump to 3.2/WT

[01:24:07] <Doyle> Can't wait!

[01:26:19] <GothAlice> So yeah, 21,248 is dangerously close to 24,000, and the number is actually based on namespaces, which include indexes, not just collections. I have absolutely no idea what happens if you exceed that threshold.

[01:26:26] <Doyle> oh, that's... scary

[01:28:07] <GothAlice> Your collection count may actually be (at least partially) responsible for the load/io issues.

[01:28:35] <Doyle> I have a nagios check that counts collections. I increased the interval from the default 5m to 20

[01:32:20] <Doyle> How do I check the namespace size?

[01:33:52] <GothAlice> https://docs.mongodb.com/manual/reference/limits/#Size-of-Namespace-File < default is 16MB. nsSize configuration option. As for checking actual utilization, I do not know.

[01:37:30] <Doyle> db.system.namespaces.count()

[01:38:23] <Doyle> I get 12 :/

[01:38:51] <Doyle> Or 0 on an RS with almost nothing on it :/

[01:40:34] <Doyle> ah, that's for the db in use, duh

[01:55:41] <GothAlice> Oop, yes indeed, Doyle. Sorry, got distracted elsewhere. ^_^

[01:58:01] <Doyle> The namespace limit is per DB? or Per Mongod process?

[02:00:07] <GothAlice> Per DB, AFIK.

[02:00:28] <GothAlice> You can double-check me on this by looking in your data directory (I'm using WiredTiger, so won't have the corresponding .ns files).

[02:01:17] <Doyle> Yes, one per db

[02:01:25] <Doyle> that's a bit better

[03:28:55] <dsc_> GothAlice: still around?

[10:14:00] <robscow> via the shell, db.sc.getIndexes() returns an empty list, yet when querying via my Perl client, it's telling me what the indexes are after I create them - nearly feels like they're not being committed

[11:21:07] <kurushiyama> robscow You sure you are in the correct database when emitting db.sc.getIndexes() ?

[11:32:57] <robscow> kurushiyama, nope, rookie mistake :)

[11:33:53] <kurushiyama> robscow We all were at one point in time. I just could interpret the fact because that is a mistake I make myself – to this day...

[11:34:36] <kurushiyama> robscow Like :xa in _vi_ ;)

[11:39:39] <robscow> :)

[11:58:56] <mroman> Is there a $this in aggregate?

[11:59:07] <mroman> like all: {$addToSet : "$this"}?

[13:22:19] <Zelest> Silly question, how can I find and update a sub-document? :o

[13:23:03] <Zelest> ah, 'foo.$' :o

[13:24:57] <GothAlice> Zelest: As a very, very important (nay, critical) note on that, you can only update one array's element at a time. If you have > 1 array, you're… gonna have a bad time.

[13:25:27] <GothAlice> (> 1 array either as field siblings, or nested, either is bad for being able to address specific array elements.)

[13:25:41] <cheeser> probably the biggest wart on the mongodb data model

[13:25:52] <GothAlice> Eh, you get used to it.

[13:26:11] <Zelest> Uhm.. not entirely sure on how you mean.. :o

[13:26:39] <GothAlice> Just don't naively throw everything in a single, highly nested document like a noob. (And/or learn from that mistake.)

[13:27:05] <GothAlice> Zelest: {a: [{v: 1}, {v: 2}], b: [{v: 3}, {v: 4}]}

[13:28:07] <GothAlice> In this situation, when you use $elemMatch (or other query operator) to match a value you want to update in the query portion, such as {"a.v": 2}, the $ pointer points at the "a" array element. There's only one $ operator… so you can't also search "b" for a value to update at the same time.

[13:29:22] <Zelest> Aah

[13:29:33] <GothAlice> There are a few proposals in the ticket system about ways to improve that, but… so far all of them I've seen are worse than the problem. :(

[13:29:54] <Zelest> That's no issue for me atm though :) But thanks for the heads up :)

[13:30:19] <GothAlice> https://jira.mongodb.org/browse/SERVER-831 — nested arrays

[13:30:21] <Zelest> But, what's the "best practice" approach on replacing {v: 1} for example?

[13:30:58] <GothAlice> https://jira.mongodb.org/browse/SERVER-6866 — accessing past the $ operator

[13:32:12] <Zelest> Is there any mongodb command to make my boss extend my deadline as well? :P

[13:32:14] <GothAlice> Searching for the value to update and using the $ operator in the update itself to point your change at it. You can also do things like reference elements by index, too, but that's… less good if there's a possibility the order of the array elements may change.

[13:32:35] <GothAlice> Zelest: Alas, applying the Scotty Factor is beyond the responsibility of MongoDB. ;P

[13:32:49] <Zelest> DOh ;)

[13:34:39] <cheeser> scotty factor?

[13:35:08] <GothAlice> How can yer captain see you as a miracle worker if you give 'em accurate estimates, lad?!

[13:35:30] <Zelest> Haha

[13:35:37] <GothAlice> "Starfleet captains are like children. They want everything right now and they want it their way. But the secret is to give them only what they need, not what they want."

[13:35:49] <GothAlice> Engineering life lesson, right there.

[13:36:06] <cheeser> oh! that scotty.

[13:38:51] <GothAlice> So yeah, on any estimate, double and add half. That'll give you headroom to handle unplanned issues, and if you deliver on time, you're golden. If you deliver early (since it shouldn't actually take the full estimated time), you've worked magic. If you give accurate times and are frequently late due to outside issues, you develop a very different reputation, all because you forgot a little multiplication in your time estimates. ;)

[13:55:53] <jayjo> In my mongo shell, can I have a collection with a - in it?

[13:56:19] <GothAlice> Only if you're careful, and only if you access it as an array subscript instead of attribute of "db". I.e. db["foo-bar"]

[13:56:27] <GothAlice> Doing so makes it difficult to access, so is not recommended.

[13:56:48] <cheeser> most drivers won't even notice but the shell is a bit sensitive.

[13:57:19] <GothAlice> Well, in most languages foo-bar means "foo mathematical minus bar", not "a symbol whose name is 'foo-bar'".

[13:57:33] <jayjo> OK - I should probably rename it then, I have named my dbs and collections with '-' instead of '_'

[13:57:39] <GothAlice> For example, in Python, you wouldn't be able to "easily" access it as an attribute, either.

[13:59:59] <GothAlice> As a style thing, I avoid multi-word database, collection, and field names. A thesaurus is handy to find synonyms that are shorter.

[14:00:24] <GothAlice> (Especially if some names are reserved words in your language of choice, i.e. "from" being reserved in Python, so for e-mail records, I use "author" instead for that field. It's also better representative of what the field actually means, which is a nice bonus.)

[14:02:14] <GothAlice> Also, in cases where you have prefixes on fields, i.e. name_first, name_last, that's an indicator that a sub-document might actually be more useful. I.e. {name: {first: "…", last: "…"}}

[14:03:34] <GothAlice> That'll let you search first name—{name.first: "Bob"}—or last name—{name.last: "Dole"}—or both first and last name: {name: {first: "Bob", last: "Dole"}}

[14:07:07] <Zelest> Can't I use $set when I update a subdocument?

[14:07:29] <cheeser> sure

[14:08:25] <Zelest> db.col.update({_id: ..., sub._id: ...}, {'$set': {'sub.$': {a: 1, b:2}}})

[14:08:32] <Zelest> won't that just set a and b?

[14:08:40] <Zelest> now I seem to replace the entire sub-doc

[14:08:56] <cheeser> $set will replace the value of the key given.

[14:09:00] <GothAlice> You are telling it to set the matched element in sub.

[14:09:09] <Zelest> oh, yeah

[14:09:10] <GothAlice> So, it's doing what you're telling it to do. :P

[14:09:21] <GothAlice> {'sub.$.a': 1, 'sub.$.b': 2}

[14:09:31] <Zelest> Of course, my bad.. thanks :)

[14:13:21] <jayjo> If I create an index on a very moderately sized db (20gb), can I get verbose output to see progress?

[14:13:28] <jayjo> Or on long-running queries, like EXPLAIN in sql?

[14:16:17] <GothAlice> jayjo: https://docs.mongodb.com/manual/core/index-creation/#view-index-build-operations

[14:16:56] <GothAlice> Specifically: https://docs.mongodb.com/manual/reference/method/db.currentOp/#currentop-index-creation

[14:34:41] <GothAlice> gitgitgit: Minor note, it'd be nice if you kept your nick non-offensive. As you are using an authenticated account (GitGud) renaming shenanigans are simply annoying and add to the noise, not signal. (I'm sure there's a hilarious reason in another channel for the renames, but still.)

[14:35:09] <gitgitgit> haha ye sorry about that GothAlice i was just joking about in #freenode

[14:35:16] <GothAlice> ;)

[14:35:47] <GitGud> ;D

[14:38:20] <GothAlice> Aaaand I just realized a consequence of a design decision in my pymongo helper library. Arbitrary type storage. (Any document or sub-document with a _cls import or entry_point plugin reference will load that class and pass the document fields as keyword arguments.) Huh. I meant it to handle Document sub-classes, but really, it'll work with any callable, after thinking about it.

[15:20:22] <|RicharD|> hi

[15:20:33] <|RicharD|> what is the best way for backup a huge db ?

[15:20:44] <|RicharD|> (daily...and the db is 300gb)

[15:21:02] <|RicharD|> doing a mongodump or copy the data from the mongoldb directory ?

[15:21:13] <cheeser> https://www.mongodb.com/cloud

[15:24:08] <|RicharD|> ?

[15:29:28] <saml> to backup big data, i think you can use replication

[15:30:43] <saml> but you can pay mongodb.com and use their cloud service

[15:32:30] <deathanchor> |RicharD|: for that large, a dump would take a long time. You could do an fsynclock on the db and copy the data files and then unlock it when done.

[15:33:17] <deathanchor> https://docs.mongodb.com/manual/reference/method/db.fsyncLock/

[15:33:26] <|RicharD|> but with fsynclock the mongodb is locked ?

[15:35:18] <deathanchor> you do it on a secondary that is hidden

[15:35:40] <deathanchor> that avoids reading old data from that member

[15:36:16] <deathanchor> basically it will stop syncing/writing/etc. until you unlock it again

[15:36:41] <deathanchor> so check your oplog window and how long it takes to copy the data files

[17:43:11] <cheeser> replication is not backup

[17:46:05] <cheeser> |RicharD|: they changed up the landing page. here's the right one: https://www.mongodb.com/cloud/cloud-manager

[17:46:28] <|RicharD|> cheeser: you work at mongodb.com ?

[17:46:33] <cheeser> i do

[17:48:02] <GothAlice> Delayed replication _might_ be a backup, but only if you catch problems within the delay period. ;P

[17:48:10] <|RicharD|> btw cost too much

[17:48:34] <|RicharD|> we cavee 120k free on was

[17:48:45] <cheeser> if you think that costs too much, imagine the costs of catastrophic data loss. ;)

[17:49:05] <GothAlice> And the ongoing time of DBA maintenance.

[17:49:31] <GothAlice> (The big thing for me using the cloud manager is in maintenance automation. My word, it saves time and stress during upgrade roll-outs.)

[17:50:57] <cheeser> yeah. upgrading is awesome actually.

[17:51:01] <cheeser> clicky click!

[17:52:51] <|RicharD|> we don't use mongoldb like main db lucky :)

[17:55:09] <GothAlice> :P

[17:55:37] <GothAlice> (Just a very satisfied customer.)

[18:02:13] <|RicharD|> I understand all your points and I agree

[18:02:18] <|RicharD|> but the decision is not up to me

[18:02:19] <|RicharD|> :(

[18:02:43] <dsc_> i like how you called it 'mongoldb' :P

[18:02:54] <|RicharD|> yes auto correct on OS X that sucks :P

[18:02:54] <dsc_> my employer calls it that too

[18:03:08] <|RicharD|> nono I not call this, is auto correct by OS X

[18:03:11] <dsc_> heh' :) ok

[18:03:33] <|RicharD|> no fucking idea how do backups properly ehehehe

[18:03:40] <|RicharD|> it's complicated

[18:04:27] <StephenLynx> kek

[18:04:32] <StephenLynx> cp x y

[18:04:33] <StephenLynx> :v

[18:15:06] <saml> with mongodb, you don't need backup

[18:15:26] <saml> but 300GB is small.. you can backup however you like

[18:16:42] <GothAlice> saml: In all things, you need backups. To not have backups is irresponsible.

[18:17:01] <saml> what if your data is big and you cannot afford backup storage

[18:17:09] <saml> mongodb is big data

[18:17:15] <GothAlice> saml: I have 35TiB of data backed up off-site, cost: $5/month.

[18:17:35] <GothAlice> So… "affording" backup storage is silly, when unlimited data solutions exist.

[18:17:50] <saml> 35TB is daily ingest

[18:18:15] <saml> which vendor are you using for 35TB $5/month ?

[18:18:20] <GothAlice> Backblaze.

[18:18:37] <GothAlice> No data limits, so ingest vs. total… kinda irrelevant.

[18:18:51] <|RicharD|> has api ?

[18:19:35] <GothAlice> They do for their B2 cloud storage solution, their backup solution uses a proprietary client.

[18:20:01] <|RicharD|> btw I don't trust too much

[18:20:09] <GothAlice> Admittedly, the initial backup took three months on my pitiful home connection, which was silly, but. (And for business use, it's $50/machine/year.)

[18:20:11] <|RicharD|> a provider that cost only $5/month for 35TB

[18:20:26] <GothAlice> They… pretty much invented the highest density storage system possible.

[18:20:48] <GothAlice> And if you don't trust them, you can totally build their infrastructure yourself: https://www.backblaze.com/blog/storage-pod-4-5-tweaking-a-proven-design/

[18:20:56] <GothAlice> (Their hardware setups are open source.)

[18:21:30] <|RicharD|> nono is not that I don't trust them like company

[18:21:35] <|RicharD|> it's only that is so cheap.....

[18:21:36] <GothAlice> The backup system is bzip2'd deep storage, so recovery is a "browse the files, pick files to restore, they'll e-mail you or ship you HDDs" situation.

[18:21:43] <|RicharD|> and usually hardware costs money

[18:22:01] <GothAlice> And they offer full encryption, too. I.e. they don't know the files they have for you or what they contain.

[18:22:14] <saml> |RicharD|, why do you backup? do you want easy restoration of mongodb data in case something happens?

[18:22:45] <|RicharD|> are you kidding ? :P

[18:22:49] <|RicharD|> if the server goes on fire

[18:22:53] <|RicharD|> the business fail

[18:23:06] <saml> i usually have a replica member with low priority and do mongodump on the node

[18:23:10] <|RicharD|> if not a big deal if we loss 2-3-4 days of data

[18:23:18] <saml> so you want to recover

[18:23:26] <saml> what's wrong with mongodump?

[18:23:36] <|RicharD|> It's use too much the cpu/ram

[18:23:49] <|RicharD|> and slow down the server

[18:23:49] <GothAlice> And is tricky to get consistent results out of on a loaded cluster.

[18:23:55] <saml> that's why you put a replica set or node that's not hit by clients

[18:24:05] <GothAlice> That aggravates the consistency issue due to replication lag.

[18:24:25] <|RicharD|> I can't

[18:24:30] <|RicharD|> because we use dokku

[18:25:00] <|RicharD|> and it not supports natively (and I'm not a expert DBA)

[18:25:48] <|RicharD|> I'm doing this stuff, only because i'm forced and i'm the person with more skills

[18:25:53] <|RicharD|> in the company

[18:27:00] <saml> https://docs.mongodb.com/manual/core/replica-set-hidden-member/ can you use this with dokku ?

[18:27:49] <GothAlice> Because I'm using the personal backup client, I'm forced to mount ZFS snapshots on my Mac to have the "drive" (that is, the mongo data directory on my set of RAID arrays) appear for inclusion in the backup set on my laptop. A little more pain in setup is something I'm fully willing to accept for such inexpensive service; haven't checkout out their newer B2 storage yet, though.

[18:28:08] <GothAlice> Of course, the ZFS snapshot approach is not usable in a managed hosted environment, either.

[18:39:32] <GothAlice> Specifically, I'm not wanting to pull in every related record, but only the latest. $lookup + $unwind + $group/first… isn't going to cut it, with the amount of data that'd get pulled in during the $lookup.

[18:40:15] <GothAlice> (Also I need to preserve records with no related data, so the $unwind is out. It eats records with empty arrays.)

[18:40:54] <GothAlice> :(

[18:48:22] <|RicharD|> buy you guys

[18:48:28] <|RicharD|> use mongoldb has main db?

[18:48:36] <|RicharD|> for what exactly ?

[18:49:11] <|RicharD|> Don't hate me, but I'm using mogodb only for stores some JSON responses...but maybe next time I will just use postgresql+jsonb

[18:52:40] <deathanchor> |RicharD|: I store lots of data (location, user info, events, etc.) I just like that the values aren't constrained by a size limit other than 16MB for total doc size.

[18:53:14] <GothAlice> |RicharD|: My at-home dataset (that 35 TiB one) is a metadata filesystem based on GridFS storing every bit of digital information I've touched since September 2001, excluding anything automatically purged due to lack of access. At work we run an employment offer (job) distribution platform, applicant tracking software, and event-based analytics suite, plus reporting and a bunch of other stuff.

[18:57:46] <StephenLynx> |RicharD|, I developed lynxhub.com using mongo not only to store the data, but also files.

[18:58:09] <StephenLynx> because gridfs allows me to easily distribute files instead of having to abstract them directly on the disk.

[18:59:23] <StephenLynx> plus not having to worry about mandatory schemas makes it much easier to develop and update it.

[18:59:38] <StephenLynx> I don't have to run a script to update a table, I just update the data when I have to.

[19:00:29] <StephenLynx> also, by using mongo I don't have to write 2 languages at the same time.

[19:02:15] <GothAlice> StephenLynx: I'm beginning to grapple with the idea of writing MongoDB server-side functions and map/reduce in Python, transpiling Python code into JS. Muhahahaha.

[19:02:56] <StephenLynx> :::::::::::vvvvvvvvvvvvvvvvvvvvvvvv

[19:03:08] <StephenLynx> jesus christ, how horrifying

[19:03:35] <cheeser> GothAlice: some day i'll write my js in kotlin. ;)

[19:07:34] <GothAlice> StephenLynx: It ought to be transparent to the developer. I.e. that "just use a Python function for map/reduce" thing would let you use the same code in isolated testing as well as db-side. Optional, of course: https://github.com/marrow/mongo/blob/develop/setup.py#L96 ;)

[19:07:52] <GothAlice> "Mad Science" means never stopping to ask "what's the worst thing that could happen?"

[19:07:54] <GothAlice> >:D

[19:08:28] <cheeser> the worst part of that plan is doing map reduce in mongo

[19:08:33] <StephenLynx> and bad design means never stopping to ask "wtf"

[19:08:57] <GothAlice> Indeed, aggregates are strongly preferred. Some things simply require code, though.

[19:10:27] <GothAlice> StephenLynx: It's been several years since I've really cared about language distinctions. Code is code, they're all just different spelling and grammar for the same ideas. Thinking this way gives a lot of freedom. (The transpiled code is not substantially worse, as code, than hand-written, and the logic is identical, thus, what's the difference?)

[19:11:01] <StephenLynx> its opaque, confusing, complex.

[19:11:11] <StephenLynx> and with larger margin for error.

[19:11:21] <GothAlice> Considering Marrow Mongo is already beginning to develop a number of simulations of MongoDB behaviour (such as document matching against a query spec, projection, etc.), having code able to be run in both environments is advantageous, too.

[19:11:56] <GothAlice> https://camo.githubusercontent.com/b4777a69407ab7b6b25b9f0f500c6a023fb750e6/687474703a2f2f7777772e7472616e7363727970742e6f72672f696c6c757374726174696f6e732f636c6173735f636f6d706172652e706e67 < a demo from one of the transpiler packages.

[19:12:06] <GothAlice> (Showing source and resulting JS.)

[19:12:31] <StephenLynx> imo your project suffers from feature creep.

[19:12:43] <GothAlice> The translation process is 100% deterministic… hardly opaque or even very complex. Get source AST, run through it generating template-driven chunks of code in another language. (Basically.)

[19:13:33] <StephenLynx> when you started it, did you document which were the project's goals?

[19:13:34] <GothAlice> That goes for _any_ transpiler.

[19:13:39] <GothAlice> StephenLynx: Sure did.

[19:13:46] <StephenLynx> _any_ transpiler is cancer.

[19:14:01] <StephenLynx> and what are these goals?

[19:20:34] <GothAlice> StephenLynx: A non-middleware replacement for MongoEngine, to support and enhance native pymongo usage, with documentation on specific differences in design documented on the wiki. May I PM?

[19:21:45] <StephenLynx> what does mongoengine does?

[19:23:02] <GothAlice> ODM layer.

[19:23:09] <GothAlice> Active record DAO.

[19:25:28] <StephenLynx> how does that feature pushes your goals forward?

[19:44:08] <django_> hey all

[19:44:15] <dsc_> GothAlice: Is it sqlalchemy?

[19:44:22] <django_> for mongodb what does host do in mongoclient

[19:44:53] <cheeser> django_: it's where the database is running

[19:45:06] <django_> cheeser, is there an admin panel?

[19:45:10] <GothAlice> dsc_: Sorry, what?

[19:45:14] <cheeser> admin panel?

[19:45:23] <dsc_> GothAlice: nvm

[19:45:39] <django_> cheeser, to see the DB and what is stored

[19:46:41] <cheeser> there are several such tools. https://www.mongodb.com/products/compass e.g.

[19:48:00] <cheeser> also, https://docs.mongodb.com/ecosystem/tools/administration-interfaces/

[20:00:21] <django_> can you do this in mongoDB: db.test.insert({"arrays":[1,2,3"]})

[20:00:29] <django_> im getting syntax error

[20:00:46] <GothAlice> django_: Because you have a syntax error. :P Careful where you put quotes.

[20:01:01] <django_> lol

[20:04:09] <django_> GothAlice, how can i print it

[20:04:25] <deathanchor> db.test.find().pretty()

[20:04:43] <GothAlice> "it"? If you mean the results, then ^ would work.

[20:04:57] <django_> ohhhh

[20:05:18] <GothAlice> deathanchor: Most of the time, I prefer to ask for clarity instead of guessing. ;P

[20:05:33] <GothAlice> (Most of the time.) XP

[20:06:07] <deathanchor> true when it involves work on my part, but sometimes I provide an answer and if that is confusing, then get clarity :)

[20:06:50] <deathanchor> NYC Tourist: Where is Time Square? I just point in the direction. :D

[20:07:11] <deathanchor> no time for clarification :D

[20:07:40] <deathanchor> so I'm writing a webapp project on my own, using flask with mongoengine

[20:08:11] <deathanchor> I was surprised that the flask and mongoengine tutorials weren't combined in anyway

[20:08:21] <GothAlice> deathanchor: Ouch. You may wish to be aware of a number of outstanding issues: https://github.com/marrow/contentment/issues/12 < meta-ticket

[20:09:53] <deathanchor> it was a minor side project

[20:10:33] <deathanchor> what should I use instead?

[20:10:45] <GothAlice> … it doesn't support the $ operator during projection. (Like, basic stuff is missing or broken these days.) I'd recommend using Pymongo instead.

[20:11:34] <deathanchor> no other ORM toolkits available for mongo?

[20:11:42] <GothAlice> Consider what an ORM does for you.

[20:11:53] <GothAlice> I.e. what are the top five things you think it does for you?

[20:12:05] <deathanchor> easy object/db management

[20:12:07] <GothAlice> (Or ODM in this case.)

[20:12:29] <deathanchor> my project only does simple CRUD work.

[20:12:49] <GothAlice> That's not something it does, that's a broad category of things. If you mean "have an object representing a document where I can access fields as attributes", you can tell pymongo to use ordered attribute access dictionaries or any other desired container for documents, preserving nice attribute access that is the primary benefit of using an active record-based schema system.

[20:13:27] <GothAlice> If you mean "have a schema for my data", MongoDB itself does that for you these days, and will do a better (and way more powerful) job of it: https://docs.mongodb.com/manual/core/document-validation/

[20:13:58] <GothAlice> So for simple CRUD, pymongo's Collection.insert_one, insert_many, find_one, find_many, etc., etc. are absolutely perfect.

[20:14:10] <deathanchor> not using it for schema, it was just a simple way to reduce my code down to a few lines instead of writing my own data handlers.

[20:14:20] <GothAlice> "data handler"?

[20:15:04] <deathanchor> get this object (doc from db), and update it, done.

[20:15:57] <GothAlice> That… is a very simple operation, easily accomplished with no "extraneous" code with the bare driver. http://api.mongodb.com/python/current/api/pymongo/collection.html#pymongo.collection.Collection.find_one_and_update

[20:16:38] <GothAlice> collection.find_one_and_update({…filter…}, {…update…}, …)

[20:16:55] <GothAlice> Not sure how an ODM would simplify that. Instead of referring to a collection, you'd refer to the class, …

[20:17:03] <deathanchor> most of my code for work I use only pymongo, I was toying with mongoengine for my little side project.

[20:17:22] <GothAlice> The "simplification" is mostly an illusion.

[20:18:05] <GothAlice> Now, there are some very specific things that ORM/ODMs do that might be useful. Django-style query or update arguments, for example. (I.e. .update(set__field=value) instead of needing dictionaries)

[20:18:09] <deathanchor> yeah, so is mongoengine goign to die off or just has lots of known problems?

[20:18:54] <GothAlice> It's basically dying. Has been since 0.10 broke basically everything.

[20:19:14] <deathanchor> ho hum. back to pymongo solo I go :)

[20:19:22] <GothAlice> https://github.com/marrow/mongo/blob/develop/marrow/mongo/query/djangoish.py#L174-L196 < the "django-style filters" thing is this bit of code, BTW. 22 lines, including whitespace and docstring. ;)

[20:19:42] <GothAlice> Admittedly with a fairly large configuration: https://github.com/marrow/mongo/blob/develop/marrow/mongo/query/djangoish.py#L77-L127

[20:20:29] <deathanchor> eh, I'll write my own models for flask outside of mongoengine.

[20:21:31] <GothAlice> deathanchor: Pure schema, with optional data translation and validation, is the reason I wrote: https://github.com/marrow/schema#readme — I evaluated and dissected 194 other libraries that touch on schemas to build this generic implementation (where all 194 evaluated are supersets).

[20:21:32] <GothAlice> :)

[20:21:35] <django_> whats the purpose of the: "_id" : ObjectId("577d66041d5bf3cd5f4b706b"),

[20:21:41] <GothAlice> There are some tricks to the metaclass that most seem to forget.

[20:22:12] <deathanchor> django_: think of it as a unique field (or primary key)

[20:22:44] <django_> deathanchor, data in different IDs are not related?

[20:22:53] <GothAlice> django_: ObjectIds are structured compact objects containing four separate fields. A simple string that is the hex value of it is explicitly _not_ the same as the real binary object that the example you gave would produce. By default if _id is not set on a record when you insert it, one will be generated for you. It's the default primary key.

[20:23:23] <GothAlice> https://docs.mongodb.com/manual/reference/bson-types/#objectid

[20:23:37] <django_> right but when i do .insert it generates an ID

[20:23:45] <GothAlice> Correct.

[20:23:47] <deathanchor> django_: read link above

[20:27:58] <GothAlice> deathanchor: Oh, and if you like SQLAlchemy-style query building, marrow.mongo uses "query-aware fields" in a similar way to SQLA: https://github.com/marrow/mongo/blob/develop/marrow/mongo/query/__init__.py#L199 — m.mongo Document instances act as seamlessly dictionaries for easy passing to pymongo, for example.

[20:29:31] <GothAlice> https://gist.github.com/amcgregor/6ddbda735e6ded267d31 for a comparison. (m.mongo isn't production ready or complete, yet, but input would be greatly appreciated! I'm trying to avoid the mistakes of MongoEngine this time around: https://github.com/marrow/mongo/wiki/Design-Considerations)

[20:38:56] <GothAlice> Biggest thing yet to tackle: handling translated documents. There's all sorts of complication involved in having an object with individually accessible fields where some of the fields are actually stored in a multilingual array.

[20:43:04] <deathanchor> I'll give it a whirl with my little side project, summer is tough with all the BBQ and beach going :D

[20:48:10] <|RicharD|> where you live deathanchor ?

[20:48:19] <deathanchor> Dirty Jersey

[20:50:06] <cheeser> how gauche

[20:50:14] <deathanchor> I swear everytime I go to some places at the shore it's like watching a live episode of the Jersey Shore.

[20:51:00] <cheeser> it's either jersey shore in east or my name is earl in the west.

[20:51:06] <cheeser> stay close to princeton. then get out of state asap.

[20:51:18] <deathanchor> nothing beat when I went to Aruba, I swear it was Jersey Shore: Old People edition.

[21:04:38] <|RicharD|> jolol jersey shore

[21:05:04] <|RicharD|> is real deathanchor ?

[21:06:53] <|RicharD|> btw I'm italian and jersey shore is pretty offensive and not bullshit

[21:07:12] <|RicharD|> I don't know italians in new jersey, but here is not like jersey shore

[21:07:19] <|RicharD|> totally NO

[21:17:05] <deathanchor> I've lived in NY/NJ and it hasn't changed that much over the years. Just a little more toned down.

[21:17:15] <deathanchor> * all my life

[21:44:41] <GothAlice> ccccccevngbgefervlhunrgijtfftbrhleueuinvglkh

[21:45:38] <GothAlice> LOL, sorry, need to be more careful when grabbing my laptop to pack away. Gotta not grab the OTP token glued into the USB socket.

[22:29:55] <Doyle> GothAlice, that $5/mo plan with backblaze, is there an acceptable use policy?

[22:31:14] <GothAlice> Doyle: There's a terms of service, aye. It's per-machine licensed, attached drives. https://www.backblaze.com/company/terms.html

[22:32:19] <GothAlice> The attached drives thing seems to be a backup client-enforced limitation, not a policy-based one in the terms, or I wouldn't be doing my ZFS shenanigans. ;)

[22:32:51] <Doyle> looks good to me

[22:33:11] <Doyle> How are you doing the backups? freeze and dump?

[22:33:14] <Doyle> then upload?

[22:33:57] <GothAlice> Point-in-time snapshot, remote mount of the snapshot on my laptop with suitable FUSE options to make it appear as a physical drive instead of network attached storage.

[22:34:55] <Doyle> ahh, gotcha. What's your storage appliance?

[22:36:28] <GothAlice> https://github.com/osxfuse/osxfuse/wiki/Mount-options#fsid is one of the options, see also fsname, fssubtype, and fstypename (with related .plist configuration within the fuse.fs bundle).

[22:37:03] <GothAlice> Three iSCSI Drobo 8-something-i's, paired with one of three 1U Dell rack servers each, replica set between those three nodes.

[22:37:30] <Doyle> wow, cool

[22:37:38] <GothAlice> One array would be a single point of failure, right? ;P

[22:37:51] <GothAlice> (As would only having one host managing the array(s).)

[22:38:15] <Doyle> Well, you could get a dual controller all flash storage array

[22:38:34] <GothAlice> With redundant power supplies, redundant network connections, …

[22:38:37] <Doyle> Low price of $50k for the first 5-10 TB :P

[22:38:39] <GothAlice> Still a single physical point of failure.

[22:38:49] <Doyle> true

[22:40:14] <GothAlice> And shovelling drives is the big reason I went with Drobo arrays; initially more expensive for the enclosures, but boy howdy the zero downtime upgrades and replacements are a joy. SSD hot cache on each, too, of course. (That was a pain… Sandforce controller use isn't always advertised, and their chipsets are utterly borked by not honouring flush requests.)

[22:41:32] <GothAlice> Not to mention variable IO performance on Sandforce chips due to transparent compression use. They're terrible. XP

[22:42:47] <Doyle> Those are some good features for these little things

[22:43:22] <GothAlice> The way they're implemented internally is also really smooth. Drobos ext format the individual drives and use them for logical stripe storage, similar to how MongoDB itself does bulk disk allocations.

[22:43:51] <GothAlice> … and the filesystem/partition table empty pattern de-duplication rocks.

[22:44:01] <Doyle> They're improving on what I saw coming out of synology a few years back

[22:44:14] <Doyle> I think they termed it synology raid, allowing for mixed drive sizes, etc

[22:44:27] <GothAlice> Aye, the marketing term / trademark is "BeyondRAID" these days.

[22:45:28] <Doyle> Do you drobo DR to an offsite drobo?

[22:45:34] <GothAlice> Nope.

[22:45:44] <GothAlice> I was happy making that investment… once. :P

[22:46:17] <GothAlice> I also tend to mistrust IOT-like services.

[22:46:30] <Doyle> Fair

[22:47:56] <GothAlice> (Mostly because many include call-home functionality and firewall punching. There exists no universe where I'll enable something that potentially dangerous on one of my networks. Similarly: Skype is disallowed on my networks. ;)

[22:49:55] <Doyle> So pro

[22:50:28] <Doyle> these drobos... do look good

Log file Viewer

Help | Karma | Search:

#mongodb logs for Wednesday the 6th of July, 2016