PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 11th of July, 2019

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[01:27:21] <robertparkerx> Does anyone know about the `mongo-php-library` ? I updated PHP and I am using the new driver. I'm getting errors on a application that was built for the old driver. I fixed some but I'm stuck on one. `Call to undefined method MongoDB\\Collection::update()` . Does anyone know the new driver equal ?
[01:27:23] <robertparkerx> I have googled and found updateOne and replaceOne but I don't really know how to use them.
[12:27:52] <jokke> hey there
[12:28:04] <jokke> i'm having some issues upgrading from 3.4 to 3.6
[12:28:10] <jokke> it's a sharded cluster
[12:28:44] <jokke> i'm done with updating each node and now set the feature compatibility version to 3.6 but it won't update on my arbiters
[12:29:27] <jokke> the arbiters stay on 3.4 and complain about not running journaled but writeConcernMajorityJournalDefault set to true
[12:33:35] <jokke> hmm the version of the config also differs
[12:33:44] <jokke> 4 on the primary and 3 on the arbiter
[12:34:43] <jokke> also 4 on the secondary
[12:59:17] <jokke> ok i'm a bit further
[12:59:48] <jokke> sh.status() on a mongos instance dosn't list the arbiters in the shards
[17:16:38] <cgi777> It seems mongodb still has a publicly listening port which is unauthenticated? Is there a good way to secure this? Other problems that I should be aware of?
[17:17:53] <GothAlice> cgi777: The default is NOT publicly listening. If yours is, that's a PEBCAK failure or upstream provider post MongoDB. https://docs.mongodb.com/manual/reference/configuration-options/#net.bindIp
[17:18:22] <GothAlice> https://blog.serverdensity.com/does-everyone-hate-mongodb/ has a nice summary of self-inflicted foot shots to avoid (and avoid complaining about ;)
[17:20:34] <cgi777> GothAlice, is there a link for securing a mongodb database?
[17:20:54] <cgi777> I was thinking of wrapping a python rest api on the same machine, and give my web browser js code access to the database machine
[17:21:03] <cgi777> GothAlice, any other suggestions?
[17:22:11] <GothAlice> cgi777: "and give my web browser js code access to the database machine" — please consider not doing this. Or you will be the next angry blog post other developers point and laugh at a la that "does everyone…" article.
[17:22:35] <cgi777> GothAlice, then how do i get data/queries fast from db to the browser?
[17:23:55] <GothAlice> Because that opens up your database server to use by literally everyone on the planet, for any reason (want to host child pornography? Malware? International state-directed attacks at critical infrastructure?), with no protection for your own data (it's public in its entirety; pray you do not have "accounts" with "passwords" associated with them, or you're next up on https://haveibeenpwned.com )
[17:24:40] <GothAlice> "Gotta go fast" is not a feature or goal. It's a slogan for a blue hedgehog.
[17:25:23] <cgi777> GothAlice, what is a good solution?
[17:26:56] <cgi777> GothAlice, if someone gets into https://bankofamerica.com - does it really matter if the database is on the machine or is on another machine?
[17:26:58] <GothAlice> Learn to program; properly learn, not just copy paste coding. You're wanting to provide an API or access/manipulate resources (data). Learn to do that safely. Connect whatever you're building up to your database layer, with your front-end talking to your back-end which talks to MongoDB. Then benchmark / profile and identify where performance bottlenecks have arisen. And correct those, specifically.
[17:27:19] <GothAlice> Absolutely not, and 99.995% of the time, a database service will never be running on a machine running any other service.
[17:27:42] <GothAlice> For many reasons that a college or university level database administration course can properly educate you on.
[17:27:46] <cgi777> k - so you are saying Browser -> Backend -> Database. Well if one breaks into the Backend - its the same problem?
[17:29:18] <GothAlice> You're throwing out meaningless hypotheticals, and I'm not sure why. If someone "breaks in"… what? Define that. As you will have written your back-end, that's on you to code properly and protect. (However, NOT having a backend API interface your front-end talks to, and having your front-end talk directly to the database, means you have literally given up any attempt at security or normalcy.)
[17:29:44] <cgi777> I think we miscommunicated
[17:30:10] <cgi777> Here is what I am thinking -> Browser -> REST API -> Database server (without going to the backend that is serving the webpage).
[17:30:24] <cgi777> So the database machine has a python rest api running that does queries on mongodb
[17:30:31] <GothAlice> "without going to the backend that is serving the webpage" ← why did you add this, and what does it mean?
[17:30:51] <cgi777> A webpage is served from a backend (a webserver)
[17:30:59] <cgi777> I am trying to avoid that machine to do db queries
[17:31:33] <GothAlice> … not typically. Your back-end application (running by itself, or in a cluster or cloud) runs in isolation, your database services run in isolation, and they communicate. No other services run on the database nodes other than database services.
[17:32:22] <cgi777> so running a mongodb - securing it - and running a https server (for REST API) on the same machine is a no-no then.
[17:34:28] <GothAlice> I think you should take a proper course on systems administration and DBA concerns. As repeatedly stated, running any other services on the same nodes as database services is a terrible idea. And… if you're thinking of having an app which the front-end talks to, and ANOTHER app which the back-end talks to, via HTTP, to make MongoDB requests, you have arbitrarily overcomplicated your process for no reason.
[17:35:19] <GothAlice> I.e. I suspect you do not understand what is meant by a "database query". When an application performs a query, it does no work. It hands the query off to the database service, which then gets back to the application with the results. There is no need or reason to put anything between the two.
[17:36:09] <GothAlice> On the other hand, there is EVERY reason to put something (i.e. your formal REST API) between the front-end and database, unless you want to run an open database for anyone on the internet to use.
[17:37:06] <cgi777> GothAlice, that is indeed what I was talking about - mongodb + Authenticated REST API - If I understand right - what you are suggesting is devote two separate machines for this. I wanted to do this on the same machine
[17:37:50] <GothAlice> Hmm.
[17:38:53] <GothAlice> I can only keep suggesting seeking a formal education on these things. One does not deploy web apps on single machines. One does not deploy DATABASES on single machines, unless one does not value the data, and doesn't mind losing all of it at some point.
[17:39:48] <GothAlice> I do not think you understand what is meant by "database query". When an application performs a query, it does no work. It hands the query off to the database service, which then gets back to the application with the results. Ref: https://docs.mongodb.com/manual/replication/ see also: high availability, failover.
[17:39:49] <cgi777> GothAlice, you are assuming I am not backing it up?
[17:40:10] <cgi777> GothAlice, k - let me be very specific
[17:40:12] <GothAlice> Er, sorry about the duplication, there. ¬_¬
[17:40:39] <GothAlice> (Didn't mean to hit the up arrow before adding the links. >_<)
[17:41:24] <cgi777> I've a machine X - ubuntu 16.04 - I install mongodb on it, secure it. Then I run a python program that listens on 443 - only talks to authorized users. When users send queries, the python code decides if it should hand it off to the database or not. Assume this machine is backed up.
[17:42:12] <cgi777> GothAlice, Thanks for the help. Indeed my database course has been some time already.
[17:42:48] <GothAlice> Alice's Law #107: “Do you have a backup?” means “I can’t fix this.” — replication w/ high availability (at least three nodes) and failover means in the event of the failure of up to two thirds of your database nodes, you will never need your backups. This is the "right" way to deploy a database.
[17:42:51] <cgi777> At this point I am not worried about high availability and failover. I am just asking if this is something that makes sense to do. On a single machine, use https+rest api to query mongodb
[17:43:37] <cgi777> GothAlice, that also means price - more work - 3 nodes - to start with for 100MB of data that I can backup - is over engineering to me.
[17:45:14] <GothAlice> Possible? Yes. Advisable, reasonable, durable, or reliable? No. Not even close. If you have so little data, and insufficient capability to handle it properly, does your hosting provider not offer a shared MongoDB database environment? That is set up correctly? Have you explored hosted options?
[17:45:31] <GothAlice> Hosted options fail for me due to the volume of my data (~40 terabytes), but 100MB costs would be far more reasonable.
[17:45:38] <cgi777> GothAlice, I was looking at hosted options actually
[17:46:27] <cgi777> I was even looking at redis - in memory - but my data can grow - to a few GB in size. So dumped redis.
[17:51:15] <GothAlice> "I run a python program that listens on 443" ← another issue. Don't run application services exposed directly to the web, ever. Always utilize a front-end load balancer (FELB) or Application Firewall, e.g. nginx. It'll handle HTTP connection management, SSL termination, etc., etc. infinitely better than a Python app. Additionally, avoid "reverse proxying" where possible, e.g. having the FELB speak to your application over HTTP, which
[17:51:15] <GothAlice> is an inefficient text-based protocol.
[17:51:44] <GothAlice> FastCGI is binary, uWSGI is a version update to FastCGI that's even more awesome, anything other than running an actual HTTP server in Python.
[17:55:26] <cgi777> GothAlice, Thanks. I currently run nginx -> tornado (does not need uwsgi) - and i think some people use this in production (facebook?)
[17:56:35] <cgi777> so if i end up running mongodb - it would be mongodb + nginx + tornado (in python) + rest api written in tornado.
[17:57:02] <cgi777> on ubuntu lts
[17:57:11] <GothAlice> (Fun exploitable thing about HTTP servers: a simple way to denial of service / DOS one is to open a connection, then start to make a request, but never finish. Every half second, leaving each connection open, sending one byte every few seconds to each. If you don't correctly configure your application behind that FELB, i.e. bind only to 127.0.0.1, then external attackers can consume all possible connections to your back-end application.)
[17:57:32] <GothAlice> "nginx -> tornado (does not need uwsgi)" → so that means you're using Torando's HTTP server, speaking the plain text HTTP protocol through Nginx's reverse proxy process.
[17:59:09] <GothAlice> Client request → Nginx parses → Nginx resolves the request to a proxy_pass directive → Nginx reconstructs a modified request and passes it along → application worker receives HTTP request, parses it → resolves the request to a response → encodes the response and sends it back to Nginx → Nginx receives response, parses it, then re-serializes it to pass along → client gets response.
[17:59:15] <cgi777> GothAlice, https://www.tornadoweb.org/en/stable/guide/running.html
[17:59:49] <GothAlice> The proxy_pass process outlined above should be obviously sub-optimal. And yup, confirming what I suggested, you're running an HTTP server. In Python. For some reason. ;P
[18:00:36] <cgi777> GothAlice, I think tornado's http server is pretty well optimized. Let me check.
[18:00:53] <GothAlice> https://uwsgi-docs-additions.readthedocs.io/en/latest/Tornado.html
[18:01:10] <GothAlice> It's less about optimization, and more about security, vulnerability, and surface area of attack.
[18:01:10] <cgi777> GothAlice, https://klen.github.io/py-frameworks-bench/
[18:01:20] <GothAlice> Gah, I hate these benchmarks.
[18:01:36] <GothAlice> On this test, my own web framework gets sub-millisecond times.
[18:02:08] <GothAlice> "I didn't know sub-millisecond response times existed" -- djdduty
[18:04:00] <cgi777> GothAlice, https://groups.google.com/forum/#!topic/python-tornado/icV-uXiR0XQ - you should write here :)
[18:04:05] <GothAlice> https://gist.github.com/amcgregor/707936#file-terminal-2-apachebench-L35-L44 ← my own HTTP/1.1 server, written using Tornado's IOLoop subsystem (and just that subsystem… extracted and isolated from Tornado), gets 6K r/sec @ 10,000 concurrency (C10K) or 0.167 milliseconds per request (mean across all concurrent requests).
[18:04:15] <cgi777> Especially this one -> my basic setup is; nginx + tornado ( + supervisor) + mongodb (with motor)
[18:04:44] <GothAlice> There's little point in my engaging there.
[18:05:23] <cgi777> GothAlice, From the author of Tornado -> "I've never used uwsgi or gunicorn, and I haven't used pypy in production. "
[18:05:43] <GothAlice> Cool. Weight of that developer's opinion duly adjusted.
[18:07:22] <GothAlice> cgi777: https://github.com/marrow/cinje/wiki/Benchmarks#python-37 ← I write systems that are efficient. Because they need to be, and every other existing tool has failed me. (I.e. I reinvent wheels that are square, in situations where square wheels aren't actually appropriate. Yes, there are situations where a square wheel is the right solution.)
[18:09:14] <GothAlice> Notably, I wrote cinje because mako and jinja2 offered no way to sensibly flush templates as they generate, and they were taking > 30 seconds to generate anything, which meant they generated nothing. Compare cinje_flush_first vs. mako. Essentially identical templates, same output, same input data: almost literally 1000x faster.
[18:10:49] <cgi777> GothAlice, isnt this benchmark about templates more than the webserver itself?
[18:11:48] <GothAlice> This one is, but it's demonstrative that many of these benchmarks aren't really measuring the right things. My C10K test, there, wipes the floor with the contenders from the link you provided. In a web application, however, it's all tied together.
[18:12:19] <cgi777> GothAlice, In my case, I do not use templates at all from python. Only rest and websockets.
[18:12:56] <GothAlice> E.g. the maximum speed of template generation, and the maximum speed of endpoint resolution (controller lookup), … all combine to determine the absolute maximum performance of your whole application.
[18:13:01] <GothAlice> That's a potential warning sign, too.
[18:13:01] <cgi777> GothAlice, so I just need raw websocket/http speed along with ease of use to route urls - which tornado does well at - if i am not mistaken
[18:13:14] <GothAlice> cgi777: Does your front-end make "requests" over the web socket, then wait for "answers"?
[18:13:34] <cgi777> GothAlice, no - websockets are async
[18:13:49] <cgi777> GothAlice, otherwise ajax type calls over https
[18:14:25] <GothAlice> I'm aware. That has not stopped hundreds of people I've encountered from reinventing HTTP on top of WebSockets which run on top of HTTP. E.g. to make transactional requests (callbacks waiting for responses), or to reimplement caching. WebSockets are often a bad sign, and generally completely unnecessary.
[18:14:58] <GothAlice> Unless you're wiring a realtime voice or video chat system, video game with 50 tick per second update requirements, etc.
[18:15:00] <cgi777> GothAlice, indeed we have little use of websockets and might get rid of them at some point
[18:15:01] <GothAlice> Real binary socket stuff.
[18:17:40] <cgi777> GothAlice, https://www.techempower.com/benchmarks/ - another benchmark - latency looks interesting
[18:18:16] <GothAlice> cgi777: I'd be curious if you have any commentary on https://github.com/marrow/WebCore/blob/develop/example/annotation.py#L15 (framework or other examples, too). You mentioned endpoint routing. Tornado uses an O(n) router, taking the time to independently check every single registered route in the event of a 404. WebCore defaults to "object dispatch", or an O(depth) router based on attribute resolution, requiring no explicit
[18:18:16] <GothAlice> registration of any kind.
[18:18:19] <GothAlice> :)
[18:18:57] <GothAlice> (Annotation-based typecasting is also darned handy for writing APIs quickly.)
[18:19:26] <GothAlice> (WebCore, there, is a third the size of Flask, for comparison.)
[18:20:08] <cgi777> GothAlice, I will have to look at it. Thanks for the pointer, looks interesting. I would also be curious if people are using it in production?
[18:20:29] <cgi777> It looks too experimental at this stage? Perhaps better but not tested enough?
[18:21:10] <cgi777> Need to leave for an hour. Will be back :)
[18:21:15] <cgi777> GothAlice, Thanks for the conversation :)
[18:21:15] <GothAlice> cgi777: Various components have 100% test coverage, or near-100% coverage, with certain instances of packages having literally more tests than statements to test.
[18:21:27] <GothAlice> It's also been in production use and production-ready since early 2009.
[18:22:23] <GothAlice> :P A nice example of my not minding people not jumping aboard my sexier, faster train. I'm okay with being selfish. I like my train. ;)
[18:22:39] <GothAlice> (Ah, that 2009 bit also means it actually predates Flask, too.)
[18:24:44] <GothAlice> I even have one developer here on IRC translating the framework to PHP (the dispatch process really tickled his fancy, ref: https://github.com/marrow/protocols/tree/master/dispatch#readme) and another porting its concepts to JS.
[18:51:13] <robertparkerx> Does anyone know what to replace to legacy `->remove()` and `->insert()` with ?
[18:51:42] <robertparkerx> in the mongo-php-library
[18:56:46] <GothAlice> Has the trend not been to include a quantity specifier? _one vs. _many suffix?
[19:01:31] <GothAlice> robertparkerx: https://docs.mongodb.com/php-library/master/reference/method/MongoDBCollection-insertMany/ + https://docs.mongodb.com/php-library/master/reference/method/MongoDBCollection-insertOne/
[19:01:37] <GothAlice> robertparkerx: https://docs.mongodb.com/php-library/master/reference/method/MongoDBCollection-deleteMany/ + https://docs.mongodb.com/php-library/master/reference/method/MongoDBCollection-deleteOne/
[19:01:46] <robertparkerx> You're the best!
[19:01:52] <GothAlice> Yup, same as virtually all drivers.
[19:02:11] <GothAlice> 👍 for consistency.
[19:04:40] <robertparkerx> GothAlice, what about ensureIndex()
[19:06:23] <GothAlice> That was an anti-pattern they realized was unfortunate to promote.
[19:06:35] <GothAlice> ensureIndex is the wrong approach to indexing.
[19:09:55] <robertparkerx> I don't think its being used anyway
[19:10:39] <robertparkerx> what about `$cursor->limit()` and `$cursor->skip()` I know `$cursor->count()` was removed
[19:11:29] <robertparkerx> $cursor = $this->collection[$collection]->find($criteria,$options);
[19:14:58] <robertparkerx> https://docs.mongodb.com/manual/reference/method/db.collection.find/
[19:15:04] <robertparkerx> it seems its still possible
[19:28:41] <cgi777> GothAlice, these managed database providers - how do they handle database queries and communication? over https?
[19:29:18] <GothAlice> … no. HTTP is not involved. MongoDB wire protocol is involved, and that can use SSL, similar to HTTPS, but not over it.
[19:29:50] <cgi777> GothAlice, these guys support postgresql/redis/mysql - so they use mongodb protocol?
[19:30:27] <GothAlice> https://docs.mongodb.com/manual/reference/mongodb-wire-protocol/ + https://docs.mongodb.com/manual/core/security-transport-encryption/index.html
[19:30:40] <GothAlice> Nope, each of those has their own wire protocol.
[19:30:44] <GothAlice> And none of those are MongoDB.
[19:31:25] <cgi777> MONGODB_CA_CERT = "/path/to/ca_cert.pem"
[19:31:26] <cgi777> MONGODB_CONN_URL = "mongodb://testuser:<pwd>@SG-example-17026.servers.mongodirector.com:27017,SG-example-17027.servers.mongodirector.com:27017,SG-example-17028.servers.mongodirector.com:27017/test?replicaSet=RS-example-0&ssl=true"
[19:31:40] <cgi777> and this is safe? to directly allow users to connect to these machines over the web + ssl?
[19:32:39] <GothAlice> Not over the web. >_< I'm not getting through to you. And it's regrettable you have a desire to paste things like that into IRC channels. At least there wasn't a password in that connection string.
[19:36:34] <cgi777> GothAlice, That was sample code from a hosted secure service (scalegrid) :)
[19:37:53] <cgi777> I would rather have MONGODB_CONN_URL="https://db.mydomain.com" - only works with a OTP and lets you do rest - that is what I was talking about
[19:38:32] <cgi777> I would think that is much better than managed service mongodb://...mongodirector.com:27017?
[19:39:43] <cgi777> here is another managed service : mongo ds012345.mlab.com:56789/dbname -u dbuser -p dbpassword
[19:43:49] <edrocks> cgi777 the point is you shouldn't be handing out your db address public or private to the general internet
[19:47:51] <cgi777> edrocks, that sounds like security by obscurity?
[19:48:10] <edrocks> cgi777 it's called secure your db and obscure it too. Zero days are a thing
[19:48:53] <edrocks> security by obscurity is just a marketing thing to get people to actually secure stuff when really in fact you should be doing both and have layers of security(with obscurity being one of them)
[19:53:47] <cgi777> edrocks, thanks
[21:09:32] <fennng_> any one know good online video course of mongodb? better with assignments.