[10:55:50] <chazapp> Hi, anyone know how to fix mongodb-compass when its stuck on "Loading" screen at startup ?
[10:56:21] <chazapp> I've uninstalled, tried community version, but i'm still stuck at the same point, i can't use it
[13:25:02] <chazapp> db.getCollection("doc").find({"features.creationTimestamp": {"$gte": "2019-09-09T14:03:43.094Z"}}); => 0 results instead of 3. Did i miss anything ?
[13:27:59] <GothAlice> chazapp: Why are you searching and attempting to >= compare a literal string?
[13:29:00] <GothAlice> (That looks like mongo shell input, not application-side DAO, so there’s _no_ assistance or awareness of any kind of schema, because there is no schema, for casting the values used.)
[13:30:08] <chazapp> Doesn't mongo manage timestamps that way ?
[13:31:22] <GothAlice> chazapp: MongoDB expects real datatypes for certain things. You are suffering the “datetime object” version of this problem: https://stackoverflow.com/a/55386123/211827
[13:32:12] <GothAlice> Last four days, five people have hit that, that I’ve shared the link with.
[13:32:22] <GothAlice> Four of them were using Mongoose. Are you? ;^P
[13:32:36] <chazapp> We use mongoengine on a flask backend
[13:32:47] <chazapp> are you sure this is related ?
[13:33:13] <chazapp> (don't hit the new backend dev on an outdated stack)
[13:33:24] <GothAlice> If you’re in Python land, use a real datetime.datetime object, yo.
[13:34:02] <chazapp> All i want is to get docs.features.creationTimestamp ; creationTimestamp looks like this in Compass: "2019-09-09T14:03:43.094Z"
[13:34:39] <chazapp> for some reasons the current impl (written in 2017) uses "__raw__" in python. yikes
[13:35:12] <GothAlice> Have you _tried_ a datetime object?
[13:35:13] <chazapp> I just guessed i could throw dict() at it written like that "{"features":{"creationTimestamp":{"gte":"2019-09-09T14:03:43.094Z"}}}"
[13:36:02] <GothAlice> Additionally, by using __raw__ you are bypassing any assistance the DAO may have provided. (Who knows, it might grok ISO-style timestamps and convert them to datetime() objects for you. I doubt that, but.)
[13:37:02] <chazapp> got it, i'm trying to do that
[13:37:55] <GothAlice> In a mongo REPL shell, see: https://docs.mongodb.com/manual/reference/method/Date/ — each language will have its own date/time representation.
[13:40:08] <GothAlice> Apologies if I’ve linked this before (buffer history is wack locally ;), but the “yikes” was because I, too, used to use MongoEngine. Then its code quality reached “just when you think you’ve hit rock bottom, you hear knocking from below” levels. https://github.com/marrow/contentment/issues/12
[13:40:36] <GothAlice> So I wrote https://mongo.webcore.io
[13:41:37] <GothAlice> E.g. my favourite regression: if you resultset.count(), it forgets any .limit() or .skip(). Having my paginated tabular results suddenly come over as a single page of every single item (at 20 items per page, 17,000 pages…)
[13:47:31] <chazapp> I'm the new dev on a legacy project, there have been funnier times. i don't have a clue on how to do the queries on that stack
[14:00:00] <GothAlice> As much as I don’t really want to encourage its use, chazapp, I grok that practicality beats purity: http://docs.mongoengine.org/tutorial.html#accessing-our-data
[14:01:13] <GothAlice> MongoEngine, like most DAO layers in Python, takes a “declarative” approach to “schema” design, an “active record” approach to the records being able to self-save, and “parametric” querying (transforming argument name prefixes and suffixes into operations).
[14:15:10] <GothAlice> Deficit MongoEngine use encourages: ignorance of “validation documents”. You get used to MongoEngine validating the data going in, one never makes use of MognoDB itself validating that data, thus invalid data may originate from other sources, e.g. mongo shell, Studio 3T or other graphical tools, etc.
[14:17:02] <GothAlice> https://docs.mongodb.com/manual/core/schema-validation/ ← and this is so much more badass than relational schemas or constraints. By gawd. (Yes, you can totally use $or or $and… making the documents as conditionally strict as one needs.)
[14:53:47] <chazapp> GothAlice: changing strings to datetime object worked. I had to do ugly dictionary traversal to get to the string but now it works. Thanks a lot
[15:30:02] <ssarah> hei guys, what do i do if i have 2 fields that have too much text to be indexed by a normal index?
[16:12:39] <GothAlice> ssarah: Define “too much text”? This is not an issue I’ve ever encountered, though in general I’ve never attempted to create a normal index on large text content, that’s what full text indexes are for.
[16:13:24] <GothAlice> (Notably, FTI can benefit from data reduction and deduplication; depluralization and stemming means “bus”, “busses”, “bussing”, and “bussed” ➤ “bus”.)
[16:15:07] <GothAlice> Especially notably: language-aware FTI with language-aware depluralizaiton and stemming.
[16:19:12] <GothAlice> ssarah: https://www.mongodb.com/atlas/full-text-search & https://docs.mongodb.com/manual/core/index-text/ (noting that I have just shy of a million works of fiction and non-fiction books, plus hundreds of thousands of articles and papers currently stored in my 54TiB MongoDB cluster at home… ;P
[16:19:25] <GothAlice> “Too much” always strikes me as the wrong problem. ;^P
[16:20:21] <GothAlice> (Ah, also includes a complete copy of Wikipedia, most of StackOverflow + Exchange, and all of Project Gutenberg. ;)
[16:41:12] <ssarah> GothAlice, I'm using mongodb with meteor. and i have two fields where when i tried to create a normal index (without any parameters) gave me the "index content too large" error
[16:41:26] <ssarah> so i'm assuming i needed two text indexes in the same collection, which is a no no
[16:41:38] <ssarah> i tried creating a compound tex index but it didnt work so well
[18:05:28] <blizzow> I'm having a helluva time connecting to mongo replicaset using robo3t through an ssh tunnel. I run: ssh myuser@myhost -L 127.0.0.1:27017:mongoa:27017 -L 127.0.0.1:27018:mongob:27017 -L 127.0.0.1:27019:mongoc:27017
[18:05:53] <blizzow> robo3t complains that it cannot find a primary member and keeps trying to connect to the hostname of the replicaset members.
[18:06:09] <blizzow> I also tried adding them into my /etc/hosts file and that didn't do any good.
[18:07:26] <GothAlice> blizzow: One needs working DNS for reliable replica set operation. They advertise themselves as part of connection tracking and election duties (amongst other uses) for tracking purposes within the client application; each also knows what port it is on, and will include that in the advertisement.
[18:08:02] <GothAlice> SSH tunnel re-mapping will confuse the daylights out of trying to connect to it. Fix: run a sharded replica set, even if there’s only one shard, and utilize a “mongos” query router. SSH tunnel that. Only one connection to tunnel, then.
[18:08:39] <GothAlice> “ugh” or… a standard part of sysops for complex services. ¯\_(ツ)_/¯
[18:09:09] <blizzow> It's strange because studio3t handles it fine. So does the mongo CLI client. Unfortunately, it's robomongo that's giving my people hell.
[18:12:32] <GothAlice> blizzow: I’m not permitted to connect to production datasets directly. My IT department runs a local mongos router connected via VPN to the DC cluster with extensive logging enabled, so they can watch what data I access to enforce ISO certification requirements. Query routers are dang handy things!
[18:12:46] <GothAlice> (And why we run a sharded replica set with only one shard for most of our services.)
[18:37:43] <blizzow> GothAlice, I like the idea of connecting to only a mongos instance. I find standing up four extra machines, 3 configs + 1 mongos excessive.
[18:46:02] <GothAlice> blizzow: One does not require dedicated machines for every minor component.
[18:46:39] <GothAlice> Additionally, since there is only one shard, having reliable config servers to track which shards contain which stripes is far less important. You can recover from the complete loss of that information pretty easily.
[18:47:19] <GothAlice> E.g. in a three-node replica set, run a config server on each replica, too.
[18:47:40] <GothAlice> (No collision; those run on non-client-connecting ports.)
[18:47:42] <blizzow> I guess I could colocate configs and mongos, then round robin the mongos for availability.
[18:48:04] <GothAlice> blizzow: The most extreme case I had was a mongos per app worker.
[18:48:19] <blizzow> I've seen that before too. I thought that was insane.
[18:48:19] <GothAlice> Each application process essentially had its own dedicated query router.
[18:48:36] <GothAlice> Query routers are just that; lightweight proxies.
[18:49:04] <GothAlice> They do the node tracking so your application’s client driver doesn’t need to.
[18:50:27] <blizzow> It's been a few years since I've run sharded replicasets. We had such stability problems back in the 2.x days. Heck, it may have even been 1.8. (pre wired-tiger)
[18:50:36] <blizzow> We actually moved to tokumx for a while.
[18:51:31] <GothAlice> On the other hand, sunk-cost. ;P Before MonogDB had compression, I wrote compression for my data. Before MongoDB had FTI, I wrote full-text indexing myself. Etc.
[18:53:32] <GothAlice> Parallel Okapi-BM25 with binary pre-imaging. :mic drops at least three Microsoft patents:
[19:05:27] <blizzow> I really wonder how studio3t is doing their ssh proxy connection. 'cause it "just works"
[19:07:22] <GothAlice> blizzow: Make a connection, “ps aux” in Terminal to see if they’re actually invoking a “ssh” command. ;P Hard to hide what you’re doing on UNIX.
[19:08:40] <blizzow> I already tried that on osx and couldn't find it. I guess I could fire up a linux one. I think they may be using some spaycial java ssh client.