[08:57:32] <n3ssi3> Hey there :) Can someone help me import a large mongodump?
[09:29:47] <kurushiyama> n3ssi3 It would be helpful to describe the problem a bit.
[09:41:39] <n3ssi3> kurushiyama: I try to import a dump about 17gb big... It always stops with error "insertion error: EOF" so I google and found this: https://jira.mongodb.org/browse/TOOLS-939
[09:42:29] <kurushiyama> n3ssi3 Have you tried to reduce the batch size, then?
[09:43:23] <n3ssi3> I tried with many different high of batchSizes but it doesn't work
[09:55:55] <kurushiyama> n3ssi3 That is... ...strange.
[09:57:08] <kurushiyama> n3ssi3 Is this the restore of a backup or a data migration?
[09:57:48] <n3ssi3> restore of a backup... I am a frontend dev, and Backend just sent me their dump...
[09:58:07] <n3ssi3> the problem they are in the US and asleep right now :p
[09:58:50] <kurushiyama> n3ssi3 I guess they did not verify their dump.
[09:59:13] <kurushiyama> n3ssi3 Actually, I assume a corrupted dump file.
[09:59:49] <kurushiyama> n3ssi3 which would be... unfortunate, to say the least.
[10:00:00] <kurushiyama> n3ssi3 Let me quickly check sth.
[10:00:31] <kurushiyama> n3ssi3 with which version was the dump _made_?
[10:00:39] <n3ssi3> I have time ;) Since I need to do performance tests , so without the data I am going to sit here and do nothing
[10:01:27] <n3ssi3> hmm I wouldn't know... or can I check that in the dump
[10:04:18] <kurushiyama> n3ssi3 Not sure about that. But let's say you try to restore with 2.6.X and the dump was made with 3.0.X (mongorestore and mongodump version), there might be problems.
[10:05:25] <n3ssi3> If anything they use the older version....
[10:05:42] <n3ssi3> kurushiyama: would it help if I downgrade to a 2.x version?
[10:06:06] <n3ssi3> I only use mongo for this one project, so I can uninstall and install/ delete everything
[10:06:06] <kurushiyama> n3ssi3 Well, you have to find out with which version the dump was made.
[10:06:36] <n3ssi3> kurushiyama: thanks :) I will check with US once they wake up
[10:06:37] <kurushiyama> n3ssi3 Then, you should install the same major.minor.X
[10:06:54] <kurushiyama> n3ssi3 Both server and tools.
[14:18:14] <OSInet> jmikola: hello. I'm working on MongoDb for Drupal 8. IIRC when we met at the Paris MUG you told me the next ODM would be using the low-level driver and advised me to probably do the same for Drupal, but I can't see a version using it on the ODM repo. Am I missing something ?
[14:20:20] <jmikola> howdy! development hasn't started on ODM 2.0. speaking with the other devs since, they will likely use https://github.com/mongodb/mongo-php-library (or a similar library) to abstract some basic commands, like index creation
[14:21:08] <jmikola> you may want to do the same. while the library does contain Database and Collection classes, many of the command wrappers are stand-alone classes that you can use directly if you so choose (e.g. https://github.com/mongodb/mongo-php-library/blob/master/src/Operation/CreateIndexes.php)
[14:21:30] <jmikola> that would at least save you the trouble of handling differ MongoDB server apis for the same operations
[14:22:36] <jmikola> however, if you were writing some MongoDB glue to simply to writes and read operations, then it'd be easy to stick to the extension. the nuances with commands and GridFS are really the things you'd want the library for
[15:39:26] <direwolf> can i use $slice along with $addToSet
[15:57:18] <adrian_lc> hi, I'm launching a new replication slave and dunno why seem REPL [ReplicationExecutor] Error in heartbeat request to core:27017; HostUnreachable: HostUnreachable spamed in the log
[15:57:32] <adrian_lc> that node "core" is not added to the replica set conf or status
[15:57:45] <adrian_lc> but it's still trying to heartbeat
[15:58:12] <adrian_lc> how can I reset what seems to be a residual connection or something
[16:01:42] <adrian_lc> db.runCommand({connPoolStats: 1 }) does show a core:27017 under hosts
[16:18:35] <jayjo> I've asked a similar question before. If I have a document with many json separated by newline, and one of the lines is somehow causing problems with mongoimport, what's the best way to catch the line but continue with the import?
[16:21:01] <jayjo> Is mongoimport not really intended for this? When I tried to do it with pymongo, it was so painfully slow to write the documents one at a time. And even writing the documents in batches was not working quickly either
[16:21:23] <cheeser> i don't think mongoimport supports that.
[16:22:26] <jayjo> If you had a similar task, again about 15GB, would you write a script to do this data upload? At the rate I had it last time with pymongo the 15GB upload would've taken over 24 hours... not exactly sure but it was less than 1 GB an hour
[16:22:49] <jayjo> Which seems not totally reasonable
[17:29:03] <dino82> Uhh how many times does the BTree Bottom Up index need to run? This is the third time it's been [rsSync] Index: (2/3) BTree Bottom Up
[18:00:27] <orev> hi, I've been following this guide to install an app (which requires mongodb): https://help.ubnt.com/hc/en-us/articles/205146080-UniFi-Install-controller-software-to-CentOS One thing it says is that mongo requires 35GB of disk space, but I can't find any other mentions of those kind of requirements. Is that requirement correct? maybe changed with newer mongo releases?
[18:07:49] <StephenLynx> it doesn't require 35gb of disk, though.
[18:08:01] <StephenLynx> I got less than 25gb on my server and way less on my vms.
[19:09:18] <orev> ok, thanks. I'm not sure what that web page it talking about then
[19:36:35] <kurushiyama> orev Get used to that. A lot of "specialists" with "in-depth knowledge" write a lot of bullpoo about MongoDB.
[19:46:31] <orev> yeah, that's everywhere on IT blogs. they are just guidelines which is why I was trying to verify.
[19:46:47] <orev> I got it to work on a small vm, so I also didn't want to just assume it was ok
[19:56:01] <jayjo> I wrote a script that is super slow with pymongo to upload documents. Anyone have an idea why it's so slow? https://bpaste.net/show/cf39a7515682
[20:04:05] <GothAlice> Hmm, is there an adulterated mime type for MongoDB-flavour JSON? application/json+mongo?
[20:07:24] <GothAlice> jayjo: Wrap that whole thing in a function, slap https://github.com/rkern/line_profiler somewhere (pip install line_profiler), and add the @profile decorator to your function. Save the whole thing somewhere (example.py) and run: kernprof -l example.py, then to look at the results run python -m line_profiler example.py.lprof
[20:07:46] <GothAlice> jayjo: There's "all sorts of concerning" in that code of yours, but that'll be the fastest way to identify if it's JSON loading, or actually saving that's taking so long.
[20:10:25] <StephenLynx> GothAlice, wouldn't that be bson?
[20:10:39] <StephenLynx> or mongo does something BEYOND bson?
[20:10:51] <GothAlice> StephenLynx: No, I'm speaking of https://docs.mongodb.com/manual/reference/mongodb-extended-json/
[20:11:43] <StephenLynx> that looks like valid json, though.
[20:11:54] <StephenLynx> or am I missing something?
[20:12:43] <GothAlice> It's JSON with implied meaning through use of $-prefixed keys. That's a "format" way beyond the general one JSON itself specifies. Typically one would use + and a custom format designator to mention the format in the mimetype you are delivering to clients. This preserves "yeah, this is JSON" matching, while also informing as to the structure layered on top.
[20:13:28] <GothAlice> (Or if there's an official one.)
[20:13:43] <StephenLynx> I really don't think there would be a standard for this.
[20:16:18] <GothAlice> StephenLynx: Consider, a client requests a resource with Accept: application/json+mongo, gets back {"_id": {"$oid": "…"}}. To JSON, the whole {"$oid": "…"} bit is meaningless, but given the +mongo context, it's an ObjectId instance. The same resource is requested with Accept: application/json, gets back {"_id": "…"} where the hex-encoded ID is returned as a string.
[20:16:59] <GothAlice> (Since the requesting client can't handle fancy MongoDB encoding, we "do the right thing" and return a value suitable for use as an _id in those cases.)
[20:17:09] <StephenLynx> well, a client can go fuck himself.
[20:19:56] <StephenLynx> if they don't define it as a standard, is not a standard.
[20:20:08] <GothAlice> (And to be specific, RFC is the _process_ used by the IETF.)
[20:20:54] <GothAlice> So, the document structuring standards I use at work aren't "standards"? Your definition is unduly specific to public, open, broadly accepted standards, but does nothing to invalidate use of the word for smaller systems.
[20:22:20] <StephenLynx> not matter how well-documented and consistent it is, unless it is declared as a standard by a body, it isn't a standard.
[20:22:52] <GothAlice> Standard: a level of quality or attainment; quality, guideline, principle, or, oddly, flag or banner. Sorry, the three dictionaries I have on hand disagree with you. ;P
[20:23:36] <StephenLynx> you are arguing over semantics here.
[20:24:20] <GothAlice> (That's the noun form. The adjective form is still suitably broad: 1. the standard way of doing, normal, usual, typical, customary, conventional, or established. 2. the standard work on the subject: definitive, established, recognized, accepted, authoritative. Doesn't say it has to be a worldwide standard.) Regardless, it's not like I'm exposing MongoDB's built-in REST service, here.
[20:25:11] <GothAlice> I've just been discussing announcement of serialization format / encoding.
[20:33:18] <StephenLynx> why can't you expect the client to always understand extended json?
[20:34:02] <StephenLynx> is not like the difference between compressed and uncompressed text.
[20:34:06] <GothAlice> Because I have no control over the client, there are many, and each has different capabilities. The same reason I can't expect all browsers to have localstorage enabled: clients differ.
[20:34:33] <StephenLynx> if it understands vanilla json, it understands extended json.
[20:34:53] <StephenLynx> it won't magically implement types that work in the same way mongo have just because its extended.
[20:35:14] <StephenLynx> it will always have to, one way or the other, work around its own types.
[20:37:43] <StephenLynx> what can happen and what is the alternative?
[20:38:15] <StephenLynx> assuming the client can clearly differentiate between vanilla and extended.
[20:38:30] <GothAlice> The alternative is to, as I've been describing, hand back content to clients in a format they explicitly request. Which they can. Via Accept.
[20:39:47] <GothAlice> My only question has revolved around the "correct" label to use to mark MongoDB extended formatting, but I'll use "+mongo" for now pending the result of that ticket.
[20:45:58] <StephenLynx> and what other format can you send than a string?
[20:46:20] <StephenLynx> what im trying to tell you is that its just an implementation detail.
[20:46:28] <StephenLynx> clients can understand both.
[20:46:48] <GothAlice> I… have a lot I support, so that list can go on for a while.
[20:47:28] <StephenLynx> I am talking about picking between json and extended json.
[20:47:38] <StephenLynx> not between json and something completely unrelated to json.
[20:47:46] <GothAlice> YAML having native support for extended types, there's no particular need to inform the mimetype that an ObjectId is an ObjectId by virtue of convention, since it's explicit in the data. "This is an ObjectId, use class bson.ObjectId to instantiate this."
[20:50:50] <GothAlice> I'm writing a universal REST framework on top of my web framework. My Marrow Mongo package contains the adapter glue needed to REST-ify collections and documents, generically. It has no clue what the final use for these things will be.
[20:51:20] <GothAlice> Sure, there are "schemas" in a loose sense. It doesn't care, it _can't_ care. It must be generic.
[20:52:02] <StephenLynx> I clenched my butthole so hard it achieved critical mass on "REST framework on top of my web framework".
[20:52:13] <GothAlice> https://github.com/marrow/mongo/blob/develop/web/db/mongo/collection.py#L30 < just getting started extracting bits from work and cleaning them up, of course. There be dragons here.
[20:52:28] <StephenLynx> no, it IS a dragon on itself.
[20:52:48] <GothAlice> My web framework is a few hundred lines of code. The REST framework on top of it? https://github.com/marrow/web.dispatch.resource/blob/develop/web/dispatch/resource/helper.py#L11-L24 < that's the developer side of it.
[20:53:10] <GothAlice> The actual REST dispatcher being less than 100 lines of code: https://github.com/marrow/web.dispatch.resource/blob/develop/web/dispatch/resource/dispatch.py#L18-L108
[20:53:28] <GothAlice> I'm not working on Django or anything.
[20:53:55] <StephenLynx> I don't see a difference, though.
[20:54:07] <StephenLynx> it is a web framework abstracting a high-level surface.
[20:54:28] <GothAlice> Have a benchmark just on template generation performance: https://github.com/marrow/cinje/wiki/Benchmarks#python-34 < hint: I'm 17,000x more performant.
[20:54:41] <GothAlice> And… that's just templates.
[20:55:04] <StephenLynx> you are talking about python. no one uses python for performance and certainly the maintainers don't give a hoot about it.