pmxbot IRC Log Viewer

[00:00:31] <GothAlice> The tutorial does a great job of breaking the process down into understandable steps, with many links deeper into the reference documentation.

[00:03:44] <Sygin> GothAlice, one question then. my nodejs application mongodb package supports sharding. why couldn't i use that?

[00:03:49] <Sygin> it supports shard keys and everything

[00:04:12] <Sygin> i dont need to abstract it away with mongos

[00:04:32] <GothAlice> Yes, you do.

[00:04:57] <GothAlice> Your application connects to mongos; any support it has for sharding, sharding keys, etc. is on top of that.

[00:10:00] <Sygin> GothAlice, like for example look here: http://stackoverflow.com/questions/36287097/can-i-connect-mongoose-to-a-sharded-mongodb-instance

[00:10:18] <Sygin> i did this in my system with 2 instances and it worked

[00:10:29] <Sygin> even though the binary "mongos" was not running in my system at all

[00:11:18] <Sygin> all i had to do was mongos: true. and the nodejs mongoose handled it for me. mongos binary never needed to run though

[00:14:29] <GothAlice> I do not think that means what you think it means.

[00:14:44] <GothAlice> Sharding requires knowledge of where the data resides. That option has no provision for recording that information. It's not true sharding.

[00:15:32] <GothAlice> True sharding has a "configuration server" which tracks this. Additionally, running a real mongos service means it can actively rebalance chunks.

[00:16:00] <Sygin> does the real mongos service you speak of appear as a mongod service to the application?

[00:16:07] <GothAlice> Correct.

[00:16:11] <GothAlice> It's transparent.

[00:16:22] <GothAlice> (Again, read the docs.)

[00:16:35] <Sygin> i did. they said the same thing. but i didn't believe it

[00:16:49] <GothAlice> Middleware is not an unusual technique.

[00:17:28] <Sygin> expecially this link: https://docs.mongodb.com/manual/tutorial/deploy-shard-cluster/

[00:17:37] <Sygin> it makes it sound more complicated

[00:18:59] <Sygin> also GothAlice mine is a single server does it really need all those keys and stuff at the moment? the mongod services will not be exposed to the internet at all

[00:19:12] <Sygin> the only service that will be is the application

[00:19:33] <GothAlice> Hmm. I'm guessing you're also planning on running the application on the same server?

[00:19:49] <Sygin> yeah

[00:19:53] <GothAlice> >_<

[00:20:44] <GothAlice> So yeah, performance and reliability/redundancy can't possibly be a concern. ;) If you firewall the machine such that no mongo process is accessible externally, yes, you can safely choose to not use authentication and keys.

[00:21:07] <Sygin> ok dope

[00:21:29] <Sygin> so all mongos then becomes is just a middleware (joiner) for the two mongod processes

[00:21:38] <GothAlice> You'll only need one mongos.

[00:21:45] <Sygin> yeah

[00:21:49] <GothAlice> And basically, yes. It also handles re-balancing.

[00:21:50] <Sygin> well that part of the sentence

[00:21:53] <Sygin> not plural

[00:21:54] <Sygin> lol

[00:22:12] <GothAlice> (Since the mongos is what's aware of where all the data is.)

[00:22:46] <Sygin> yeah and the 2 mongods are going to carry different data. and the second mongod is going to route to the second hdd ya?

[00:23:22] <GothAlice> In principle, aye.

[00:24:04] <GothAlice> I hope you have three drives, though. One for the OS, two for data. You'd point the third mongod ("configuration server") at the OS drive, then the data mongod's at their respective data drives.

[00:26:18] <GothAlice> (The OS drive doesn't need to be at all beefy.)

[00:27:29] <Sygin> ok im confused

[00:27:49] <Sygin> is mongos the configuration server ?

[00:27:52] <GothAlice> No.

[00:27:55] <Sygin> i thought mongos is handling all that?

[00:27:55] <Sygin> what

[00:27:57] <GothAlice> It's the router. Reference this diagram: https://docs.mongodb.com/manual/sharding/#sharded-cluster

[00:28:47] <GothAlice> Three colours. Blue are the mongos routers, green are the mongod data services, orange is the mongod "configuration server" that the routers use to track which data server contains which data.

[00:29:13] <Sygin> and my app sends requests to the mongos servers right?

[00:29:17] <GothAlice> Correct.

[00:29:28] <GothAlice> mongos then looks up which mongod to talk to via the config server.

[00:29:57] <Sygin> and config servers has info like the shard numbers. like which shard gets more data etc

[00:30:01] <Sygin> what was that word

[00:30:01] <GothAlice> I.e. record 1 is on server 1, record 2 is on server 2, record 3 is on server 1, …

[00:32:00] <Sygin> well i'm not deploying any replica sets

[00:32:19] <GothAlice> Technically you don't need to. That graphic is illustrating an optimal setup.

[00:35:28] <Sygin> i read the docs but i still dont fully understand it

[00:35:32] <Sygin> i suppose i'll have to do it

[00:36:15] <GothAlice> Try it out on your development machine. Get comfortable with the arrangement, configuration, and commands. I often spin up "mini clusters" on my development machine, test things, then just delete the data directories to try again.

[00:36:29] <GothAlice> There's no shame in needing multiple tries to get it right; with everything, practice makes perfect. :)

[00:38:37] <Sygin> ok so GothAlice here https://docs.mongodb.com/manual/reference/program/mongos/

[00:38:45] <Sygin> it shows something like

[00:38:53] <Sygin> configDB: <configReplSetName>/cfg1.example.net:27017, cfg2.example.net:27017,...

[00:39:07] <Sygin> but what if i wanted to just add a shard. not a repl shard

[00:39:48] <GothAlice> Indeed, the default recommendation is to have a replica set of configuration servers, again, because if you have only a standalone one and it fails for any reason, your mongos will get amnesia about where your data is located. However, in your setup, you can still just use a standalone.

[00:40:25] <Sygin> sure but this doc is not showing how to add a standalone (so 2 standalones) in mongos though

[00:40:33] <GothAlice> Just don't specify multiple.

[00:40:36] <GothAlice> ¬_¬

[00:41:06] <GothAlice> configDB: localhost:27019 — you're going to need to configure each of these things to run on a different port to prevent conflicts, of course.

[00:41:29] <GothAlice> (Only the mongos should run on the default mongodb port, since it's the "server" your app connects to.)

[00:42:03] <Sygin> yeah i can make the mongod run on other ports ez

[00:42:42] <GothAlice> If you're trying things out locally, don't worry about messing up commands, give things a try. :) You can always clear out the data directories to start over.

[00:43:16] <Sygin> im trying it but i seem to have a entirely different picture of this than what this actually is so for example

[00:43:30] <Sygin> i thought mongos was all about adding 2 mongod servers into 1 ?

[00:43:48] <Sygin> so but if i mention multiple wouldnt it think that the other one is a replica set ?

[00:44:32] <Sygin> if mongos is one for one for all mongod servers, then what is the point of mongos ?

[00:45:11] <GothAlice> No. mongos is about having multiple mongod (any number) storing data, and one or more configuration mongod tracking where the data is, routing queries, and combining results. You can have as many mongos as you need or want; the diagram illustrates having one per physical application server. You only have one physical server, thus only need one mongos.

[00:45:18] <GothAlice> configDB is not data

[00:45:34] <GothAlice> It's the tracking information.

[00:48:28] <Sygin> GothAlice, 'mongos is about having multiple mongod (any number) storing data, and one or more configuration mongod tracking where the data is, routing queries, and combining results.' how is that not adding bunch of mongods into 1 ?

[00:48:50] <Sygin> you are taking multiple mongod storing data, and combining results into 1

[00:51:17] <GothAlice> The "no" was in response to "if mongos is one for one for all mongod servers". It's not 1:1. Each scales independently. With each application server you add, add another mongos pointing at the same mongod data server(s) and same mongod configuration server(s).

[00:52:21] <GothAlice> And the exact numbers of servers don't matter. You can always add more of them, remove them, etc. It's not 2:1, it's many:many.

[00:52:30] <Sygin> also right now mongos is giving me some weird backtrace so im not being able to run mongos at all

[00:54:43] <Sygin> http://pastebin.com/hRA4MVzq

[00:54:49] <Sygin> i dont even know whats going on

[00:55:24] <GothAlice> Lines two and three very clearly identify the problem.

[00:55:38] <GothAlice> You have another process running already listening to the port mongos is trying to use.

[00:55:58] <Sygin> i think im going to end myself tonight this is it for me

[00:56:14] <Sygin> lol

[00:57:28] <Sygin> "Surprised to discover that localhost:27017 does not believe it is a config server"

[00:57:31] <Sygin> thats cute btw

[00:58:30] <Sygin> ok i see i sorta get it now

[00:59:08] <GothAlice> Remember: you're going to want _everything_ to run on different ports than normal, and only have mongos listening to the normal port.

[01:00:38] <Sygin> yeah of course. i mistakenly added the config of the mongos copied from the mongod's one and never changed it

[01:00:58] <Sygin> i think that mistake was the easy part to be honest. now i have to figure out how to set up my config servers

[01:01:08] <Sygin> so that it works well with the mongods. but honestly i think i get it now

[01:01:22] <GothAlice> All of that's in the tutorial. Just, for your situation, just set one up and ignore the replication options.

[01:03:16] <Sygin> GothAlice, is it that mongos listens for application calls then looks at config servers for the right mongod instance to send data to ?

[01:03:53] <GothAlice> Effectively yes.

[01:04:12] <GothAlice> Remembering that to your application, mongos is no different than mongod. It's transparent.

[01:09:05] <Sygin> GothAlice, thank you so much for all your help. i think i have the base concept down. so i can figure things out on my own. i mean you have been talking to me for hours so thank you so much for that :D

[01:09:30] <GothAlice> It never hurts to help. I've been hacking away on my complex data modelling problem all day, basically, so it's nice to have a distraction. :)

[01:10:29] <GothAlice> 634 lines remaining to refactor. Yay!

[01:11:11] <GothAlice> Also apologies for heading you down the symlink route; that solves a different problem.

[01:11:41] <Sygin> no honestly right now symlink route is looking more attractive

[01:11:48] <Sygin> lol

[01:11:54] <Sygin> since with that route i still know whats going on

[01:12:01] <Sygin> it will be tedious but it could be done

[01:12:08] <GothAlice> It might look sexy, but it's a lame horse in disguise. It does _not_ do what you want or need.

[01:12:16] <Sygin> and i would let my future self deal with the scaling when the time came

[01:12:18] <GothAlice> It only adds problems to your situation.

[01:12:31] <Sygin> but i get what you are saying

[01:14:39] <GothAlice> http://s.webcore.io/bw1E/%7B874A0A04-E380-466B-8BAB-C85106F4530A%7D-sexy%20pose.jpg

[01:14:53] <GothAlice> The symlink approach is ^ for your problem. ;P

[01:16:26] <GothAlice> Not a comfortable ride. :D

[02:21:31] <jiffe> 'utf8' codec can't decode byte 0xf1 in position 108: invalid continuation byte

[02:21:54] <jiffe> can I either skip or ignore that somehow?

[02:23:44] <Boomtime> that might depend on what language you are using -- what driver and where are you getting that error?

[02:24:34] <jiffe> I am using pymongo

[02:25:00] <GothAlice> jiffe: If you're getting that error, the data isn't utf-8, but is stored as a string. Is that error happening in the pymongo codebase itself, or in our own code, for example, during string manipulation? Also, Python 2 or 3?

[02:26:43] <jiffe> Python 2.7.11 and this is happening during a call to collection.find({ "message_identifier": { '$gte': searchMessageIdentifier } }).sort('message_identifier').limit(self.maxRead)

[02:27:11] <GothAlice> What's the full traceback, and type(searchMessageIdentifier)?

[02:27:43] <GothAlice> Python 3 provides better error messages when accidentally auto-casting strings to unicode, Python 2 less so.

[02:29:50] <jiffe> searchMessageIdentifier is type str and traceback is http://nsab.us/public/mongo

[02:32:05] <GothAlice> Hmm. It's failing to deserialize the response. That's not a good sign. However, the error itself seems to be in deserializing an error being returned. searchMessageIdentifier is being potentially incorrectly treated as unicode; try wrapping it in bson.Binary()

[02:32:31] <GothAlice> I.e. whatever is in searchMessageIdentifier isn't unicode-safe.

[02:32:54] <jiffe> this has been working got 177M documents, just hit a response that started doing this

[02:33:46] <GothAlice> Indeed. Either the input you're passing in is in some way invalid (possible), or one of the documents you managed to store contains a text BSON field containing invalid utf-8 data, which is difficult, but can happen.

[02:34:31] <GothAlice> Unless potentially expensive checks are enabled server-side to validate incoming data, it trusts the client is sending well-formed data.

[02:35:54] <jiffe> yeah well I know there's corruption within this database due to disk failures which is why I'm importing into a new clean db

[02:36:19] <jiffe> this is a pretty old db, started out with mongodb 2.0.4

[02:36:20] <GothAlice> If you're able to get an interactive traceback of that (i.e. run the query through a step debugger like pdb/pudb or web-based stack trace tool like WebError or backlash) I'd investigate what the binary data being processed on this line is, exactly: error_object = bson.BSON(response[20:]).decode()

[02:36:48] <GothAlice> Specifically, the value of `response`.

[02:37:11] <jiffe> ok

[02:41:26] <GothAlice> It should be the raw BSON data (str) that is the response, with deserialization starting 20 bytes in. If it actually contains an error message coming from the server, you may have better luck reading it with a debugger this way, given conversion to unicode is failing.

[02:42:06] <jiffe> these are the raw bytes in the response in hex http://nsab.us/public/response

[02:44:07] <GothAlice> jiffe: BSONObj size: -1976527147 (0x8A3096D5) is invalid. Size must be between 0 and 16793600(16MB) First element:

[02:44:29] <GothAlice> You have a broken record. You're going to want to run a --repair (CLI startup option) or runtime repair command on that collection/database.

[02:44:53] <jiffe> well I don't want to repair, this is too big to repair, I am trying to read the documents I can and skip the rest

[02:45:28] <GothAlice> Found by running: print repr(''.join(chr(int(i, 16)) for i in '...paste...'.split(':')))

[02:45:30] <GothAlice> FYI

[02:45:50] <jiffe> interesting

[02:46:39] <GothAlice> So, given this is happening during deserialization, and is a server-provided error, the cursor won't survive it. I.e. you can't just try: … except: pass the error away.

[02:47:08] <jiffe> I see

[02:49:21] <GothAlice> You might be able to use an offline mongodump with --repair, or even an "online" one. That option tells mongodump to only dump valid data.

[02:49:24] <GothAlice> Ref: https://docs.mongodb.com/manual/reference/program/mongodump/#cmdoption--repair

[02:50:04] <GothAlice> That is an option if you're using mmapv1.

[02:50:56] <GothAlice> Otherwise… your options are limited, pretty much, to a full service-level --repair, which can be tricky with so much data. After an impactful service event like this, if possible, image the drive, restore elsewhere, and operate on that copy. (Not always possible, I know. T_T)

[08:02:05] <GothAlice> Huzzah. https://jira.mongodb.org/browse/SERVER-25717

[09:31:37] <olivervscreeper> Hi, I'm finding the docs very confusing - if I want database encryption (at rest) and I can't afford enterprise, what are my options?

[09:42:36] <GothAlice> olivervscreeper: OS-level full-disk encryption. Insert note about such protections being largely pointless in cloud environments, too. ;)

[09:43:31] <olivervscreeper> What doing that be basically the best option to meet data compliance laws though? I agree it's pointless, but hey, gotta make clients happy: @gothalice

[09:43:46] <GothAlice> For sure, that's why I didn't bother elaborating. Practicality beats purity.

[09:43:49] <GothAlice> :)

[09:44:01] <GothAlice> But yeah, FDE is generally the way to go for that.

[09:44:25] <GothAlice> LUKS, for example, on Linux.

[09:45:05] <GothAlice> Also note that SSL is no longer a pure enterprise feature; anyone can enable it. (You're going to want to do that, too, and get some proper pinned certificates in there.)

[09:45:29] <GothAlice> This to secure intra-server and client communication.

[09:48:25] <olivervscreeper> Thanks @GothAlice, I look forward to setting that up :P

[09:49:02] <Sygin> GothAlice, so i talked to my boss and he said that hes ok with getting us a replica set like 3 months into the deployment :)

[09:50:06] <Sygin> that's still something

[09:50:48] <GothAlice> His business, his three months of risk. Better than nothing, for sure, though! :)

[09:51:09] <Sygin> yeah i told him

[09:51:12] <Sygin> might even be sooner

[09:51:14] <Sygin> like 1-2 months

[09:51:43] <GothAlice> olivervscreeper: Minor notes on that: if possible, get 4096-bit RSA, at a minimum 2048-bit; 1024-bit is now generally weak. Avoid certs with only SHA1 hashes; weak to forgery. Don't use self-signed. Avoid EC certs like the plague.

[09:52:07] <GothAlice> (EC = Elliptic Curve)

[09:52:40] <GothAlice> EV (Extended Validation) is probably unnecessary, given the certificates intended use is purely cryptographic.

[09:53:21] <olivervscreeper> GothAlice: do you know if there's a way of running MongoDB with LUKS or something similar on DigitalOcean? Since it's a VPS and you can't access partitions I may be out of luck and have to move host?

[09:53:37] <GothAlice> O

[09:53:55] <olivervscreeper> I found this which may work: https://www.digitalocean.com/community/tutorials/how-to-use-dm-crypt-to-create-an-encrypted-volume-on-an-ubuntu-vps

[09:54:10] <GothAlice> Eek.

[09:54:47] <olivervscreeper> haha

[09:55:05] <GothAlice> I say that because there's all sorts of reasons to migrate off DO.

[09:55:21] <olivervscreeper> Really? Would love to know :-)

[09:55:29] <GothAlice> (UserVoice hacked and passwords leaked, hundreds of tickets outstanding for multiple years, etc.)

[09:56:59] <GothAlice> Custom images, 2012: https://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/3276477-allow-custom-images — Own boot loader, 2012: https://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/2814988-give-option-to-use-the-droplet-s-own-bootloader — Doesn't send power management clean shutdown, 2013: https://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/4005528-implement-a-clean-shut-down

[09:56:59] <GothAlice> -rather-than-risk-data

[09:57:20] <GothAlice> (sorry for the wrap there) Doesn't send power management clean shutdown, 2013: https://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/4005528-implement-a-clean-shut-down-rather-than-risk-data — Physically separated deployment is a crapshoot, 2013: https://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/3859618-deploy-to-physically-separated-hardware

[09:57:33] <GothAlice> Nested KVM for running Docker and friends, 2014: https://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/3695008-enable-nested-kvm — Can't download backup snapshots, 2013: https://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/3829438-download-snapshot-and-or-backup

[09:58:11] <GothAlice> Can't adjust DNS TTL, Feb 2015: https://digitalocean.uservoice.com/forums/136585-digitalocean/suggestions/3592508-dns-configurable-setting-to-raise-and-lower-ttl-v

[09:58:14] <GothAlice> Fin.

[09:58:34] <GothAlice> olivervscreeper: ^

[09:59:15] <GothAlice> And sorry, status on that last one was changed Feb 2015, but it was posted original Jan 2013.

[10:00:58] <GothAlice> Yeah, I was looking at the wrong field on a few of those. ¬_¬ Silly eyeballs.

[10:01:17] <olivervscreeper> Ok, thanks. If I was to use a VPS anyway, would that DO tutorial be the best approach?

[10:03:07] <GothAlice> olivervscreeper: Answers like http://askubuntu.com/a/97203 would seem to indicate it's reasonable. However, "AES" is not a crypto scheme, it's a base algorithm, and choice of scheme (XTS, etc.) has a huge impact on the security. Make _sure_ it's using XTS; other schemes are vulnerable if used to store at-rest or serialized (such as disks are) data, depending.

[10:03:57] <olivervscreeper> Great, thanks. GothAlice, final question I think, any source on the UserVoice hack out of interest?

[10:04:01] <GothAlice> Serial referring to block-after-block with the same key / IV situations.

[10:05:40] <GothAlice> Happened around or shortly before May 10, 2016, according to my request for account deletion.

[10:06:13] <GothAlice> https://status.uservoice.com/incidents/fb7ml8b3nphf

[10:06:46] <GothAlice> In this particular instance I knew it was DO's UserVoice because that was the only UV site the e-mail address notified had been used on.

[10:06:55] <olivervscreeper> Ah, so a UV incident rather than a DO one. Unless DO own UV or something?

[10:07:15] <GothAlice> http://s.webcore.io/0i3w1B372e08/Screen%20Shot%202016-08-20%20at%2005.57.16.png < the original e-mail notification

[10:07:34] <GothAlice> (I never delete anything ever. ;)

[10:07:54] <olivervscreeper> Seems like that comes in handy!

[10:08:04] <GothAlice> It was a UV incident directly relating to DO's UV support site. Not directly DO, but if you were a customer directed to UV to submit issues and ideas…

[10:08:25] <GothAlice> … and were silly, and used the same e-mail address and password there …

[10:08:47] <GothAlice> The vast majority of Anonymous comments you see there are from users who chose to delete their accounts in response to the incident. ;)

[10:09:51] <GothAlice> https://news.ycombinator.com/item?id=7498861 is worrying, too, but relatively old. Unsurprising the link is dead.

[10:10:33] <GothAlice> https://news.ycombinator.com/item?id=7499299 seems to be an official response.

[10:12:59] <GothAlice> Security is hard. ;)

[12:11:57] <beygibeygi> I have mongodb running on ubuntu, sometimes it may stop working, I wanted to know how can I restart the service using normal user permissions, (I actually have a program which checks the service status and if it do not respond, it will restart the service, but it does not have root permissions)

[16:49:36] <sifat> Hi all - I tried doing this regex escape expression but it's returning null - any ideas? db.logins.findOne({Tradable: {$regex: "DF - DDH $Fix$"}})

[20:06:17] <Sygin> is the mongos server the same as a query router ?

[20:06:25] <Sygin> like meaningwise

[20:21:06] <angular_mike> if pymongo receives SIGTERM during execution of insert operation will it wait for it to complete?

[20:25:45] <GothAlice> angular_mike: According to the signal hander code and use of inShutdown invariant in the concurrency model, yes, this would seem to indicate that in-progress writes are completed prior to a clean shutdown.

[20:26:31] <GothAlice> Not necessarily all journaled writes—that's harder for me to easily dig up—but any that are in the process of actually manipulating real on-disk data. Likely the journal is flushed then simply continued on next start, but don't take my word for it. ;)

[20:27:13] <GothAlice> Similar with the oplog: https://github.com/mongodb/mongo/blob/6bfe3dce5c0722653fa3713bc1a5e77269524f0d/src/mongo/db/repl/initial_sync.cpp#L66-L130

[20:27:28] <GothAlice> (Note the inShutdown checks in the inner loop.)

[20:28:30] <angular_mike> GothAlice: does that mean that it will protect about corrupted records but might not let all records that were submitted to be inserted?

[20:28:58] <GothAlice> Sorry, I'm having difficulty parsing that question. "protect about corrupted records" — could you define what you mean by this?

[20:31:20] <GothAlice> Sygin: Yes, mongos and "query router" are one and the same.

[20:31:34] <Sygin> ok lol thanks GothAlice. btw i set it up fine ^_^

[20:31:42] <Sygin> in a simulated environment

[20:31:43] <GothAlice> Sygin: https://docs.mongodb.com/manual/reference/glossary/ < there is a handy glossary of terms, too. ;)

[20:32:29] <GothAlice> I'm glad to hear it! Valuable experience, setting up "proper" clusters. :)

[20:32:39] <Sygin> on that note GothAlice can you show me a link which shows how to set up config servers ?

[20:32:57] <angular_mike> GothAlice: records that have been created but don't completely match what was submitted by the INSERT query

[20:33:33] <Sygin> for example GothAlice this shows how to do it with mongos: https://docs.mongodb.com/manual/reference/program/mongos/. but there is no such similar link for config servers

[20:33:46] <Sygin> there is this: https://docs.mongodb.com/manual/core/sharded-cluster-config-servers/

[20:33:50] <Sygin> but it doesn't explain all flags

[20:34:15] <Sygin> so i had to use this link to learn how to do things. https://www.digitalocean.com/community/tutorials/how-to-create-a-sharded-cluster-in-mongodb-using-an-ubuntu-12-04-vps

[20:34:21] <GothAlice> Sygin: https://gist.github.com/amcgregor/c33da0d76350f7018875#file-cluster-sh-L80-L89 — this is the command used in my "build a testing cluster" script to start the config servers. https://gist.github.com/amcgregor/c33da0d76350f7018875#file-cluster-sh-L51-L53 are the configuration options used; some, such as keyFile, don't apply to your setup. Refer to the documentation page on mongod startup options to see what all these are.

[20:34:46] <GothAlice> angular_mike: I do not believe what you are suggesting is a thing. Inserts don't "match" things.

[20:35:02] <Sygin> yeah GothAlice but thats your link though. shouldn't the official mongo documentation have a flag list for config servers as well? i think that would be helpful

[20:35:30] <GothAlice> angular_mike: From what I see of the way clean shutdowns are handled, the insert will be fully applied and only on the check for the next operation will it examine inShutdown() and exit.

[20:35:42] <Sygin> i should'nt have to use 3rd party sites to learn how to set up config servers

[20:35:43] <GothAlice> Sygin: Again, reference the mongod configuration options. They're all listed there.

[20:36:13] <GothAlice> https://docs.mongodb.com/manual/reference/configuration-options/#sharding-options

[20:36:40] <GothAlice> https://docs.mongodb.com/manual/core/sharded-cluster-config-servers/#sharding-config-server

[20:38:33] <Sygin> i see

[20:39:46] <GothAlice> And config server setup is included in the general tutorial: https://docs.mongodb.com/manual/tutorial/deploy-shard-cluster/#sharding-deploy-sharded-cluster < the only difference for your setup is a standalone config server instead of replica set.

[20:39:58] <Sygin> idk the UI could be improved i think. it was not clear from first glance at the config server link that mongod would be the operator for setting up configsvrs

[20:40:23] <Sygin> but i had to go around to the mongod page to find the config server stuff

[20:40:27] <GothAlice> The tutorial is quite explicit on that, both up front "start each mongod…" and in the code/command examples.

[20:40:29] <Sygin> but idk, just my opinion

[20:41:59] <Sygin> anwyay thank you for all your help GothAlice couldnt have done it without yoyu :)

[20:42:01] <Sygin> you*

[20:42:02] <Sygin> anyway*

[20:42:09] <GothAlice> It never hurts to help. :)

[21:38:07] <Mattx> Hi

[21:38:36] <Mattx> I have a huge collection, and I'm getting an error "Overflow sort stage buffered data usage of ..."

[21:38:59] <Mattx> After that I created an index on the field I use to sort the results, but I'm still getting that error

[21:39:29] <Mattx> Should I do anything else given that the data was already on the collection before I created the index?

[21:39:46] <Mattx> Maybe update the index with all the existent data?

[21:44:26] <cheeser> in your aggregration?

[21:49:37] <Mattx> cheeser, it's a find command that is failing

[21:50:00] <Mattx> I would also increase the memory limit, that would work for me

[21:50:09] <Mattx> I'm trying to find out how

[21:50:43] <GothAlice> Mattx: Many memory limits are hardcoded at compile time (such as the 16MB document size limit) in order to reduce the likelihood of accidental denial of service.

[21:51:06] <Mattx> So how do I get around this?

[21:51:33] <Mattx> {"find"=>"ODP", "filter"=>{}, "projection"=>{"_id"=>0}, "sort"=>{"id"=>1}}

[21:51:36] <Mattx> This is the query ^

[21:51:56] <GothAlice> I… can't even parse that. That's a strange looking query.

[21:52:05] <GothAlice> Are you able to reproduce it using the mongo shell?

[21:52:12] <Mattx> I have multiple collections with around 10k documents each, I dom't expect ODP to be bigger than that

[21:52:27] <Mattx> Don't know, I haven't tried

[21:52:30] <Mattx> Let me see

[21:52:51] <GothAlice> The mongo shell is a convienent lowest common denominator for problem solving; keeps everyone on the same page. :)

[21:57:42] <GothAlice> cheeser: Imagine my surprise: https://jira.mongodb.org/browse/SERVER-25717 ;)

[21:58:38] <GothAlice> Seems like a ball was dropped somewhere when $position was added. ^_^;

[21:59:43] <cheeser> GothAlice: i, myself, ran in to a bug in $facet. :)

[22:00:48] <GothAlice> cheeser: Jira link? I'm getting flashbacks to EAV looking up the general feature. XD

[22:02:24] <cheeser> i'm not sure there's a jira, yet. i ran into it building a demo for a talk last week.

[22:02:57] <GothAlice> Looks quite handy, though. Something to look forward to.

[22:03:03] <cheeser> very much so

[22:09:08] <Mattx> I believe the query is translated to this: db.ODP.find(null, {_id: 0, id: 1}).sort({id: -1})

[22:09:16] <Mattx> But somehow on mongodb's shell it works

[22:09:25] <Mattx> 7247 results

[22:10:51] <Mattx> afte rshowing the first 10-20 results it says "Type 'it' for more", maybe a limit is applied and that allows the query to finish successfully

[22:11:44] <cheeser> all queries are batched. the default is 20. i think in every driver.

[22:12:10] <cheeser> in any case, it's the sort stage which is failing and that's well before any batching of results.

[22:12:54] <GothAlice> Mattx: All queries are returned in batches to the client. What language and driver is your application code using? (I've never seen a query like that, and don't know what "ODP" is.)

[22:13:28] <cheeser> GothAlice: it's because i don't use python. :D

[22:16:36] <Mattx> GothAlice, ODP is just a collection within a database. I'm using the ruby gem/driver https://github.com/mongodb/mongo-ruby-driver

[22:16:46] <Mattx> And the query I showed is what is on the log just before the error

[22:17:12] <Mattx> (It's not a query, it's the log of the query)

[22:20:41] <GothAlice> It's odd that you have both _id and id in the same document, instead of using the value of id as _id, but that shouldn't really impact this. Alas, I don't Ruby. :( What seems to be the problem, here, is that you have a very, very large document somewhere in the results, or potentially all of your records are quite large. Make sure you have an index covering the field you are sorting by.

[22:20:44] <GothAlice> Mattx: ^

[22:20:57] <Mattx> I see what happens!

[22:22:24] <Mattx> the problem is with the index, I thought I had an index for the "id" field on the ODP collection, but I created it on another collection by mistake it seems

[22:22:33] <cheeser> heh

[22:22:37] <Mattx> at least I can't find it on the system.indexes collection

[22:23:01] <Mattx> regarding the "id" issue, isn't it created by mongo by default?

[22:23:11] <GothAlice> _id is created by mongo if your application doesn't provide its own.

[22:23:13] <Mattx> I really don't use that field, I created one on my own with a sequence number

[22:23:17] <GothAlice> It's also automatically indexed.

[22:23:38] <Mattx> I would use _id if I can specify the value myself

[22:23:40] <Mattx> is that possible?

[22:24:38] <GothAlice> Having your own _id isn't uncommon, but there are certain important reasons ObjectIds exist which are important to understand. Notably, unlike auto increment IDs, ObjectIds can be created by each app worker independently without central locking or authority, whereas an integer ID requires central management to ensure the same ID isn't given out twice. It also saves you from needing any creation time field: https://docs.mongodb.com/manual/r

[22:24:38] <GothAlice> eference/method/ObjectId/

[22:24:39] <Mattx> for the first record I usee the id is ObjectId("57b87009c7c5bdfe6c4b2916")

[22:24:40] <GothAlice> https://docs.mongodb.com/manual/reference/method/ObjectId/

[22:24:43] <GothAlice> Sorry.

[22:24:43] <Mattx> what on earth is that?

[22:24:52] <GothAlice> See the link I just gave. :)

[22:25:00] <Mattx> thanks!

[22:30:52] <Mattx> I guess I can start using that, it makes sense

[22:32:02] <GothAlice> Range querying to select records by creation time is really awesome. :3

[22:32:41] <umdstu> i’m trying to set up and access a database with a user. i’ve created both, but when I call `find` on one of the tables. `not authorized on test_db to execute command { find: "users", filter: { name: null }, limit: 1, batchSize: 1, singleBatch: true }` even though the user was given roles dbOwner and readWrite

[22:34:35] <Mattx> I need to update 15GB of records if I want to migrate the current documents and start using _id T_T

[22:34:50] <Mattx> all spread in 3 database with 100 collections each

[22:35:03] <Mattx> (more or less)

[22:39:02] <GothAlice> Mattx: Bulk update operations will be your friend. You can $set and $unset in the same operation.

[22:39:52] <GothAlice> I.e. load up the record, $set the correct field, $unset the old one, repeat, then tell the bulk update to execute. I'd definitely batch results, and only project the field you need to read from to avoid transferring that full 15GB over to your application or mongo shell.

[22:41:41] <Mattx> Will check all that, thank you

[22:49:57] <atomicb0mb> good night gentlemans, is there away that i can have a callback from db.close? I tried db.close(function(err, success) { console.log ( success ) }) but nothing happens... is that possible? THank you

[22:52:19] <GothAlice> atomicb0mb: close() returns a dummy promise which you can .on() with, but there's no point. It's an immediate, not asynchronous operation.

[22:52:34] <GothAlice> https://github.com/mongodb/node-mongodb-native/blob/2.2/lib/db.js#L354-L386

[22:53:07] <atomicb0mb> i just wanted to check how many seconds my app got connected... and i thought to set time out on close

[22:53:11] <atomicb0mb> i will read about it

[22:53:44] <GothAlice> atomicb0mb: As the Db object does, you can listen to the topology 'close' event yourself, seen here: https://github.com/mongodb/node-mongodb-native/blob/2.2/lib/db.js#L162

[22:54:02] <GothAlice> (But that's whole-connection close, not per-db close.)

[22:55:17] <atomicb0mb> all right :) Thanks budy

[23:25:24] <Sygin> GothAlice, hey

[23:28:52] <Sygin> so if i have a config server with one shard attached to it that has like a lot of data in it. and i add a new shard to it, will it automatically redistribute all the data 50 50 ?

[23:29:23] <Sygin> also if i have a lot of data in a system, and i add a replica set and specify it is a replica set, will it automatically replicate the data over?

[23:29:27] <Sygin> ty in advance

[23:30:43] <GothAlice> Sygin: The router (mongos) does the rebalancing, but yes. The mongod running as a config server just tracks which shard contains which record and stores the locks the router uses when doing chunk migration between shards. (Such as when you add a new one.)

[23:31:57] <Sygin> ah dope. thanks GothAlice

[23:32:05] <GothAlice> No worries. :)

Log file Viewer

Help | Karma | Search:

#mongodb logs for Saturday the 20th of August, 2016