[03:08:00] <nictuku> hi. I have a single server with a database that I want to move to a server across the atlantic (hetzner.de == cheaper). I'm OK with some downtime, but I wouldn't want to wait for a mongodump transfer. What would you recommend? convert my servers to a replica set, then move data that way?
[03:44:17] <Boomtime> @nickuku: "I wouldn't want to wait for a mongodump transfer." -> you have to transfer the data somehow, sooner or later those bits have to traverse the atlantic - a mongodump could at least be compressed for transmission, with scp or similar
[04:14:08] <nictuku> Boomtime: but wouldn't a replication copy the data incrementally?
[06:40:15] <lyze> Hello, I'm encountering a problem with morphia, is this the right place to ask for help?
[06:48:57] <Boomtime> @nictuku: what do you mean 'incrementally'? no matter what you do the data has got to be transferred, i'm not sure where you think the efficiency is being improved - the only way to get the data to be sent faster, is to send _less_ - unless you delete half your data, the only other option is to compress it, ala scp etc
[06:55:59] <lyze> And if it is, I've tried creating a simple test class as an entity ( http://pastebin.com/i70SwS2p ) which I try to save into the database. however I'll get a "indexoutofboundsexception" on the "save" method.: http://pastebin.com/r65xKGs3
[06:56:56] <lyze> ( Sorry, line 21-22 should be: datastore.save(new Test("blurgh")); )
[08:21:58] <mroman> If I have multiple mongos instances running, do I have to do the sh.addShard() on each of those?
[08:22:41] <mroman> or do they somehow discover each other/replicate data?
[08:30:17] <mroman> hm looks like the communicate through the config servers somehow.
[09:23:50] <Keksike> Hey, I need some help updating my mongo on Ubuntu. I have 2.6.12 installed now. I run sudo apt update, then sudo apt upgrade mongodb-org, but it says that newest version already installed
[09:23:57] <kurushiyama> mroman: Set up a config server replset. Fire up a mongos. Add the standalone via sh.addShard()
[09:24:14] <kurushiyama> Keksike: So what is your problem?
[09:27:29] <kurushiyama> mroman: Though it is a Very Bad Idea (tm) to have a shard backed by a standalone instead of a replset.
[09:29:01] <gta99_> Hi all, I'm currently trying to install Mongo 3.3 on Debian Jessie, but an apt-get update reports "W: GPG error: http://repo.mongodb.org jessie/mongodb-org/3.3 Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY BC711F9BA15703C6" any idea where to get the correct public key?
[09:30:09] <gta99_> I'm using saltstack to install the repository btw, if that helps :)
[09:30:18] <Keksike> kurushiyama: the problem is that although I have only 2.6.12 installed, it is saying that I have the newest version
[09:30:26] <Keksike> although newest version is 3.2. or whatever
[09:31:55] <kurushiyama> Keksike: From the pov of your operating system that is correct
[09:32:25] <kurushiyama> Keksike: You need to add the repo for the new version.
[09:32:48] <Keksike> kurushiyama: I tried apt update before apt upgrade. Do I need to do something else to add the repo for the newer version?
[09:33:11] <kurushiyama> Keksike: It is NOT a command. You need to add another software repository...
[09:34:18] <kurushiyama> Keksike: https://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/#install-mongodb-community-edition is the result of googling "mongodb install ubuntu" ... ;P
[09:34:46] <Keksike> I only tried googling mongodb upgrade version on ubuntu
[09:35:00] <Keksike> im not too familiar with how these things work, so thanks for the help :)
[09:35:59] <kurushiyama> gta99_: Huh? Thought the dev release was only available for OSX...
[09:36:23] <kurushiyama> gta99_: Apparently, I was mistaken
[09:46:10] <gta99_> it seems like 3.3 is the only one available
[09:46:39] <gta99_> I've tried 3.2 wheezy mongodb but it fails with a dpkg configuration error for mongodb-org-server
[09:47:05] <gta99_> and I'm assuming this is due to me using a wheezy package on jessie
[09:49:45] <Keksike> kurushiyama: ok, now I installed the 3.2.4 according to the rules you linked. But my mongodb-org-mongos, -server, -shell and -tools are still 2.6.12. How do I update them also?
[10:02:52] <kurushiyama> Keksike: Well, good question then. TL;DR A bug in the package assembly manifest, most likely.
[10:04:34] <kurushiyama> Keksike: The longer story: Files need to be marked as configuration files during package creation. There are several ways to do that – and the side effects sometimes are complicated.
[10:05:12] <kurushiyama> Derick: of the top of your head, were there major changes in packaging done lately?
[10:14:58] <kurushiyama> Keksike: You might want to stop mongod by doing a "sudo killall -TERM mongod" now, then, you want to run "sudo chown mongodb.mongodb -R /var/lib/mongodb" and then you want to start mongodb via the usual means
[10:26:10] <kurushiyama> Keksike: You are welcome.
[10:47:06] <SimpleName> If I config a product with serveral images, Should I design images model with mongodb, or use Mysql like this: img1 img2..img10, you can upload at most 10 images to one product
[10:49:01] <mroman> How does auth work in sharded clusters with replica sets?
[10:49:29] <mroman> I connect to a mongos instance and create the first admin user.
[10:50:00] <mroman> is that user replicated to all shards/replica sets?
[10:50:11] <Derick> I *believe* credentials are always stored on the primary shard.
[10:50:24] <mroman> or what happens if someone connects directly to a shard
[10:51:00] <mroman> I may be dealing with an architecture where not all servers are really under my control.
[10:51:24] <Derick> "The admin database in a sharded environment will actually live on the config servers."
[10:51:27] <mroman> some companies want to form a common database.
[10:54:58] <kurushiyama> SimpleName: That is entirely our decision, isnt it?
[10:55:34] <kurushiyama> @Derick You are correct with the primary shard thing in case the user is stored in the DB.
[10:55:53] <Derick> my sharding knowledge is a bit shaky
[10:57:25] <kurushiyama> mroman: Authorization with MongoDB's internal means (and that is the problem here) is not granular enough to guarantee separation of data
[10:57:47] <kurushiyama> mroman: Depending on the data and your jurisdiction, it might even be illegal to do so.
[10:59:04] <kurushiyama> @Derick Personally, in a sharded env, I tend to store the users in the admin database, though
[10:59:56] <kurushiyama> mroman: I know for sure it would be in Germany, and most likely it would be in the othe EU states, as well.
[11:04:09] <kurushiyama> mroman: There are limits defined in what you can do in the BDSG (Bundesdatenschutzgesetz – german for Data privacy act). One of them is that you have to use all means necessary to make sure the data is safe. Putting the data set of multiple companies into the same database is not exactly that. Furthermore in case the data is compromised and multiple companies have potential access, it is very hard to find out who is respo
[11:04:09] <kurushiyama> nsible, making it likely that the provider of said database will be held liable. Each breach of the BDSG might be fined with €50k - €300k, or more, at the judges discretion.
[11:05:07] <kurushiyama> mroman: And iirc, that is per incident. You better have your check book ready.
[11:06:28] <mroman> yeah but how would that be different from collecting all the data in one company?
[11:06:50] <mroman> those companies provide data from wheather sensors stuff like that
[11:07:07] <kurushiyama> mroman: Aside of that. When multiple companies share the same database, this is usually done to do some integrative tasks. The industry learned the hard way that databases are a poor tool for integration. With regards to one company
[11:07:25] <kurushiyama> mroman: With multiple, it is even more.
[11:07:44] <mroman> so one of the idea would be that each hosts a shard for that data
[11:08:06] <kurushiyama> mroman: It does not sound better the more you explain what you want to do.
[11:30:27] <kurushiyama> mroman: Well, sounds reasonable. I was a bit concerned for the throughput, but unless each sensor sends _a lot_ of data/s, even a USB thumb drive should be fast enough. I am not sure wether sharding is necessary from the beginning,then, since adding shards would be to keep the sweet spot for data storage costs. However, you can start out with a single shard, to be sure everything works as expected should the need arise to s
[11:32:23] <mroman> sharding would be more to distribute mapreduce jobs
[11:33:12] <mroman> so you can distribute the 0.25TB data to shards and then let mapreduce jobs run
[11:33:40] <kurushiyama> mroman: Well, distributing the data comes at a cost, too.
[11:35:08] <kurushiyama> mroman: You should be careful there. I would first identify wether actually a m/r is needed. As a rule of thumb: If you can do it with the aggregation framework, it is usually the better idea.
[11:36:55] <kurushiyama> mroman: To cut a long story short: Create a test env and try to find the best way.
[11:37:33] <mroman> that's what I'm currently doing :D
[11:38:01] <kurushiyama> mroman: Have a deep look into the aggregation framework, then. A very deep look.
[11:39:42] <mroman> It has a limit of 10% RAM consumption? o_O
[11:40:16] <mroman> that's probably not good if we end up aggregating a few mio. entries
[11:41:43] <kurushiyama> mroman: You are assuming that you load all entries into RAM?
[11:42:44] <kurushiyama> mroman: Aside from an early match: you can actually use disk space, as m/r does by default. But the aggregation framework offers _really_ nifty functions.
[11:45:08] <Mumla> I got a crazy problem which is summarized here: http://stackoverflow.com/questions/36499924/cant-start-mongodb-as-a-service (I hope you don't hate me for posting such a link here...). I'm quite desperate here at my work :(
[11:46:05] <Mumla> can you help me with this problem? that would be aaaaawsome!
[11:48:53] <kurushiyama> Mumla: And you should be. With all the trial and error, your install most likely is pretty screwed.
[11:49:31] <Mumla> I can do a reinstall when this would be a part of a soultion
[11:50:59] <kurushiyama> Mumla: On several conditions. a) Delete the question on SO. It belongs to dba.stackexchange.com b) Reask it there c) Add OS, its version and the mongodb version d) Success not guaranteed.
[11:52:54] <Mumla> @kurushiyama could be a begin, yes
[11:53:33] <kurushiyama> Oh, and e) Take notes and answer your own question when we fixed it ;)
[11:53:43] <kurushiyama> Mumla: The latter is optional.
[12:32:22] <mroman> that article is a little bit shot on actual information :).
[12:33:11] <mroman> embedding might be useful for m/r though
[12:33:26] <mroman> if you split your data to avoid embedding you can't just map over it anymore?
[12:33:39] <kurushiyama> mroman: Is it? ;) And no, embedding actually does more harm than good for m/r, generally speaking.
[12:35:51] <kurushiyama> mroman: Uhm. Aside from the fact that map/reduce has little to do with actual maps of the source data, the 16MB size limit is hardcoded. So regardless, you need to adapt your models.
[12:36:28] <mroman> yeah but assume I have a data set with a few rows.
[12:36:50] <kurushiyama> mroman: Read the #Conclusion of that blog post ;)
[12:36:53] <mroman> I can store them as {"dataset":[{row1},{row2},{row3}]} something like that
[12:37:05] <mroman> or I can each row as a document
[12:37:06] <kurushiyama> mroman: Sure. until you hit the size limit.
[12:37:29] <mroman> now let's say I need to do some analyzes on all rows together
[12:37:53] <kurushiyama> mroman: Uhm, that is why you use m/r and/or aggregations...
[15:19:42] <kurushiyama> May well be that some lib was updated...
[15:20:14] <basldex> still that's a somehow weird strategy
[15:20:35] <kurushiyama> My bet goes on libboost fs.
[15:20:50] <basldex> as I said. I really don't want to mess around with my production machines' package managing
[15:21:04] <basldex> so 2.4 is like "works if you're lucky, otherwise gtfo"?
[15:21:14] <Derick> kurushiyama: I thought we compiled that in statically - but could be wrong. If it's an *Ubuntu* packaged mongodb, then, I think you're out of luck anyway
[15:21:58] <Derick> they might do weird things too
[15:22:59] <basldex> ok, guess then there's no way to get around installing from different sources
[15:23:10] <kurushiyama> Derick: There is a reason why Ubuntu scares me of like a devil is from holy water.
[15:23:29] <Derick> kurushiyama: it's good as a desktop OS, but they make intersting packaging choices
[15:24:21] <kurushiyama> basldex: You almost always want to use MongoDB Inc's official packages, never failed me, though some of them could be reworked (and yes, I am in the process of doing so) ;=
[15:24:57] <basldex> it is an interesting packaging choice indeed to break important server software in your long term release
[15:25:08] <Derick> LTS is IMO a silly concept anyway.
[15:26:22] <kurushiyama> Derick: As a desktop OS I find it to be too bloated, and as a server OS? Yeah... "interesting" is probably the politest way to describe it. For example the "interesting" choices regarding the proc fs
[15:26:51] <Derick> kurushiyama: I run Debian unstable with xfce. Works For Me™
[15:27:24] <kurushiyama> Derick: OSX and elementaryOS as desktop, CentOS and RHEL as server.
[15:29:25] <kurushiyama> Derick: Erm. I do not want to do finger pointing, but there were _extremely_ nasty bugs in wt integration when 3.0 was released. tbh, since then, I always stay a minor behind.
[15:29:45] <kurushiyama> basldex: Now worries, those problems are fixed now.
[15:30:07] <Derick> kurushiyama: yeah, premature does happen.
[15:31:41] <kurushiyama> basldex: And by any chance, unless you have _very_ good reasons not to, you might want to migrate to wiredTiger.
[16:03:00] <bros> Does anybody know if there are plans to add a query/$where to $lookup on aggregation?
[16:35:00] <kurushiyama> I just do not see where we need a JOIN, yet
[16:35:14] <bros> I need to know what locations exist and which items belong to which location.
[16:35:32] <kurushiyama> bros: I'd split that in two questions
[16:36:06] <kurushiyama> bros: Even more so since it is unlikely that you want to display ALL items of ALL locations SIMULTANEOUSLY.
[16:36:20] <bros> kurushiyama: It is for cache rebuilding
[16:36:24] <bros> I have it split into two queries
[16:36:29] <bros> but then it takes 3s to map them together in the API server
[16:37:04] <kurushiyama> bros: What is the use case to have _all_ items of _all_ locations gathered together?
[16:37:36] <bros> kurushiyama: Rebuilding a cache.
[16:38:32] <kurushiyama> bros: Sounds to me like a readthough cache would be more useful to you. But setting that aside: What sort of cache? What are you caching and why?
[16:39:14] <bros> My clients stores this cache client-side. It's a JSON blob. I store it in redis.
[16:39:26] <bros> I have a real-time warehouse management app that requires this data.
[16:40:14] <kurushiyama> bros: Set aside the language, but this pretty much explains it: https://github.com/golang/groupcache
[16:41:31] <kurushiyama> bros: And here is what I do not get. Basically, you tend to query either an item and want to find out where to get it. Or, you have a location and want to know what is stored there.
[16:42:57] <jr3> is there a way to sum the number of elements in an array from all docs?
[16:43:30] <kurushiyama> jr3: like you have 10 items in doc A and 20 items in doc B, the answer should be 30?
[19:07:37] <bros> kurushiyama: no. i need to dump an entire collection, filtered by account_id
[19:11:11] <kurushiyama> bros: Would need detailed structure and expected results. But I am preparing dinner, so maybe tomorrow?
[19:44:12] <shlant> hi all. will I run into problems dumping a 3.2 db with a 3.0 client and then importing to a 3.2 db?
[19:44:29] <shlant> as in, does the client dumping effect the dump itself?
[19:44:58] <shlant> and if so, is that also true the other way around? 3.0 db dumped with 3.2 client and imported into 3.0 db
[20:14:38] <mkjgore> hey folks, just did a restore of our mongo cluster and I've noticed that the machines just don't perform as they used to. Even when sitting "idle" (our app isn't accessing the cluster as far as I can tell) there are times when iotop is showing almost constant reads across all the rep sets (sometimes in the Mb for a few seconds)
[20:14:47] <mkjgore> has anyone else seem this when restoring from mongo cloud?
[20:22:00] <crazyphil> kurushiyama: you weren't kidding about trying to export and import the monster setup I have running right now, it's going on 4 days and the import is only at 34%