PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Thursday the 7th of April, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[08:03:10] <kurushiyama> dimaj Not that I am aware of. Why do you need it that way?
[08:04:33] <dimaj> Thanks for the reply kurushiyama! at my work, we have several lab networks and a production network. One lab network has access to production, but the other does not.
[08:04:50] <dimaj> it so happened that I have to deploy my application on the network that does not have access
[08:05:19] <dimaj> so, I was hoping that I could trick my application into thinking that I'm talking to the mongodb on the lab network
[08:05:39] <dimaj> but all traffic will be forwarded to the mongodb on the production network
[08:05:43] <dimaj> does that make sense?
[08:06:31] <dimaj> oh, my web app is a node.js application and i'm using mongoose for mongodb queries
[08:06:44] <kurushiyama> dimaj Wait. What?!?
[08:06:54] <dimaj> lol
[08:06:58] <dimaj> ok
[08:07:07] <dimaj> i have 3 networks: A, B, C
[08:07:18] <dimaj> MongoDB is on a machine on network A
[08:07:37] <dimaj> Network A has bi-directional communication with network B
[08:07:52] <dimaj> Network B has bi-directional communication with network C
[08:08:15] <dimaj> My requirement is to have my application deployed on network C and read data from database on network A
[08:09:22] <dimaj> my ops department is super busy right now and they won't get to my ticket for at least another week, if not more
[08:09:42] <dimaj> so, i need to come up with a way to make my application talk to the database
[08:09:51] <dimaj> hope this makes more sense
[08:13:44] <kurushiyama> dimaj I am afraid not. At least not as far as I know.
[08:14:26] <dimaj> thanks kurushiyama. I was afraid that would be the answer :) Thanks anyway
[08:15:08] <kurushiyama> dimaj Well, it is _my_ answer. Others might come up with sth,
[08:15:31] <dimaj> kurushiyama :)
[08:15:43] <Derick> your app needs to be able to talk to every node directly
[08:16:09] <dimaj> Derick, is that a question or a statement?
[08:16:18] <kurushiyama> Derick it is stranger than that... Much stranger
[08:16:47] <Derick> statement
[08:16:48] <kurushiyama> Derick Well, on the other hand it is not.
[08:17:16] <dimaj> I'm basically hoping that one node is going to delegate all of the heavy lifting to another node and then just return the results back to the application
[08:17:29] <dimaj> Derick, ok
[08:18:05] <dimaj> i know this is probably the weirdest usecase :)
[08:18:18] <Derick> dimaj: there is no such thing like that with MongoDB - unless you can get a mongos on that network? I suppose you can't install anything either?
[08:18:27] <kurushiyama> dimaj Well, if you can fire up additional instances...
[08:18:37] <dimaj> i can install
[08:18:43] <Derick> oh, in that case
[08:18:49] <kurushiyama> dimaj Then you could declare one as a config server and one as a mongos
[08:19:06] <kurushiyama> Derick I bow in respect.
[08:19:18] <dimaj> machine on network C is a Cloud Foundry... so, can't install... can only deploy to it... machine on network B, i have root privileges
[08:20:10] <kurushiyama> dimaj But your application is on C. Is that to be taken as layered?
[08:20:20] <dimaj> Derick, kurushiyama, hang on a sec... googling _mongos_
[08:20:26] <Derick> on network B, install a mongos (sharding router). On network C, install a mongod, and 3 mongo config instances. Configure your app on network C to use the mongos on network B. Configure the sharded environment through mongos on network B to talk to things on network A
[08:20:30] <dimaj> what do you mean by 'layered'?
[08:20:55] <kurushiyama> dimaj Just thinking with my fingers.
[08:21:27] <Derick> your issue could be the 3 config servers that you'd need
[08:21:53] <dimaj> does it have to be 3?
[08:22:00] <dimaj> it can't be 1?
[08:22:07] <Derick> dimaj: is your setup a production environment?
[08:22:11] <kurushiyama> Derick Since it is a temporary solution... ... and we are talking of a single shard...
[08:22:12] <dimaj> no
[08:22:16] <Derick> then you can use one
[08:22:29] <dimaj> ok
[08:22:46] <dimaj> so, configuration is done just on network B on the mongos side, correct?
[08:22:52] <Derick> yes
[08:22:57] <dimaj> I don't need to touch mongodb on network A
[08:23:05] <kurushiyama> dimaj Not at all
[08:23:40] <dimaj> and my app won't even know which shard it is talking to? as far as it is concerned, mongodb is that mongos isntance?
[08:23:44] <kurushiyama> Derick collections and databases themselves are unaware of sharding, right?
[08:23:48] <Derick> dimaj: correct
[08:23:50] <Derick> kurushiyama: correct
[08:24:23] <dimaj> alright! I know what I'll be doing tomorrow :D
[08:24:29] <Derick> yeah, I think this could work with just one mongos and one config server on network B
[08:24:39] <kurushiyama> I agree.
[08:24:42] <dimaj> Thanks!
[08:24:55] <Derick> warning: this is a hack :D
[08:25:25] <dimaj> so, again, just to clarify... my final setup is going to look like this: Network A (mongodb); Network B - mongos and mongodb; Network C - node app that is talking to mongos
[08:25:44] <kurushiyama> Well, a bit like bellwire and spit, but should do the trick
[08:25:51] <dimaj> Derick, that's fine... it's temporary, until my opts department will have time to address my ticket
[08:26:26] <dimaj> at which time, my machine on Network B gets destroyed
[08:26:39] <kurushiyama> dimaj Correct setup. However, terminology is _really_ important. In B, it will be mongos and a config server.
[08:27:13] <dimaj> kurushiyama so, mongodb and config server are different things?
[08:27:22] <dimaj> as you can tell, i'm very green with this :)
[08:27:53] <Derick> yes, they're different things
[08:28:12] <dimaj> ok
[08:28:35] <kurushiyama> dimaj mongod is the daemon, which – under certain cisrcumstances – may be used as a config server.
[08:28:59] <dimaj> also, my DB is currently 20GB in size... will it all this data be in the config server?
[08:29:04] <Derick> see a config server as the settings a proxy (mongos) needs.
[08:29:17] <Derick> the config server only stores on which shard (only one in your case) your data lives
[08:29:25] <Derick> so the config server will contain very little data in your case
[08:29:31] <Derick> if it's 16 MB, I'd be surprised
[08:29:32] <dimaj> in other words, it's a router
[08:29:39] <kurushiyama> NO
[08:29:53] <kurushiyama> dimaj It is a config server
[08:30:03] <dimaj> ok
[08:30:34] <kurushiyama> dimaj the config server stores the cluster metadata. The router, called mongos, fetches the metadata from the config server.
[08:31:01] <dimaj> ah. ok
[08:31:10] <kurushiyama> dimaj Simplified, it actually caches the metadata and reloads it under certain conditions.
[08:31:26] <dimaj> ok
[08:31:42] <dimaj> how large is the metadata? (any guestimates)?
[08:32:08] <kurushiyama> THis is what Derick said: If it was 16MB, not only he would be surprised.
[08:32:27] <dimaj> oh!
[08:32:34] <dimaj> perfect!
[08:33:03] <kurushiyama> Actually, backing up cluster metadata is the only valid use case for mongodump, imvho.
[08:33:52] <dimaj> cool! I've got some reading and configuration to do :)
[08:34:31] <kurushiyama> I feel a bit uneasy with one config.
[08:34:38] <dimaj> kurushiyama and Derick, thanks a lot guys! Really appreciate the help!
[08:34:41] <dimaj> how come?
[08:34:55] <kurushiyama> dimaj Just because the way I learned it.
[08:35:50] <kurushiyama> dimaj There was a guy hammering it into my head "ALWAYS use 3 config servers". But he was referencing to a proper cluster setup. You have nothing to worry about.
[08:36:19] <dimaj> ok. cool.
[08:36:37] <dimaj> yeah, my application is an internal tool that won't see the light of day
[08:37:14] <dimaj> if, and when, it'll take off, we'll set up a proper cluster to make sure that it's fast and reliable and not hacky :D
[08:37:22] <kurushiyama> And even if. The worst case scenario would be that you get corrupted metadata. Which, with only a single shard, is pretty easy to overcome.
[08:54:32] <Ange7> Someone can explain why i have error « duplicate key error » but i make an update with $inc and options upsert = true
[08:54:57] <Ange7> i just want $inc my document or insert him if not exists.
[08:58:34] <kurushiyama> Ange7 can you show some code? In general, this should work. a sample doc would not hurt, either.
[09:05:29] <Ange7> kurushiyama: http://pastebin.com/QrdBHz7H
[09:07:43] <ToMiles> Any reason that when I mongorestore a database into a freshly initiated replset of mongo servers that my db name got prefixed with "b=" ?
[09:08:10] <kurushiyama> Ange7 Most likely, your match does not match the already existing document with _id $key, hence mongod tries to insert a document, but then it finds that a document with $key exists and throws an error.
[09:08:28] <kurushiyama> ToMiles User error, most likely.
[09:15:37] <kurushiyama> ToMiles you might want to provide us with the command lines used to dump and restore.
[09:20:56] <Ange7> kurushiyama: i'm not sure to understand, doesn't match but existing document
[09:21:24] <kurushiyama> Ange7 You have a match that is beyond just the _id
[09:21:56] <kurushiyama> Ange7 See lines 5-9 of your code
[09:22:14] <kurushiyama> Most likely, no document matches that
[09:22:29] <kurushiyama> But there _is_ a document with the same id
[09:22:40] <kurushiyama> hence the duplicate key
[09:33:11] <diegoaguilar> Hello, is it possible to get a mongolab free instance acting as slave?
[09:33:23] <diegoaguilar> well I only want replication
[09:33:30] <diegoaguilar> and not sharding
[09:33:38] <Ange7> ok kurushiyama
[09:34:24] <Ange7> so is it possible to match only with _id field and if not exists insert other fields ?
[09:34:47] <kurushiyama> diegoaguilar Not sure if it can open a cursor to the outside world.
[09:36:36] <kurushiyama> Ange7 Well, depends on your use case.db.foo.update({_id:"asdf"},{$inc:{a:1}},{upsert:true}) should always work.
[09:37:06] <Ange7> okay, but how can i set the name field ?
[09:38:20] <kurushiyama> Ange7 In the update document?
[09:38:39] <Ange7> if the document not exists, it will be insert. so i need to set the name field too ?
[09:38:53] <kurushiyama> db.foo.update({_id:"asdf"},{name:"bar", $inc:{a:1}},{upsert:true}) should always work. Get a spoon.
[09:39:10] <Ange7> okay
[09:39:15] <Ange7> that's cool !
[09:39:18] <Ange7> i will try
[09:39:22] <Ange7> thank you kurushiyama
[09:39:27] <kurushiyama> Ange7 You should _really_ get into the docs.
[09:40:11] <kurushiyama> Ange7 As you are told not for the first time, and not only by me. Really, it does not help you if you get every problem spoon fed. You need to learn how to fish, not how to beg for fish.
[09:44:42] <diegoaguilar> kurushiyama, do u think it would be too dificult to get automatic s3 backup ina custom server?
[09:45:07] <diegoaguilar> well Id like to keep up to 5 last backups ...
[09:45:26] <kurushiyama> diegoaguilar Tbh, if you have to ask, most likely the answer is yes.
[09:46:28] <diegoaguilar> tbh?
[09:47:52] <kurushiyama> diegoaguilar Too Be Honest
[09:47:57] <diegoaguilar> oh :)
[10:01:47] <eje_> Hi guys. Is this some kind of bug?
[10:02:01] <eje_> db.HugeColl.find().sort({'$natural':-1}).limit(100).count()
[10:02:02] <eje_> 1098411165
[10:02:28] <eje_> mongo 3.2
[10:04:46] <Ange7> db.foo.update({_id:"asdf"},{name:"bar", $inc:{a:1}},{upsert:true}) should always work. Get a spoon.
[10:04:51] <Ange7> Okay, but doesn't work
[10:04:58] <kurushiyama> eje_ Well, imho it is surprising, at leat.
[10:05:11] <Ange7> The dollar ($) prefixed field '$inc' in '$inc' is not valid for storage.
[10:05:36] <kurushiyama> Ange7 That is not php or something, but plain mongo shell
[10:06:14] <kurushiyama> Ange7 Read. the. docs.
[10:06:14] <Ange7> kurushiyama: you say that the problem is the PHP Driver ?
[10:06:22] <kurushiyama> Ange7 Read. The. Docs.
[10:06:45] <kurushiyama> Ange7 You can not simply copy and paste.
[10:08:20] <eje_> mother of god
[10:08:25] <eje_> should use applySkipLimit for this
[10:08:36] <eje_> mongo is full of shitty surprises
[10:16:44] <kurushiyama> eje_ Well, count actually returns the number of _matched_ docs, iirc.
[10:18:40] <kurushiyama> eje_ So basically, it is correct. limit on the other hand, modifies the cursor. Which is what you want. So call the length method on the cursor, and Bob's your uncle.
[10:19:15] <kurushiyama> eje_ db.hugeColl.find().sort({'$natural':-1}).limit(100).length()
[10:19:37] <kurushiyama> eje_ Which, btw, for hugeColl, should always be == limit.
[10:23:33] <eje_> kurushiyama: thx, i'll remember that
[10:31:50] <crazyphil> anyone know if its possible to stripe mongodb data across multiple directories?
[10:33:53] <kurushiyama> crazyphil Sure it is
[10:34:00] <kurushiyama> crazyphil One directory per DB
[10:34:44] <kurushiyama> https://docs.mongodb.org/manual/reference/configuration-options/#storage-options
[10:34:45] <crazyphil> I was looking for a global way to let mongo do it itself, like I can do with Elasticsearch or Kafka (basically you can tell them use these X directories)
[10:35:02] <kurushiyama> crazyphil https://docs.mongodb.org/manual/reference/configuration-options/#storage.directoryPerDB
[10:35:10] <crazyphil> looking now
[10:35:38] <crazyphil> hmm
[10:35:44] <crazyphil> I think I'll just migrate to an array
[10:35:50] <kurushiyama> crazyphil Why do you want that? Unless you have _a lot_ of collections and indices, this is not necessary
[10:36:10] <crazyphil> I have 5 collections that just chewed through 1TB in about 1 month
[10:36:32] <kurushiyama> crazyphil And why do you need multiple directories, then?
[10:36:48] <crazyphil> because then I could just mount 6 drives to 6 directories
[10:36:55] <crazyphil> instead of having to create a software array
[10:37:09] <kurushiyama> You should use LVM anyway.
[10:37:32] <crazyphil> yeah, I should probably stop using mdadm
[10:37:37] <crazyphil> but old habits die hard
[10:37:38] <kurushiyama> crazyphil Since you _really_ do not want to do backups with mongodump on that scale.
[10:38:32] <crazyphil> I'm doing a dump right now just as a precaution
[10:38:36] <kurushiyama> crazyphil Well, time to start a new habit. _Use LVM_! ;)
[10:38:56] <kurushiyama> crazyphil On a TB? So, what's for dinner tomorrow? ;)
[10:39:10] <crazyphil> it's 16% done, in about 15 minutes
[10:39:25] <kurushiyama> crazyphil You sure your data size is 1TB?
[10:39:25] <crazyphil> I gotta fix a problem with one of the dbs anyhow
[10:39:29] <crazyphil> oh yes
[10:39:40] <kurushiyama> Not your file size?
[10:40:15] <crazyphil> oh, it's 925.75
[10:40:22] <crazyphil> GB
[10:41:03] <kurushiyama> crazyphil Again, data or file size? Or, to put it differently: Do you use wiredTiger or MMAPv1?
[10:41:58] <crazyphil> I have no idea, I inherited this setup, and got a crash course on mongo operation
[10:43:14] <kurushiyama> crazyphil Connect to it and get a db.version()
[10:43:30] <crazyphil> 2.6.11
[10:43:41] <kurushiyama> Ah, so it is MMAPv1
[10:44:18] <strepsils> Hey folks, I have a replica set upgrade question: going from 2.4 to 3.2 by adding a 3.2 node to the 2.4 replica set, once the data is replicated over making the 3.2 node PRIMARY, removing the 2.4 nodes and adding more 3.2 nodes. This is skipping the 2.6 upgrade the docs recommend, are there any risks in doing so?
[10:44:21] <crazyphil> it was supposed to be a sharded setup, however when I told the primary db to start replicated, it told me something about it couldn't because there was some dbs that was local
[10:44:22] <kurushiyama> Which most likely means that you have less than 1TB worth of data. It is just that the datafiles do not shrink.
[10:44:49] <kurushiyama> crazyphil Uhm... that does not really make sense.
[10:44:54] <crazyphil> show dbs has the db size at 950
[10:46:44] <kurushiyama> strepsils Uhm, iirc you are supposed to update to 2.6 first, and not out of fun. Skipping it may render your data unusable, so I would not.
[10:47:25] <kurushiyama> crazyphil uhm. your data size is bigger than your files? We might have a misunderstanding here... ;)
[10:47:41] <strepsils> @kurushiyama the thing is, there's no data corruption… it seems to work
[10:48:28] <kurushiyama> strepsils Well, cool if _you_ want to bet your data on being lazy. I would not. Ever.
[10:50:00] <kurushiyama> crazyphil A sharded cluster may be composed of replsets or standalones (or both). First step is to start a replica set (if you want to), then do the sharding.
[10:51:49] <crazyphil> I'm pretty sure a replica set is there, as all 3 servers show "RS0" at the mongo prompt
[10:52:34] <kurushiyama> crazyphil Please pastebin rs.status() from the primary. Do not forget to anonymize the host names, if publicly available.
[10:54:13] <strepsils> @kurushiyama I don't want to play with data, it's just that after upgrading a few replica sets using this approach I realized we missed the 2.6 step. RTFM didn't happen when it was supposed to :(
[10:55:08] <crazyphil> http://pastebin.com/MYtp2KGw
[10:57:04] <kurushiyama> crazyphil lgtm
[11:00:16] <kurushiyama> crazyphil Well, now you can do a sharding setup. However, you should be very, _very_, _VERY_ careful with choosing a shard key. It is one of the very few things you can really screw up with MongoDB, since it is _pretty_ hard to change.
[11:05:17] <avner_> Hi all, question re ReplicaSet URI setup, from driver side. docs say: "When connecting to a replica set it is important to give a seed list of at least two mongod instances.."
[11:05:18] <avner_> The "at least 2" are needed part, is it so that during first connection I get better HA, or is there a runtime implications (because all servers are auto discoverd in any case)?
[11:06:19] <cheeser> if you give just one server the driver (at least java's) will only talk to that one server ever. and that one might not be primary at the moment.
[11:06:55] <cheeser> by giving it at least 2, it knows to discover the topology of your cluster and find the primary/mongos machines
[11:07:08] <kurushiyama> cheeser The replSet option does not force discovery?
[11:07:36] <cheeser> not in the java driver, no.
[11:09:58] <avner_> I use python, and the discovery works with a single server. I get all 5 in our case.
[11:10:29] <cheeser> do you pass replSet on the uri?
[11:10:35] <avner_> question is, if it’s all good after that in runtime? (e.g. step down scenario).
[11:10:44] <cheeser> i think the behavior for the URI isn't well spec'd atm
[11:10:45] <avner_> Yep, I do @cheeser
[11:10:55] <cheeser> do you get them all without replSet?
[11:11:35] <kephu> hi
[11:11:37] <avner_> nope as far as I remember.
[11:11:46] <cheeser> avner_: ok, good to know. thanks.
[11:12:00] <kurushiyama> kephu Hi
[11:12:11] <cheeser> kurushiyama: i'll double check that with the java driver to be sure but i don't believe it works that way.
[11:12:53] <kurushiyama> cheeser Well, I am not too sure wether I just implied it would or wether it is documented that way in the uri docs.
[11:13:16] <avner_> So, using replicaSet, and just one server in the uri, discovery of all servers works. Do I need to worry because of the docs saying “at least 2”?
[11:13:20] <cheeser> it's worth investing either way just to sure
[11:13:29] <kephu> got a kinda tricky question, is there any way to get a result where keys are one field's values, and those key's matching vals are from aggregation's $push?
[11:13:33] <cheeser> avner_: it makes me curious, personally
[11:14:35] <avner_> @cheeser, I suspect the “at least” two, is to get a better HA on startup time, e.g. first connection, but once replicasset is discovered, it’s meaningless.
[11:15:14] <kurushiyama> cheeser Well, actually it isn't too clear about that. It says that if you only set one server _and_ omit the replset option, the client creates a standalone connection. I have to admit that I would conclude that it creates a replSet aware connection (including discovery) if the replSet option is set.
[11:16:58] <kurushiyama> cheeser But that may be just me.
[11:25:09] <jdo_dk> Hello
[11:33:04] <cheeser> kurushiyama: you might be right. i'll check the source after i get my kids to school.
[11:46:31] <jdo_dk> can i use update, $inc on an array ?
[11:49:49] <kurushiyama> jdo_dk Nafaik
[11:51:19] <jdo_dk> kurushiyama: https://dpaste.de/OzJb
[11:51:23] <jdo_dk> Something like that ?
[11:52:37] <kurushiyama> jdo_dk Do it in your code.
[11:53:57] <jdo_dk> kurushiyama: then i need to do a find, (change in code) and update.
[11:54:09] <jdo_dk> Hoped i could do it directly in one call to mongo. :(
[11:54:15] <kurushiyama> jdo_dk Or rethink your data model
[12:26:12] <jdo_dk> kurushiyama: Any ideas to create a better data model? if i have a model like: id: ID
[12:26:12] <jdo_dk> stats: [
[12:26:12] <jdo_dk> {'key': 1, 'count': 2},
[12:26:12] <jdo_dk> {'key': 2, 'count': 3},
[12:26:12] <jdo_dk> ]
[12:26:37] <jdo_dk> And i wants to be able to update counts on each subdoc or add if not exists
[12:54:40] <kurushiyama> jdo_dk Well, the question is wether it makes sense to have a subdoc in the first place.
[12:56:57] <cheeser> kurushiyama: the java driver doesn't seem to even acknowledge the replSet parameter on a URI
[12:57:36] <kurushiyama> cheeser :/ Well, easy to complain, but I think that should not be the case.
[12:57:55] <kurushiyama> cheeser Quite the contrary, actually.
[12:58:26] <kurushiyama> cheeser The way it is documented, a single host plus the param should force discovery. Imho.
[12:59:15] <cheeser> what's the url you're looking at?
[13:00:04] <kurushiyama> https://docs.mongodb.org/manual/reference/connection-string/#urioption.replicaSet
[13:00:38] <kurushiyama> Since the java driver accepts an uri, it should behave accordingly. What do you think?
[13:01:00] <cheeser> oh, there it is. the option name is replicaSet not replSet
[13:01:17] <kurushiyama> cheeser Sorry, was lazy.
[13:01:30] <kurushiyama> cheeser Thought it was clear.
[13:01:52] <cheeser> naming things is one of the 3 great problems in CS...
[13:02:17] <kurushiyama> cheeser Yep, followed by naming and naming, iirc ;)
[13:02:28] <cheeser> no, off by one errors.
[13:02:57] <Derick> and cache invalidation
[13:02:59] <kurushiyama> tbh, I rather debug the latter than have to do the former.
[13:08:42] <cheeser> kurushiyama: confirmed: https://github.com/mongodb/mongo-java-driver/blob/master/driver-core/src/main/com/mongodb/connection/ClusterSettings.java#L189-L189
[13:10:19] <kurushiyama> cheeser lgtm, as far as I understand the code.
[13:10:44] <cheeser> it's your expected behavior
[13:13:04] <kurushiyama> cheeser well, do not know the java driver too well, so I was cautious ;) I was actually pretty astonished that it could be otherwise, as the principle of least surprise seems to be an important one for MongoDB, and the Java driver is supported...
[13:52:06] <myaskevich> Hi guys
[13:52:42] <myaskevich> I have a collection of 46 docs and it takes me ~23 sec. to retrieve one of them by id (IDHACK)
[13:53:21] <myaskevich> When I look at the logs, the query is actually runs 2-6ms, i.e. very fast
[13:53:39] <StephenLynx> cool
[13:53:48] <myaskevich> Where the rest of the time is spent?
[13:53:56] <StephenLynx> you tell me.
[13:54:09] <myaskevich> I'm making my query from a remote machine
[13:54:26] <StephenLynx> maybe your connection is broken?
[13:54:34] <myaskevich> But when I run it locally -> it's retrieved as fast as it shows in the logs
[13:54:49] <StephenLynx> so your connection is really broken
[13:54:58] <StephenLynx> or your application code.
[13:55:19] <myaskevich> It doesn't. I have an another collection with thousands of docs and it pulls out very fast
[13:55:50] <myaskevich> There is nothing to do with my app, I shoot queries in mongo client
[13:56:14] <StephenLynx> what do you get when you ping the machine?
[13:58:16] <myaskevich> the machine is under firewall with only ports open I need
[13:58:25] <myaskevich> ping doesn't give me anything
[13:59:38] <myaskevich> But I can ssh to it without problem
[14:02:50] <StephenLynx> so you can't diagnose the quality of the connection?
[14:03:20] <myaskevich> Connection is good, I'm running site on it and it's quite responsive
[14:03:31] <myaskevich> One thing worth mentioning, I'm using strings as collection._id's
[14:03:40] <StephenLynx> wot
[14:03:40] <myaskevich> Not ObjectId
[14:03:46] <StephenLynx> why
[14:04:12] <myaskevich> To put username in there so it was kept unique
[14:04:20] <myaskevich> I know there are unique indexes
[14:04:40] <myaskevich> Can it cause this slowdown?
[14:04:58] <myaskevich> MongoDB is schemaless, you can put whatever you want in there
[14:05:38] <StephenLynx> yeah, you can, but using username named as _id is counterintuitive.
[14:05:57] <StephenLynx> you might as well name it username and remove the _id fiel.
[14:06:00] <StephenLynx> field*
[14:06:09] <myaskevich> Yeah
[14:06:19] <myaskevich> Look, I just tried quering the same collection by another field (name)
[14:06:23] <myaskevich> And it was fast
[14:06:43] <myaskevich> I think this is it
[14:06:52] <StephenLynx> even if it weren't indexed, you wouldn't notice on a collection with 20 or so documents
[14:06:58] <siddc> Hi, I am trying to troubleshoot a mongodb issue. We have one primary and 6 secondary(voting) and 4 secondary(non-voting) notes. During replication the secondaries become unresponsive causing the monogdb to break the cluster and the primary becomes secondary. When they reform the cluster, there is no primary and they drop their DB and start the replication again and this is stuck on an endless loop. How do we fix this? We are using 2.4.14
[14:06:58] <StephenLynx> so is not that;
[14:07:37] <myaskevich> But why. Look, I can pull the same doc two ways, by using _id or using name fields
[14:07:48] <myaskevich> Using _id takes me 20 sec.
[14:07:56] <myaskevich> Using name is 0.12ms
[14:07:59] <myaskevich> Using name is 0.12sec.
[14:08:11] <myaskevich> _id is indexed name is not
[14:08:29] <myaskevich> How this can be that?
[14:08:58] <siddc> I see status messages like this on the secondary - http://pastebin.com/s2naixGF
[14:09:40] <pamp> Hey, Im working on a MongoDB Cluster Setup...
[14:09:56] <pamp> I've mongod and conf server in the same machine
[14:09:59] <myaskevich> @Stephen?
[14:10:47] <pamp> already created both config files, and upstart job at /etc/init/
[14:11:27] <pamp> and symlinked /lib/init/upstart-job to /etc/init.d/mongo***
[14:12:08] <pamp> but the services are not recognized at service mongod/mongodconf start
[14:12:47] <pamp> if I manual start the service works
[14:13:04] <pamp> any idea what can be the heck?
[14:41:17] <StephenLynx> ¯\_(ツ)_/¯ myaskevich
[14:41:38] <StephenLynx> I don't know what is your mongo version, mongod version, your query, your schema
[14:44:30] <kurushiyama> siddc make sure your can ping each member by name as defined in the replica set config from each member.
[14:45:20] <kurushiyama> siddc That is the first thing I would fo.
[14:45:47] <siddc> Kurushiyama: Yes, I can indeed ping them using their replica set names
[14:46:20] <kurushiyama> siddc You checked all 11 names on all 11 servers?
[14:46:46] <siddc> Kurushiyama: Yup that was one of the first things I checked when starting to TS :)
[14:47:21] <kurushiyama> siddc Ok, what is your replication headroom and oplog window?
[14:48:00] <siddc> Kurushiyama: I am not a DB so I don't know how to get these parameters. How do i get them?
[14:48:26] <kurushiyama> siddc Uhm... MMS/CloudManager
[14:50:43] <kurushiyama> pamp That sounds more like an Ubuntu problem than like a MongoDB thing, no?
[14:53:40] <kurushiyama> siddc If you do not have it, we are going to have a problem debugging it. Running an 11 member cluster without metrics is like doing an ultra-low altitude flight at supersonic speed with a mobility cane and a street map for navigation. ;)
[14:54:26] <siddc> kurushiyama: I found the MMS tool. Just looking for metrics :). So Replication headroom on one node that I found is 1717S
[14:55:06] <siddc> I dont see a counter for Oplog Window :(
[14:55:08] <kurushiyama> siddc 25 minutes, roughly
[14:56:17] <kurushiyama> siddc Scroll down, you may have to activate the chart
[14:57:01] <siddc> kurushiyama: Right looking the Manage Charts thing and I dont see it. I see OpCounters = Repl
[14:58:04] <kurushiyama> Well, let's delay that. How big is your data size?
[14:58:05] <siddc> Found it -
[14:58:09] <siddc> 1717S as well
[14:58:49] <kurushiyama> Your data size?
[14:59:19] <siddc> DB storage size is 108.97GB
[15:01:54] <kurushiyama> siddc Hm... That should be sufficient headroom to do the initial sync, albeit depending on your setup, it might be not enough.
[15:03:17] <kurushiyama> siddc I take for granted that we are talking of a geodistributed replset?
[15:03:56] <siddc> kurushiyama: No, they are in the same datacenter
[15:04:17] <kurushiyama> o.O You are serious with redundancy, that is for sure.
[15:04:22] <siddc> :)
[15:04:50] <siddc> Our apps use Mongo heavily as well
[15:11:52] <siddc> kurushiyama: So what would be the next step to look at?
[15:12:25] <kurushiyama> Well, locks, IOwait, system integrity would be my next steps.
[15:12:44] <siddc> Yup looking at that. Hopefully something jumps out
[15:13:08] <kurushiyama> My suspicion is that it takes more time to sync than you have.
[15:13:36] <kurushiyama> Which, admittedly is not very likely, but more of a theory that fits the facts
[15:13:43] <siddc> Right
[15:15:19] <kurushiyama> And even if that theory proves to be correct, it is not exactly easy to overcome that problem. My guess is that the primary is overburdened.
[15:16:13] <siddc> And to resolve that, I tried replicating with only 6 of the 10 secondaries but still same issue
[15:17:07] <kurushiyama> siddc Well, as said – pretty hard to impossible to find a solution "remotely"
[15:17:19] <siddc> :(
[15:17:25] <siddc> Thanks for your help :)
[15:18:11] <kurushiyama> You are welcome.
[16:44:55] <Echo6> I have a foundation question about Mongo. As I understand it I can have "nested" collections. I have a customer that has a hotel booking website. If they were storing the hotels in one collection, the customer information in another collection and the bookings in another collection and the bookings use the hotels and the customer does a copy of the information for the hotel and the customer get duplicated to the booking document?
[16:48:58] <kurushiyama> Echo6 a) No, you can not. You can have embedded documents, which are a double sided sword: http://blog.mahlberg.io/blog/2015/11/05/data-modelling-for-mongodb/ b) You probably better use references, be them implicit or explicit (you have to decide that per use case). c) https://docs.mongodb.org/manual/applications/data-models/
[16:50:38] <Echo6> I'm not saying that I want to do it that way. I'm checking to see if I understand how Mongo works as there is conflicting information on the internet about it.
[16:57:27] <quadHelix> Is there a way for me to get the last element of a nested array? I love mongo, yet my newbishness shows with this data structure :\ I have an array like: "status" : [
[16:57:28] <quadHelix> {
[16:57:28] <quadHelix> "user" : "19",
[16:57:28] <quadHelix> "status" : "Submitted",
[16:57:28] <quadHelix> "modified" : "2016-04-01 09:24:21"
[16:57:29] <quadHelix> },
[16:57:31] <quadHelix> {
[16:57:33] <quadHelix> "user" : "1",
[16:57:35] <quadHelix> "status" : "Ready to Print",
[16:57:37] <quadHelix> "modified" : "2016-04-05 11:33:44"
[16:57:39] <quadHelix> },
[16:57:41] <quadHelix> {
[16:57:43] <quadHelix> "user" : "1",
[16:57:45] <quadHelix> "status" : "Printing",
[16:57:47] <quadHelix> "modified" : "2016-04-05 11:33:54"
[16:57:49] <quadHelix> }
[16:57:51] <quadHelix> ] I need to see "Printing"
[16:58:33] <kurushiyama> quadHelix USE PASTEBIN!
[16:58:54] <quadHelix> rodger.
[17:00:18] <quadHelix> I can get the results if I know how many entries are in the array using: db.orders.find({'status.2.status':'Printing'})
[17:00:20] <kurushiyama> Echo6 Well, here is the most valuable piece of advice I can give you: Use the official docs _only_ until you feel confident enough to judge the quality of the advice given at other locations.
[17:00:55] <kurushiyama> Echo6 _only_ in the sense of _exclusively_
[17:01:41] <kurushiyama> quadHelix http://blog.mahlberg.io/blog/2015/11/05/data-modelling-for-mongodb/ Most likely, you overembedded.
[17:02:11] <quadHelix> I am realizing this now, I will read the doc - thanks.
[17:03:17] <kurushiyama> quadHelix That is no doc, just one opinion. However, I am a stern supporter of the "Be as flat as possible" fraction.
[17:06:30] <quadHelix> I have been trying to use slice -1 and elemMatch without success. Flatness is my future friend.
[17:08:09] <quadHelix> I knew I would never hit the BSON limit. It is a glorified log file contained in an order. I will change my model to be flatter.
[17:18:56] <FreeSpencer> is there a way I can make certain that a user can only see certain records where a value = value? In mysql you can kinda do it with views
[17:20:22] <StephenLynx> I don't think so.
[17:20:51] <StephenLynx> others with more experience might confirm it.
[17:20:58] <StephenLynx> but I have never heard of such a feature in mongo.
[17:53:38] <quadHelix> I have been able to search in a nested array using: db.orders.find({'status':{$all:[{$elemMatch:{status:"Submitted"}} ] }} ) Does anybody know how I could limit this to only return the last element?
[20:34:19] <roonie> mongo 3.2 broke all my tests because i think i can no longer maintain as many connections to the db. any one else encounter this problem?
[21:01:02] <wayne> hi
[21:01:06] <wayne> i have a list of object ids
[21:01:15] <wayne> how can i retrieve all of the objects by id in order?
[21:01:42] <wayne> i don't believe {_id: {$in: [id1, id2]}} would do the trick, because it can be out of order
[21:03:01] <StephenLynx> the best way to do it would be to sort on application code if it's a sub-array.
[21:03:05] <StephenLynx> IMO
[21:03:18] <StephenLynx> since you would have to unwind and use a $sort
[21:03:29] <StephenLynx> which is more expensive than just sorting the array directly.
[21:03:49] <wayne> mm okay thanks
[21:03:55] <wayne> i was considering application code too
[21:04:18] <kurushiyama> wayne You could use sort, couldn't you?
[21:04:33] <wayne> kurushiyama: but i have an arbitrarily ordered list of ids
[21:04:42] <wayne> i just want to pull them in that order
[21:05:06] <kurushiyama> wayne db.coll.find( {_id: {$in: [id1, id2]}} ).sort(_id:1}) ?
[21:05:16] <wayne> that sorts by the id value
[21:05:24] <wayne> but i already have a specified order of ids
[21:05:26] <kurushiyama> wayne sorry, db.coll.find( {_id: {$in: [id1, id2]}} ).sort({_id:1}) ?
[21:05:40] <wayne> e.g. id1 may be actually greater than id2
[21:05:51] <wayne> so with that query, the id2 object would come first
[21:06:27] <kurushiyama> wayne So you want to have the documents returned in the order of the ids given? Why would you?
[21:07:06] <wayne> i have a pre-sorted document array of ids
[21:07:31] <kurushiyama> wayne I just try to understand the use case...
[21:08:09] <wayne> so the sorting is pretty expensive
[21:08:25] <wayne> and i have to use different sorts
[21:08:37] <wayne> so i have a collection of ids sorted in different fashions
[21:08:48] <wayne> and now i'm trying to go from collection containing document with id list
[21:08:52] <wayne> to the actual lists of objects
[21:30:11] <kurushiyama> wayne hmmm
[21:31:00] <kurushiyama> wayne Sorry, dont get a grasp on it
[23:51:05] <Freman> hey kurushiyama, just talked our boss into the enterprise support, can you link me to pricing? :D