PMXBOT Log file Viewer

Help | Karma | Search:

#mongodb logs for Tuesday the 31st of May, 2016

(Back to #mongodb overview) (Back to channel listing) (Animate logs)
[03:19:05] <Keverw> Hey. I'm playing around with a local cluster just to learn.
[03:19:37] <Keverw> How do I include the shardkey in the query? I attempted shardkey: '_id' and shardkey: {_id: 'hashed'} ? Wondering for both updateOne and findAndModify Trying to figure it out
[03:20:38] <Boomtime> the shard key is just a name given to the field (or fields) you used to shard the colleciton on
[03:21:06] <Boomtime> you don't say it explicitly, "including the shard key" just means that you include those fields in the predicates as normal
[03:21:45] <Keverw> oh. I see. So if I want to update only one record by the id, I'd include _id in my query then.
[03:21:53] <Boomtime> if your shard key is _id (often a poor choice) then your queries are better off including _id\
[03:22:02] <Boomtime> right
[03:22:25] <Boomtime> update( { _id: 1 }, { $set: { some_field: "some value" } } )
[03:22:36] <Boomtime> if your shard key is _id then that will be a targeted update
[03:22:50] <Keverw> Oh i see. When I as reading the doc and the error message. I was thinking it meant including a field containing what key you used, but not the actual key. So that makes sense
[03:22:59] <Boomtime> if your shard key is anything else then it will be a scatter update
[03:23:26] <Boomtime> :D
[03:25:15] <Keverw> Sweet. That makes sense. So I guess in my idea I'd need to query the user by name to get the ID, then update the user.
[03:25:54] <Keverw> Thanks :)
[03:56:03] <kashike> hrmph. morphia documentation says to use @Embedded, but it doesn't seem to be required (at least in 1.1.1)?
[07:10:35] <skullcrasher> can I use mongodb driver version 3 with mongodb 2.x server?
[07:10:44] <skullcrasher> or do I have to user driver v2 too
[07:16:19] <Boomtime> what language skullcrasher?
[07:16:24] <skullcrasher> java
[07:16:44] <Boomtime> https://docs.mongodb.com/ecosystem/drivers/java/#java-driver-compatibility
[07:16:47] <Boomtime> computer says yes
[07:17:10] <skullcrasher> ah :D
[07:17:13] <skullcrasher> Boomtime, thx
[10:10:42] <ren0v0> Hi, why is there an --exclude-collection but mongodump doesn't support --collection 1,2,3 or similar ?
[10:11:30] <Derick> I don't know why, but it should be easy to script in bash
[10:11:45] <Derick> for i in col1 col2 col3; do mongodump .... --collection $i; done
[10:13:07] <kurushiyama> ren0v0: mongodump is meant as a backup tool, not something for data export, iirc. So basically, you would want to include what you have.
[10:13:08] <ren0v0> Derick, yea no doubt i can write a script for it, it just seems odd
[10:13:40] <ren0v0> kurushiyama, i want to backup, but not everything. --exclude-collection is nasty imo as you have to keep your command updated if you add new collections for example.
[10:13:59] <ren0v0> surely it would have been better to do it the other way round ?
[10:14:24] <kurushiyama> ren0v0: As said: Usually, you want to make sure _all_ your data is backed up. And _especially_ if you add new ollections, you want to have them included by default.
[10:14:55] <ren0v0> well that's a good assumption, but doesn't apply in my case
[10:15:12] <ren0v0> what if someone wants to use --archive, specify certain collections to store on different drives?
[10:15:29] <ren0v0> just throwing ideas out, but seems it doesn't apply to everyone
[10:15:39] <Zelest> </sarcasm>
[10:15:54] <kurushiyama> > "but doesn't apply in _my_ case" Go figure ;)
[10:15:56] <Derick> Zelest: it's not the worst idea i've heard of today :)
[10:16:00] <Zelest> haha
[10:16:24] <Zelest> better now? :D
[10:16:24] <ren0v0> kurushiyama, software is never improved by real world user suggestions, is it ;)
[10:16:31] <schlitzer> hey, is there a rsync mirror for the mongodb repository?
[10:16:57] <Derick> schlitzer: not that I know of
[10:17:04] <Derick> git clone is your friend
[10:18:00] <kurushiyama> ren0v0: Users are a necessary evil to feed developers while doing what they love ;P
[10:18:10] <ren0v0> kurushiyama, this would also make sense if mongorestore had the ability to restore certain collections
[10:18:23] <ren0v0> doesn't it make the new "--archive" pointless also?
[10:18:36] <ren0v0> loop over single --collection commands, and end up with what
[10:18:50] <ren0v0> multiple .gz, so --archive is rendered useless too
[10:19:00] <schlitzer> Derick, i am talking about rpm/deb repository, not git :-)
[10:19:20] <Derick> schlitzer: ok. I doubt it though
[10:19:40] <schlitzer> ok, thank you
[10:29:02] <uehf> Hi guys. I have question about repairDatabase in shard. If I will start it on slave to reclaim disk spaceafter deleting data, will it block the master or mongos or not?
[10:34:05] <uehf> Or maybe you know best way to reclaim space after deleting table?
[10:38:41] <kurushiyama> schlitzer: iirc, rsync works on http urls as well.
[10:40:13] <schlitzer> kurushiyama, no, rsync wont work with http/https, rsync, or ssh, is required for syncing with remote resources
[10:43:20] <kurushiyama> uehf: As dumb as it may sound: if you have to reclaim disk space in a sharded environment, you most likely failed several stages earlier: proper dimensioning, proper monitoring and alerting. Using repairDatabase is quite a tricky operation. It only guarantees that your instance will be in a usable state after it finished. If there are problems with your data, there might be documents missing. If you have a replica set, simp
[10:43:20] <kurushiyama> ly take one secondary offline, delete the _content_ of the dbpath and let it resync, rinse and repeat for all secondaries. issue an "rs.stepDown()" on the primary, then repeat the process there.
[10:44:39] <kurushiyama> schlitzer: wget -N ?
[10:45:10] <kurushiyama> schlitzer: Or simply use a caching proxy.
[10:46:21] <uehf> kurushiyama: yep, kinda bad planning. Also we using WiredTiger. So can we just use compact on slave and then make it primary, repeating it on master?
[10:46:34] <uehf> kurushiyama: it will not block the whole shard?
[10:47:22] <kurushiyama> uehf: Not as long as you make sure you do not run stuff on the current primary. BUT: You should not have to reclaim disk space on wT
[10:48:29] <uehf> kurushiyama: why?
[10:49:03] <kurushiyama> uehf: Because it releases disk space back to the OS.
[10:49:39] <kurushiyama> uehf: Once again: repairDatabase is _not_ to be taken lightly.
[10:53:38] <uehf> kurushiyama: sorry, i've red manual again. we don't have enough space to use repair. So compact is our only choice
[10:54:05] <kurushiyama> uehf: Why do you need to compact?
[10:55:36] <uehf> kurushiyama: to release space after reducing collection
[10:56:09] <kurushiyama> uehf: That is the cause, not the reason. why do you need to release that space?
[10:56:10] <uehf> kurushiyama: for example we have data for two years an now we want to left data for 2 months only
[10:56:17] <kurushiyama> uehf: So?
[10:57:23] <uehf> kurushiyama: because I don't want to see space alert in my monitoring
[10:58:09] <kurushiyama> uehf: Strange things happening here. But your call.
[10:58:17] <uehf> kurushiyama: and if some dev will decide to dump something - he will have enough space for it
[10:58:50] <uehf> kurushiyama: so compact will not block master?
[11:00:39] <kurushiyama> uehf: Again, there is no such thing as a master in a replica set. make sure you do not run administrative commands on the primary in that point of time. My suggestion is to not run any administrative command _at all_ until _you_ understand the implications, which are all layed out in the docs.
[11:01:12] <uehf> kurushiyama: thank you
[14:22:59] <catphish_> i'm trying to add a server to a replica set but i can't work out how to configure the new slave to actually join the cluster
[14:23:35] <catphish_> i've run rs.add() on the current master, but i can't find any docs on what needs to be run on the new member to have it connect to an existing member of the cluster and pull config
[14:24:32] <kurushiyama> catphish_: congrats, you are already done.
[14:25:04] <kurushiyama> catphish_: And there is no "master". it is a "primary". Terminology is important.
[14:25:37] <catphish_> unfortunately my new member has no way to know where the cluster is afaik, it's not joined it as far as i can see
[14:26:12] <catphish_> or should i expect the current primary to connect to the new member and initiate the process?
[14:26:59] <catphish_> ooh, i see, that's exactly what should happen!
[14:27:11] <catphish_> it only failed because my new member was not configured to listen on the right IP!
[14:27:13] <catphish_> thanks
[14:27:41] <kurushiyama> catphish_: You should not use IP adresses for your members.
[14:28:07] <kurushiyama> catphish_: Not even hostnames. Best practise is to use CNAMEs
[14:28:21] <catphish_> i strongly disagree with the use of cnames over hostnames
[14:28:26] <catphish_> but sure
[14:29:15] <catphish_> one still has to configure a bind IP though :)
[14:31:00] <catphish_> anyway as soon as i configured the right IP to listen on the new member, it burst into life
[14:32:23] <cheeser> CNAMEs are more stable than IPs and hostnames.
[14:33:00] <catphish_> CNAMEs make sense for services that might move about, like a config server
[14:33:34] <catphish_> but for a regular member i'd always use its hostname, and i usually maintain entries in /etc/hosts all all members to avoid reliance on DNS
[14:34:44] <cheeser> i've never quite followed the reasoning for CNAMEs but that's our official rec at any rate.
[14:35:16] <cheeser> though one reason *I* would choose them is for semantic grouping.
[14:35:24] <catphish_> i believe it's only recommended for a mirrored config server setup
[14:36:08] <catphish_> it certainly makes sense in the case where you might give the role to a different server, i think of CNAMEs as being for "roles" or "services", not for actual servers
[14:36:36] <catphish_> i'm sure it all depends on the particular environment though
[14:37:17] <Ange7> hey
[14:37:35] <catphish_> i think it's going to take quite some time to sync my 400GB database to my new member :( unfortunately i'm down to one node right now, so can't do an offline data sync
[14:37:44] <cheeser> yikes
[14:38:16] <catphish_> it's a non-critical database, normally has 2 nodes, one failed, i'm replacing it :)
[14:38:25] <Ange7> someone to help me to create a query to get : ALL DOCUMENTS with field « string » contains « ABC » BUT NOT « DEF » ? thank you
[14:39:15] <catphish_> anyway, next task calls, thanks :)
[15:07:16] <kurushiyama> cheeser: The reasoning goes as follows (iirc): With using CNAMEs you do not have to change the config of anything for replacing a machine, even when the new machines happens to have a different IP and hostname, for example when one member's mongod instance goes down and you still want to investigate on said machine, but want to replace it meanwhile. Without using CNAMES, you would have to take away the hostname from the ori
[15:07:17] <kurushiyama> ginal machine.
[15:07:36] <cheeser> that'
[15:07:42] <kurushiyama> catphish_: Uhm. 2 nodes? Master/slave repl?
[15:07:45] <cheeser> that's right. i forgot that bit.
[15:09:30] <catphish_> kurushiyama: well its just a replset, but yes, i use one as a master and the other as a backup
[15:09:40] <kurushiyama> cheeser: Well, ofc you would have to change the target of the CNAME. Another, rather subtle disadvantage of the CNAME approach is that you should make sure the TTL is not too high for the zone... bit my neck once, where somebody had a TTL of a week.
[15:10:25] <kurushiyama> catphish_: There is _either_ a replset _OR_ a master/slave replication. A replset with only 2 members is not a good idea.
[15:11:54] <kurushiyama> cheeser: Hm, not sure, but I guess CNAMES make sense for SSL as well...
[15:11:59] <catphish_> kurushiyama: i have a replset with 2 data-carrying members, and a 3rd member with no data for mediation
[15:12:36] <kurushiyama> catphish_: That sounds _much_ better. So you have a primary, a secondary and an arbiter. As said, terminology matters ;)
[15:28:03] <cheeser> catphish_: an arbiter you mean?
[15:28:20] <catphish_> yes, sorry
[15:28:41] <catphish_> i assumed an arbiter was implied in a 2-node cluster, sorry
[15:29:21] <catphish_> i suppose actually it's considered a 3 node cluster where only 2 are data nodes
[15:29:53] <cheeser> i've learned to never assume anything is implied ;)
[15:30:12] <catphish_> entirely true
[15:30:44] <cheeser> some get offended by that practice. but they're fewer than those helped by not assuming :D
[15:32:18] <kurushiyama> I made the mistake only once ;) "We have a two node replica set..." which I misinterpreted as "Two data bearing nodes"...
[15:35:36] <cheeser> so that's a mistake you don't want to ... replicate?
[15:35:38] <cheeser> *badump*
[15:44:23] <kurushiyama> cheeser: No. The mistake was assuming that there was an arbiter...
[15:44:57] <cheeser> ... replicate ... no?
[15:46:56] <kurushiyama> cheeser: Well, it replicated as long as both machines were up ;) They called me in, because when one node was shut down for maintenance, the replica set became unavailable. I asked about their setup, they answered "we have a two node replset" and I assumed that that would mean "2 data bearing nodes" and implied an arbiter. Bad mistake. ;)
[15:50:22] <catphish_> i did get a nasty shock once when i hade larger number of nodes in a replset, and i took several of them offline, at which point the remaining ones refused to elect a primary, it makes perfect sense in hindsight
[15:52:46] <cheeser> kurushiyama: *sigh* nevermind. i was trying to make a terrible joke and apparently it was worse than i thought. ;)
[18:21:10] <shlant> anyone have a guide on how to update ssl certs being used by a replica set?
[20:14:14] <pyCasso> I am suddenly getting a Segmentation fault: 11 error. not sure why on my local machine. any help on this issue?
[20:46:01] <pyCasso> where is it documented on how to setup authentication?
[23:25:14] <poz2k4444> Hi guys, I'm using mongo-connector to sync my mongo collection to elasticsearch but I cannot make mongo-connector to update the index when a document in the collection was updated, can anybody help me with this