pmxbot IRC Log Viewer

[07:44:23] <energizer> In python I have a list of dictionaries with unique ids. What is the best way to upsert it into mongodb?

[08:28:37] <Keksike> Hey. For some reason, after updating mongo from 2.6.11 to 3.2, I get this kind of error when trying to authenticate: 2016-04-08T14:58:11.907+0300 I ACCESS [conn88] Failed to authenticate thisUser@myDB with mechanism MONGODB-CR: AuthenticationFailed: MONGODB-CR credentials missing in the user document

[08:28:52] <Keksike> How should I try to debug/fix this?

[08:34:28] <Derick> Keksike: did you go through 3.0 first?

[08:34:45] <Keksike> Derick: im not sure actually, it was my client who updated it. I'll ask em.

[08:35:06] <Derick> the format for auth changed, and if you upgraded correctly, it should have automatically fixed that

[08:35:18] <Keksike> ok

[08:35:46] <Derick> Keksike: https://docs.mongodb.org/manual/release-notes/3.0-upgrade/#upgrade-existing-mongodb-cr-users-to-use-scram-sha-1 gives a hint (and a link) too

[08:36:04] <Keksike> Thanks alot Derick

[10:55:08] <roelj> Any hints on how to debug a segfaulting mongod?

[11:43:03] <torbjorn> I need to merge data vectors coming in from different source, and detect possible conflicts, ie if { SampleID : "1234", Age: 24 } merges wtih { SampleID: "1234", Age:25 }, then thats an error, however { SampleID: "1234", Age: NaN } // is ok to merge, and then keep Age:25 for that sample

[11:43:13] <torbjorn> can mongo help with that, or do i have to do that in the application layer

[12:21:06] <kurushiyama> torbjorn: That is definetly an application layer thingy, with application layer having a broad meaning. You might be able to do a diff/merge on your input data, depending on how the data is generated.

[12:21:28] <torbjorn> do you know anything that might help me

[12:21:47] <torbjorn> its not a too weird scenario

[12:21:57] <kurushiyama> torbjorn: http://linux.die.net/man/1/diff

[12:22:16] <kurushiyama> torbjorn: Inconsistent data? I call that an _exceptional_ scenario.

[12:22:49] <torbjorn> well its user input validation basically

[12:22:59] <kurushiyama> torbjorn: Uhm? What?

[12:23:23] <kurushiyama> torbjorn: Please clarify. Carefully and extensively.

[12:23:23] <torbjorn> my vectors come from humans that err

[12:23:42] <kurushiyama> torbjorn: Ok. How are they sent?

[12:23:51] <torbjorn> spread sheets

[12:24:05] <kurushiyama> torbjorn: And how can you decide wether the age is 24 or 25?

[12:24:09] <torbjorn> i cant

[12:24:24] <torbjorn> or rather - an email or a phone call

[12:24:27] <kurushiyama> torbjorn: > is ok to merge, and then keep Age:25 for that sample

[12:24:58] <kurushiyama> diegoaguilar: tsup?

[12:25:10] <torbjorn> id rather keep it as null or na, and put all collected errors in an separate list

[12:25:17] <torbjorn> that i can then work with to sort out

[12:25:51] <kjekken> hello. I need to insert a day , month , year and week record into a collection. I allready have a record called creation_time that contains a ISODate. Just need to use the isodate to make the other records.

[12:26:13] <kurushiyama> torbjorn: here is what I would do: Write a merge tool, which loads the data from the spread, looks up the according doc and does the merging/diffing/whatever. And log the errors.

[12:26:43] <kurushiyama> torbjorn: Everything else might well miss some logic.

[12:27:13] <kurushiyama> kjekken: What do you need that for? to Query by year/month/day?

[12:27:57] <kjekken> the frontend is build to use that logic :P

[12:28:07] <kjekken> its going to change but now i just need a quick fix :P

[12:29:58] <kurushiyama> kjekken: I am not aware of something else than iterating and updating. But you may describe what you want to _do_, so that we can find an alternative.

[12:30:55] <kjekken> db.documents.find().forEach(function(doc){ db.documents.update({_id:doc._id}, {$set:{"year":doc.creation_time.getFullYear(), "month":doc.creation_time.getMonth(), "day":doc.creation_time.getDay(), "week":{$week: doc.creation_time.getWeek(),}}}); });

[12:31:05] <kjekken> this is what i wanted to do

[12:31:18] <kurushiyama> kjekken: That was not my question.

[12:31:19] <kjekken> but looks like there is no getWeek :P

[12:31:55] <kurushiyama> kjekken: WHY do you need the day/month/year as separate fields. Why _exactly_?

[12:31:56] <kjekken> yes i want to iterate and update just not sure how :D

[12:32:27] <kjekken> because the frontend use a query that uses the day year month and week records

[12:32:48] <kjekken> not by my design :P

[12:33:43] <StephenLynx> is your back-end using node?

[12:33:49] <kurushiyama> kjekken: I got that. Again, why _exactly_? And how does this query look like? And can we change it? It really, really, really does not make sense to have single fields for those values if you already have an ISODate.

[12:33:51] <kjekken> yess

[12:33:57] <StephenLynx> store a date object

[12:34:08] <kjekken> i know its bad design :P

[12:34:08] <StephenLynx> you can get all those from the date object AND also get the week day

[12:34:18] <StephenLynx> then give all that back to the FE

[12:34:20] <kjekken> and it will be changes but just need a quick fix right now

[12:34:34] <StephenLynx> give me a tl,dr

[12:35:09] <StephenLynx> http://www.w3schools.com/jsref/jsref_obj_date.asp

[12:36:00] <kjekken> problem here is week yes?

[12:36:13] <kjekken> no getWeek

[12:36:19] <kurushiyama> StephenLynx: Basically, there seem to be legacy queries/design reqs which need to be fulfilled by querying by y or m or d. Other than that, I am still interrogating the suspect ;)

[12:36:48] <kjekken> you are correct :)

[12:36:51] <StephenLynx> he wants to query by day of week then?

[12:36:56] <StephenLynx> let me check stuff

[12:37:14] <kjekken> Day Week(Number(

[12:37:24] <kjekken> day weeknumber year and day

[12:37:25] <kurushiyama> StephenLynx: The suspect plead the fifth on that so far.

[12:38:31] <StephenLynx> ok, hear me out

[12:38:34] <kurushiyama> Most likely, and in place aggregation would make more sense, provided we have the possibility for an early match.

[12:38:35] <StephenLynx> you could use aggregation

[12:38:39] <kurushiyama> ;)

[12:38:57] <StephenLynx> if you are querying by year

[12:39:00] <StephenLynx> you can set a range

[12:39:08] <StephenLynx> and filter anything outside this range

[12:39:22] <StephenLynx> then you isolate other parts of what remained

[12:39:27] <StephenLynx> such as day of week

[12:39:33] <StephenLynx> and use a second match

[12:40:06] <StephenLynx> https://docs.mongodb.org/manual/reference/operator/aggregation/dayOfWeek/

[12:40:41] <StephenLynx> all while just storing a simple date object.

[12:40:45] <kjekken> hehe i might not be explaining myself very well

[12:40:49] <kurushiyama> may look different when we talk of time series data and for example the average number of users for a website by dow should be caclulated.

[12:41:09] <kurushiyama> kjekken: You could simply show use the query as it is now.

[12:41:46] <kjekken> i just want day(day number in month) year(ex 2015) month(ex 4) week(ex 34) as records in my db.documents collection

[12:42:03] <StephenLynx> hm?

[12:43:15] <kjekken> "creation_time" : ISODate("2016-01-17T18:36:55Z"), will create : day 17 year 2016 week 2 month 0

[12:43:57] <StephenLynx> yes

[12:45:15] <StephenLynx> well, not exactly.

[12:45:25] <StephenLynx> it will create a date object with an unix time

[12:45:36] <kurushiyama> Uhm, I might be wrong, but that is a question of how to display a given date, no?

[12:45:40] <StephenLynx> and from that you can get that data.

[12:48:17] <StephenLynx> i still have no idea what he meant

[12:50:00] <kjekken> frontend might ask like: {"$match":{"month":2,"year":2016, etc

[12:50:19] <kjekken> and i need to create those records using an isodate i allready have

[12:50:29] <StephenLynx> again

[12:50:38] <StephenLynx> ok

[12:50:52] <StephenLynx> I already told you that. store the date

[12:51:03] <StephenLynx> then create a range using the year

[12:51:30] <StephenLynx> that will eliminate most records

[12:51:57] <StephenLynx> will it ask for just a single period?

[12:52:02] <StephenLynx> one year, one month?

[12:52:25] <kurushiyama> kjekken: The thing is that you already have a date and you are putting the cart before the horse. It is simply extremely bad design to have this redundant data in the db. Load the data as is, and either use your controller (or however it is called) to split the date in the parts you need, or do it in the templates, if your template engine provides that.

[12:52:39] <kjekken> i know its bad design :P

[12:52:45] <kjekken> it will not say that way

[12:52:50] <StephenLynx> again

[12:52:55] <kjekken> just right now i need to make a quick fix

[12:52:56] <kurushiyama> kjekken: If you really have to have it split when it is returned from the database, use an aggregation which does that for you.

[12:52:58] <StephenLynx> is it just a single date?

[12:53:05] <StephenLynx> or is it multiple dates?

[12:53:06] <kjekken> no

[12:53:25] <kjekken> ranges of dates

[12:53:38] <StephenLynx> multiple years, multiple months?

[12:54:40] <StephenLynx> kjekken, you there?

[12:54:46] <kjekken> yess

[12:54:52] <StephenLynx> multiple years, multiple months?

[12:55:07] <StephenLynx> kjekken, you there?

[12:55:07] <kjekken> just one range

[12:55:09] <StephenLynx> ok

[12:55:18] <StephenLynx> so its really simple

[12:55:20] <kjekken> like from feb to apric for instance

[12:55:23] <kjekken> april

[12:55:32] <StephenLynx> create a regular js date object

[12:55:38] <StephenLynx> then create another

[12:55:44] <StephenLynx> one will have the upper limit of the date

[12:55:47] <StephenLynx> the other the lower limit

[12:56:16] <StephenLynx> then you query the date you have stored, which will be just a simple date object

[12:56:22] <kurushiyama> http://pastebin.com/cn4hW6t6

[12:56:24] <StephenLynx> and say it has to be between both of these limits.

[12:56:36] <StephenLynx> no need to use aggregation, though

[12:56:45] <StephenLynx> you got it wrong, kuru

[12:56:53] <StephenLynx> he doesn't need to project, he need to match.

[12:57:07] <kurushiyama> StephenLynx: It was for showing that this can be split with an aggregatin

[12:57:17] <StephenLynx> yeah, but he doesn't need that.

[12:58:23] <kurushiyama> Oh, let me, to get it right, this time ;)

[12:59:20] <kjekken> db.documents.find().forEach(function(doc){ db.documents.update({_id:doc._id}, {$set:{"year":doc.creation_time.getFullYear(), "month":doc.creation_time.getMonth(), "day":doc.creation_time.getDay(), "week":{$week: doc.creation_time.getWeek(),}}}); });

[12:59:27] <StephenLynx> no

[12:59:31] <StephenLynx> https://gitgud.io/LynxChan/LynxChan/blob/master/src/be/graphsOps.js#L236

[12:59:38] <kjekken> this is what i want :) but there is no getWeek

[12:59:46] <StephenLynx> you don't need that.

[12:59:56] <StephenLynx> first you are running an update there

[13:00:00] <StephenLynx> that query is non-sense

[13:00:24] <kurushiyama> StephenLynx: It is the update for the data which would create the redundancy.

[13:00:24] <kjekken> works for month year and day

[13:00:29] <StephenLynx> just use $gte:lowerLimit,$lt: upperLimit

[13:00:30] <kjekken> YES

[13:00:39] <StephenLynx> you don't need that.

[13:00:42] <kjekken> i do

[13:00:45] <StephenLynx> you DONT

[13:01:04] <StephenLynx> use the desired range, get the date and use the date functions to extract the fields.

[13:01:15] <kjekken> witch i cant change now

[13:01:26] <StephenLynx> if you can change that, you can change this.

[13:01:27] <kjekken> as i said

[13:01:44] <StephenLynx> this doesn't require you to change the database.

[13:01:49] <StephenLynx> that does.

[13:01:52] <ren0v0> Does mongodb not have a remote sync option? I know there is copyDatabase(), bu this requires the db/table not to be existing

[13:01:56] <ren0v0> if i want to sync a table, what are my optons?

[13:02:26] <kurushiyama> kjekken: If you want to find the docs between two dates, it is pretty straightforward: db.date.find({date:{$gte:ISODate("2016-02-01T00:00:00"),$lt:ISODate("2016-05-01T00:00:00Z")}})

[13:02:28] <StephenLynx> and you can output the very same content to the front-end

[13:02:32] <kjekken> im sorry but this is just me being a mongodb noob :P

[13:02:49] <kurushiyama> kjekken: Hence, you better listen to StephenLynx – he is not ;)

[13:02:56] <kjekken> yepp

[13:03:00] <kjekken> still dont get it :)

[13:03:03] <kjekken> my bad

[13:03:04] <StephenLynx> just use the date functions to extract the parts of the date you need for the front-end

[13:03:27] <StephenLynx> did you open my links?

[13:03:31] <kjekken> yes

[13:03:48] <StephenLynx> one linking to https://gitgud.io/LynxChan/LynxChan/blob/master/src/be/graphsOps.js#L236 and the other to http://www.w3schools.com/jsref/jsref_obj_date.asp

[13:04:01] <StephenLynx> the first one links to how you will query all documents within the specified range

[13:04:16] <StephenLynx> the second one links to how you will extract the parts you need from the obtained date.

[13:04:23] <kjekken> i dont see why i need to do that

[13:04:37] <StephenLynx> you need to inform those to the front-end, dont you?

[13:04:39] <kjekken> i want to insert 4 records into all my db.documents

[13:04:48] <StephenLynx> why you want to insert those?

[13:04:52] <kurushiyama> kjekken: No, you do not.

[13:04:58] <kurushiyama> kjekken: That is the whole point.

[13:05:01] <StephenLynx> db changes are NOT trivial.

[13:05:11] <kjekken> i know

[13:05:12] <StephenLynx> they are the last thing you change.

[13:05:19] <StephenLynx> so why you want to change that?

[13:05:33] <StephenLynx> you already got that information on the date object.

[13:05:40] <kurushiyama> kjekken: You want to use the _existing_ date, and extract what you need either in your controller or your templates.

[13:06:15] <kjekken> yes

[13:06:24] <kjekken> but right now i just need to change the db

[13:06:33] <kurushiyama> I give up.

[13:06:40] <StephenLynx> sdfkjhdfkjshdfjasdfhskjdhfkjsdf

[13:06:45] <kjekken> sorry guys

[13:06:48] <kurushiyama> kjekken: You DONT

[13:07:01] <kurushiyama> kjekken: Have a look at the pastebin I gave you.

[13:07:20] <kurushiyama> kjekken: a *_*_CLOSE_*_* look

[13:12:16] <kurushiyama> kjekken: The point is: regardless of where you do the conversion of the date you need, you do _not_ have to store them in the database. You can either use the aggregation, as shown above. Or, when you have loaded the data, you can extract the acording parts in your controller. Or (and that would be the best thing, if technically possible, in order to decouple logic and presentation), do it in your presentation layer.

[13:21:48] <kurushiyama> kjekken: A bit more detailed: http://pastebin.com/rtFcp4CL

[13:27:46] <hyades> Does $lookup take into account the indexes on the from table?

[14:06:44] <echelon> hey

[14:07:16] <echelon> what does it mean when the fatal.log reports.. "Still in error state. iteration #######"?

[14:12:45] <kurushiyama> echelon: It means bad news ;)

[14:13:04] <kurushiyama> echelon: Can you state a bit more about your environment?

[14:23:54] <echelon> kurushiyama: it's an aws ec2 instance

[14:23:56] <echelon> uhm

[14:24:05] <echelon> not sure what other details i can provide that would be helpful

[14:25:49] <echelon> kurushiyama: it's the mongodb automation agent

[14:30:13] <cheeser> you should file a support ticket

[14:30:21] <cheeser> you're paying for support after all

[14:32:46] <echelon> cheeser: i am?

[14:33:01] <echelon> support ticket with aws?

[14:33:21] <cheeser> mongo cloud

[14:34:08] <echelon> nevermind, it seems the environment was never setup with mms

[14:34:13] <echelon> just disabled it

[14:34:16] <kurushiyama> echelon: Oh.

[14:34:20] <kurushiyama> Sorry had a call

[14:38:03] <echelon> :)

[15:55:14] <ren0v0> hi, i'm trying to use copyDatabase, but i'm getting complaints about a table already being there, but i've dropped the entire database prior to copy?

[15:56:26] <kurushiyama> ren0v0: Terminology matters. There is no such thing as a table in MongoDB.

[15:56:45] <kurushiyama> ren0v0: So I doubt that MongoDB complains about that.

[15:57:16] <ren0v0> sorry, collection then :)

[15:57:31] <ren0v0> "errmsg" : "failed to create collection \"sites_data\": collection already exists"

[15:57:34] <ren0v0> letme paste it all

[15:57:38] <kurushiyama> ren0v0: Maybe you want to show the command you are running and the actual error message.

[15:57:44] <kurushiyama> ren0v0: Please use pastebin

[15:58:32] <ren0v0> kurushiyama, http://pastebin.com/Ap0zhdW0

[15:59:37] <kurushiyama> ren0v0: You clone it to a different instance. Maybe you want to drop the collection there? Or is this a replset?

[16:00:16] <kurushiyama> ren0v0: Ah, soory. Mom

[16:00:52] <ren0v0> i have dropped it

[16:01:20] <kurushiyama> ren0v0: Still, are we talking of a replica set?

[16:02:39] <ren0v0> i'm just trying to copy datbase from staging to deployment server

[16:02:57] <ren0v0> i can't find any "sync" ability, this was the best solution i could find, copying from remote using ssh tunnel

[16:03:57] <kurushiyama> ren0v0: If you do not answer my questions, how should I be able to answer yours? ;) So I implicitly assume that we are not talking of a replica set.

[16:05:11] <StephenLynx> I'd like to make a comment. If you want to sync your staging to production and don't want to just copy what you got from one to the other, you might have deeper issues.

[16:05:24] <StephenLynx> you want to keep any data you already have on deployment

[16:05:25] <StephenLynx> ?

[16:05:32] <StephenLynx> even if it doesn't exist on the staging?

[16:07:13] <kurushiyama> Totally overlooked that.

[16:08:06] <kurushiyama> StephenLynx: Nice spot. And it might well explain why the collection was (re-)created.

[16:09:19] <ren0v0> kurushiyama, i have no idea what one is, which is why i explained the scenario i was trying to overcome

[16:10:12] <ren0v0> StephenLynx, yes i'm just looking to copy one to the other for now, isn't that what "copyDatabase()" does though?

[16:10:31] <StephenLynx> do why don't you just drop the database from deployment before doing it?

[16:10:38] <ren0v0> i did, check the pastebin...

[16:10:40] <StephenLynx> or use mongodump and mongorestore?

[16:10:50] <ren0v0> i did this a few days ago with no issues

[16:10:52] <kurushiyama> ren0v0: Ok, let us assume you have none. But StephenLynx holds a valid point here. And the question is actually wether your application is still running. Because if inserts happen, databases and collections get created on the fly, which might well happen during the copy of the other data.

[16:10:55] <StephenLynx> i dunno then

[16:10:59] <ren0v0> lol!

[16:11:20] <StephenLynx> if you managed to do it right and you dropped the database beforehand

[16:11:26] <ren0v0> kurushiyama, more data maybe being updated there, but no new collection names

[16:11:34] <StephenLynx> I am going to guess you didn't do what you thought you did.

[16:11:45] <ren0v0> StephenLynx, did you check the pastebin ?

[16:11:55] <StephenLynx> let me see

[16:12:14] <StephenLynx> that sounds weird

[16:12:23] <StephenLynx> you don't specify the dabase you are reading from

[16:12:33] <StephenLynx> are you sure that's how it works?

[16:12:37] <kurushiyama> ren0v0: NAh, you got that wrong. You are trying to clone the data to the prod database. But, when your application is still running, it might do inserts on said collection, in which case the collection gets created automatically.

[16:12:42] <StephenLynx> did you try using mongodump and restore?

[16:12:48] <StephenLynx> that too

[16:12:53] <kurushiyama> StephenLynx: He is correct, double checked that.

[16:13:12] <StephenLynx> are you sure the database isn't in use when you try to copy it?

[16:13:42] <StephenLynx> are you checking what data is in the destination db when you get that error?

[16:13:51] <ren0v0> kurushiyama, got you there

[16:13:59] <ren0v0> right

[16:14:09] <ren0v0> i have the code running on this server too, thanks guys :D

[16:14:56] <ren0v0> StephenLynx, mongodump/restore yea this may be best, i'll need to write some scripts for it. Its a shame mongo doesn't have builtin functions for syncing et

[16:15:09] <kurushiyama> ren0v0: It has.

[16:15:14] <kurushiyama> ren0v0: Very much so.

[16:15:18] <StephenLynx> it just doesn't apply to your case.

[16:15:27] <StephenLynx> you can't sync two arbitrary servers, afaik.

[16:15:35] <StephenLynx> they have to relate in a specific manner.

[16:15:59] <kurushiyama> StephenLynx: with clone you basically can, but you have to make sure that the target DB is not in use.

[16:16:16] <kurushiyama> StephenLynx: It is a oneshot, though, ofc.

[16:16:23] <StephenLynx> so is not a sync.

[16:16:38] <StephenLynx> I am talking about "write what isn't there, read what isn't here"

[16:16:47] <StephenLynx> "delete what is gone"

[16:17:15] <kurushiyama> StephenLynx: Of course, that is not possible. Multimaster replication, that would be, no?

[16:17:18] <StephenLynx> clone just clones the source and place on the origin.

[16:17:24] <StephenLynx> destination*

[16:18:03] <ren0v0> can you use mongo restore with running code then ?

[16:18:14] <StephenLynx> yes.

[16:18:38] <StephenLynx> however if you want to completely overwrite the destination, you might not want to.

[16:19:13] <StephenLynx> afaik mongorestore will just delete everything on the destination before writing, if there's anything there

[16:22:07] <kurushiyama> StephenLynx: Uh, I am not too sure about that. There is a "--drop" option, after all. Iirc, as long as there is not dupe violation, restore will happily add to an existing collection.

[16:22:27] <StephenLynx> ah yeah

[16:22:36] <kurushiyama> However, I am not sure what happens at the first dupe violation.

[16:22:53] <StephenLynx> I knew you could do both, just didn't remember the default

[16:24:02] <kurushiyama> Ah, here it is. By default, dupes are simply ignored and restore carries on, unless "--stopOnError" is set. At least this is how I interpret it.

[16:26:06] <kurushiyama> wether this is a reasonable way to merge two datasets remains to be seen.

[16:26:57] <kurushiyama> especially when there are changes on the same _id in both datasets.

[17:06:49] <ren0v0> kurushiyama, it does seem like mongo isn't handling this the best it can

[17:07:00] <ren0v0> its not always ideal to stop something

[17:09:00] <kurushiyama> ren0v0: You are using mongodb out of its intended purpose? What do you expect? Would you expect a formula one car excel at creating furrows?

[17:10:04] <StephenLynx> kek

[17:10:35] <StephenLynx> software can't handle properly situations where they are misused.

[17:11:08] <ren0v0> wait what

[17:11:51] <kurushiyama> Well, one _could_ argue that if a software can be used aside from its intended purpose it has too many features aside from its intended purpose, I grant that.

[17:12:48] <StephenLynx> you are screwing up. that's what I meant.

[17:15:15] <kurushiyama> ren0v0: To be more precise. You have a certain use case, which, as StephenLynx pointed out, has quite a few risks and is most likely far away from what can be considered best practice. Then, you misuse a feature and complain that it does not work as you would like to behave it to fulfill said flawed use case.

[17:23:09] <kurushiyama> ren0v0: Please do not get it wrong. Nobody wants o bash you. But the discussion should have told you that you may have other challenges than the one at hand.

[17:24:21] <StephenLynx> and when one blames the tool before doing a little introspection, people won't think too much before telling the equivalent of "git gud"

[18:07:30] <kurushiyama> StephenLynx: I always feel like I should implement a git subcommand when I read it ;)

[18:14:42] <keldwud> I'm following the tutorial for disabling transparent huge pages (found here: https://docs.mongodb.org/manual/tutorial/transparent-huge-pages/) in centos7 and I've added the init script and added the tuned.conf then ran tuned-adm profile <profile name>. I run into an issue when verifying. enabled shows [never] but defrag still shows [always].

[18:16:13] <keldwud> manually echoing never to defrag works, of course.

[18:16:24] <keldwud> so I'm trying to figure out where I messed up

[18:19:32] <keldwud> is it because centos7 uses systemd? I guess the issue lies with tuned (which I have no experience with)

[18:20:24] <uuanton> have you setup mongod.service ?

[18:23:12] <keldwud> uuanton: I haven't, that's why I suspected maybe the disconnect was there

[18:23:26] <keldwud> the tutorial used init.d script

[18:24:08] <keldwud> what's interesting is that the first echo was successful but the second echo wasn't

[18:24:14] <keldwud> so it worked on *some* level

[18:26:28] <keldwud> oh wait, yeah, mongo.service is present and enabled

[18:33:34] <kurushiyama> keldwud: OS and version?

[18:34:06] <kurushiyama> keldwud: Oh, just saw it.

[18:35:35] <kurushiyama> keldwud: With the init script you are referring to https://docs.mongodb.org/manual/tutorial/transparent-huge-pages/#init-script ?

[18:36:01] <keldwud> I'm researching tuned now to see where maybe it is getting stuck on passing 'never' to /sys/kernel/mm/transparent_hugepage/defrag

[18:36:05] <keldwud> and yes

[18:36:11] <keldwud> I'm using that same init script

[18:36:31] <keldwud> I have tested by manually echoing 'never' to both enabled and defrag and that works fine

[18:37:03] <keldwud> it *doesn't* work when I use tuned-adm to load the profile no-thp

[18:37:10] <kurushiyama> keldwud: If you are using tuned or ktune (for example, if you are running Red Hat or CentOS 6+), you must ********additionally********** configure them so that THP is not re-enabled. See Using tuned and ktune.

[18:37:23] <keldwud> and I am using the profile found on the same page under the centos tuned portion

[18:38:03] <keldwud> so after I load the no-thp profile, enabled is set to never but defrag either is not set to never or gets changed back to always

[18:38:44] <kurushiyama> keldwud: You need both.

[18:38:54] <keldwud> kurushiyama: I've done both

[18:39:02] <keldwud> I have the init.d script and the tuned profile

[18:39:19] <keldwud> and I changed the permissions on the init.d script

[18:39:21] <kurushiyama> keldwud: That was not so clear. Thank you. Hmm... Strange

[18:39:26] <keldwud> wonder if maybe there's a setting missing that specifies /sys/kernel/mm/transparent_hugepages/defrag separately from

[18:39:46] <keldwud> separately from /sys/kernel/mm/transparent_hugepages/enabled

[18:39:49] <kurushiyama> very strange, since twfm

[18:40:19] <keldwud> I'm digging through tuned documentation now to see if I can maybe pin it down

[18:41:59] <keldwud> does tuned access the init.d script?

[18:42:23] <keldwud> or are the changes added to tuned to make sure it doesn't change a setting back to a previous setting?

[18:43:00] <kurushiyama> keldwud: Have you checked wether you _enabled_ the init script?

[18:43:10] <keldwud> yeah, I did a chkconfig --add

[18:43:28] <keldwud> but I'm not sure if systemd adds it when you use the chkconfig command

[18:43:37] <keldwud> I am assuming that it has some backwards compatibility

[18:43:57] <kurushiyama> well, if it wouldn't, I'd have some quite severe problems ;)

[18:44:19] <keldwud> it appears that the script is running because /sys/kernel/mm/transparent_hugepages/enabled gets set to never and stays set to never

[18:44:32] <keldwud> it's just the defrag portion that is either not getting set or is not staying set

[18:44:38] <kurushiyama> keldwud: Not necessarily. It can be an effect of tune.d

[18:45:01] <kurushiyama> keldwud: Since tuned does not touch the defrag, I am pretty sure the init script is not run.

[18:45:07] <keldwud> ahh ok

[18:45:25] <keldwud> so I want to add a line in the tuned profile to also touch defrag?

[18:45:44] <keldwud> that's what I was suspecting, wasn't sure if tuned accessed the script to make changes or if it did its own thing

[18:46:33] <kurushiyama> keldwud: Nope. What happens if you run the script manually? Maybe there was some problem during c&p. Just please check to make sure.

[18:46:41] <keldwud> ok

[18:47:47] <keldwud> hmm, yeah, defrag doesn't change after running the init.d script

[18:47:53] <keldwud> good catch

[18:47:59] <kurushiyama> Not sure, yet.

[18:48:07] <keldwud> oh wait, it didn't change either

[18:48:28] <keldwud> I didn't run it properly

[18:48:42] <kurushiyama> can you please do a `file /etc/init.d/disable-transparent-hugepages` ?

[18:48:56] <keldwud> ok yeah, the script made the changes

[18:49:16] <keldwud> POSIX shell script, ASCII text executable

[18:49:39] <keldwud> so when I run the script, both changes are made but when doing it through tuned it only affects /enabled

[18:49:57] <kurushiyama> insert an echo randomly somewhere.

[18:50:08] <keldwud> during testing, I manually echo 'always' to both files to make sure I notice when it changes to never

[18:50:10] <keldwud> ok

[18:51:19] <kurushiyama> tuned does not call the init script.

[18:53:14] <kurushiyama> really strange

[18:55:14] <kurushiyama> Ok, lets try sth. You can add "transparent_hugepage=never" to the kernel command line in grub.conf

[18:55:37] <keldwud> I was hoping to avoid that :)

[18:55:51] <keldwud> I didn't want to set it in grub because I didn't want to have to rely on a reboot to make sure it takes place

[18:56:59] <kurushiyama> keldwud: It is just for testing sth.

[18:57:05] <keldwud> I want to make sure I'm understanding correctly, though, /sys/kernel/mm/transparent_hugepage/enabled *does* get set to never when I follow the mongod documentation after I add the init.d script and the tuned profile then run tuned-adm

[18:57:55] <keldwud> but /sys/kernel/mm/transparent_hugepage/defrag does *not* set to never. Should it be getting set to never? I assumed that it should be since the init.d script modifies it and the documentation states that it should show as never

[18:59:06] <kurushiyama> You are correct with that. So suspect it to be reenabled somewhere.

[18:59:41] <keldwud> this is a fresh minimal install of centos07 and installing only mongod, if that helps any

[19:02:22] <keldwud> here we go

[19:02:28] <keldwud> I found a serverfault that touches on my issue

[19:02:34] <keldwud> http://serverfault.com/questions/688392/disable-thp-and-thp-defrag-on-centos-7-ec2-instance

[19:02:43] <keldwud> "The tricky part is that the vm plugin can only configure the /sys/kernel/mm/transparent_hugepage/enabled setting. To disable the /sys/kernel/mm/transparent_hugepage/defrag setting too, I had to create a script that is called by the profile on start."

[19:03:08] <keldwud> so that's what I was suspecting the whole time. that tuned just wasn't changing the defrag setting

[19:03:34] <keldwud> it's the third answer down on that page

[19:04:17] <keldwud> I wonder what the mongod documentation did to get it to work using just the configuration mentioned

[19:04:27] <keldwud> or should I document my process and submit it?

[19:04:29] <kurushiyama> Wtf? Thats nasty.

[19:04:34] <kurushiyama> Please do!

[19:04:45] <keldwud> where do submissions go?

[19:04:50] <keldwud> for documentation

[19:05:25] <keldwud> but yeah, it feels pretty hacky to have tuned call a script just to get defrag to turn off

[19:05:35] <keldwud> I'm wondering if at this point I should just have it call the init.d script instead

[19:06:17] <kurushiyama> Should work

[19:06:52] <kurushiyama> Scroll all the way up in the docs.

[19:06:57] <kurushiyama> See the edit sign?

[19:07:22] <keldwud> ok yeah, I see the edit symbol

[19:07:40] <keldwud> question is, what happens if defrag isn't also set to never?

[19:07:57] <keldwud> I don't understand enough about this portion of linux to know why both need to be set to never

[19:09:36] <kurushiyama> keldwud: Me neither, tbh. I do not see the point in TPH, anyway. It is a feature only there for very narrow use cases, and I can not see why it is on by default.

[19:10:45] <keldwud> but to disable it fully, both /defrag and /enabled have to be set to never?

[19:10:58] <keldwud> I'll look into that

[19:10:58] <kurushiyama> keldwud: I do not have a clue.

[19:11:07] <keldwud> opportunity for me to learn :)

[19:11:40] <keldwud> https://www.kernel.org/doc/Documentation/vm/transhuge.txt

[19:11:52] <kurushiyama> keldwud: Do not forget to add your findings to the docs with a small explanation. On or two sentences.

[19:12:15] <keldwud> I want to be sure of my findings before I update the docs, though

[19:12:23] <keldwud> I don't want to add a hacky solution if I don't have to

[19:12:35] <keldwud> -er hacky workaround, more like it

[19:14:41] <kurushiyama> keldwud: Which I can only agree to.

[19:15:49] <shlant> hi all. Can I create a one member replica set with tls enabled?

[19:15:52] <shlant> I assume so?

[19:16:10] <kurushiyama> shlant: Why shouldn't you?

[19:18:04] <kurushiyama> shlant: This wasn't a rhetorical question. What are your concerns there?

[19:18:50] <shlant> kurushiyama: I was jsut making sure

[19:20:24] <kurushiyama> shlant: The better question would be "What do I do with a single member replset?"

[19:21:29] <shlant> it's for CI testing

[19:24:09] <kurushiyama> shlant: CI with which meaning?

[19:24:52] <shlant> continuous integration tests

[19:25:18] <shlant> need to replicate our infra, but I don't want to go full 3 member replica

[19:26:08] <kurushiyama> shlant: Uhm. A single node replset is behaving exactly the same as a standalone instance, except for writing the oplog...

[19:26:23] <kurushiyama> shlant: No failover testing possible.

[19:26:39] <shlant> not interested in failover

[19:26:46] <kurushiyama> shlant: Just more complicated setup.

[19:26:48] <shlant> single will fit my needs

[19:27:25] <shlant> kurushiyama: so you're saying if doing a single member replica, I might as well just do a standalone instance

[19:28:03] <kurushiyama> shlant: Unless you are doing some odd things with the oplog: yes.

[19:29:34] <shlant> alright, thanks

[19:30:21] <kurushiyama> shlant: However, I stringly suggest using a 2+1 replset to do failover testing.

[19:30:33] <kurushiyama> s/stringly/strongly/

[19:32:22] <kurushiyama> shlant: since you definetly want to deal with election times and such.

[20:43:36] <hackel> Is there a way to query multiple fields for the same value without repeating them in an $and? e.g. col.find({/a.(b|c).isActive/: true})

[20:43:50] <cheeser> nope

[20:44:14] <kurushiyama> not for boolean values.

[20:44:19] <hackel> Heh, a quick and direct answer...thanks.

[20:44:52] <kurushiyama> hackel: You could use text searches... ...for text values, what a surprise ;)

[20:46:35] <hackel> kurushiyama: Heh, fascinating concept! It is just a boolean I'm after in this case, and I'm just being my anally DRY self, as usual.

[20:48:38] <kurushiyama> hackel: Although, I have to admit, "true" isn't a very distinct value, and by having all field names written out, it becomes much more clear what you want to do. Much more readable, at least.

[20:50:58] <cheeser> for grins, try sharding on a boolean field sometime.

[20:51:03] <hackel> kurushiyama: That's fair. In this case, these sub-documents were added as properties of a parent object, with the type identified by the property name, instead of adding them as an array of with the type as a separate field. The latter would have been a much better deisgn, but I'm not in a place where I can do a migration.

[20:52:09] <kurushiyama> hackel: As far as I got it, both models do not seem to be... well, likeable.

[20:53:14] <hackel> Heh, good way of putting it!

[20:54:55] <kurushiyama> hackel: Every time I hear or read array of subdocs, I have to make a concious decision so that my brain does not translate it to "overembedding".

[20:57:23] <hackel> kurushiyama: There's definitely an unnecessarily level of embedding currently. Would you say that it's better to move the child documents to a separate collection instead of an array, even if the relation is always going to be 1:1?

[21:00:27] <kurushiyama> hackel: More often than not, even without knowing the use cases, the answer is a big, fat, red, blinking *YES*!

[21:02:04] <kurushiyama> hackel: There are edge cases, however. a 1:1 makes it worth a closer look, surely

[21:02:57] <hackel> Huh, that's interesting. I was told the opposite, since Mongo's not optimized for joins, blah blah. Definitely will have to keep that in mind on the next project.

[21:03:37] <kurushiyama> hackel: Wait a sec, I'll give you some links

[21:03:39] <cheeser> "not optimized for joins" is a generous description

[21:04:05] <hackel> Just reading about $lookup now...

[21:05:29] <kurushiyama> hackel: http://blog.mahlberg.io/blog/2015/11/05/data-modelling-for-mongodb/ http://dba.stackexchange.com/questions/134841/how-can-we-ensure-security-and-integrity-of-data-stored-in-mongodb/134858#134858

[21:05:41] <kurushiyama> hackel: The example section of http://dba.stackexchange.com/questions/134898/lots-of-indexes-mysql-vs-mongodb-migrating/134961#134961

[21:06:47] <hackel> kurushiyama: Thanks...adding those to my reading list for sure!

[21:10:34] <kurushiyama> hackel: All those translate to a few rules of thumb: 1) If you find yourself having an array of subdocs, stop and think again. 2) If you would need a JOIN in SQL to achieve what you want, you might be able to do it relatively cheap with some redundancy. 3) Document operations are atomic. If in doubt, _document_ events or points in time.

[21:17:49] <hackel> Thanks for the advice. In this case, I wouldn't need the atomicity. What would be a good use case for embedding an array of subdocuments? I'm also doing it one case where I need to store some extra data for a relation, where I might have used a pivot table before.

[21:20:47] <kurushiyama> hackel: A good use case for an array of subdocs? tbh, I have yet to find one. Since one way or the other, it comes with problems.

[21:21:30] <kurushiyama> hackel: Well, that is not entirely true:

[21:22:03] <kurushiyama> http://blog.mahlberg.io/blog/2015/11/05/data-modelling-for-mongodb/#the-mongodb-way-with-embedding

[21:22:33] <kurushiyama> Or, more general: http://blog.mahlberg.io/blog/2015/11/05/data-modelling-for-mongodb/#conclusion

[21:23:16] <kurushiyama> hackel: But it did not cross my way in real life scenarios.

[21:23:35] <kurushiyama> hackel: And I am not even entirely sure wether the example is a good one.

[21:26:38] <hackel> Gotcha. Very informative blog post.

[21:27:06] <kurushiyama> hackel: You are welcome to leave comments or wishes for the next issue.

[21:30:09] <hackel> Thanks, will do.

[22:43:54] <Doyle> Hey. For mongodb 3.0, to gzip a mongodump it's --out - | gzip outfile.gz ?

[22:46:55] <Doyle> --out - | gzip > out.gz ?

[22:48:16] <Doyle> Yep, think that's it

Log file Viewer

Help | Karma | Search:

#mongodb logs for Monday the 11th of April, 2016