[07:44:23] <energizer> In python I have a list of dictionaries with unique ids. What is the best way to upsert it into mongodb?
[08:28:37] <Keksike> Hey. For some reason, after updating mongo from 2.6.11 to 3.2, I get this kind of error when trying to authenticate: 2016-04-08T14:58:11.907+0300 I ACCESS [conn88] Failed to authenticate thisUser@myDB with mechanism MONGODB-CR: AuthenticationFailed: MONGODB-CR credentials missing in the user document
[08:28:52] <Keksike> How should I try to debug/fix this?
[08:34:28] <Derick> Keksike: did you go through 3.0 first?
[08:34:45] <Keksike> Derick: im not sure actually, it was my client who updated it. I'll ask em.
[08:35:06] <Derick> the format for auth changed, and if you upgraded correctly, it should have automatically fixed that
[08:35:46] <Derick> Keksike: https://docs.mongodb.org/manual/release-notes/3.0-upgrade/#upgrade-existing-mongodb-cr-users-to-use-scram-sha-1 gives a hint (and a link) too
[10:55:08] <roelj> Any hints on how to debug a segfaulting mongod?
[11:43:03] <torbjorn> I need to merge data vectors coming in from different source, and detect possible conflicts, ie if { SampleID : "1234", Age: 24 } merges wtih { SampleID: "1234", Age:25 }, then thats an error, however { SampleID: "1234", Age: NaN } // is ok to merge, and then keep Age:25 for that sample
[11:43:13] <torbjorn> can mongo help with that, or do i have to do that in the application layer
[12:21:06] <kurushiyama> torbjorn: That is definetly an application layer thingy, with application layer having a broad meaning. You might be able to do a diff/merge on your input data, depending on how the data is generated.
[12:21:28] <torbjorn> do you know anything that might help me
[12:21:47] <torbjorn> its not a too weird scenario
[12:25:10] <torbjorn> id rather keep it as null or na, and put all collected errors in an separate list
[12:25:17] <torbjorn> that i can then work with to sort out
[12:25:51] <kjekken> hello. I need to insert a day , month , year and week record into a collection. I allready have a record called creation_time that contains a ISODate. Just need to use the isodate to make the other records.
[12:26:13] <kurushiyama> torbjorn: here is what I would do: Write a merge tool, which loads the data from the spread, looks up the according doc and does the merging/diffing/whatever. And log the errors.
[12:26:43] <kurushiyama> torbjorn: Everything else might well miss some logic.
[12:27:13] <kurushiyama> kjekken: What do you need that for? to Query by year/month/day?
[12:27:57] <kjekken> the frontend is build to use that logic :P
[12:28:07] <kjekken> its going to change but now i just need a quick fix :P
[12:29:58] <kurushiyama> kjekken: I am not aware of something else than iterating and updating. But you may describe what you want to _do_, so that we can find an alternative.
[12:33:43] <StephenLynx> is your back-end using node?
[12:33:49] <kurushiyama> kjekken: I got that. Again, why _exactly_? And how does this query look like? And can we change it? It really, really, really does not make sense to have single fields for those values if you already have an ISODate.
[12:36:19] <kurushiyama> StephenLynx: Basically, there seem to be legacy queries/design reqs which need to be fulfilled by querying by y or m or d. Other than that, I am still interrogating the suspect ;)
[12:40:41] <StephenLynx> all while just storing a simple date object.
[12:40:45] <kjekken> hehe i might not be explaining myself very well
[12:40:49] <kurushiyama> may look different when we talk of time series data and for example the average number of users for a website by dow should be caclulated.
[12:41:09] <kurushiyama> kjekken: You could simply show use the query as it is now.
[12:41:46] <kjekken> i just want day(day number in month) year(ex 2015) month(ex 4) week(ex 34) as records in my db.documents collection
[12:52:25] <kurushiyama> kjekken: The thing is that you already have a date and you are putting the cart before the horse. It is simply extremely bad design to have this redundant data in the db. Load the data as is, and either use your controller (or however it is called) to split the date in the parts you need, or do it in the templates, if your template engine provides that.
[12:52:55] <kjekken> just right now i need to make a quick fix
[12:52:56] <kurushiyama> kjekken: If you really have to have it split when it is returned from the database, use an aggregation which does that for you.
[12:52:58] <StephenLynx> is it just a single date?
[13:01:52] <ren0v0> Does mongodb not have a remote sync option? I know there is copyDatabase(), bu this requires the db/table not to be existing
[13:01:56] <ren0v0> if i want to sync a table, what are my optons?
[13:02:26] <kurushiyama> kjekken: If you want to find the docs between two dates, it is pretty straightforward: db.date.find({date:{$gte:ISODate("2016-02-01T00:00:00"),$lt:ISODate("2016-05-01T00:00:00Z")}})
[13:02:28] <StephenLynx> and you can output the very same content to the front-end
[13:02:32] <kjekken> im sorry but this is just me being a mongodb noob :P
[13:02:49] <kurushiyama> kjekken: Hence, you better listen to StephenLynx – he is not ;)
[13:03:48] <StephenLynx> one linking to https://gitgud.io/LynxChan/LynxChan/blob/master/src/be/graphsOps.js#L236 and the other to http://www.w3schools.com/jsref/jsref_obj_date.asp
[13:04:01] <StephenLynx> the first one links to how you will query all documents within the specified range
[13:04:16] <StephenLynx> the second one links to how you will extract the parts you need from the obtained date.
[13:04:23] <kjekken> i dont see why i need to do that
[13:04:37] <StephenLynx> you need to inform those to the front-end, dont you?
[13:04:39] <kjekken> i want to insert 4 records into all my db.documents
[13:04:48] <StephenLynx> why you want to insert those?
[13:07:01] <kurushiyama> kjekken: Have a look at the pastebin I gave you.
[13:07:20] <kurushiyama> kjekken: a *_*_CLOSE_*_* look
[13:12:16] <kurushiyama> kjekken: The point is: regardless of where you do the conversion of the date you need, you do _not_ have to store them in the database. You can either use the aggregation, as shown above. Or, when you have loaded the data, you can extract the acording parts in your controller. Or (and that would be the best thing, if technically possible, in order to decouple logic and presentation), do it in your presentation layer.
[13:21:48] <kurushiyama> kjekken: A bit more detailed: http://pastebin.com/rtFcp4CL
[13:27:46] <hyades> Does $lookup take into account the indexes on the from table?
[15:55:14] <ren0v0> hi, i'm trying to use copyDatabase, but i'm getting complaints about a table already being there, but i've dropped the entire database prior to copy?
[15:56:26] <kurushiyama> ren0v0: Terminology matters. There is no such thing as a table in MongoDB.
[15:56:45] <kurushiyama> ren0v0: So I doubt that MongoDB complains about that.
[16:01:20] <kurushiyama> ren0v0: Still, are we talking of a replica set?
[16:02:39] <ren0v0> i'm just trying to copy datbase from staging to deployment server
[16:02:57] <ren0v0> i can't find any "sync" ability, this was the best solution i could find, copying from remote using ssh tunnel
[16:03:57] <kurushiyama> ren0v0: If you do not answer my questions, how should I be able to answer yours? ;) So I implicitly assume that we are not talking of a replica set.
[16:05:11] <StephenLynx> I'd like to make a comment. If you want to sync your staging to production and don't want to just copy what you got from one to the other, you might have deeper issues.
[16:05:24] <StephenLynx> you want to keep any data you already have on deployment
[16:10:40] <StephenLynx> or use mongodump and mongorestore?
[16:10:50] <ren0v0> i did this a few days ago with no issues
[16:10:52] <kurushiyama> ren0v0: Ok, let us assume you have none. But StephenLynx holds a valid point here. And the question is actually wether your application is still running. Because if inserts happen, databases and collections get created on the fly, which might well happen during the copy of the other data.
[16:12:23] <StephenLynx> you don't specify the dabase you are reading from
[16:12:33] <StephenLynx> are you sure that's how it works?
[16:12:37] <kurushiyama> ren0v0: NAh, you got that wrong. You are trying to clone the data to the prod database. But, when your application is still running, it might do inserts on said collection, in which case the collection gets created automatically.
[16:12:42] <StephenLynx> did you try using mongodump and restore?
[16:14:09] <ren0v0> i have the code running on this server too, thanks guys :D
[16:14:56] <ren0v0> StephenLynx, mongodump/restore yea this may be best, i'll need to write some scripts for it. Its a shame mongo doesn't have builtin functions for syncing et
[16:18:38] <StephenLynx> however if you want to completely overwrite the destination, you might not want to.
[16:19:13] <StephenLynx> afaik mongorestore will just delete everything on the destination before writing, if there's anything there
[16:22:07] <kurushiyama> StephenLynx: Uh, I am not too sure about that. There is a "--drop" option, after all. Iirc, as long as there is not dupe violation, restore will happily add to an existing collection.
[16:22:36] <kurushiyama> However, I am not sure what happens at the first dupe violation.
[16:22:53] <StephenLynx> I knew you could do both, just didn't remember the default
[16:24:02] <kurushiyama> Ah, here it is. By default, dupes are simply ignored and restore carries on, unless "--stopOnError" is set. At least this is how I interpret it.
[16:26:06] <kurushiyama> wether this is a reasonable way to merge two datasets remains to be seen.
[16:26:57] <kurushiyama> especially when there are changes on the same _id in both datasets.
[17:06:49] <ren0v0> kurushiyama, it does seem like mongo isn't handling this the best it can
[17:07:00] <ren0v0> its not always ideal to stop something
[17:09:00] <kurushiyama> ren0v0: You are using mongodb out of its intended purpose? What do you expect? Would you expect a formula one car excel at creating furrows?
[17:11:51] <kurushiyama> Well, one _could_ argue that if a software can be used aside from its intended purpose it has too many features aside from its intended purpose, I grant that.
[17:12:48] <StephenLynx> you are screwing up. that's what I meant.
[17:15:15] <kurushiyama> ren0v0: To be more precise. You have a certain use case, which, as StephenLynx pointed out, has quite a few risks and is most likely far away from what can be considered best practice. Then, you misuse a feature and complain that it does not work as you would like to behave it to fulfill said flawed use case.
[17:23:09] <kurushiyama> ren0v0: Please do not get it wrong. Nobody wants o bash you. But the discussion should have told you that you may have other challenges than the one at hand.
[17:24:21] <StephenLynx> and when one blames the tool before doing a little introspection, people won't think too much before telling the equivalent of "git gud"
[18:07:30] <kurushiyama> StephenLynx: I always feel like I should implement a git subcommand when I read it ;)
[18:14:42] <keldwud> I'm following the tutorial for disabling transparent huge pages (found here: https://docs.mongodb.org/manual/tutorial/transparent-huge-pages/) in centos7 and I've added the init script and added the tuned.conf then ran tuned-adm profile <profile name>. I run into an issue when verifying. enabled shows [never] but defrag still shows [always].
[18:16:13] <keldwud> manually echoing never to defrag works, of course.
[18:16:24] <keldwud> so I'm trying to figure out where I messed up
[18:19:32] <keldwud> is it because centos7 uses systemd? I guess the issue lies with tuned (which I have no experience with)
[18:20:24] <uuanton> have you setup mongod.service ?
[18:23:12] <keldwud> uuanton: I haven't, that's why I suspected maybe the disconnect was there
[18:23:26] <keldwud> the tutorial used init.d script
[18:24:08] <keldwud> what's interesting is that the first echo was successful but the second echo wasn't
[18:34:06] <kurushiyama> keldwud: Oh, just saw it.
[18:35:35] <kurushiyama> keldwud: With the init script you are referring to https://docs.mongodb.org/manual/tutorial/transparent-huge-pages/#init-script ?
[18:36:01] <keldwud> I'm researching tuned now to see where maybe it is getting stuck on passing 'never' to /sys/kernel/mm/transparent_hugepage/defrag
[18:36:11] <keldwud> I'm using that same init script
[18:36:31] <keldwud> I have tested by manually echoing 'never' to both enabled and defrag and that works fine
[18:37:03] <keldwud> it *doesn't* work when I use tuned-adm to load the profile no-thp
[18:37:10] <kurushiyama> keldwud: If you are using tuned or ktune (for example, if you are running Red Hat or CentOS 6+), you must ********additionally********** configure them so that THP is not re-enabled. See Using tuned and ktune.
[18:37:23] <keldwud> and I am using the profile found on the same page under the centos tuned portion
[18:38:03] <keldwud> so after I load the no-thp profile, enabled is set to never but defrag either is not set to never or gets changed back to always
[18:40:19] <keldwud> I'm digging through tuned documentation now to see if I can maybe pin it down
[18:41:59] <keldwud> does tuned access the init.d script?
[18:42:23] <keldwud> or are the changes added to tuned to make sure it doesn't change a setting back to a previous setting?
[18:43:00] <kurushiyama> keldwud: Have you checked wether you _enabled_ the init script?
[18:43:10] <keldwud> yeah, I did a chkconfig --add
[18:43:28] <keldwud> but I'm not sure if systemd adds it when you use the chkconfig command
[18:43:37] <keldwud> I am assuming that it has some backwards compatibility
[18:43:57] <kurushiyama> well, if it wouldn't, I'd have some quite severe problems ;)
[18:44:19] <keldwud> it appears that the script is running because /sys/kernel/mm/transparent_hugepages/enabled gets set to never and stays set to never
[18:44:32] <keldwud> it's just the defrag portion that is either not getting set or is not staying set
[18:44:38] <kurushiyama> keldwud: Not necessarily. It can be an effect of tune.d
[18:45:01] <kurushiyama> keldwud: Since tuned does not touch the defrag, I am pretty sure the init script is not run.
[18:45:25] <keldwud> so I want to add a line in the tuned profile to also touch defrag?
[18:45:44] <keldwud> that's what I was suspecting, wasn't sure if tuned accessed the script to make changes or if it did its own thing
[18:46:33] <kurushiyama> keldwud: Nope. What happens if you run the script manually? Maybe there was some problem during c&p. Just please check to make sure.
[18:55:14] <kurushiyama> Ok, lets try sth. You can add "transparent_hugepage=never" to the kernel command line in grub.conf
[18:55:37] <keldwud> I was hoping to avoid that :)
[18:55:51] <keldwud> I didn't want to set it in grub because I didn't want to have to rely on a reboot to make sure it takes place
[18:56:59] <kurushiyama> keldwud: It is just for testing sth.
[18:57:05] <keldwud> I want to make sure I'm understanding correctly, though, /sys/kernel/mm/transparent_hugepage/enabled *does* get set to never when I follow the mongod documentation after I add the init.d script and the tuned profile then run tuned-adm
[18:57:55] <keldwud> but /sys/kernel/mm/transparent_hugepage/defrag does *not* set to never. Should it be getting set to never? I assumed that it should be since the init.d script modifies it and the documentation states that it should show as never
[18:59:06] <kurushiyama> You are correct with that. So suspect it to be reenabled somewhere.
[18:59:41] <keldwud> this is a fresh minimal install of centos07 and installing only mongod, if that helps any
[19:02:43] <keldwud> "The tricky part is that the vm plugin can only configure the /sys/kernel/mm/transparent_hugepage/enabled setting. To disable the /sys/kernel/mm/transparent_hugepage/defrag setting too, I had to create a script that is called by the profile on start."
[19:03:08] <keldwud> so that's what I was suspecting the whole time. that tuned just wasn't changing the defrag setting
[19:03:34] <keldwud> it's the third answer down on that page
[19:04:17] <keldwud> I wonder what the mongod documentation did to get it to work using just the configuration mentioned
[19:04:27] <keldwud> or should I document my process and submit it?
[19:07:22] <keldwud> ok yeah, I see the edit symbol
[19:07:40] <keldwud> question is, what happens if defrag isn't also set to never?
[19:07:57] <keldwud> I don't understand enough about this portion of linux to know why both need to be set to never
[19:09:36] <kurushiyama> keldwud: Me neither, tbh. I do not see the point in TPH, anyway. It is a feature only there for very narrow use cases, and I can not see why it is on by default.
[19:10:45] <keldwud> but to disable it fully, both /defrag and /enabled have to be set to never?
[19:32:22] <kurushiyama> shlant: since you definetly want to deal with election times and such.
[20:43:36] <hackel> Is there a way to query multiple fields for the same value without repeating them in an $and? e.g. col.find({/a.(b|c).isActive/: true})
[20:44:19] <hackel> Heh, a quick and direct answer...thanks.
[20:44:52] <kurushiyama> hackel: You could use text searches... ...for text values, what a surprise ;)
[20:46:35] <hackel> kurushiyama: Heh, fascinating concept! It is just a boolean I'm after in this case, and I'm just being my anally DRY self, as usual.
[20:48:38] <kurushiyama> hackel: Although, I have to admit, "true" isn't a very distinct value, and by having all field names written out, it becomes much more clear what you want to do. Much more readable, at least.
[20:50:58] <cheeser> for grins, try sharding on a boolean field sometime.
[20:51:03] <hackel> kurushiyama: That's fair. In this case, these sub-documents were added as properties of a parent object, with the type identified by the property name, instead of adding them as an array of with the type as a separate field. The latter would have been a much better deisgn, but I'm not in a place where I can do a migration.
[20:52:09] <kurushiyama> hackel: As far as I got it, both models do not seem to be... well, likeable.
[20:54:55] <kurushiyama> hackel: Every time I hear or read array of subdocs, I have to make a concious decision so that my brain does not translate it to "overembedding".
[20:57:23] <hackel> kurushiyama: There's definitely an unnecessarily level of embedding currently. Would you say that it's better to move the child documents to a separate collection instead of an array, even if the relation is always going to be 1:1?
[21:00:27] <kurushiyama> hackel: More often than not, even without knowing the use cases, the answer is a big, fat, red, blinking *YES*!
[21:02:04] <kurushiyama> hackel: There are edge cases, however. a 1:1 makes it worth a closer look, surely
[21:02:57] <hackel> Huh, that's interesting. I was told the opposite, since Mongo's not optimized for joins, blah blah. Definitely will have to keep that in mind on the next project.
[21:03:37] <kurushiyama> hackel: Wait a sec, I'll give you some links
[21:03:39] <cheeser> "not optimized for joins" is a generous description
[21:04:05] <hackel> Just reading about $lookup now...
[21:05:41] <kurushiyama> hackel: The example section of http://dba.stackexchange.com/questions/134898/lots-of-indexes-mysql-vs-mongodb-migrating/134961#134961
[21:06:47] <hackel> kurushiyama: Thanks...adding those to my reading list for sure!
[21:10:34] <kurushiyama> hackel: All those translate to a few rules of thumb: 1) If you find yourself having an array of subdocs, stop and think again. 2) If you would need a JOIN in SQL to achieve what you want, you might be able to do it relatively cheap with some redundancy. 3) Document operations are atomic. If in doubt, _document_ events or points in time.
[21:17:49] <hackel> Thanks for the advice. In this case, I wouldn't need the atomicity. What would be a good use case for embedding an array of subdocuments? I'm also doing it one case where I need to store some extra data for a relation, where I might have used a pivot table before.
[21:20:47] <kurushiyama> hackel: A good use case for an array of subdocs? tbh, I have yet to find one. Since one way or the other, it comes with problems.
[21:21:30] <kurushiyama> hackel: Well, that is not entirely true: