PMXBOT Log file Viewer

Help | Karma | Search:

#pypa-dev logs for Monday the 18th of November, 2019

(Back to #pypa-dev overview) (Back to channel listing) (Animate logs)
[16:09:06] <sumanah> as I prep a talk about PyPI's security model, for use among students who almost certainly know more than I do about formal application security assessment, I am going to probably talk quite a lot about my own perspective as a project manager/product manager/community manager and give historical context on the packaging pipeline
[16:09:38] <sumanah> my goal is to have an outline about 4 hours from now... I'll then be asking folks in this channel to help double-check whether I'm wrong about any of what I am saying
[16:10:46] <sumanah> woodruffw: ^
[16:12:02] <woodruffw> sumanah: thanks for the heads up! i'll be on a plane right about then but should have internet; just ping me :-)
[16:12:07] <sumanah> Thanks :)
[16:12:39] <sumanah> Probably whatever I write should end up as a pull request to update https://warehouse.readthedocs.io/security/
[16:14:51] <sumanah> Am starting a pad https://pad.sfconservancy.org/p/warehouse-security-model-nyu-talk-20191118
[16:54:05] <sumanah> and I just got delayed by taking 20 minutes to respond to yet another "I hate your user interface and of course I must obviously be right" person https://github.com/pypa/warehouse/issues/6933#issuecomment-555106043
[19:23:27] <sumanah> I've made some progress....
[21:07:30] <sumanah> Congrats to jaraco and Brian and other Twine folks -- down to 2 open pull requests!
[21:10:37] <sumanah> ok, still working on https://pad.sfconservancy.org/p/warehouse-security-model-nyu-talk-20191118
[21:10:40] <sumanah> I have a basic outline
[21:12:55] <techalchemy> hey sumanah -- i'm gonna have a look if you dont mind
[21:13:00] <sumanah> techalchemy: please do!
[21:14:11] <techalchemy> i made a mental not earlier but didn't respond, but thanks for sharing the notes! I suspect if you wanted newer-than-2016 bandwidth numbers EWDurbin *may* be able to produce them sort of reasonably quickly
[21:14:49] <sumanah> True
[21:15:59] <sumanah> I figure I will actually be saying to these students: I don't actually pay THAT much attention to those numbers and I don't have to, because of Fastly, etc
[21:16:13] <techalchemy> nice :)
[21:17:08] <sumanah> techalchemy: I'd welcome help filling in what's currently lines 39-42
[21:17:27] <sumanah> where I try to answer the standard questions: Components and modules—What are the major divisions between the application’s components and modules?
[21:17:27] <sumanah> Intermodule relationships—At a high level, how do different modules in the application communicate?
[21:17:27] <sumanah> Fundamental security expectations—What security expectations do legitimate users of this application have?
[21:17:27] <sumanah> Major trust boundaries—What are the major boundaries that enforce security expectations?
[21:17:34] <sumanah> woodruffw: ^ this is where I am
[21:17:43] <sumanah> in https://pad.sfconservancy.org/p/warehouse-security-model-nyu-talk-20191118 .... I know you're on a plane
[21:17:52] <techalchemy> sumanah, and this is largely to discuss the security of pypi itself?
[21:18:15] <sumanah> woodruffw: I can muddle through on my own mostly! but I welcome your brief notes
[21:19:12] <sumanah> techalchemy: "something that goes into the details of PyPI's security model would be great" is what I was asked for, and this is a class in application security. I figure that, along the way, I will need to talk briefly about the packaging system as a whole
[21:19:21] <sumanah> techalchemy: so, yes, I figure, security of PyPI itself
[21:19:25] <techalchemy> ah yeah that makes good sense
[21:22:34] <techalchemy> ok some of that is best left to someone who has more warehouse background than me :p
[21:22:47] <sumanah> :)
[21:23:07] <techalchemy> i probably have less technical knowledge about it than you do
[21:24:11] <sumanah> techalchemy: I think I'm getting hung up on "Components and modules" and then " Major trust boundaries—What are the major boundaries that enforce security expectations?"
[21:24:26] <techalchemy> the components and modules one is confusing to me also
[21:24:35] <techalchemy> that's why i started at the next one :p
[21:24:55] <sumanah> techalchemy: right :) I'm re-reading https://warehouse.readthedocs.io/application/ -- much of which I wrote!
[21:25:28] <techalchemy> I was like am I supposed to like talk about a UML component vs like a python module in the __init__ sense?
[21:26:35] <techalchemy> for security boundaries I would assume that would be across the API/database -- I haven't actually looked at authentication but presumably... well that I can actually glance over let me check
[21:26:57] <sumanah> Once I put a security hat on, I think it gets a little easier for me to think about the "components/modules" part of an appsec review. Basically: what are the different abstractions that, when you forget about them, can ruin your day?
[21:28:53] <techalchemy> db access, authentication, token generation, XSS, MITM attacks/poor SSL implementation, session hijacking (see XSS), file permissions, admin passwords /2FA enforcement
[21:32:15] <techalchemy> ddos mitigation, single points of failure (for anything, including user password compromise incidents, outages, etc), and then you get into things that there isn't a great answer to for anyone yet but has caused issues like malicious uploads, typosquatting (there's an email in admin@pypi's inbox about some research and API access for typosquatting)
[21:34:31] <techalchemy> maintainers handing over the keys to their very popular packages to a malicious actor, after a long convo I think we left a discussion about this with 'who do users trust? do they trust a package just because it came from pypi? do they trust it if it consistently comes from the same uploader (e.g. verify a gpg key of a repo or a user?) Do they trust it if it undergoes some scanning? Does someone have to read the
[21:34:32] <techalchemy> code?'
[21:34:52] <techalchemy> problem with that is people reading code also isn't that good for catching exploits...
[21:35:46] <techalchemy> i'm far from a security expert despite that being my job so I always just phrase these things as questions :p
[21:36:45] <sumanah> techalchemy: I think maybe you were using this list as a response to my "what are the different abstractions that, when you forget about them, can ruin your day?" rhetorical question. Is that correct?
[21:36:58] <techalchemy> yep
[21:37:07] <sumanah> ah. ok. I see that my phrasing was a little broad.
[21:37:18] <techalchemy> i mean they are all relevant to pypi
[21:37:21] <sumanah> What I actually meant was:
[21:38:16] <sumanah> _in the section of this review where I am enumerating modules and components of Warehouse_, one way for me to think of and name "modules" or "components" is to think of the various abstractions that will bite me, as a Warehouse developer, if I forget to account for them when I write a new feature
[21:39:01] <techalchemy> meh talks always have to be so specific
[21:39:13] <sumanah> techalchemy: well, no, they don't, and I think maybe you're being playful? but I am not sure
[21:39:19] <techalchemy> yeah i am :p
[21:39:56] <techalchemy> i'm surprised you even caught that one it was subtle :)
[21:39:58] <sumanah> I am trying to create a talk where I spend 30 minutes giving some folks an overview of the application security of PyPI, because that's what the professor asked me for, and because that's going to fit reasonably well into the course of stuff that his students already know and what they're trying to learn
[21:40:53] <techalchemy> yeah it makes sense i just misunderstood how you wanted to present the info
[21:41:09] <sumanah> In the course of that, I may well _mention_ a lot of the things you listed, but they aren't components/modules, so as I'm trying to somewhat systematically think about PyPI security, at this very moment I was trying to work out the components/modules thing
[21:41:17] <sumanah> techalchemy: There are a lot of talks out there, in academia, in industry, in hobbyist talks, etc., that can be much broader! !!Con is a good spot for those, for instance
[21:41:51] <sumanah> cool, glad we understand each other a bit better
[21:43:11] <sumanah> I do think db access, auth, and token generation are among the abstractions I should mention
[21:43:15] <sumanah> thanks for listing those
[21:43:16] <techalchemy> sumanah, is one of the goals of your communication to show people kind of the underlying organization of the code itself to make it easier for them to understand how to actually contribute if they so choose?
[21:44:30] <sumanah> techalchemy: my primary goal right now is to just not seem like a waste of time ignorant fool. Second goal is to help them see how complicated it is when you only control a part of the ecology. I think like maybe a 5th goal, if I have time, is to encourage contribution
[21:44:49] <techalchemy> ok
[21:45:19] <sumanah> I am pretty concerned with that primary goal :( usually when I do public speaking it's about something I know more deeply
[21:46:03] <techalchemy> yeah I know that feeling for sure, I suspect you might know more about security than you think though
[22:29:21] <sumanah> ok I'm calling https://pad.sfconservancy.org/p/warehouse-security-model-nyu-talk-20191118 good enough
[22:31:17] <techalchemy> looks solid, and yeah the trust boundaries and the components/module are all things where if compromised it would be very bad
[22:32:00] <techalchemy> i.e. it's bad if someone can get a user's token, substantially worse if token generation itself is compromised
[22:32:44] <techalchemy> i'm sure you'll do great it's a ton of info already and not a ton of time! good luck :)
[22:32:47] <sumanah> di_codes: EWDurbin: woodruffw: ^ please feel free to take a look at https://pad.sfconservancy.org/p/warehouse-security-model-nyu-talk-20191118 and expand it anytime between now and like 7pm ET
[22:36:56] <sumanah> or comment on anything I have gotten wrong