PMXBOT Log file Viewer

Help | Karma | Search:

#pypa logs for Friday the 21st of August, 2020

(Back to #pypa overview) (Back to channel listing) (Animate logs)
[13:44:37] <famubu> Can anyone tell me the names of a few projects that use flit for packaging? I couldn't find any on googling.
[13:46:06] <astronavt> good question famubu. i havent even tried flit, poetry seems to have won the popularity contest
[13:59:02] <tos9> famubu: https://github.com/search?q=%5Btool.flit.metadata%5D&type=Code
[14:06:44] <abn> Filtering out some of the noise https://github.com/search?l=&p=3&q=%22%5Btool.flit.metadata%5D%22+-path%3Atests%2F+filename%3Apyproject.toml&ref=advsearch&type=Code
[14:13:28] <famubu> astronavt: Maybe because poetry can do anything flit accomplishes...
[14:13:46] <famubu> tos9 and abn: Thanks for the links!
[14:16:16] <famubu> But I couldn't recognize any of the projects at a quick glance. I had at least heard of some projects using poetry. But with flit, I've only seen its usage in tutorials.
[14:17:28] <abn> guess i'd share this here as well; gathering some input regarding PEP 621 depdendency specification at https://forms.gle/Qmy1a5bD8fKNYKtp9 would be great to get a wider audience participating
[14:18:13] <abn> famubu: you can also do the same search but with tool.poetry to get an idea of the difference I guess
[14:22:40] <famubu> abn: Yeah, I had done that by modifying the link. But I think there is no option to sort the results by most cloned or repo with most commits or anything. Even for poetry I couldn't find familiar names.. But it may be that I've spent enough time with Python yet. :-)
[14:22:45] <astronavt> why is google docs blocked on my work vpn
[14:22:46] <astronavt> wtf
[14:23:58] <abn> haha; that is new :)
[14:24:23] <abn> i'd be even more surprised if you use google docs for work too
[14:24:30] <famubu> astronavt: A lot of people share your emotions regarding that during the corona time...
[14:24:43] <astronavt> some orgs use exclusively google docs, but at an institutional level through g-suite
[14:25:09] <abn> custom domains?
[14:26:46] <astronavt> in bigger ones yes
[14:26:56] <astronavt> google has enterprise accounts
[14:27:13] <astronavt> anyway, what is this pep for? a tool-agnostic dependency spec in pyproject.toml?
[14:33:01] <abn> Well 621 deals with metadata in general; but the discourse is specific to how dependencies gets listed
[14:33:22] <astronavt> cool, i will have to sign up for the forum and get involved
[14:33:26] <astronavt> much better than a mailing list :D
[14:34:05] <abn> This is the specific open issue - https://www.python.org/dev/peps/pep-0621/#how-to-specify-dependencies
[14:35:22] <nanonyme> Hey, have you considered https://pypi.org/pypi/<project_name/releases/json/ into https://warehouse.pypa.io/api-reference/json/#json-api so you could query just the releases?
[14:37:23] <nanonyme> The response is quite massive for the API endpoint that is currently provided. Also maybe pagination for releases
[15:35:20] <abn> @nanonyme https://pypi.org/rss/project/virtualenv/releases.xml maybe helps?
[16:22:51] <nanonyme> abn, sort of. I was hoping for a JSON API though
[22:33:34] <abn> Is it possible for bandersnatch to run in a "metadata-only" mode; ie. not sync artifacts?
[22:39:54] <cooperlees> abn: Not today, but I'd accept the PR
[22:40:07] <cooperlees> abn: Do you only want to JSON files saved to disk?
[22:40:35] <abn> I want to try create a metadata mirror to expose a graphql api; figuring out how best to expose this.
[22:40:58] <abn> One thought was to get bandersnatch to mirror it and wrap the graphql around the storage backend.
[22:41:17] <abn> So, yes. Onyl JSON.
[22:41:42] <abn> cooperlees: I am happy to do the work anyway if you think that is useful
[22:43:27] <cooperlees> abn: I feel the best way is to add this endpoint to warehouse
[22:43:37] <cooperlees> Where you could cache live data out of the real PyPI database
[22:43:57] <cooperlees> But this is going to need PEPs and lots of fun
[22:43:58] <abn> That would be awesome; but is that something that the team is open to?
[22:44:12] <cooperlees> I've personally tried to replace the xmlrpc API and failed
[22:44:28] <cooperlees> I'm ment to finish a PEP for the current JSON API, need to do that this weekend
[22:44:29] <abn> What was the main reasons for it?
[22:45:00] <cooperlees> abn: I'd Open an issue on the warehouse GitHub and state your reasons for wanting GraphQL particularly
[22:45:56] <abn> ack; will write that up next week I guess.
[22:46:11] <abn> cooperlees: do you know what the current data size for the metadata is by any chance?
[22:47:15] <cooperlees> abn: No idea sorry - But the PR to bandersnatch wouldn't be very many lines
[22:47:33] <abn> cooperlees: regarding "tried to replace the xmlrpc API and failed", can you point me to any relevant discussions on this?
[22:47:39] <cooperlees> I'm going to say 100s of MBs
[22:47:49] <cooperlees> abn: https://github.com/pypa/warehouse/issues/284
[22:47:59] <abn> cheers!
[22:48:15] <cooperlees> asmacdo had a working POC
[22:48:32] <abn> I was looking at one of the big ones; botocore - it had around 2.4 megs. :(
[22:48:58] <cooperlees> gzip is your friend
[22:49:20] <abn> heh; yeah I was more wondering uncompressed
[22:49:21] <cooperlees> we had a POC client here too: https://github.com/cooperlees/pypi-api
[22:49:53] <cooperlees> Goal was to make the API more CDN friendly
[22:50:55] <cooperlees> So due to that, GraphQL was ruled out
[22:50:57] <abn> REST works well for that I guess.
[22:51:06] <cooperlees> Ya
[22:51:27] <abn> But for read operations it makes more sense; when for example certain clients only want a limited subset etc.
[22:51:52] <abn> I would not expect it to be used for mutations etc though.
[22:52:11] <abn> But subscriptions + queries, it would be great.
[22:53:02] <cooperlees> Here was the PR - https://github.com/pypa/warehouse/pull/4078
[22:53:52] <cooperlees> If you wanted to rebirth that and get test coverage it might be accepted
[22:54:14] <cooperlees> asmacdo and I were not keen on doing the 100% test coverage when people were not 100% committed to deploying it
[22:54:20] <abn> I can have a look to see how much work its going to be; and if I really agree with the design :)
[22:54:35] <abn> haha; I can imagine.
[22:56:35] <cooperlees> abn: anyways - #bandersnatch exists
[22:56:56] <cooperlees> If you want to do down that route for metadata only feels free to open to issue and I can state how I'd implement it
[22:57:03] <abn> ah cool will join there too
[22:57:11] <abn> I think that might be the first stop for me anyway
[22:57:12] <cooperlees> config and then just use that bool to turn of package and simplehtml generation for the packages
[22:57:29] <cooperlees> And only save the JSON
[22:57:42] <abn> sounds straight forward ... and so the famous last words were spoken
[22:58:11] <cooperlees> I wouldn't imagine this to be to hard. Just might be some passing the variable through to the right parts and then the unit test might be a little painful
[22:59:22] <abn> Yeah; considering there are filters etc. already I assumed as much.
[23:02:11] <abn> cooperlees: one last silly question; is there a pypi read-only replica of the postgres database somewhere?
[23:03:08] <cooperlees> abn: not that I know of
[23:03:35] <abn> cooperlees: suspected as much; appreciate your help though.
[23:04:05] <cooperlees> It's the internet, it would just get abused
[23:04:25] <cooperlees> Especially as we'd need it in a postgres cluster to keep it up to date
[23:04:49] <abn> ...
[23:04:57] <cooperlees> And it has credentials and token data - So that couldn't be public - I don't think it's partitioned - I could be wrong
[23:05:26] <abn> I guess it was never designed to be exposed either.
[23:05:39] <cooperlees> Sure wasn't - that's what the APIs in front of it are for :)
[23:05:42] <abn> It makes sense it is not exposed, I was trying to be an optimist.
[23:05:58] <cooperlees> What you trying to do? What do you want your new API for?
[23:07:40] <abn> Part of exploring how to get a graph api up; having data access would bypass the need for an endpoint in warehouse. The problem with working within the application is I wil have to write all the resolvers etc. Access to data means I could potentially leverage something like http://hasura.io/ ... but mostly just thinking out loud right now.
[23:08:55] <abn> The need for the endpoint, atleast for me is driven by tools like poetry needing to fetch metadata when doing resolutions; its not a silver bullet but it helps towards making things consume lesser data.
[23:09:27] <cooperlees> The problem there is why poetry has a thought time, Python's / PyPI's metadata is not clean + enforced well.
[23:09:32] <cooperlees> *tough
[23:09:42] <abn> Yeah; but its the first step though.
[23:09:59] <abn> Poetry still downloads the artifacts if data is ... not reliable.
[23:10:16] <abn> In order to do the inspection.
[23:10:19] <cooperlees> I don't think GraphQL is going to fix the data. The first step is cleaning up the upload and enforcing more metadata validation + PEPs to back that up
[23:10:29] <abn> Oh I agree.
[23:10:48] <abn> The GraphQL bit is to reduce the api calls and data transfered for the metadata part.
[23:10:55] <abn> It is mostly an experiment.
[23:11:27] <abn> And if that works, next step is some sort of caching service and then federating two graph endpoints etc.
[23:11:54] <abn> The metadata validation problem is not going away anytime soon though. Because retoactively fixing the metadata wont happen.
[23:12:04] <cooperlees> Cool - Yeah I mean by all means POC it :) Could be useful. But then making it the default etc. will be a lot more work
[23:12:36] <abn> Who knows ... it might all go out in a blaze of glory! ... or well not so much glory.
[23:15:11] <abn> I am also curious to see if something like that becomes available; how many people would have a use for it.
[23:15:35] <abn> Easier to give folks an endpoint to try out than ask.
[23:21:51] <abn> cooperlees: https://github.com/pypa/bandersnatch/issues/665
[23:25:44] <cooperlees> Cool - Totally will accept - if i get time over the weekend I'll write how I would do it and if it's super easy I'll just do it
[23:25:49] <cooperlees> If you beat me to it, so be it.
[23:26:12] <abn> I would appreicate that. I might not get to it till Sunday anyway.