[05:23:42] <McSinyx[m]> pradyunsg: please see my proposal here (mainly to judge if the approach is ok and if the scope is fitting for gsoc): https://cloud.disroot.org/s/KYPEXYqakHSsezD
[05:24:14] <McSinyx[m]> i have not drafted the timeline due to the mentioned uncertainty
[06:25:52] <gutsytechster> devesh: I wanted to know, what things you are facing around debugging tests.
[06:26:45] <gutsytechster> I mean any, it could be lack of some documentation or where you think the structure could be better. Want to get your input!
[06:28:16] <devesh> Is this about the issue I raised
[07:00:36] <McSinyx[m]> devesh: I think gutsytechster's connection is not realy good, and since the channel is logged, don't worry, they will reply when they can
[07:03:41] <McSinyx[m]> you're welcome *fly away to attend my online lecture*
[07:12:16] <gutsytechster> devesh: I apologize, maybe that or some other issue you may feel you are facing. I am writing a proposal to improve pip's test helper, so want to know if you are facing any problem in accessebility.
[07:16:36] <devesh> Yes, I was actually trying to write a unit test for one of the PR's I raised
[07:17:25] <devesh> https://github.com/pypa/pip/pull/7891, but I found that the existing codebase is not very readable and user-friendly
[07:17:39] <devesh> One of the issues I faced is the issue I messaged before
[07:19:57] <gutsytechster> Yeah, that I found. Great. If you feel any other problem, let me know. :)
[07:20:32] <devesh> Yeah, I am currently waiting on that response on the PR on how to write the unit test itself
[07:20:42] <devesh> Once I get it and maybe contribute to more PR
[07:20:56] <devesh> in future, I might find other problems
[08:11:36] <travis-ci> pypa/pip#15245 (master - 33f8722 : Paul Moore): The build was broken.
[15:07:06] <McSinyx[m]> pradyunsg: I've updated the proposal with task-line (without specific timing): https://cloud.disroot.org/s/aFcZB3H59nHQgBc/download
[15:08:12] <McSinyx[m]> do you think the direction I'm heading is suitable?
[16:09:38] <travis-ci> pypa/pip#15250 (master - 4f64052 : Paul Moore): The build was fixed.
[16:54:32] <pradyunsg> McSinyx[m]: there's 2 things that would be useful to have, in such a proposal:
[16:57:37] <pradyunsg> 1. A clear idea of *how* the process would work (as part of a dependency resolution algorithm) theoretically.
[17:00:31] <McSinyx[m]> I'm listening (am looking for pkgmgr that resolve as fetched, but from experience of usage, i don't think i've used one)
[17:04:53] <McSinyx[m]> as of (1), I think I can work on that
[17:09:51] <McSinyx[m]> btw, speaking of pip's dependency resolution, when will the legacy resolver be legacy (i.e. not used by default)? I see the WIP tests to use resolvelib but haven't found the timeline
[17:13:14] <sumanah> McSinyx[m]: hi -- that is a good question
[17:14:09] <sumanah> McSinyx[m]: this is going to depend on what we find as we do further automated testing, manual testing, beta testing by users, etc
[17:14:59] <McSinyx[m]> so iiuc, not a very near future (< 6 mo)
[17:15:14] <sumanah> McSinyx[m]: if everything goes amazing, then maybe the late 2020 release .... you see https://pip.pypa.io/en/stable/development/release-process/
[17:16:06] <sumanah> McSinyx[m]: you see that pip's release months are January, April, July, October.... ideally the July release will include the resolver as a new feature but we haven't yet decided whether the new resolver will be on by default in that release
[17:16:33] <sumanah> pradyunsg might have opinions on that (mentioning in case he has time)
[17:17:03] <McSinyx[m]> got it, so it's nearer than I thought
[17:17:36] <sumanah> McSinyx[m]: well, MIGHT be.... we may have to delay till October ... we are learning stuff now
[17:18:00] <sumanah> McSinyx[m]: the more we know earlier, the better we can forecast. And so this is why we're asking all the people who have heard about the new resolver to spread the word
[17:18:32] <sumanah> that way we will hear sooner from more users -- people signing up for the user experience studies and surveys, running "pip check", and so on
[17:19:39] <McSinyx[m]> about that, this sounds dumb, but is there any convinient equivalent of autoremove in pip?
[17:19:53] <sumanah> McSinyx[m]: like "uninstall unused packages"?
[17:20:03] <sumanah> I do not think that is a dumb question :-)
[17:20:21] <McSinyx[m]> e.g. if I want to remove all dependencies of a package to be uninstalled to install it again
[17:20:34] <McSinyx[m]> yes, uninstallation of unused ones
[17:21:51] <sumanah> so "unused" here would mean something like: look at all the packages installed, and filter out the packages the user has explicitly installed, plus dependencies that those require. Whatever is left, uninstall it
[17:21:54] <sumanah> something like that McSinyx[m] ?
[17:22:50] <McSinyx[m]> sumanah: basically that, and recursively
[17:22:55] <sumanah> McSinyx[m]: you might like to skim https://wiki.python.org/psf/PackagingWG#Dependency_resolver_and_user_experience_improvements_for_pip which has a list of some bugfixes and features that the resolver will make feasible
[17:23:45] <sumanah> sorry I mean https://wiki.python.org/psf/Fundable%20Packaging%20Improvements#Finish_dependency_resolver_for_pip
[17:26:21] <McSinyx[m]> sumanah: I haven't been able to find the ``autoremove'' there but maye because nobody ever asked for it (-:
[17:26:54] <sumanah> McSinyx[m]: I understand :-) I am not saying it's in that list .... I just have a feeling that the new resolver would be important to implement such a feature properly
[17:26:58] <sumanah> that's not a comprehensive list, I'm sure
[17:27:52] <McSinyx[m]> I'm still surprised that pip is lacking so many package manager features but most of us have been using it without complaining
[17:28:36] <sumanah> McSinyx[m]: As one of the people who hears the complaints, I'm not so sure about the last part of what you said :-)
[17:28:54] <McSinyx[m]> of course, a lot of issues are not triaged also; back to the reason why I'm here, I want to make sure that the new backtracking resolver will not be too slow
[17:29:01] <sumanah> Also, if you have a matrix/checklist of package manager features, I'd be interested in objectively comparing pip to other tools
[17:29:59] <sumanah> McSinyx[m]: do you have some performance benchmarks you would need pip to meet?
[17:30:02] <McSinyx[m]> sumanah: you (as in, mainteners here) hear complains from users, I only have conversations on social media and sometimes with my peers
[17:31:04] <sumanah> McSinyx[m]: so I would be hesitant to either say that most pip users complain or that they do not complain; I think neither you nor I would be able to confirm or refute that claim
[17:31:18] <sumanah> I totally get that it's your experience though
[17:31:31] <McSinyx[m]> many of which would depends on metadata of all packages to be provided by pypi https://wiki.archlinux.org/index.php/Pacman/Rosetta
[17:32:43] <sumanah> Thanks for the link. I'd have to dig in to see which of these are commonly implemented by package managers for programming languages/frameworks
[17:33:03] <McSinyx[m]> about benchmark? probably not, but where I'm from, there's no mirror for pypi and even with the legacy resolver it's a pain in the back to wait for the downloads to finish
[17:33:36] <sumanah> McSinyx[m]: could you be more specific in terms of duration? how long does it take?
[17:34:07] <sumanah> also, are you within an organization that might benefit from using a local cache/proxy like bandersnatch? cooperlees here co-maintains that and can help you start getting set up
[17:34:34] <McSinyx[m]> I think I've asked here before but my use of words confused everyone, what's the current track of dependency metadata graph to be sync'ed between pypi and clients
[17:35:06] <cooperlees> hello ... around ... so shoot any messages if required
[17:35:25] <sumanah> btw, McSinyx[m], hi, I am Sumana Harihareswara and I'm one of the people working on pip right now -- not sure whether we have met before
[17:35:38] <McSinyx[m]> I'm just a student, and by nature I'm curious, so i install a lot of packages to play with for no reason, but it's goood to know that a mirror(-like?) is possible
[17:36:44] <McSinyx[m]> cooperlees: hi; sumanah: I don't think so, I'm here the first time to discuss about my plan for gsoc (otherwise I'd be having discussion on github)
[17:36:47] <sumanah> I recently updated a lot of the items in https://packaging.python.org/key_projects/ in case you want to take a fresh look at what other tools are available to do various things
[17:37:20] <cooperlees> McSinyx[m]: Yeah it's a mirror of the static parts of PyPI.
[17:37:46] <cooperlees> Only JSON metadata + package objects. pip search etc. still have to talk to xmlrpc dynamic APIs at pipit.org
[17:38:05] <cooperlees> WE should own pipit.org tho :D
[17:38:10] <sumanah> McSinyx[m]: your college/university/educational institution might be able to run a local mirror -- maybe a local student group, even
[17:38:31] <cooperlees> Only bummer these days is you'll need > 6.0T of storage for a full copy
[17:38:54] <cooperlees> There is filtering to help lower that, but the filtering isn't intelligent and is pretty human driven unfortunately
[17:39:04] <McSinyx[m]> sumanah: I'll try to ask my uni, although the internet there is not even good
[17:39:07] <sumanah> cooperlees: there are ways to only proxy the latest version of packages, right? like "not every single nightly release of big packages"
[17:39:16] <McSinyx[m]> cooperlees: just heard that yesterday, but bandwidth is more of a concern
[17:39:56] <sumanah> btw McSinyx[m] in case you are interested in initiatives to improve the internet at your university, it might be worth looking at https://innovationfund.comcast.com/ for funding
[17:40:25] <cooperlees> I added a JSON endpoint to /stats + you can use https://github.com/cooperlees/pypistats to generate a blacklist for the largest 100 packages.
[17:41:18] <sumanah> you asked a few minutes ago about plans for "dependency metadata graph to be sync'ed between pypi and clients" -- lemme get back to that question
[17:41:44] <McSinyx[m]> sumanah: thanks, I've never heard of that fund; yes please go on
[17:41:45] <sumanah> so, if I recall correctly, Poetry does this, but also this is one reason Poetry's initial load-up is slow
[17:42:34] <sumanah> McSinyx[m]: (that plus several other grants/funding programs are under-publicized in my opinion -- I blogged in December about several funding opportunities for open source https://www.harihareswara.net/sumana/2019/12/04/0 )
[17:43:00] <McSinyx[m]> cooperlees: I will look into that if I can convince the uni to provide the facility
[17:43:18] <McSinyx[m]> sumanah: could you please elaborate on the load-up is slow part?
[17:43:53] <cooperlees> Had planned to @ PyCon ...
[17:44:27] <sumanah> McSinyx[m]: (by the way, a university has many parts. you can look into a student-run group -- https://www.csua.berkeley.edu/ or https://www.ocf.berkeley.edu/ for example -- and not just central IT or a CS department)
[17:45:00] <McSinyx[m]> sumanah: my university is, ehm, not that have many parts
[17:45:15] <McSinyx[m]> we have like 500 students or so
[17:46:48] <sumanah> McSinyx[m]: I have not yet used Poetry, either. I am having trouble finding the place in their documentation or GitHub where I can point to the thing about initial load-up
[17:47:31] <sumanah> and I'm worried I'm saying something false - people should correct me
[17:47:34] <pradyunsg> 2. A clear idea of what doing this would solve, and how it matters. For example, we could gain a 2x speedup on average, but the download+dependency resolution code would be so complicated that it would be very difficult to make changes without introducing bugs - that would be a no-go for example.
[17:47:37] <sumanah> but as I recall: at some point early on in a user's usage of Poetry, it downloads a bunch of info from PyPI so it can later use it for internal dependency resolution
[17:47:55] <sumanah> pradyunsg: We see #2 but did you have a #1 point as well? because if so we missed it
[17:48:29] <McSinyx[m]> sumanah: it might be because it update the package index every run (like dnf by default)
[17:48:59] <McSinyx[m]> pradyunsg: my oh my I forgot to analyse about that
[17:49:22] <McSinyx[m]> sumanah: it's about my gsoc ~~proposal~~ draft
[17:50:24] <pradyunsg> re resolver: I think it's really unlikely that we don't see the new resolver become the default this year (or first release 2021, worst case).
[17:51:12] <McSinyx[m]> pradyunsg: I think (2) would be clear after (1) is done
[17:51:18] <pradyunsg> sumanah: I sent that around 10:27pm my time (so 1 hour ago).
[17:52:19] <pradyunsg> 8:37:06 PM <McSinyx[m]> pradyunsg: I've updated the proposal with task-line (without specific timing): https://cloud.disroot.org/s/aFcZB3H59nHQgBc/download
[17:52:19] <pradyunsg> 10:24:32 PM <pradyunsg> McSinyx[m]: there's 2 things that would be useful to have, in such a proposal:
[17:52:19] <pradyunsg> 10:27:37 PM <pradyunsg> 1. A clear idea of *how* the process would work (as part of a dependency resolution algorithm) theoretically.
[17:52:42] <pradyunsg> That wasn't what I intended to do, but it's happened now.
[17:57:14] <sumanah> McSinyx[m]: I took a quick look at your draft proposal -- thanks for having so much of it ready for review :-)
[17:57:56] <McSinyx[m]> sumanah: I think in pradyunsg's opinion, there is not enough (right now i feel that it's too vague)
[17:58:02] <sumanah> btw pradyunsg I don't know whether I've ever shared with you my GSoC mentoring and management philosophy https://www.mediawiki.org/wiki/Summer_of_Code_2012/management#GSoC_management_philosophy
[17:58:35] <pradyunsg> sumanah: you had not, and I'm gonna read it *now*, before I look at my TODO list.
[17:58:51] <sumanah> McSinyx[m]: I can understand that. Sure. But I also want to explicitly point out: having this early not-yet-specific-enough draft helps us review and consider it and talk about it with you ....
[17:59:12] <sumanah> and there are SO MANY applicants who do not show us something like this early enough that we can help give useful comments. So, thanks
[17:59:47] <McSinyx[m]> sumanah: meanwhile I'm a bit nervous because there're only 5 days left
[18:00:06] <sumanah> so I have a few general suggestions for you McSinyx[m]
[18:00:26] <pradyunsg> sumanah: thanks for sharing!
[18:00:39] <sumanah> one is: can you give us some contingency plans? Like, in case you get really bogged down in one of the early steps, what would you cut from your schedule?
[18:01:48] <McSinyx[m]> I don't think I understand your question, please rephrase
[18:01:49] <sumanah> McSinyx[m]: another is: do some benchmarking, or look around for statistics and benchmarking other people have done (in blog posts, slides for talks, etc.) so you can make a better case for the parallelization idea
[18:02:01] <McSinyx[m]> I'm sorry english is not my native language
[18:02:06] <sumanah> re: contingency plans: give us a Plan A, and then a Plan B in case there is a problem with your Plan A
[18:02:20] <sumanah> McSinyx[m]: your English is WAY better than any of my non-English languages so I'm totally happy to rephrase :-)
[18:05:01] <McSinyx[m]> about the benchmarking thing, this is particularly hard for me to find reference, since the approach of pip (resolve-fetch-resolve-...) is quite rare to find, and users don't... wait do we have the upload time data available from PyPI
[18:05:05] <McSinyx[m]> cooperlees: do we have the data on download time per session from pypi available?
[18:05:46] <sumanah> what an interesting question! Hmmm.
[18:06:03] <McSinyx[m]> re contingency plan: if the refactoring does not work out, the path I'm more confident in (but would basically touch 30% of pip's code base) is to use asyncio
[18:06:33] <sumanah> McSinyx[m]: so that raises the question: what can you actually accomplish in 12 weeks?
[18:07:38] <sumanah> to me, a good GSoC proposal is _small_ in that it only tries to do, like, 6 weeks of feature coding work, because there will also be 6 weeks of revision, responding to code review, research, documentation, and "oh wait I also need to implement foo and bar"
[18:07:50] <McSinyx[m]> **if** things go as planned, I believe I can handle the refactor (both network and resolve-related part) in 4 weeks
[18:08:38] <sumanah> McSinyx[m]: if you haven't already, you might want to explore https://packaging.python.org/guides/analyzing-pypi-package-downloads/ to see what stats are available -- I doubt that download duration is in there, though
[18:09:17] <pradyunsg> McSinyx[m]: we can't use asyncio since that'll need having a "python3 only" code in pip. :/
[18:10:13] <McSinyx[m]> sumanah: if there's session information then it'd be really helpful, I need to take a look at later
[18:10:59] <sumanah> McSinyx[m]: there's a tiiiiiiiny possibility that EWDurbin, the Director of Infrastructure at the PSF, would be able to grab some session duration info for you
[18:11:22] <pradyunsg> McSinyx[m]: all of pip's code needs be parsable as valid Python 2 code, even if it's not run on Python 2.
[18:12:08] <McSinyx[m]> I need to go to bed now, I'll continue the discussion tomorrow
[18:13:14] <sumanah> EWDurbin: in order to work out the speed increase that we could implement with parallelization in pip downloads, McSinyx[m] could use download time per session data from PyPI
[18:13:42] <EWDurbin> i'm almost certain we don't have that available. just backend response time
[18:15:54] <pradyunsg> McSinyx[m]: sure! good night! :)
[19:08:17] <sumanah> btw in case anyone needs chill tunes today I'm listening to https://www.arte.tv/en/videos/081858-005-A/passengers-apparat/
[19:08:39] <sumanah> (grateful right now for https://www.npr.org/2020/03/17/816504058/a-list-of-live-virtual-concerts-to-watch-during-the-coronavirus-shutdown and https://www.stayathomefest.com/ )
[19:09:24] <pradyunsg> https://www.stayathomefest.com/ is really nice!
[19:42:35] <cooperlees> McSinyx[m]: Huh? What time metric are you after - Average download connection length for client?
[19:42:35] <sumanah> techalchemy: need any poking or are you pretty much on task? :-)
[20:42:56] <sumanah> techalchemy: actually, one more thing before I go. I figure you will want to publish a pre-release, get some testing from users, and THEN push the canonical release. In my experience, giving users a LIST of workflows to test is really helpful
[20:43:14] <sumanah> giving them the recipe to follow, you know?
[20:43:34] <sumanah> techalchemy: so I just edited the tracking issue and posted a comment https://github.com/pypa/pipenv/issues/3369#issuecomment-604674479 asking people to add suggested workflows to test to an Etherpad I made
[20:44:00] <sumanah> another way people can help while waiting for you to push the pre-release.
[20:44:01] <techalchemy> e.g. install, remove environment, recreate, re-lock, try different python versions
[20:45:35] <sumanah> right! but even more specific than that. So, I figure other people can do some of that while you work on merging that branch, plus the bleach PR for vistir if necessary
[20:45:48] <sumanah> techalchemy: hope that helps.