PMXBOT Log file Viewer

Help | Karma | Search:

#pypa-dev logs for Tuesday the 22nd of April, 2014

(Back to #pypa-dev overview) (Back to channel listing) (Animate logs)
[00:00:19] <Ivo> but thas' coo'
[00:00:25] <dstufft> However CacheControl sits between requests and pip and will attempt to use a cached response (or a conditional GET if that fails) if possible, but in order to cache the response it has to consume the response
[00:00:54] <Ivo> CacheControl sounds important and formal, like it's working for NASA or smth
[00:00:58] <agronholm> it could just "tee" the data stream
[00:01:23] <agronholm> or provide a progress callback
[00:02:08] <dstufft> so it consumes the entire response body before passing it back to pip, which means the file has already been downloaded before pip gets access tot he response, so our progress bar goes from 0 to 100% instantly because we're iterating over something in memory instead of chunks we're pulling off the wire
[00:02:45] <dstufft> There are a few ways we can solve it, the simplest way being "remove progress bars" because then it doesn't really matter if CacheControl downloads the entire file before giving it to pip or not
[00:03:11] <agronholm> I would prefer keeping the progress bars if it doesn't pose any insurmountable problems
[00:03:42] <agronholm> there are some rather large distributions (PySide?) and I would rather see how the download is going
[00:04:30] <Ivo> i don't really care about progress bars, but if you were pulling something from HDD cache I'd think you just wouldn't display it
[00:04:59] <dstufft> the more complicated ones are send a patch to CacheControl so that the "store the response in the cache" happens by wrapping the iter_content and using that as a hook to "ok the response has been downloaded, now save it to the cache" or implenent are own REsponse class which has a callback that will be fired after each chunk in iter_content
[00:05:21] <dstufft> agronholm: can you comment to that affect on https://github.com/pypa/pip/issues/1732
[00:05:26] <agronholm> sure
[00:06:56] <dstufft> Ivo: this affects cached or not, but if we keep the progress bars the progress bars will be near instant for files we have cached already
[00:07:10] <Ivo> maybe we can just reimplement cachecontrol ourselves? Seems like we only need some very simple functionality?
[00:07:31] <Ivo> dstufft: ...so?
[00:07:33] <dstufft> we already reimplement parts of cachecontrol ourselves, and we do a bad job at it :)
[00:07:46] <dstufft> infact we reimplement parts of it twice
[00:07:56] <Ivo> instant progress bars, who the fuck is complaining about instant progress bars, lemme at 'em
[00:08:24] <dstufft> Ivo: oh noone is complaining, i'm just saying there's no reason to special case progress bars when pulling from the cache
[00:08:31] <dstufft> they'll just be near instant
[00:09:00] <dstufft> and if we keep them for when we don't hit the cache, it's no additional work to keep them for when we do
[01:00:10] <Ivo> dstufft: if you're still there, care to share an opinion on https://github.com/pypa/pip/pull/1718 (what option could be called / look like?)
[01:30:05] <Ivo> fuck multipurpose code
[16:15:58] <carljm> dstufft: caching of the index pages is a significant change in the behavior of the download cache, because it means that depending on the index's server config you could fail to see newly-released packages (most likely to bite people when trying to test installation of something they just released)
[16:16:42] <carljm> I guess the most important question is what cache headers PyPI serves (though the same issue could bite people with their own local indexes too, just not nearly as many people)
[16:17:04] <carljm> in any case, I think it's a net win, should just be aware of the potential consequences
[16:52:07] <dstufft> carljm: yea
[16:52:27] <dstufft> Ideally the answer to that is "fix your cache headers"
[16:53:21] <dstufft> carljm: I'm exploring CacheControl at the moment, it wraps requests and just handles the caching portion of stuff, including doing a conditional get if the cache has expired but it still has the stale cache item around to try and refresh the cache without donwloading the request body again
[16:56:59] <DanielHolth> i'm surprised the progress bar would automatically go, wouldn't the cache just return a proxy object to the base request object?
[16:57:29] <DanielHolth> however most python dists are quite small.
[17:15:57] <dstufft> DanielHolth: what do you mean "would automatically go" ?
[17:22:19] <DanielHolth> there was something on the bugtracker about losing progress when using a cache wrapper.
[17:23:33] <dstufft> ah
[17:23:41] <dstufft> right now it'll lose the progress tracker
[17:23:53] <DanielHolth> I suppose a lot of pip users don't know that it has progress bras
[17:23:55] <DanielHolth> bar
[17:23:55] <DanielHolth> s
[17:23:56] <DanielHolth> hm
[17:23:56] <dstufft> because in order to cache the response, it has to consume the response
[17:24:20] <dstufft> I'm planning to submit a PR to CacheControl to make it hook into the normal method of consuming the response
[17:24:29] <DanielHolth> I'd expect to get a wrapped response object that saves itself as a side effect of reading the last byte
[17:24:38] <dstufft> so that it lazily caches once the object has been consumed
[17:24:39] <dstufft> yea
[17:24:41] <DanielHolth> OK
[17:24:44] <DanielHolth> probably easy
[17:24:56] <DanielHolth> then we can have urwid-pip
[17:25:18] <dstufft> Yea, I submitted few other PRs yesterday that re-arcihected CacheControl a little bit to make it fit nicer with what pip does with requests
[17:25:27] <dstufft> and to fix a few security things
[17:25:37] <DanielHolth> is it just something like response.readlines()?
[17:26:20] <dstufft> is what? what pip does?
[17:26:37] <dstufft> it's response.raw.read(4096) more or less
[17:26:39] <dstufft> in a loop
[17:27:28] <dstufft> with some other tinkering to override some requests behavior that they don't have a saner way to mess with it yet
[17:30:01] <DanielHolth> eventually we'll just vendor all of pypi in pip, solving the problem
[17:30:09] <dstufft> heh
[17:30:14] <dstufft> sooner or later!
[18:21:57] <yusuket> dstufft: I know you’re busy tackling caching in pip, but did you have a chance to lok at my change?
[18:22:30] <dstufft> yusuket: oof I forgot, I'll do it in a bit i'm working on some non packging stuff for work atm
[18:22:42] <yusuket> ok sounds good