[15:54:23] <dstufft> carljm: I wonder if the answer isn't to rip out pip's custom caching stuff and just use a HTTP cache that understands Cache-Control headers
[15:54:32] <dstufft> then we'll get caching of the index pages too
[18:49:01] <yusuket> dstufft: can you look at https://github.com/pypa/warehouse/pull/299 when you have a chance? I believe I’ve fixed all the outsanding issues
[18:56:01] <dstufft> yusuket: yea i'll take a look later today
[18:56:28] <dstufft> ~850 lines of diff means it takes more than a few minutes to thoroughly review it
[18:57:06] <yusuket> dstufft: awesome, thanks! Yeah sorry about the large diff, I wanted to port it all while I had the context in my head
[23:54:16] <dstufft> Ivo: When a HTTP server serves a response, it can include an ETag header which is essentially a unique value for some response. And when you go to make a request against, if you have a response cached you can add a If-None-Match: <stored-etag-header>, and a webserver will, when it sees a If-None-Match header, compare it to the e-tag it has for the url, and if it matches return a 304 response with an empty body, if i doesn't match return a
[23:54:23] <Ivo> dstufft: I haven't looked into HTTP's caching mechanism in any great depth before, but if there is some sensible replacement for hashes, then sure
[23:55:27] <Ivo> the main thrust of the suggestion was to use HEAD to find a unique identifier for the thing we intend to download, instead of GET, was all
[23:55:59] <dstufft> I'm not sure what your suggestion was then :)
[23:57:33] <Ivo> dstufft: you mention "which means that the entire file is downloaded prior to pip getting control of the Response object" which I take means we're sending GET request first, which could be replaced with an initial HEAD for efficiency (incase we don't have to download it).
[23:58:30] <Ivo> you're the one that asked for opinions, you didn't say whether they had to be well informed or not!
[23:59:42] <dstufft> requests doesn't download the body of the request until you consume it via response.iter_content() (resp.content calls iter_content() internally). With pip we use that to implement both download progress bars (because we can download "chunks" of the response body instead of the whole thing in shot, allowing us to update the progress bar between chunks) and because we can implement our own iter_content (to work around some content-encoding