pmxbot IRC Log Viewer

[15:54:23] <dstufft> carljm: I wonder if the answer isn't to rip out pip's custom caching stuff and just use a HTTP cache that understands Cache-Control headers

[15:54:32] <dstufft> then we'll get caching of the index pages too

[18:49:01] <yusuket> dstufft: can you look at https://github.com/pypa/warehouse/pull/299 when you have a chance? I believe I’ve fixed all the outsanding issues

[18:56:01] <dstufft> yusuket: yea i'll take a look later today

[18:56:28] <dstufft> ~850 lines of diff means it takes more than a few minutes to thoroughly review it

[18:57:06] <yusuket> dstufft: awesome, thanks! Yeah sorry about the large diff, I wanted to port it all while I had the context in my head

[18:57:13] <dstufft> yusuket: no problem!

[20:51:32] <dstufft> Thoughts please https://github.com/pypa/pip/issues/1732#issuecomment-40975613 :D

[20:52:14] <dstufft> Aso thoughts on https://github.com/pypa/pip/issues/1733#issuecomment-40951518 too would be great :D

[23:51:26] <agronholm> dstufft: package checking is actually the reason I got involved with the warehouse project

[23:51:35] <Ivo> dstufft: what e these etags you're referring to

[23:51:48] <dstufft> Ivo: serious?

[23:52:04] <agronholm> I'm sick and tired of having to sift through tons of crap on pypi when searching for useful libraries

[23:52:05] <Ivo> ys

[23:52:13] <dstufft> agronholm: heh, it's a non trivial thing, though i'd like to have it eventually :]

[23:52:25] <agronholm> did you read my comment on the subject on github?

[23:52:36] <dstufft> agronholm: not yet, I saw the email come in though

[23:52:37] <Ivo> pypi is an uncurated index tho

[23:54:16] <dstufft> Ivo: When a HTTP server serves a response, it can include an ETag header which is essentially a unique value for some response. And when you go to make a request against, if you have a response cached you can add a If-None-Match: <stored-etag-header>, and a webserver will, when it sees a If-None-Match header, compare it to the e-tag it has for the url, and if it matches return a 304 response with an empty body, if i doesn't match return a

[23:54:16] <dstufft> normal response

[23:54:23] <Ivo> dstufft: I haven't looked into HTTP's caching mechanism in any great depth before, but if there is some sensible replacement for hashes, then sure

[23:54:31] <Ivo> *mechanisms

[23:55:27] <Ivo> the main thrust of the suggestion was to use HEAD to find a unique identifier for the thing we intend to download, instead of GET, was all

[23:55:53] <dstufft> Hm

[23:55:59] <dstufft> I'm not sure what your suggestion was then :)

[23:57:33] <Ivo> dstufft: you mention "which means that the entire file is downloaded prior to pip getting control of the Response object" which I take means we're sending GET request first, which could be replaced with an initial HEAD for efficiency (incase we don't have to download it).

[23:57:49] <dstufft> Ivo: ohhh

[23:57:55] <dstufft> no

[23:58:30] <Ivo> you're the one that asked for opinions, you didn't say whether they had to be well informed or not!

[23:59:42] <dstufft> requests doesn't download the body of the request until you consume it via response.iter_content() (resp.content calls iter_content() internally). With pip we use that to implement both download progress bars (because we can download "chunks" of the response body instead of the whole thing in shot, allowing us to update the progress bar between chunks) and because we can implement our own iter_content (to work around some content-encoding

[23:59:42] <dstufft> wonkiness)

[23:59:47] <Ivo> ETags seem to be just a place to put a hash in the header, rather than the url...

Log file Viewer

Help | Karma | Search:

#pypa-dev logs for Monday the 21st of April, 2014