PMXBOT Log file Viewer

Help | Karma | Search:

#pypa-dev logs for Tuesday the 17th of June, 2014

(Back to #pypa-dev overview) (Back to channel listing) (Animate logs)
[00:00:05] <Alex_Gaynor> aka no one has a clue what it's for :-)
[00:00:45] <dstufft> actually I have no idea what the test is even supposed to be testing... it says it's testing --local
[00:00:54] <dstufft> but it doesn't have a global site packages environment
[00:00:56] <dstufft> so everything is local
[00:01:16] <Alex_Gaynor> gotta love tests like that
[00:01:50] <dstufft> I'm still confused why the hell it's locking up though
[00:02:13] <Alex_Gaynor> me too :/
[00:05:35] <dstufft> tbh
[00:05:46] <dstufft> I don't even think your changes even touch anything that pip freeze uses
[00:06:09] <Alex_Gaynor> So what you're saying is: computer.
[00:07:45] <dstufft> looks like it might be a problem with pip install
[00:08:01] <dstufft> https://github.com/dstufft/pip/blob/cleanup-compat/tests/functional/test_freeze.py#L284 has printed, but -> 2 hasn't
[00:11:15] <Alex_Gaynor> dstufft: you might try aplyin gthis commit https://github.com/alex/pip/commit/efc72a0db2f94774e73f0c626d9e4f7166aca6b5
[00:13:21] <dstufft> hum
[00:13:23] <dstufft> interesting
[00:13:28] <dstufft> I might have reproduced this locally
[00:13:30] <Alex_Gaynor> I applied it on my PR
[00:13:33] <Alex_Gaynor> orly?
[00:13:44] <dstufft> I don't know what I did, but my thing is hanging now
[00:13:54] <dstufft> I just added some more debugging
[00:13:58] <Alex_Gaynor> "good"?
[00:14:10] <dstufft> Successfully installed initools
[00:14:10] <dstufft> Cleaning up...
[00:14:10] <dstufft> Removing temporary dir /private/var/folders/94/79qhyrmd5y9f843ry4jnvy4w0000gn/T/pytest-15/test_freeze_with_local_option0/workspace/venv/build...
[00:14:15] <dstufft> is the last thing that pip install output
[00:14:22] <Alex_Gaynor> and then it hangs?
[00:14:33] <dstufft> it appears so yes
[00:14:39] <dstufft> I have the change to print that info pushed to travis
[00:14:43] <dstufft> to confirm
[00:15:50] <dstufft> https://github.com/pypa/pip/blob/develop/pip/util.py#L45-L47
[00:16:22] <Alex_Gaynor> hmm, what about it?
[00:16:42] <dstufft> that's the code that is called to remove the directory
[00:16:48] <Alex_Gaynor> and it hangs?
[00:17:53] <dstufft> I don't know if it's that code that hangs yet, all I know is the last thing that is output is right before rmtree is called
[00:18:05] <Alex_Gaynor> mmm
[00:18:54] <dstufft> nothing else interesting happens after rmtree is called, except for... https://github.com/pypa/pip/blob/develop/pip/basecommand.py#L137-L185
[00:19:02] <dstufft> this is all happening inside of that self.run() call
[00:19:45] <dstufft> after rmtree is called, an integer is decremented, return statement, and if statement that evaluates to false, then the def run() returns
[00:21:13] <dstufft> uh
[00:21:14] <dstufft> wat
[00:21:20] <Alex_Gaynor> ?
[00:21:31] <dstufft> instead of hanging, now PYthon 3.3 fails on travis
[00:21:37] <Alex_Gaynor> progress!
[00:21:59] <dstufft> oh wait, I lied
[00:22:15] <dstufft> it's still hanging
[00:22:19] <dstufft> it's just hanging after that test
[00:22:45] <Alex_Gaynor> this is because the binary() function on py3k silently returned None for a non-str argument
[00:22:46] <Alex_Gaynor> wtf?
[00:23:04] <dstufft> it did?
[00:23:31] <Alex_Gaynor> dstufft: https://github.com/pypa/pip/pull/1878/files#diff-3e5d1500a5ca809a766c060d6dc0fb4dL33
[00:23:38] <Alex_Gaynor> that's why I started working on this patch in the first place
[00:23:50] <dstufft> lol wtf
[00:24:03] <Alex_Gaynor> I assume that's related
[00:24:06] <Alex_Gaynor> looking at https://travis-ci.org/pypa/pip/jobs/27731870
[00:24:11] <Alex_Gaynor> I don't even see where string is called?
[00:25:35] <dstufft> that's inside scripttest.py
[00:25:56] <Alex_Gaynor> why isn't it in the traceback?
[00:26:49] <dstufft> https://github.com/pypa/scripttest/blob/master/scripttest.py#L214
[00:27:10] <Alex_Gaynor> oh fuck that
[00:27:47] <dstufft> wonder if there is a flag to turn that off in pytest
[00:27:53] <Alex_Gaynor> the things that string() is called on don't even come from the args
[00:28:32] <dstufft> h
[00:28:34] <dstufft> hm
[00:28:36] <dstufft> youre right
[00:29:07] <Alex_Gaynor> looking at this patch, top to bottom, I don't see how this change broke anything
[00:29:53] <dstufft> I'm guessing it wasn't this aptch, xafer had something similar he was complaining about like yesterday
[00:30:05] <Alex_Gaynor> but other patches work, right?
[00:30:08] <dstufft> yes
[00:31:06] <dstufft> hm
[00:31:20] <dstufft> I left out some stuff xafer said earlier
[00:32:15] <dstufft> http://bpaste.net/show/dgDWr65JkGcOSuGIuq65/
[00:32:26] <dstufft> changing howmuch stuff he was printing changes wether it got stuck or not
[00:33:51] <dstufft> I wonder if something is blocking waiting for input
[00:33:52] <Alex_Gaynor> wtf
[00:33:58] <Alex_Gaynor> seems likely
[00:34:10] <dstufft> the difference between when that test case failed on py33 and when it blocked
[00:34:25] <dstufft> is primarily subprocess.PIPE for stdout/stdin/etc
[00:34:54] <dstufft> https://github.com/pypa/scripttest/blob/master/scripttest.py#L242-L255 debug == failure (no traceback, just return code) else == blocks
[00:36:42] <dstufft> which could explain why it blocked for me but failed on travis too
[00:36:46] <dstufft> because I'm executing it in a normal shell
[00:38:55] <dstufft> good news is, I can repro it locally now
[00:39:05] <dstufft> so I don't have to play the fuck with travis game
[00:40:14] <dstufft> it's not the rmtree call
[00:43:06] <dstufft> ok
[00:43:08] <dstufft> now i'm even more confused
[00:44:03] <dstufft> Alex_Gaynor: https://github.com/pypa/pip/pull/1880/files#diff-b670e3b192038c9ffe810c1a12c0c51fR140 uncommenting that == test failure, commenting it == hangs
[00:44:27] <Alex_Gaynor> so... something to do with blocking pipes and buffer sizes... I guess?
[00:45:03] <dstufft> this is without subprocess.PIPE
[00:45:12] <dstufft> I have debug=True
[00:45:17] <dstufft> so it's just inheriting the FDs
[00:46:05] <dstufft> uhh
[00:46:15] <dstufft> if I copy that print statement and move it to after the try/except block
[00:46:19] <dstufft> then it still blocks
[01:36:28] <dstufft> huh
[01:36:35] <dstufft> the stuck pip is maxing the CPU
[01:37:20] <Alex_Gaynor> nice
[01:37:23] <Alex_Gaynor> dtruss?
[01:37:57] <dstufft> I don't know how to use that
[01:38:12] <Alex_Gaynor> dtruss -f -p <pid>
[01:42:25] <dstufft> I'm not sure this is telling me anything
[01:42:26] <dstufft> http://d.stufft.io/image/3n2h380e462A
[01:44:07] <Alex_Gaynor> apparently you're not making any syscalls
[01:44:09] <Alex_Gaynor> or something
[01:44:26] <dstufft> computers are the worst
[01:45:05] <dstufft> is dtruss supposed to slow my system down massively
[01:45:19] <Alex_Gaynor> should only slow downt aht one process
[01:47:02] <dstufft> ok now it's doing some things
[01:47:40] <dstufft> http://d.stufft.io/image/3o3Z46243p0v
[01:48:34] <Alex_Gaynor> sooo some syscalls are happening
[01:48:48] <Alex_Gaynor> I don't really know what to say :
[01:48:49] <Alex_Gaynor> :/
[01:51:21] <dstufft> it's stuck inside of a eintr handling loop
[01:51:34] <Alex_Gaynor> sounds like subprocess
[01:51:59] <dstufft> yes
[01:52:04] <dstufft> line 479 of my subprocess
[01:52:14] <dstufft> is where it is everytime I ctrl C py.test
[01:52:54] <dstufft> which is apparently a while True: loop
[01:57:20] <dstufft> so
[01:57:26] <dstufft> it's apparently stuck on os.waitpid
[01:58:19] <Alex_Gaynor> waiting for the subprocess to finish?
[01:59:45] <dstufft> apparently so
[01:59:48] <dstufft> but i'm confused
[02:00:24] <dstufft> I have output from pip that prints right before sys.exit(0) is called
[02:00:34] <dstufft> as in, the very next line is sys.exit(0)
[02:03:24] <dstufft> sooo
[02:03:32] <dstufft> it works if I replace sys.exit(0) with os._exit(0)
[02:04:33] <Alex_Gaynor> usually the only time you need os._exit() is in a fork'd process
[02:05:42] <dstufft> what does subprocess.Popen do
[02:06:14] <Alex_Gaynor> fork, but also exec
[02:06:21] <Alex_Gaynor> when you exec after forking different rules come into play
[02:10:55] <dstufft> so like
[02:10:58] <dstufft> distlib imports threading
[02:11:03] <dstufft> that couldn't fuck things up could it
[02:11:03] <dstufft> :/
[02:11:31] <Alex_Gaynor> does it do anythign with threading?
[02:11:50] <dstufft> it spawns some threads in some of it's APIs, I don't think we call any of those APIs though
[02:14:53] <dstufft> ooh
[02:14:59] <dstufft> lockfile uses threads to
[02:16:22] <Alex_Gaynor> so yeah, threads can fuck your day right up, specificaly fork+threads is a bad time
[02:16:49] <dstufft> is dummy_threading "safe"
[02:17:07] <dstufft> like can I just replace every threading import with dummy_threading and see if it works
[02:17:08] <Alex_Gaynor> I think so
[02:23:04] <dstufft> still hangs ;(
[02:23:18] <Alex_Gaynor> so probably unrelated
[02:23:28] <Alex_Gaynor> also unrelated: dear god auto-retry can't come soon enough, fuck Connection reset
[02:23:34] <dstufft> I'm considering just shoving os._exit(0) and getting real angry at computers
[02:23:50] <Alex_Gaynor> we shoudl try to understand why
[22:40:29] <dstufft> Alex_Gaynor: FML, another (unrelated) change to pip is hanging, this time on 3.2
[22:40:42] <dstufft> in a different spot
[22:40:52] <Alex_Gaynor> dstufft: "good"? Now at least we know it's not 33 related :P
[22:41:30] <dstufft> Alex_Gaynor: you have a strange definition of good
[22:41:33] <dstufft> :|
[22:42:06] <Alex_Gaynor> dstufft: we learned somethign new, that's good :-)
[22:56:09] <dstufft> so
[22:56:28] <dstufft> it looks like it hangs on travis if using subprocess.PIPE and doesn't hang if not using that
[22:58:02] <Alex_Gaynor> so something to do with pipe buffers
[22:58:46] <glyph> hi
[22:58:55] <glyph> Alex_Gaynor summoned me, I trust to discuss something horrible
[22:59:10] <Alex_Gaynor> glyph: So. pip has a test that sometimes fails on some versions of python, when using subprocess, and there are some pipes
[22:59:32] <Alex_Gaynor> s/fails/ghangs/
[22:59:47] <glyph> Alex_Gaynor: ha ha ha, ha ha.
[22:59:54] <glyph> I'm out, nice knowing you all
[23:00:02] <Alex_Gaynor> glyph: It's fine. The disk isn't full.
[23:00:15] <glyph> Alex_Gaynor: that _isn't even_ what I was thinking
[23:00:27] <glyph> Alex_Gaynor: what platform?
[23:00:47] <Alex_Gaynor> Linux/travis for sure
[23:00:53] <Alex_Gaynor> Maybe dstufft got it on his oS X though?
[23:01:08] <dstufft> I've had it happen on OSX yes
[23:01:11] <dstufft> not this time though
[23:01:50] <dstufft> pip is being called by subprocess.Popen, if called with subprocess.PIPE for stuff it hangs, if not it doesn't
[23:02:11] <dstufft> I know there is like a 65k limit and then it deadlocks, but subprocess.Popen().communicate() is supposed to avoid that
[23:02:21] <dstufft> (I thought?)
[23:02:30] <glyph> yeah
[23:02:38] <glyph> the stdlib is supposed to do a lot of things
[23:02:47] <glyph> would it surprise you to find out that sometimes... it doesn't
[23:02:59] <glyph> twisted has a rarified version of this same race condition
[23:03:02] <glyph> I'm looking up the bug
[23:03:19] <glyph> we thought we fixed it
[23:03:20] <glyph> but you can't fix it
[23:03:25] <glyph> you have to use posix_spawn
[23:03:40] <dstufft> does posix_spawn work on windows
[23:03:45] <dstufft> I'm gussing no
[23:04:08] <dstufft> this is pip's test suite, so I am 100% ok with using whatever random crap we need to
[23:04:10] <glyph> dstufft: no, but _this_ problem shouldn't happen on windows
[23:04:24] <glyph> dstufft: if you're getting hangs on windows... are you sure your disk isn't full
[23:04:26] <dstufft> ok, so I can use posix_spawn on not windows, and subprocess on windows?
[23:04:28] <glyph> have you checked, like, a lot
[23:04:32] <dstufft> No we're not getting hangs on windows
[23:04:37] <glyph> oh okay
[23:04:43] <dstufft> I jsut can't put something that does't work on windows in the pip test suite
[23:04:54] <glyph> just port everything to use twisted, we'll fix the bug in twisted and the interface won't change
[23:06:07] <dstufft> While I don't have a problem using twisted in the test runner, we need PYthon 3 support
[23:06:12] <glyph> sigh
[23:06:18] <glyph> once again stymied by the fact trial isn't ported
[23:06:36] <Alex_Gaynor> Also, porting the entire test suite to use twisted doesn't feel like the right solution to this problem
[23:06:42] <dstufft> ther's that too
[23:09:41] <glyph> nah twisted feels like the solution to all problems ever
[23:09:57] <glyph> but okay given that it wouldn't actually help maybe not specifically _this_ one
[23:10:26] <glyph> the reason I'm even talking about it is that the subprocess module is both fundamentally flawed and frozen for the rest of eternity
[23:10:38] <glyph> twisted is the former but not the latter :)
[23:11:39] <dstufft> I'm not a fan of stdlibs
[23:11:42] <dstufft> :|
[23:12:15] <dstufft> https://github.com/pypa/scripttest/blob/master/scripttest.py#L242-L260 is the code btw, debug = True, no blocking, debug = False, blocks forever (or well 10 minutes at which point travis kills the test)
[23:13:18] <glyph> dstufft: reliably!?
[23:14:09] <dstufft> for this one particular test, it's blocked everytime I had debug = False, and the test fails past the communicate call with debug = True, only on python 3.2 though
[23:14:30] <dstufft> it's an unrelated change to what this test tests though
[23:14:50] <dstufft> there was a different one yesterday with similiar circmunstances, except it was only python 3.3
[23:14:57] <dstufft> and a different set of changes
[23:15:00] <dstufft> and a different test
[23:15:04] <dstufft> but same symptoms
[23:15:13] <glyph> okay I thought you were talking about something intermittent
[23:15:18] <glyph> maybe this is not what I thought
[23:15:58] <dstufft> fwiw yesterady I shoved print statements everywhere and it got to the point where a print right before sys.exit(0) would be printed, and then nothing
[23:16:09] <dstufft> os._exit(0) did of course force it to exit
[23:17:33] <glyph> well, let's not be hasty, "of course" is a bit strong
[23:19:04] <dstufft> sorry I forgot we were dealing with computers
[23:19:39] <glyph> but okay it does cause the exit, which means you can narrow down things a bit to something the *subprocess* is doing
[23:19:45] <glyph> which means the subprocess module likely _isn't_ at fault
[23:20:03] <glyph> and we're seeing hangs on Linux and OS X but _not_ on Windows
[23:20:22] <dstufft> well, no reported hangs, but we don't run CI on Windows
[23:20:29] <dstufft> so it's more of a we don't know
[23:20:50] <glyph> dstufft: which test is it, what script is it?
[23:20:55] <glyph> "we don't run CI on Windows"
[23:20:59] <dstufft> yes I know
[23:21:00] <dstufft> it's bad
[23:21:35] <dstufft> right now it's this test -> https://github.com/pypa/pip/blob/develop/tests/functional/test_install_config.py#L20-L38
[23:21:54] <dstufft> with this change https://github.com/pypa/pip/pull/1882 (it doesn't fail w/o this change)
[23:22:28] <dstufft> yesterday it was this test -> https://github.com/pypa/pip/blob/develop/tests/functional/test_freeze.py#L279-L306 with this change https://github.com/pypa/pip/pull/1878
[23:24:23] <dstufft> the only change to today's code is wrapping stuff in a context manager that closes requests's session (and thus the TCP connections) but doesn't change any of the code inside of the context managers
[23:24:25] <dstufft> FWIW
[23:24:30] <dstufft> other then indenting them
[23:25:10] <glyph> dstufft: so perhaps I should state the obvious first: subprocess.communicate() is a bad thing to run in a test, because if the subprocess fails to exit, it's game over
[23:25:39] <dstufft> is there a better way to test the output of a sub process?
[23:26:12] <dstufft> 1/2 of pip's test suite is "create a virtual env, install pip into it, run some pip commands, assert stuff about the output and/or state of the virtual env"
[23:26:35] <dstufft> this sucks, but it's an ongoing thing to move more of our testing to unit tests
[23:27:59] <glyph> dstufft: this is why twisted is in fact a good idea. Real subprocess I/O does have the risk of a hang, but you want to receive output incrementally so that you can notice if, for example, you see a traceback in your stderr and just exit
[23:28:25] <glyph> I don't think there's a way of doing evented I/O to a subprocess normally
[23:29:55] <dstufft> the docs says communicate() uses select on unix and threads on Windows, but idk more than that
[23:30:47] <dstufft> I have to run out for a few minutes, do you mind if we finish this in a little bit glyph ?
[23:30:49] <glyph> dstufft: the thing is, it uses select() to interleave stdin and stdout
[23:31:14] <glyph> dstufft: I am going to be leaving pretty soon but I should have plenty of time tomorrow in the afternoon
[23:31:21] <dstufft> glyph: works for me
[23:31:22] <glyph> from 3(PST) onwards
[23:31:36] <dstufft> will continue then!
[23:31:42] <dstufft> thanks