Bug 43565 - new-run-webkit-tests: Every few runs, Windows Tests hang indefinitely somewhere in Python guts
Summary: new-run-webkit-tests: Every few runs, Windows Tests hang indefinitely somewhe...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: PC Windows Vista
: P2 Normal
Assignee: Dirk Pranke
URL:
Keywords:
Depends on: 49566
Blocks: 34984 38023
  Show dependency treegraph
 
Reported: 2010-08-05 10:49 PDT by Dimitri Glazkov (Google)
Modified: 2011-03-18 18:23 PDT (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dimitri Glazkov (Google) 2010-08-05 10:49:54 PDT
It runs a few runs, then hangs to the point where it just sits there, periodically spewing exceptions: 

http://build.webkit.org/builders/Chromium%20Win%20Release%20(Tests)/builds/1004/steps/layout-test/logs/stdio

Here is the stack trace (sic on partially eaten output), which keeps repeating over and over throughout the run:

ut_tests\layout_package\dump_render_tree_thread.py", line 333, in _run
  result = self._run_test(test_info)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 437, in _run_test
  self._driver.run_test(test_info.uri, test_info.timeout, image_hash)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\port\chromium.py", line 374, in run_test
  (line, crash) = self._write_command_and_read_line(input=cmd)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\port\chromium.py", line 347, in _write_command_and_read_line
  line = self._proc.stdout.readline()

# Thread: 19804
File: "c:\depot_tools\python_bin\lib\threading.py", line 499, in __bootstrap
  self.__bootstrap_inner()
File: "c:\depot_tools\python_bin\lib\threading.py", line 527, in __bootstrap_inner
  self.run()
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 258, in run
  self._run(test_runner=None, result_summary=None)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 333, in _run
  result = self._run_test(test_info)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 437, in _run_test
  self._driver.run_test(test_info.uri, test_info.timeout, image_hash)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\port\chromium.py", line 374, in run_test
  (line, crash) = self._write_command_and_read_line(input=cmd)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\port\chromium.py", line 347, in _write_command_and_read_line
  line = self._proc.stdout.readline()

# Thread: 17884
File: "./WebKitTools/Scripts/new-run-webkit-tests", line 38, in <module>
  sys.exit(run_webkit_tests.main())
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 1677, in main
  return run(port_obj, options, args)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 1441, in run
  num_unexpected_results = test_runner.run(result_summary)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 759, in run
  self._run_tests(self._test_files_list, result_summary))
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 699, in _run_tests
  self._dump_thread_states_if_necessary()
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 652, in _dump_thread_states_if_necessary
  self._dump_thread_states()
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 637, in _dump_thread_states
  for filename, lineno, name, line in traceback.extract_stack(stack):

# Thread: 19528
File: "c:\depot_tools\python_bin\lib\threading.py", line 499, in __bootstrap
  self.__bootstrap_inner()
File: "c:\depot_tools\python_bin\lib\threading.py", line 527, in __bootstrap_inner
  self.run()
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 258, in run
  self._run(test_runner=None, result_summary=None)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 333, in _run
  result = self._run_test(test_info)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 444, in _run_test
  output, error)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 107, in process_output
  configuration)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\test_types\text_diff.py", line 110, in compare_output
  print_text_diffs=True)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\test_types\test_type_base.py", line 217, in write_output_files
  pretty_patch = port.pretty_patch_text(diff_filename)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\port\base.py", line 624, in pretty_patch_text
  return self._executive.run_command(command, decode_output=False)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\common\system\executive.py", line 293, in run_command
  output = process.communicate(string_to_communicate)[0]
File: "c:\depot_tools\python_bin\lib\subprocess.py", line 660, in communicate
  self.wait()
File: "c:\depot_tools\python_bin\lib\subprocess.py", line 845, in wait
  obj = WaitForSingleObject(self._handle, INFINITE)

# Thread: 17884
File: "./WebKitTools/Scripts/new-run-webkit-tests", line 38, in <module>
  sys.exit(run_webkit_tests.main())
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 1677, in main
  return run(port_obj, options, args)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 1441, in run
  num_unexpected_results = test_runner.run(result_summary)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 759, in run
  self._run_tests(self._test_files_list, result_summary))
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 699, in _run_tests
  self._dump_thread_states_if_necessary()
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 652, in _dump_thread_states_if_necessary
  self._dump_thread_states()
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\run_webkit_tests.py", line 637, in _dump_thread_states
  for filename, lineno, name, line in traceback.extract_stack(stack):

# Thread: 19804
File: "c:\depot_tools\python_bin\lib\threading.py", line 499, in __bootstrap
  self.__bootstrap_inner()
File: "c:\depot_tools\python_bin\lib\threading.py", line 527, in __bootstrap_inner
  self.run()
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 258, in run
  self._run(test_runner=None, result_summary=None)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 333, in _run
  result = self._run_test(test_info)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\layout_package\dump_render_tree_thread.py", line 437, in _run_test
  self._driver.run_test(test_info.uri, test_info.timeout, image_hash)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\port\chromium.py", line 374, in run_test
  (line, crash) = self._write_command_and_read_line(input=cmd)
File: "D:\google-windows-2\chromium-win-release-tests\build\WebKitTools\Scripts\webkitpy\layout_tests\port\chromium.py", line 347, in _write_command_and_read_line
  line = self._proc.stdout.readline()
Comment 1 Evan Martin 2010-08-05 11:07:28 PDT
_dump_thread_states_if_necessary bits are just the code that is dumping the current state of the world for debugging purposes, right?

So it's hanging in the subprocess stdout read.
That probably indicates stdout buffering.

What is the subprocess?
If it's C code, see "man 3 setvbuf" (set it to line buffering).
If it's Python code, do something like
  sys.stdout = os.fdopen(1, 'w', 1)  # last arg of 1 means line buffering
Comment 2 Dirk Pranke 2010-08-05 11:23:48 PDT
This is almost certainly the same issue as bug 36622 and bug 38200 (which aren't actually
mac specific, and the cause is believed to be well-understood). Assigning to me.

Evan's right that the problem is caused by subprocess, but it's not so much stdout buffering as it is the fact that we're using PIPEs with a fixed buffer size (even on windows), and overflowing the
buffer causes the threads to deadlock.
Comment 3 Dimitri Glazkov (Google) 2010-08-05 11:33:49 PDT
(In reply to comment #2)
> This is almost certainly the same issue as bug 36622 and bug 38200 (which aren't actually
> mac specific, and the cause is believed to be well-understood). Assigning to me.
> 
> Evan's right that the problem is caused by subprocess, but it's not so much stdout buffering as it is the fact that we're using PIPEs with a fixed buffer size (even on windows), and overflowing the
> buffer causes the threads to deadlock.

Thanks Dirk! Another thing of note: when I go kill processes to clean up this mess, I also see ruby.exe running.
Comment 4 Dirk Pranke 2010-08-05 11:44:53 PDT
(In reply to comment #3)
> (In reply to comment #2)
> > This is almost certainly the same issue as bug 36622 and bug 38200 (which aren't actually
> > mac specific, and the cause is believed to be well-understood). Assigning to me.
> > 
> > Evan's right that the problem is caused by subprocess, but it's not so much stdout buffering as it is the fact that we're using PIPEs with a fixed buffer size (even on windows), and overflowing the
> > buffer causes the threads to deadlock.
> 
> Thanks Dirk! Another thing of note: when I go kill processes to clean up this mess, I also see ruby.exe running.

Yeah, the ruby.exe comes from the pretty diff ; it is definitely one of the culprits.
Comment 5 Dimitri Glazkov (Google) 2010-08-05 12:49:54 PDT
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > This is almost certainly the same issue as bug 36622 and bug 38200 (which aren't actually
> > > mac specific, and the cause is believed to be well-understood). Assigning to me.
> > > 
> > > Evan's right that the problem is caused by subprocess, but it's not so much stdout buffering as it is the fact that we're using PIPEs with a fixed buffer size (even on windows), and overflowing the
> > > buffer causes the threads to deadlock.
> > 
> > Thanks Dirk! Another thing of note: when I go kill processes to clean up this mess, I also see ruby.exe running.
> 
> Yeah, the ruby.exe comes from the pretty diff ; it is definitely one of the culprits.

Cool. This time when it hang, it was wdiff.exe.
Comment 6 Dimitri Glazkov (Google) 2010-08-05 14:47:15 PDT
(In reply to comment #5)
> (In reply to comment #4)
> > (In reply to comment #3)
> > > (In reply to comment #2)
> > > > This is almost certainly the same issue as bug 36622 and bug 38200 (which aren't actually
> > > > mac specific, and the cause is believed to be well-understood). Assigning to me.
> > > > 
> > > > Evan's right that the problem is caused by subprocess, but it's not so much stdout buffering as it is the fact that we're using PIPEs with a fixed buffer size (even on windows), and overflowing the
> > > > buffer causes the threads to deadlock.
> > > 
> > > Thanks Dirk! Another thing of note: when I go kill processes to clean up this mess, I also see ruby.exe running.
> > 
> > Yeah, the ruby.exe comes from the pretty diff ; it is definitely one of the culprits.
> 
> Cool. This time when it hang, it was wdiff.exe.

BTW, killing just ruby.exe seemed to have unwedged the script.
Comment 7 Dimitri Glazkov (Google) 2010-08-06 15:13:58 PDT
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > (In reply to comment #3)
> > > > (In reply to comment #2)
> > > > > This is almost certainly the same issue as bug 36622 and bug 38200 (which aren't actually
> > > > > mac specific, and the cause is believed to be well-understood). Assigning to me.
> > > > > 
> > > > > Evan's right that the problem is caused by subprocess, but it's not so much stdout buffering as it is the fact that we're using PIPEs with a fixed buffer size (even on windows), and overflowing the
> > > > > buffer causes the threads to deadlock.
> > > > 
> > > > Thanks Dirk! Another thing of note: when I go kill processes to clean up this mess, I also see ruby.exe running.
> > > 
> > > Yeah, the ruby.exe comes from the pretty diff ; it is definitely one of the culprits.
> > 
> > Cool. This time when it hang, it was wdiff.exe.
> 
> BTW, killing just ruby.exe seemed to have unwedged the script.

Also, installing non-cygwin ruby helped a lot. It hasn't hung in the last 12 runs.
Comment 8 Dirk Pranke 2011-03-18 18:23:28 PDT
I'm closing this ... I assume this has been fixed since we run DRT by default these days.