87803 – Layout tests often fail trying to stat nonexistent logs

RESOLVED FIXED 87803

Layout tests often fail trying to stat nonexistent logs

https://bugs.webkit.org/show_bug.cgi?id=87803

Summary Layout tests often fail trying to stat nonexistent logs

Stephanie Lewis

Reported 2012-05-29 18:20:12 PDT

On Mac crashreporter will roll over logs after a certain number of crashes. This can cause the layout tests to fail if looking for a log that doesn't exist anymore. We're tracking bugs internally to fix that problem, but the the try/catch block should include the stat and catch OSError. backtrace: 18:01:45.278 79451 worker/5 raised OSError('[Errno 2] No such file or directory: '/Users/buildbot/Library/Logs/DiagnosticReports/WebProcess_2012-05-29-161118_buildbots-Mac-Pro-40.crash''): 18:01:45.278 79492 worker/7 mathml/presentation/fenced-mi.xhtml passed 18:01:45.278 79451 layout_tests/controllers/manager_worker_broker.py:290 (in run) 18:01:45.278 79451 self._worker_connection.run_message_loop() 18:01:45.278 79451 layout_tests/controllers/manager_worker_broker.py:243 (in run_message_loop) 18:01:45.278 79451 self._broker.run_message_loop(self._run_topic, self._client, delay_secs) 18:01:45.278 79451 layout_tests/controllers/manager_worker_broker.py:175 (in run_message_loop) 18:01:45.279 79451 self._run_loop(topic_name, client, block=True, delay_secs=delay_secs) 18:01:45.279 79451 layout_tests/controllers/manager_worker_broker.py:189 (in _run_loop) 18:01:45.279 79451 self._dispatch_message(msg, client) 18:01:45.279 79451 layout_tests/controllers/manager_worker_broker.py:198 (in _dispatch_message) 18:01:45.279 79451 message_handler(message.src, *optargs) 18:01:45.279 79451 layout_tests/controllers/worker.py:144 (in handle_test_list) 18:01:45.279 79451 self._run_test(test_input) 18:01:45.279 79451 layout_tests/controllers/worker.py:171 (in _run_test) 18:01:45.279 79451 result = self.run_test_with_timeout(test_input, test_timeout_sec) 18:01:45.279 79451 layout_tests/controllers/worker.py:213 (in run_test_with_timeout) 18:01:45.279 79451 return self._run_test_in_this_thread(test_input) 18:01:45.279 79451 layout_tests/controllers/worker.py:298 (in _run_test_in_this_thread) 18:01:45.279 79451 return self.run_single_test(self._driver, test_input) 18:01:45.279 79451 layout_tests/controllers/worker.py:302 (in run_single_test) 18:01:45.279 79451 test_input, driver, self._name) 18:01:45.279 79451 layout_tests/controllers/single_test_runner.py:46 (in run_single_test) 18:01:45.279 79451 return runner.run() 18:01:45.279 79451 layout_tests/controllers/single_test_runner.py:101 (in run) 18:01:45.280 79451 return self._run_reftest() 18:01:45.280 79451 layout_tests/controllers/single_test_runner.py:291 (in _run_reftest) 18:01:45.280 79451 reference_output = self._driver.run_test(DriverInput(reference_test_name, self._timeout, test_output.image_hash, should_run_pixel_test=True)) 18:01:45.280 79451 layout_tests/port/driver.py:217 (in run_test) 18:01:45.280 79451 return self._running_drivers[cmd_line_key].run_test(driver_input) 18:01:45.280 79451 layout_tests/port/webkit.py:579 (in run_test) 18:01:45.280 79451 newer_than=start_time) 18:01:45.280 79451 layout_tests/port/mac.py:194 (in _get_crash_log) 18:01:45.280 79451 crash_log = crash_logs.find_newest_log(name, pid, include_errors=True, newer_than=newer_than) 18:01:45.280 79451 common/system/crashlogs.py:38 (in find_newest_log) 18:01:45.280 79451 return self._find_newest_log_darwin(process_name, pid, include_errors, newer_than) 18:01:45.280 79451 common/system/crashlogs.py:59 (in _find_newest_log_darwin) 18:01:45.280 79451 if not newer_than or self._host.filesystem.mtime(path) > newer_than: 18:01:45.280 79451 common/system/filesystem.py:183 (in mtime) 18:01:45.280 79451 return os.stat(path).st_mtime

Attachments
patch (2.98 KB, patch) 2012-05-29 19:50 PDT, Stephanie Lewis	no flags	Details Formatted Diff Diff
patch and test (4.49 KB, patch) 2012-05-30 19:48 PDT, Stephanie Lewis	dpranke: review+	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Dirk Pranke

Comment 1 2012-05-29 18:24:13 PDT

the list of logs is only obtained 4 lines above ... doesn't that imply that we're rolling over the logs in the time it takes to actually walk the list (presumably less than a second or so)? How many things do you have crashing at once, and what is the limit on how many logs crash reporter will keep?

Stephanie Lewis

Comment 2 2012-05-29 18:25:12 PDT

hmm, It isn't the crash log rollover causing the problem either. I just looked at the bot now, and there are older logs than the one it is looking for. I wonder how it got the name of a file that doesn't exist.

Dirk Pranke

Comment 3 2012-05-29 18:32:03 PDT

(In reply to comment #2) > hmm, It isn't the crash log rollover causing the problem either. I just looked at the bot now, and there are older logs than the one it is looking for. I wonder how it got the name of a file that doesn't exist. Interesting. That's a good question. There would either have to be a bug in our file-finding code (which I just checked and it looks correct - that's certainly a plausible looking filename, at least, but maybe I'm missing something), or the file would've had to have been created and then deleted, or you're seeing some sort of a filesystem / python inconsistency between os.walk() and os.stat().

Stephanie Lewis

Comment 4 2012-05-29 18:35:23 PDT

It looks like ReportCrash doesn't necessarily remove by oldest logs first. I just had the logs disappear out from underneath me. It must use some sort of heuristic and we are getting bitten by it.

Dirk Pranke

Comment 5 2012-05-29 18:36:50 PDT

I see. Well, catching the exception would certainly be defensively safe and correct; did you have a patch in the works, or do you want me to fix it?

Stephanie Lewis

Comment 6 2012-05-29 19:50:42 PDT

Created attachment 144683 [details] patch

Ojan Vafai

Comment 7 2012-05-29 20:10:58 PDT

Comment on attachment 144683 [details] patch This needs a unittest.

Stephanie Lewis

Comment 8 2012-05-30 19:48:45 PDT

Created attachment 144973 [details] patch and test

Stephanie Lewis

Comment 9 2012-05-30 20:07:11 PDT

committed http://trac.webkit.org/projects/webkit/changeset/119030

Dirk Pranke

Comment 10 2012-06-14 21:42:57 PDT

I'm not sure why this is still open ... closing.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Mac

OS Unspecified

Product WebKit

Component Tools / Tests

Assignee

Nobody

Reported

2012-05-29 18:20 PDT

Modified

2012-06-14 21:42 PDT History

CC List

6 users Show

URL

Keywords

Depends on

Blocks