Bug 87803 - Layout tests often fail trying to stat nonexistent logs
Summary: Layout tests often fail trying to stat nonexistent logs
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-29 18:20 PDT by Stephanie Lewis
Modified: 2012-06-14 21:42 PDT (History)
6 users (show)

See Also:


Attachments
patch (2.98 KB, patch)
2012-05-29 19:50 PDT, Stephanie Lewis
no flags Details | Formatted Diff | Diff
patch and test (4.49 KB, patch)
2012-05-30 19:48 PDT, Stephanie Lewis
dpranke: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stephanie Lewis 2012-05-29 18:20:12 PDT
On Mac crashreporter will roll over logs after a certain number of crashes.  This can cause the layout tests to fail if looking for a log that doesn't exist anymore.  We're tracking bugs internally to fix that problem, but the the try/catch block should include the stat and catch OSError.

backtrace:

18:01:45.278 79451 worker/5 raised OSError('[Errno 2] No such file or directory: '/Users/buildbot/Library/Logs/DiagnosticReports/WebProcess_2012-05-29-161118_buildbots-Mac-Pro-40.crash''):
18:01:45.278 79492 worker/7 mathml/presentation/fenced-mi.xhtml passed
18:01:45.278 79451   layout_tests/controllers/manager_worker_broker.py:290 (in run)
18:01:45.278 79451     self._worker_connection.run_message_loop()
18:01:45.278 79451   layout_tests/controllers/manager_worker_broker.py:243 (in run_message_loop)
18:01:45.278 79451     self._broker.run_message_loop(self._run_topic, self._client, delay_secs)
18:01:45.278 79451   layout_tests/controllers/manager_worker_broker.py:175 (in run_message_loop)
18:01:45.279 79451     self._run_loop(topic_name, client, block=True, delay_secs=delay_secs)
18:01:45.279 79451   layout_tests/controllers/manager_worker_broker.py:189 (in _run_loop)
18:01:45.279 79451     self._dispatch_message(msg, client)
18:01:45.279 79451   layout_tests/controllers/manager_worker_broker.py:198 (in _dispatch_message)
18:01:45.279 79451     message_handler(message.src, *optargs)
18:01:45.279 79451   layout_tests/controllers/worker.py:144 (in handle_test_list)
18:01:45.279 79451     self._run_test(test_input)
18:01:45.279 79451   layout_tests/controllers/worker.py:171 (in _run_test)
18:01:45.279 79451     result = self.run_test_with_timeout(test_input, test_timeout_sec)
18:01:45.279 79451   layout_tests/controllers/worker.py:213 (in run_test_with_timeout)
18:01:45.279 79451     return self._run_test_in_this_thread(test_input)
18:01:45.279 79451   layout_tests/controllers/worker.py:298 (in _run_test_in_this_thread)
18:01:45.279 79451     return self.run_single_test(self._driver, test_input)
18:01:45.279 79451   layout_tests/controllers/worker.py:302 (in run_single_test)
18:01:45.279 79451     test_input, driver, self._name)
18:01:45.279 79451   layout_tests/controllers/single_test_runner.py:46 (in run_single_test)
18:01:45.279 79451     return runner.run()
18:01:45.279 79451   layout_tests/controllers/single_test_runner.py:101 (in run)
18:01:45.280 79451     return self._run_reftest()
18:01:45.280 79451   layout_tests/controllers/single_test_runner.py:291 (in _run_reftest)
18:01:45.280 79451     reference_output = self._driver.run_test(DriverInput(reference_test_name, self._timeout, test_output.image_hash, should_run_pixel_test=True))
18:01:45.280 79451   layout_tests/port/driver.py:217 (in run_test)
18:01:45.280 79451     return self._running_drivers[cmd_line_key].run_test(driver_input)
18:01:45.280 79451   layout_tests/port/webkit.py:579 (in run_test)
18:01:45.280 79451     newer_than=start_time)
18:01:45.280 79451   layout_tests/port/mac.py:194 (in _get_crash_log)
18:01:45.280 79451     crash_log = crash_logs.find_newest_log(name, pid, include_errors=True, newer_than=newer_than)
18:01:45.280 79451   common/system/crashlogs.py:38 (in find_newest_log)
18:01:45.280 79451     return self._find_newest_log_darwin(process_name, pid, include_errors, newer_than)
18:01:45.280 79451   common/system/crashlogs.py:59 (in _find_newest_log_darwin)
18:01:45.280 79451     if not newer_than or self._host.filesystem.mtime(path) > newer_than:
18:01:45.280 79451   common/system/filesystem.py:183 (in mtime)
18:01:45.280 79451     return os.stat(path).st_mtime
Comment 1 Dirk Pranke 2012-05-29 18:24:13 PDT
the list of logs is only obtained 4 lines above ... doesn't that imply that we're rolling over the logs in the time it takes to actually walk the list (presumably less than a second or so)? How many things do you have crashing at once, and what is the limit on how many logs crash reporter will keep?
Comment 2 Stephanie Lewis 2012-05-29 18:25:12 PDT
hmm, It isn't the crash log rollover causing the problem either.  I just looked at the bot now, and there are older logs than the one it is looking for.  I wonder how it got the name of a file that doesn't exist.
Comment 3 Dirk Pranke 2012-05-29 18:32:03 PDT
(In reply to comment #2)
> hmm, It isn't the crash log rollover causing the problem either.  I just looked at the bot now, and there are older logs than the one it is looking for.  I wonder how it got the name of a file that doesn't exist.

Interesting. That's a good question. There would either have to be a bug in our file-finding code (which I just checked and it looks correct - that's certainly a plausible looking filename, at least, but maybe I'm missing something), or the file would've had to have been created and then deleted,  or you're seeing some sort of a filesystem / python inconsistency between os.walk() and os.stat().
Comment 4 Stephanie Lewis 2012-05-29 18:35:23 PDT
It looks like ReportCrash doesn't necessarily remove by oldest logs first.  I just had the logs disappear out from underneath me.  It must use some sort of heuristic and we are getting bitten by it.
Comment 5 Dirk Pranke 2012-05-29 18:36:50 PDT
I see. Well, catching the exception would certainly be defensively safe and correct; did you have a patch in the works, or do you want me to fix it?
Comment 6 Stephanie Lewis 2012-05-29 19:50:42 PDT
Created attachment 144683 [details]
patch
Comment 7 Ojan Vafai 2012-05-29 20:10:58 PDT
Comment on attachment 144683 [details]
patch

This needs a unittest.
Comment 8 Stephanie Lewis 2012-05-30 19:48:45 PDT
Created attachment 144973 [details]
patch and test
Comment 9 Stephanie Lewis 2012-05-30 20:07:11 PDT
committed http://trac.webkit.org/projects/webkit/changeset/119030
Comment 10 Dirk Pranke 2012-06-14 21:42:57 PDT
I'm not sure why this is still open ... closing.