RESOLVED FIXED 87803
Layout tests often fail trying to stat nonexistent logs
https://bugs.webkit.org/show_bug.cgi?id=87803
Summary Layout tests often fail trying to stat nonexistent logs
Stephanie Lewis
Reported 2012-05-29 18:20:12 PDT
On Mac crashreporter will roll over logs after a certain number of crashes. This can cause the layout tests to fail if looking for a log that doesn't exist anymore. We're tracking bugs internally to fix that problem, but the the try/catch block should include the stat and catch OSError. backtrace: 18:01:45.278 79451 worker/5 raised OSError('[Errno 2] No such file or directory: '/Users/buildbot/Library/Logs/DiagnosticReports/WebProcess_2012-05-29-161118_buildbots-Mac-Pro-40.crash''): 18:01:45.278 79492 worker/7 mathml/presentation/fenced-mi.xhtml passed 18:01:45.278 79451 layout_tests/controllers/manager_worker_broker.py:290 (in run) 18:01:45.278 79451 self._worker_connection.run_message_loop() 18:01:45.278 79451 layout_tests/controllers/manager_worker_broker.py:243 (in run_message_loop) 18:01:45.278 79451 self._broker.run_message_loop(self._run_topic, self._client, delay_secs) 18:01:45.278 79451 layout_tests/controllers/manager_worker_broker.py:175 (in run_message_loop) 18:01:45.279 79451 self._run_loop(topic_name, client, block=True, delay_secs=delay_secs) 18:01:45.279 79451 layout_tests/controllers/manager_worker_broker.py:189 (in _run_loop) 18:01:45.279 79451 self._dispatch_message(msg, client) 18:01:45.279 79451 layout_tests/controllers/manager_worker_broker.py:198 (in _dispatch_message) 18:01:45.279 79451 message_handler(message.src, *optargs) 18:01:45.279 79451 layout_tests/controllers/worker.py:144 (in handle_test_list) 18:01:45.279 79451 self._run_test(test_input) 18:01:45.279 79451 layout_tests/controllers/worker.py:171 (in _run_test) 18:01:45.279 79451 result = self.run_test_with_timeout(test_input, test_timeout_sec) 18:01:45.279 79451 layout_tests/controllers/worker.py:213 (in run_test_with_timeout) 18:01:45.279 79451 return self._run_test_in_this_thread(test_input) 18:01:45.279 79451 layout_tests/controllers/worker.py:298 (in _run_test_in_this_thread) 18:01:45.279 79451 return self.run_single_test(self._driver, test_input) 18:01:45.279 79451 layout_tests/controllers/worker.py:302 (in run_single_test) 18:01:45.279 79451 test_input, driver, self._name) 18:01:45.279 79451 layout_tests/controllers/single_test_runner.py:46 (in run_single_test) 18:01:45.279 79451 return runner.run() 18:01:45.279 79451 layout_tests/controllers/single_test_runner.py:101 (in run) 18:01:45.280 79451 return self._run_reftest() 18:01:45.280 79451 layout_tests/controllers/single_test_runner.py:291 (in _run_reftest) 18:01:45.280 79451 reference_output = self._driver.run_test(DriverInput(reference_test_name, self._timeout, test_output.image_hash, should_run_pixel_test=True)) 18:01:45.280 79451 layout_tests/port/driver.py:217 (in run_test) 18:01:45.280 79451 return self._running_drivers[cmd_line_key].run_test(driver_input) 18:01:45.280 79451 layout_tests/port/webkit.py:579 (in run_test) 18:01:45.280 79451 newer_than=start_time) 18:01:45.280 79451 layout_tests/port/mac.py:194 (in _get_crash_log) 18:01:45.280 79451 crash_log = crash_logs.find_newest_log(name, pid, include_errors=True, newer_than=newer_than) 18:01:45.280 79451 common/system/crashlogs.py:38 (in find_newest_log) 18:01:45.280 79451 return self._find_newest_log_darwin(process_name, pid, include_errors, newer_than) 18:01:45.280 79451 common/system/crashlogs.py:59 (in _find_newest_log_darwin) 18:01:45.280 79451 if not newer_than or self._host.filesystem.mtime(path) > newer_than: 18:01:45.280 79451 common/system/filesystem.py:183 (in mtime) 18:01:45.280 79451 return os.stat(path).st_mtime
Attachments
patch (2.98 KB, patch)
2012-05-29 19:50 PDT, Stephanie Lewis
no flags
patch and test (4.49 KB, patch)
2012-05-30 19:48 PDT, Stephanie Lewis
dpranke: review+
Dirk Pranke
Comment 1 2012-05-29 18:24:13 PDT
the list of logs is only obtained 4 lines above ... doesn't that imply that we're rolling over the logs in the time it takes to actually walk the list (presumably less than a second or so)? How many things do you have crashing at once, and what is the limit on how many logs crash reporter will keep?
Stephanie Lewis
Comment 2 2012-05-29 18:25:12 PDT
hmm, It isn't the crash log rollover causing the problem either. I just looked at the bot now, and there are older logs than the one it is looking for. I wonder how it got the name of a file that doesn't exist.
Dirk Pranke
Comment 3 2012-05-29 18:32:03 PDT
(In reply to comment #2) > hmm, It isn't the crash log rollover causing the problem either. I just looked at the bot now, and there are older logs than the one it is looking for. I wonder how it got the name of a file that doesn't exist. Interesting. That's a good question. There would either have to be a bug in our file-finding code (which I just checked and it looks correct - that's certainly a plausible looking filename, at least, but maybe I'm missing something), or the file would've had to have been created and then deleted, or you're seeing some sort of a filesystem / python inconsistency between os.walk() and os.stat().
Stephanie Lewis
Comment 4 2012-05-29 18:35:23 PDT
It looks like ReportCrash doesn't necessarily remove by oldest logs first. I just had the logs disappear out from underneath me. It must use some sort of heuristic and we are getting bitten by it.
Dirk Pranke
Comment 5 2012-05-29 18:36:50 PDT
I see. Well, catching the exception would certainly be defensively safe and correct; did you have a patch in the works, or do you want me to fix it?
Stephanie Lewis
Comment 6 2012-05-29 19:50:42 PDT
Ojan Vafai
Comment 7 2012-05-29 20:10:58 PDT
Comment on attachment 144683 [details] patch This needs a unittest.
Stephanie Lewis
Comment 8 2012-05-30 19:48:45 PDT
Created attachment 144973 [details] patch and test
Stephanie Lewis
Comment 9 2012-05-30 20:07:11 PDT
Dirk Pranke
Comment 10 2012-06-14 21:42:57 PDT
I'm not sure why this is still open ... closing.
Note You need to log in before you can comment on or make changes to this bug.