RESOLVED FIXED Bug 171976
webkitpy: Process crash-logs for iOS devices
https://bugs.webkit.org/show_bug.cgi?id=171976
Summary webkitpy: Process crash-logs for iOS devices
Jonathan Bedard
Reported 2017-05-11 09:42:05 PDT
We should process crash logs for iOS devices, sharing as much code as possible with other Darwin ports.
Attachments
Patch (16.75 KB, patch)
2017-05-11 10:00 PDT, Jonathan Bedard
no flags
Patch (17.77 KB, patch)
2017-05-11 16:12 PDT, Jonathan Bedard
no flags
Patch (22.41 KB, patch)
2017-05-15 15:23 PDT, Jonathan Bedard
no flags
Patch (22.80 KB, patch)
2017-05-18 13:14 PDT, Jonathan Bedard
no flags
Patch (24.31 KB, patch)
2017-06-01 11:15 PDT, Jonathan Bedard
no flags
Patch (27.19 KB, patch)
2017-06-01 16:03 PDT, Jonathan Bedard
no flags
Patch (28.10 KB, patch)
2017-06-02 10:12 PDT, Jonathan Bedard
no flags
Patch (30.30 KB, patch)
2017-06-02 11:45 PDT, Jonathan Bedard
no flags
Patch (29.15 KB, patch)
2017-06-02 15:16 PDT, Jonathan Bedard
no flags
Patch (29.87 KB, patch)
2017-06-05 09:51 PDT, Jonathan Bedard
no flags
Patch (33.19 KB, patch)
2017-06-05 11:12 PDT, Jonathan Bedard
no flags
Patch (26.16 KB, patch)
2017-06-06 09:24 PDT, Jonathan Bedard
no flags
Patch (26.13 KB, patch)
2017-06-06 09:31 PDT, Jonathan Bedard
ddkilzer: review+
Radar WebKit Bug Importer
Comment 1 2017-05-11 09:42:48 PDT
Jonathan Bedard
Comment 2 2017-05-11 10:00:42 PDT
Daniel Bates
Comment 3 2017-05-11 10:38:15 PDT
Comment on attachment 309719 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=309719&action=review I am not convinced that this is the correct approach. I understand that we may need to symbolicate device crash logs and that reading crash logs may require more filesystem operations to be performed on device more than the number of file system operations required to run a test. Can we symbolicate device crash logs on the host? Can we use efficient idioms to make the filesystem operations fast such that we can avoid the need to have a separate code path for iOS? > Tools/Scripts/webkitpy/common/system/crashlogs.py:79 > + self._host.list_crashlogs(newer_than), process_name, pid, Can we symbolicate crash logs on the host? And avoid the need to add list_crashlogs()?
Daniel Bates
Comment 4 2017-05-11 14:08:15 PDT
Jonathan and I discussed this bug some more today (05/11). Jonathan is concerned that calling FileSystem.mtime() for each crash log would cause a significant slowdown when the device has many crash logs. For 1000 crash logs Jonathan estimated it would take 30s to a minute to call FileSystem.mtime() on each log. Note that these crash logs may have existed before the test run. One idea I mentioned to reduce the number of invocations of FileSystem.mtime() is to have some sort of cache of last seen files. When run-webkit-tests is run and before the first test is executed we compute and store all the files seen in the crash log directory on the target host (device). We could modify FileSystem.file_under() to take this cache (in fact the cache may even be the output of this same function) and return the difference between the actual listing and the cache.
Daniel Bates
Comment 5 2017-05-11 14:10:11 PDT
I forgot to mention, Jonathan and I also talked about adding a new function to the Host? object, called symbolicate_crash(), that takes the path to an unsymbolicated crash log and returns the path to the symbolicated crash log on the target device. For non-iOS ports symbolicate_crash() would return the path it was given. For iOS device, it would do something more meaningful.
Daniel Bates
Comment 6 2017-05-11 14:16:43 PDT
(In reply to Daniel Bates from comment #5) > I forgot to mention, Jonathan and I also talked about adding a new function > to the Host? object, called symbolicate_crash(), that takes the path to an > unsymbolicated crash log and returns the path to the symbolicated crash log > on the target device. For non-iOS ports symbolicate_crash() would return the > path it was given. For iOS device, it would do something more meaningful. We should still consider doing symbolication on the host machine (Mac) if possible instead of on the device as this is more efficient. Implementing symbolicate_crash() as described above and where the iOS device would do the actual symbolication is a first step. If it turns out that doing this computation on the device is a bottleneck for running tests and on-device caching is non-trivial it may naturally lead us to consider host symbolication as well as other optimization options.
Jonathan Bedard
Comment 7 2017-05-11 16:12:22 PDT
Jonathan Bedard
Comment 8 2017-05-11 16:21:48 PDT
Still iterating on this (attachment 309808 [details]). As Dan and I discussed, this implementation re-uses more code between iOS devices and other Darwin ports. It uses a cached array of paths to ensure that we don't waste time checking crashes which occurred before the test run began. The change that this does not contain, which Dan has suggested, is that the logs are not symbolicated on the host, they are symbolicated on the device. Symbolicating logs on the host is possible, but it's more difficult and depends on the host having the binary which generated the crash. One consideration here is if the host machine does not have the iOS SDK installed, or has a different SDK from the device. I don't think we will see much of a performance benefit from symbolicating crashes on the host. It will likely be faster to symbolicate on the host, but symbolication occurs once per crash and should only be applied to crashes which will be archived with the results. I don't think that symbolicating crash-logs on the host is worth the trouble.
Daniel Bates
Comment 9 2017-05-11 16:52:03 PDT
Comment on attachment 309808 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=309808&action=review > Tools/Scripts/webkitpy/common/system/crashlogs.py:165 > + count = 1 > + original_result = result_name > while result_name in crash_logs: > - result_name = result_name + "-1" > + result_name = original_result + '-{}'.format(count) > + count += 1 I suggest that we fix this in a separate bug and add a test.
Daniel Bates
Comment 10 2017-05-11 16:58:15 PDT
Comment on attachment 309808 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=309808&action=review > Tools/Scripts/webkitpy/common/system/filesystem.py:54 > + def __init__(self): > + self.cached_crashlogs = [] > + This does not seem the appropriate place for this knowledge. The purpose of this class is to abstract file system operations. Maybe PlatformInfo would be a more appropriate place for this? > Tools/Scripts/webkitpy/common/system/filesystem.py:349 > + def symbolicated_crashlog(self, path): > + return path I would call this symbolicate_crash_log() or symbolicate_crash_log_if_needed() This does not seem the appropriate place for this knowledge. The purpose of this class is to abstract file system operations. Maybe Host would be a more appropriate place for this? > Tools/Scripts/webkitpy/port/darwin.py:111 > + crash_logs = CrashLogs((target_host or self.host)) Are the parentheses necessary? If not, then I suggest we remove them. > Tools/Scripts/webkitpy/port/win.py:380 > + crash_logs = CrashLogs((target_host or self.host), self.results_directory()) Ditto.
Jonathan Bedard
Comment 11 2017-05-15 15:23:30 PDT
Jonathan Bedard
Comment 12 2017-05-18 13:14:57 PDT
Jonathan Bedard
Comment 13 2017-05-18 13:15:28 PDT
Comment on attachment 310542 [details] Patch This patch shouldn't be committed until the Internal bits land.
Jonathan Bedard
Comment 14 2017-06-01 11:15:53 PDT
Jonathan Bedard
Comment 15 2017-06-01 16:03:41 PDT
Daniel Bates
Comment 16 2017-06-02 09:34:26 PDT
Comment on attachment 311771 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=311771&action=review > Tools/ChangeLog:12 > + * Scripts/webkitpy/common/system/crashlogs.py: This ChangeLog entry is out-of-date or there is a bug in prepare-ChangeLog. In particular, this ChangeLog is missing an entry for the modification of function CrashLogs.is_crash_log(). Please regenerate this ChangeLog file or file a bug about prepare-ChangeLog. > Tools/Scripts/webkitpy/common/system/crashlogs.py:58 > + return False Is this correct? > Tools/Scripts/webkitpy/common/system/crashlogs.py:88 > + return False Is this correct? > Tools/Scripts/webkitpy/common/system/crashlogs.py:151 > + for line in log_contents.splitlines(): > + match = process_regex.match(line) > + if match: > + break > + Ewww. We should not duplicate code. I suggest that we extract out the process_regex and this parsing logic into a common helper method, say parse_darwin_crash_log(), that takes a path and symbolicates the crash log, parses it, and returns a (process name, pid) tuple on success and None if parsing fails. > Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:203 > + # Cache crash logs already on host. > + try: > + crash_log_directory = self._port.path_to_crash_logs() > + except NotImplementedError: > + return True > + for i in xrange(self._options.child_processes): > + target_host = self._port.target_host(i) > + if not target_host.cached_crash_logs: > + target_host.cached_crash_logs = target_host.filesystem.files_under(crash_log_directory) > + This is not the appropriate place to do port-specific logic. Port specific logic should live in the Port class. Can we move this logic into Port.setup_test_run()? > Tools/Scripts/webkitpy/port/device.py:70 > + return path This is not correct. The code in CrashLogs.py assumes symbolicate_crash_log_if_needed() returns the contents of the crash log. Either this code needs to change or the code in CrashLogs.py needs to change. > Tools/Scripts/webkitpy/port/ios_device.py:73 > + raise NotImplementedError Raising a NotImplementedError exception is inconsistent with the behavior described in other functions when the Apple Additions module is unavailable. For example, in _device_for_worker_number_map() we raise RuntimeError(self.NO_ON_DEVICE_TESTING) when the Apple Additions module is unavailable.
Jonathan Bedard
Comment 17 2017-06-02 10:12:09 PDT
Jonathan Bedard
Comment 18 2017-06-02 10:40:25 PDT
Comment on attachment 311771 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=311771&action=review I will update to address Dan's comments that I did not respond to. >> Tools/Scripts/webkitpy/common/system/crashlogs.py:58 >> + return False > > Is this correct? Yes. This is an optimization so that we do not parse crash logs already on the host before testing began. Perhaps cached_crash_logs needs a new name to make this more clear. >> Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:203 >> + > > This is not the appropriate place to do port-specific logic. Port specific logic should live in the Port class. Can we move this logic into Port.setup_test_run()? Per your suggestion a few weeks ago, this actually isn't port-specific. This is an optimization that all ports can use to avoid parsing crash logs not associated with layout tests. For most ports, this will only calculate the list of cached crash logs once since the target host for each process is the same. If a port specifies a different target host for a specific thread (as iOS does) this will calculate a list of cached crash logs for each target host.
Jonathan Bedard
Comment 19 2017-06-02 11:45:57 PDT
Jonathan Bedard
Comment 20 2017-06-02 15:16:50 PDT
Jonathan Bedard
Comment 21 2017-06-05 09:51:04 PDT
Jonathan Bedard
Comment 22 2017-06-05 11:12:18 PDT
Daniel Bates
Comment 23 2017-06-05 13:06:08 PDT
Comment on attachment 312018 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312018&action=review I suggest that we split this patch into at least two patches. One patch to fix up ServerProcess.process_name() and one patch to add the logic to symbolicate crash logs for iOS device. > Tools/Scripts/webkitpy/common/system/crashlogs.py:61 > + def _parse_darwin_crashlog(self, path): Crash logs is two words. So, we should call this _parse_darwin_crash_log(). > Tools/Scripts/webkitpy/common/system/crashlogs.py:83 > + parsed_crash_log = self._parse_darwin_crashlog(path) I suggest we destructure/unpack the tuple here so as to avoid the need to need to know that the first and second arguments of the tuple are the process name and pid, respectively. > Tools/Scripts/webkitpy/port/apple.py:57 > + _host_to_existing_crashlogs = {} Maybe a better name for this would be: crash_logs_to_skip_for_host. This dictionary is only meaningful for the iOS device (simulator?) port at the moment that has multiple target devices. It would better if we could move this to the the iOS device (simulator?) port. What is our naming convention for protected variables? > Tools/Scripts/webkitpy/port/apple.py:99 > + def host_to_crash_logs_to_skip(self, host): > + if host in self._host_to_existing_crashlogs: > + return self._host_to_existing_crashlogs[host] > + return [] I do not see the need to add such a getter-like function since it is only called from this class and derived classes. Instead I suggest caller just access the protected instance variable directly. > Tools/Scripts/webkitpy/port/darwin.py:241 > + def plist_data_from_bundle(self, app_bundle, entry): This should be private. > Tools/Scripts/webkitpy/port/darwin.py:248 > + return self._executive.run_command( > + ['/usr/libexec/PlistBuddy', '-c', 'Print {}'.format(entry), plist_path]).rstrip() I do not see the need to split this across two lines. I would write this on one line.
Jonathan Bedard
Comment 24 2017-06-06 09:24:46 PDT
Jonathan Bedard
Comment 25 2017-06-06 09:31:46 PDT
Jonathan Bedard
Comment 26 2017-06-06 09:32:04 PDT
Talking to Dan yesterday, this bug has been split. Dan's comments regarding the plist code will be addressed in bug 172940. The approach in attachment 312079 [details] keeps the host-to-crash-log-list dictionary on the Apple port, although the population of this dictionary is no longer generalized.
David Kilzer (:ddkilzer)
Comment 27 2017-06-06 13:18:05 PDT
Comment on attachment 312079 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=312079&action=review r=me, but please consider the feedback noted. > Tools/Scripts/webkitpy/common/system/crashlogs.py:74 > + if fs.join(dirpath, basename) in self._crash_logs_to_skip: > + return False Since iOS is the only port that uses skipped crash logs, you could add a check to see if self._crash_logs_to_skip is not empty before calling fs.join() (which creates a string just to see if it's in the list) [UNTESTED]: if self._crash_logs_to_skip and fs.join(dirpath, basename) in self._crash_logs_to_skip: If you do this, please add this code everywhere this check is added. > Tools/Scripts/webkitpy/common/system/crashlogs_unittest.py:37 > - return """Process: {process_name} [{pid}] > + return """May have garbage at the beginning I'd suggest something other than "garbage" here like: return """Crash log may not start with Process line
Jonathan Bedard
Comment 28 2017-06-06 13:25:17 PDT
Note You need to log in before you can comment on or make changes to this bug.