Bug 171976 - webkitpy: Process crash-logs for iOS devices
Summary: webkitpy: Process crash-logs for iOS devices
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Jonathan Bedard
URL:
Keywords: InRadar
Depends on: 172033
Blocks:
  Show dependency treegraph
 
Reported: 2017-05-11 09:42 PDT by Jonathan Bedard
Modified: 2017-06-06 13:25 PDT (History)
6 users (show)

See Also:


Attachments
Patch (16.75 KB, patch)
2017-05-11 10:00 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (17.77 KB, patch)
2017-05-11 16:12 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (22.41 KB, patch)
2017-05-15 15:23 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (22.80 KB, patch)
2017-05-18 13:14 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (24.31 KB, patch)
2017-06-01 11:15 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (27.19 KB, patch)
2017-06-01 16:03 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (28.10 KB, patch)
2017-06-02 10:12 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (30.30 KB, patch)
2017-06-02 11:45 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (29.15 KB, patch)
2017-06-02 15:16 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (29.87 KB, patch)
2017-06-05 09:51 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (33.19 KB, patch)
2017-06-05 11:12 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (26.16 KB, patch)
2017-06-06 09:24 PDT, Jonathan Bedard
no flags Details | Formatted Diff | Diff
Patch (26.13 KB, patch)
2017-06-06 09:31 PDT, Jonathan Bedard
ddkilzer: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Bedard 2017-05-11 09:42:05 PDT
We should process crash logs for iOS devices, sharing as much code as possible with other Darwin ports.
Comment 1 Radar WebKit Bug Importer 2017-05-11 09:42:48 PDT
<rdar://problem/32134551>
Comment 2 Jonathan Bedard 2017-05-11 10:00:42 PDT
Created attachment 309719 [details]
Patch
Comment 3 Daniel Bates 2017-05-11 10:38:15 PDT
Comment on attachment 309719 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=309719&action=review

I am not convinced that this is the correct approach. I understand that we may need to symbolicate device crash logs and that reading crash logs may require more filesystem operations to be performed on device more than the number of file system operations required to run a test. Can we symbolicate device crash logs on the host? Can we use efficient idioms to make the filesystem operations fast such that we can avoid the need to have a separate code path for iOS?

> Tools/Scripts/webkitpy/common/system/crashlogs.py:79
> +            self._host.list_crashlogs(newer_than), process_name, pid,

Can we symbolicate crash logs on the host? And avoid the need to add list_crashlogs()?
Comment 4 Daniel Bates 2017-05-11 14:08:15 PDT
Jonathan and I discussed this bug some more today (05/11). Jonathan is concerned that calling FileSystem.mtime() for each crash log would cause a significant slowdown when the device has many crash logs. For 1000 crash logs Jonathan estimated it would take 30s to a minute to call FileSystem.mtime() on each log. Note that these crash logs may have existed before the test run. One idea I mentioned to reduce the number of invocations of FileSystem.mtime() is to have some sort of cache of last seen files. When run-webkit-tests is run and before the first test is executed we compute and store all the files seen in the crash log directory on the target host (device). We could modify FileSystem.file_under() to take this cache (in fact the cache may even be the output of this same function) and return the difference between the actual listing and the cache.
Comment 5 Daniel Bates 2017-05-11 14:10:11 PDT
I forgot to mention, Jonathan and I also talked about adding a new function to the Host? object, called symbolicate_crash(), that takes the path to an unsymbolicated crash log and returns the path to the symbolicated crash log on the target device. For non-iOS ports symbolicate_crash() would return the path it was given. For iOS device, it would do something more meaningful.
Comment 6 Daniel Bates 2017-05-11 14:16:43 PDT
(In reply to Daniel Bates from comment #5)
> I forgot to mention, Jonathan and I also talked about adding a new function
> to the Host? object, called symbolicate_crash(), that takes the path to an
> unsymbolicated crash log and returns the path to the symbolicated crash log
> on the target device. For non-iOS ports symbolicate_crash() would return the
> path it was given. For iOS device, it would do something more meaningful.

We should still consider doing symbolication on the host machine (Mac) if possible instead of on the device as this is more efficient. Implementing symbolicate_crash() as described above and where the iOS device would do the actual symbolication is a first step. If it turns out that doing this computation on the device is a bottleneck for running tests and on-device caching is non-trivial it may naturally lead us to consider host symbolication as well as other optimization options.
Comment 7 Jonathan Bedard 2017-05-11 16:12:22 PDT
Created attachment 309808 [details]
Patch
Comment 8 Jonathan Bedard 2017-05-11 16:21:48 PDT
Still iterating on this (attachment 309808 [details]).

As Dan and I discussed, this implementation re-uses more code between iOS devices and other Darwin ports.  It uses a cached array of paths to ensure that we don't waste time checking crashes which occurred before the test run began.

The change that this does not contain, which Dan has suggested, is that the logs are not symbolicated on the host, they are symbolicated on the device.  Symbolicating logs on the host is possible, but it's more difficult and depends on the host having the binary which generated the crash.  One consideration here is if the host machine does not have the iOS SDK installed, or has a different SDK from the device.

I don't think we will see much of a performance benefit from symbolicating crashes on the host.  It will likely be faster to symbolicate on the host, but symbolication occurs once per crash and should only be applied to crashes which will be archived with the results.  I don't think that symbolicating crash-logs on the host is worth the trouble.
Comment 9 Daniel Bates 2017-05-11 16:52:03 PDT
Comment on attachment 309808 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=309808&action=review

> Tools/Scripts/webkitpy/common/system/crashlogs.py:165
> +                    count = 1
> +                    original_result = result_name
>                      while result_name in crash_logs:
> -                        result_name = result_name + "-1"
> +                        result_name = original_result + '-{}'.format(count)
> +                        count += 1

I suggest that we fix this in a separate bug and add a test.
Comment 10 Daniel Bates 2017-05-11 16:58:15 PDT
Comment on attachment 309808 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=309808&action=review

> Tools/Scripts/webkitpy/common/system/filesystem.py:54
> +    def __init__(self):
> +        self.cached_crashlogs = []
> +

This does not seem the appropriate place for this knowledge. The purpose of this class is to abstract file system operations. Maybe PlatformInfo would be a more appropriate place for this?

> Tools/Scripts/webkitpy/common/system/filesystem.py:349
> +    def symbolicated_crashlog(self, path):
> +        return path

I would call this symbolicate_crash_log() or symbolicate_crash_log_if_needed()

This does not seem the appropriate place for this knowledge. The purpose of this class is to abstract file system operations. Maybe Host would be a more appropriate place for this?

> Tools/Scripts/webkitpy/port/darwin.py:111
> +        crash_logs = CrashLogs((target_host or self.host))

Are the parentheses necessary? If not, then I suggest we remove them.

> Tools/Scripts/webkitpy/port/win.py:380
> +        crash_logs = CrashLogs((target_host or self.host), self.results_directory())

Ditto.
Comment 11 Jonathan Bedard 2017-05-15 15:23:30 PDT
Created attachment 310175 [details]
Patch
Comment 12 Jonathan Bedard 2017-05-18 13:14:57 PDT
Created attachment 310542 [details]
Patch
Comment 13 Jonathan Bedard 2017-05-18 13:15:28 PDT
Comment on attachment 310542 [details]
Patch

This patch shouldn't be committed until the Internal bits land.
Comment 14 Jonathan Bedard 2017-06-01 11:15:53 PDT
Created attachment 311732 [details]
Patch
Comment 15 Jonathan Bedard 2017-06-01 16:03:41 PDT
Created attachment 311771 [details]
Patch
Comment 16 Daniel Bates 2017-06-02 09:34:26 PDT
Comment on attachment 311771 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311771&action=review

> Tools/ChangeLog:12
> +        * Scripts/webkitpy/common/system/crashlogs.py:

This ChangeLog entry is out-of-date or there is a bug in prepare-ChangeLog. In particular, this ChangeLog is missing an entry for the modification of function CrashLogs.is_crash_log(). Please regenerate this ChangeLog file or file a bug about prepare-ChangeLog.

> Tools/Scripts/webkitpy/common/system/crashlogs.py:58
> +                return False

Is this correct?

> Tools/Scripts/webkitpy/common/system/crashlogs.py:88
> +                return False

Is this correct?

> Tools/Scripts/webkitpy/common/system/crashlogs.py:151
> +                    for line in log_contents.splitlines():
> +                        match = process_regex.match(line)
> +                        if match:
> +                            break
> +

Ewww. We should not duplicate code. I suggest that we extract out the process_regex and this parsing logic into a common helper method, say parse_darwin_crash_log(), that takes a path and symbolicates the crash log, parses it, and returns a (process name, pid) tuple on success and None if parsing fails.

> Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:203
> +        # Cache crash logs already on host.
> +        try:
> +            crash_log_directory = self._port.path_to_crash_logs()
> +        except NotImplementedError:
> +            return True
> +        for i in xrange(self._options.child_processes):
> +            target_host = self._port.target_host(i)
> +            if not target_host.cached_crash_logs:
> +                target_host.cached_crash_logs = target_host.filesystem.files_under(crash_log_directory)
> +

This is not the appropriate place to do port-specific logic. Port specific logic should live in the Port class. Can we move this logic into Port.setup_test_run()?

> Tools/Scripts/webkitpy/port/device.py:70
> +        return path

This is not correct. The code in CrashLogs.py assumes symbolicate_crash_log_if_needed() returns the contents of the crash log. Either this code needs to change or the code in CrashLogs.py needs to change.

> Tools/Scripts/webkitpy/port/ios_device.py:73
> +            raise NotImplementedError

Raising a NotImplementedError exception is inconsistent with the behavior described in other functions when the Apple Additions module is unavailable. For example, in _device_for_worker_number_map() we raise RuntimeError(self.NO_ON_DEVICE_TESTING) when the Apple Additions module is unavailable.
Comment 17 Jonathan Bedard 2017-06-02 10:12:09 PDT
Created attachment 311835 [details]
Patch
Comment 18 Jonathan Bedard 2017-06-02 10:40:25 PDT
Comment on attachment 311771 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=311771&action=review

I will update to address Dan's comments that I did not respond to.

>> Tools/Scripts/webkitpy/common/system/crashlogs.py:58
>> +                return False
> 
> Is this correct?

Yes.

This is an optimization so that we do not parse crash logs already on the host before testing began.  Perhaps cached_crash_logs needs a new name to make this more clear.

>> Tools/Scripts/webkitpy/layout_tests/controllers/manager.py:203
>> +
> 
> This is not the appropriate place to do port-specific logic. Port specific logic should live in the Port class. Can we move this logic into Port.setup_test_run()?

Per your suggestion a few weeks ago, this actually isn't port-specific.  This is an optimization that all ports can use to avoid parsing crash logs not associated with layout tests.

For most ports, this will only calculate the list of cached crash logs once since the target host for each process is the same.  If a port specifies a different target host for a specific thread (as iOS does) this will calculate a list of cached crash logs for each target host.
Comment 19 Jonathan Bedard 2017-06-02 11:45:57 PDT
Created attachment 311841 [details]
Patch
Comment 20 Jonathan Bedard 2017-06-02 15:16:50 PDT
Created attachment 311873 [details]
Patch
Comment 21 Jonathan Bedard 2017-06-05 09:51:04 PDT
Created attachment 312013 [details]
Patch
Comment 22 Jonathan Bedard 2017-06-05 11:12:18 PDT
Created attachment 312018 [details]
Patch
Comment 23 Daniel Bates 2017-06-05 13:06:08 PDT
Comment on attachment 312018 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=312018&action=review

I suggest that we split this patch into at least two patches. One patch to fix up ServerProcess.process_name() and one patch to add the logic to symbolicate crash logs for iOS device.

> Tools/Scripts/webkitpy/common/system/crashlogs.py:61
> +    def _parse_darwin_crashlog(self, path):

Crash logs is two words. So, we should call this _parse_darwin_crash_log().

> Tools/Scripts/webkitpy/common/system/crashlogs.py:83
> +                    parsed_crash_log = self._parse_darwin_crashlog(path)

I suggest we destructure/unpack the tuple here so as to avoid the need to need to know that the first and second arguments of the tuple are the process name and pid, respectively.

> Tools/Scripts/webkitpy/port/apple.py:57
> +    _host_to_existing_crashlogs = {}

Maybe a better name for this would be: crash_logs_to_skip_for_host.

This dictionary is only meaningful for the iOS device (simulator?) port at the moment that has multiple target devices. It would better if we could move this to the the iOS device (simulator?) port.

What is our naming convention for protected variables?

> Tools/Scripts/webkitpy/port/apple.py:99
> +    def host_to_crash_logs_to_skip(self, host):
> +        if host in self._host_to_existing_crashlogs:
> +            return self._host_to_existing_crashlogs[host]
> +        return []

I do not see  the need to add such a getter-like function since it is only called from this class and derived classes. Instead I suggest caller just access the protected instance variable directly.

> Tools/Scripts/webkitpy/port/darwin.py:241
> +    def plist_data_from_bundle(self, app_bundle, entry):

This should be private.

> Tools/Scripts/webkitpy/port/darwin.py:248
> +        return self._executive.run_command(
> +            ['/usr/libexec/PlistBuddy', '-c', 'Print {}'.format(entry), plist_path]).rstrip()

I do not see the need to split this across two lines. I would write this on one line.
Comment 24 Jonathan Bedard 2017-06-06 09:24:46 PDT
Created attachment 312078 [details]
Patch
Comment 25 Jonathan Bedard 2017-06-06 09:31:46 PDT
Created attachment 312079 [details]
Patch
Comment 26 Jonathan Bedard 2017-06-06 09:32:04 PDT
Talking to Dan yesterday, this bug has been split.  Dan's comments regarding the plist code will be addressed in bug 172940.

The approach in attachment 312079 [details] keeps the host-to-crash-log-list dictionary on the Apple port, although the population of this dictionary is no longer generalized.
Comment 27 David Kilzer (:ddkilzer) 2017-06-06 13:18:05 PDT
Comment on attachment 312079 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=312079&action=review

r=me, but please consider the feedback noted.

> Tools/Scripts/webkitpy/common/system/crashlogs.py:74
> +            if fs.join(dirpath, basename) in self._crash_logs_to_skip:
> +                return False

Since iOS is the only port that uses skipped crash logs, you could add a check to see if self._crash_logs_to_skip is not empty before calling fs.join() (which creates a string just to see if it's in the list) [UNTESTED]:

    if self._crash_logs_to_skip and fs.join(dirpath, basename) in self._crash_logs_to_skip:

If you do this, please add this code everywhere this check is added.

> Tools/Scripts/webkitpy/common/system/crashlogs_unittest.py:37
> -    return """Process:         {process_name} [{pid}]
> +    return """May have garbage at the beginning

I'd suggest something other than "garbage" here like:

    return """Crash log may not start with Process line
Comment 28 Jonathan Bedard 2017-06-06 13:25:17 PDT
Committed r217853: <http://trac.webkit.org/changeset/217853>