Bug 172002

Summary: webkitpy: Properly number duplicated crashlogs
Product: WebKit Reporter: Jonathan Bedard <jbedard>
Component: Tools / TestsAssignee: Jonathan Bedard <jbedard>
Status: RESOLVED FIXED    
Severity: Normal CC: aakash_jain, buildbot, commit-queue, dbates, glenn, lforschler
Priority: P2    
Version: Other   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=171976
https://bugs.webkit.org/show_bug.cgi?id=172039
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch for landing none

Description Jonathan Bedard 2017-05-11 17:05:28 PDT
Currently, if a single process has multiple crashlogs, we will number them like so:

crash.log
crash-1.log
crash-1-1.log
crash-1-1-1.log

This doesn't make much since, these should be increasing integer numbers.
Comment 1 Jonathan Bedard 2017-05-11 17:08:40 PDT
Created attachment 309827 [details]
Patch
Comment 2 Daniel Bates 2017-05-12 10:31:07 PDT
Comment on attachment 309827 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=309827&action=review

> Tools/Scripts/webkitpy/common/system/crashlogs.py:157
> +                    original_result = result_name

Maybe a better name for this variable would be original_result_name?

Maybe it would be more idiomatic to generate the unique name in a new local variable instead of mutating result_name? Like:

count = 1
unique_name = result_name
while unique_name in crash_logs:
    unique_name = result_name + '-' count
    count += 1
crash_logs[unique_name] = errors + log_contents

> Tools/Scripts/webkitpy/common/system/crashlogs.py:158
>                      while result_name in crash_logs:

I am assuming the intention of this code is to handle process ID reuse when requesting all logs regardless of modification time. Such a situation is most likely to occur as a result of running run-webkit-tests between machine restarts. I suggest that we add a comment above line 156 to explain the purpose of this code. We should also add a comment to explain that the formatting of result_name must be kept synchronized with the parsing logic in DarwinPort._merge_crash_logs(). On that note, we could change crash_logs to be a different data structure (say, an array to tuples of the form (process name, process id, crash log contents)) and avoid the need to compute this unique encoded string solely to pass the process id and process name info to DarwinPort._merge_crash_logs().

> Tools/Scripts/webkitpy/common/system/crashlogs.py:159
> +                        result_name = original_result + '-{}'.format(count)

It seems sufficient and more concise to use string concatenation to compute the new candidate name:

result_name = original_result_name + '-' + count

> Tools/Scripts/webkitpy/common/system/crashlogs_unittest.py:250
> +        self.files['/Users/mock/Library/Logs/DiagnosticReports/DumpRenderTree_2011-06-13-150718_double_quadzen.crash'] = self.older_mock_crash_report
> +        self.files['/Users/mock/Library/Logs/DiagnosticReports/DumpRenderTree_2011-06-13-150718_triple_quadzen.crash'] = self.older_mock_crash_report

I suggest that we try to use filenames that convey the scenario we want to handle. Maybe substitute DumpRenderTree_2011-06-13-150718_quadzen_after_reboot_1.crash and DumpRenderTree_2011-06-13-150718_quadzen_after_reboot_2.crash for DumpRenderTree_2011-06-13-150718_double_quadzen.crash and DumpRenderTree_2011-06-13-150718_triple_quadzen.crash, respectively.

> Tools/Scripts/webkitpy/common/system/crashlogs_unittest.py:284
> +        self.assertTrue('DumpRenderTree-28528' in all_logs)
> +        self.assertTrue('DumpRenderTree-28528-1' in all_logs)
> +        self.assertTrue('DumpRenderTree-28528-2' in all_logs)

For your consideration, I suggest that we assert that all of the keys in all_logs match some hardcoded expected array of keys regardless of key ordering. The benefit of this approach is that it ensures that our uniqueness algorithm does not generate a bad name (e.g. DumpRenderTree-28528-0).

> Tools/Scripts/webkitpy/common/system/crashlogs_unittest.py:287
> +        self.assertTrue('DumpRenderTree-28528-0' not in all_logs)
> +        self.assertTrue('DumpRenderTree-28528-1-1' not in all_logs)

This tests do not seem meaningful as singularities as many values are not in all_logs. We can accomplish the same testing as part of asserting all the keys in all_logs. See my above remark for more details.
Comment 3 Jonathan Bedard 2017-05-12 10:36:53 PDT
Comment on attachment 309827 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=309827&action=review

>> Tools/Scripts/webkitpy/common/system/crashlogs.py:158
>>                      while result_name in crash_logs:
> 
> I am assuming the intention of this code is to handle process ID reuse when requesting all logs regardless of modification time. Such a situation is most likely to occur as a result of running run-webkit-tests between machine restarts. I suggest that we add a comment above line 156 to explain the purpose of this code. We should also add a comment to explain that the formatting of result_name must be kept synchronized with the parsing logic in DarwinPort._merge_crash_logs(). On that note, we could change crash_logs to be a different data structure (say, an array to tuples of the form (process name, process id, crash log contents)) and avoid the need to compute this unique encoded string solely to pass the process id and process name info to DarwinPort._merge_crash_logs().

I've seen this in practice with Sandbox violations on device (it's actually pretty common).  Often, a sandbox violation will generate a 'crash' log, but the process will keep running.  As a result, we can get multiple logs from a single process. Since the system is classifying them as crash logs, I think we should too. Given that this is not intuitive, I will place a comment here.
Comment 4 Jonathan Bedard 2017-05-12 10:57:20 PDT
Created attachment 309917 [details]
Patch
Comment 5 Jonathan Bedard 2017-05-12 11:22:02 PDT
(In reply to Jonathan Bedard from comment #4)
> Created attachment 309917 [details]
> Patch

Updated the patch for Dan's comments.

I did not mention that we are trying to prevent breakage in DarwinPort._merge_crash_logs().  I don't think that detail is relevant, it's up to _merge_crash_logs() to conform with what is returned from this function, not the other way around.
Comment 6 Daniel Bates 2017-05-12 11:40:42 PDT
(In reply to Jonathan Bedard from comment #5)
> (In reply to Jonathan Bedard from comment #4)
> > Created attachment 309917 [details]
> > Patch
> 
> Updated the patch for Dan's comments.
> 
> I did not mention that we are trying to prevent breakage in
> DarwinPort._merge_crash_logs().  I don't think that detail is relevant, it's
> up to _merge_crash_logs() to conform with what is returned from this
> function, not the other way around.

Do we have unit tests for _merge_crash_logs()? If not, we should add some to catch a regression in the format of the crash_log key in _find_all_logs_darwin().
Comment 7 Jonathan Bedard 2017-05-12 11:57:34 PDT
(In reply to Daniel Bates from comment #6)
> (In reply to Jonathan Bedard from comment #5)
> > (In reply to Jonathan Bedard from comment #4)
> > > Created attachment 309917 [details]
> > > Patch
> > 
> > Updated the patch for Dan's comments.
> > 
> > I did not mention that we are trying to prevent breakage in
> > DarwinPort._merge_crash_logs().  I don't think that detail is relevant, it's
> > up to _merge_crash_logs() to conform with what is returned from this
> > function, not the other way around.
> 
> Do we have unit tests for _merge_crash_logs()? If not, we should add some to
> catch a regression in the format of the crash_log key in
> _find_all_logs_darwin().

We do not have unit tests for this at the moment.

I filed https://bugs.webkit.org/show_bug.cgi?id=172039 to track this.

Dan pointed out that until we test _merge_crash_logs, we should mention how _merge_crash_logs parses names.
Comment 8 Jonathan Bedard 2017-05-12 12:17:42 PDT
Created attachment 309926 [details]
Patch
Comment 9 Aakash Jain 2017-07-03 09:43:48 PDT
Comment on attachment 309926 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=309926&action=review

looks good to me.

> Tools/Scripts/webkitpy/common/system/crashlogs.py:156
> +                    # Processes can remain running after Sanbox violations, which generate crash logs.

Typo "Sanbox" => "Sandbox"
Comment 10 Jonathan Bedard 2017-07-03 10:50:23 PDT
Created attachment 314496 [details]
Patch for landing
Comment 11 WebKit Commit Bot 2017-07-03 13:14:46 PDT
The commit-queue encountered the following flaky tests while processing attachment 314496 [details]:

workers/bomb.html bug 171985 (author: fpizlo@apple.com)
The commit-queue is continuing to process your patch.
Comment 12 WebKit Commit Bot 2017-07-03 13:15:07 PDT
Comment on attachment 314496 [details]
Patch for landing

Clearing flags on attachment: 314496

Committed r219088: <http://trac.webkit.org/changeset/219088>
Comment 13 WebKit Commit Bot 2017-07-03 13:15:09 PDT
All reviewed patches have been landed.  Closing bug.