Bug 172002

Summary:

webkitpy: Properly number duplicated crashlogs

Product:

WebKit

Reporter:

Jonathan Bedard <jbedard>

Component:

Tools / Tests

Assignee:

Jonathan Bedard <jbedard>

Status:

RESOLVED FIXED

Severity:

Normal

CC:

aakash_jain, buildbot, commit-queue, dbates, glenn, lforschler

Priority:

Version:

Other

Hardware:

Unspecified

OS:

Unspecified

See Also:

https://bugs.webkit.org/show_bug.cgi?id=171976
https://bugs.webkit.org/show_bug.cgi?id=172039

Attachments:

Description	Flags
Patch	none
Patch	none
Patch	none
Patch for landing	none

Jonathan Bedard

Reported 2017-05-11 17:05:28 PDT

Currently, if a single process has multiple crashlogs, we will number them like so: crash.log crash-1.log crash-1-1.log crash-1-1-1.log This doesn't make much since, these should be increasing integer numbers.

Attachments
Patch (4.37 KB, patch) 2017-05-11 17:08 PDT, Jonathan Bedard	no flags	Details Formatted Diff Diff
Patch (5.00 KB, patch) 2017-05-12 10:57 PDT, Jonathan Bedard	no flags	Details Formatted Diff Diff
Patch (5.20 KB, patch) 2017-05-12 12:17 PDT, Jonathan Bedard	no flags	Details Formatted Diff Diff
Patch for landing (5.21 KB, patch) 2017-07-03 10:50 PDT, Jonathan Bedard	no flags	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Jonathan Bedard

Comment 1 2017-05-11 17:08:40 PDT

Created attachment 309827 [details] Patch

Daniel Bates

Comment 2 2017-05-12 10:31:07 PDT

Comment on attachment 309827 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=309827&action=review > Tools/Scripts/webkitpy/common/system/crashlogs.py:157 > + original_result = result_name Maybe a better name for this variable would be original_result_name? Maybe it would be more idiomatic to generate the unique name in a new local variable instead of mutating result_name? Like: count = 1 unique_name = result_name while unique_name in crash_logs: unique_name = result_name + '-' count count += 1 crash_logs[unique_name] = errors + log_contents > Tools/Scripts/webkitpy/common/system/crashlogs.py:158 > while result_name in crash_logs: I am assuming the intention of this code is to handle process ID reuse when requesting all logs regardless of modification time. Such a situation is most likely to occur as a result of running run-webkit-tests between machine restarts. I suggest that we add a comment above line 156 to explain the purpose of this code. We should also add a comment to explain that the formatting of result_name must be kept synchronized with the parsing logic in DarwinPort._merge_crash_logs(). On that note, we could change crash_logs to be a different data structure (say, an array to tuples of the form (process name, process id, crash log contents)) and avoid the need to compute this unique encoded string solely to pass the process id and process name info to DarwinPort._merge_crash_logs(). > Tools/Scripts/webkitpy/common/system/crashlogs.py:159 > + result_name = original_result + '-{}'.format(count) It seems sufficient and more concise to use string concatenation to compute the new candidate name: result_name = original_result_name + '-' + count > Tools/Scripts/webkitpy/common/system/crashlogs_unittest.py:250 > + self.files['/Users/mock/Library/Logs/DiagnosticReports/DumpRenderTree_2011-06-13-150718_double_quadzen.crash'] = self.older_mock_crash_report > + self.files['/Users/mock/Library/Logs/DiagnosticReports/DumpRenderTree_2011-06-13-150718_triple_quadzen.crash'] = self.older_mock_crash_report I suggest that we try to use filenames that convey the scenario we want to handle. Maybe substitute DumpRenderTree_2011-06-13-150718_quadzen_after_reboot_1.crash and DumpRenderTree_2011-06-13-150718_quadzen_after_reboot_2.crash for DumpRenderTree_2011-06-13-150718_double_quadzen.crash and DumpRenderTree_2011-06-13-150718_triple_quadzen.crash, respectively. > Tools/Scripts/webkitpy/common/system/crashlogs_unittest.py:284 > + self.assertTrue('DumpRenderTree-28528' in all_logs) > + self.assertTrue('DumpRenderTree-28528-1' in all_logs) > + self.assertTrue('DumpRenderTree-28528-2' in all_logs) For your consideration, I suggest that we assert that all of the keys in all_logs match some hardcoded expected array of keys regardless of key ordering. The benefit of this approach is that it ensures that our uniqueness algorithm does not generate a bad name (e.g. DumpRenderTree-28528-0). > Tools/Scripts/webkitpy/common/system/crashlogs_unittest.py:287 > + self.assertTrue('DumpRenderTree-28528-0' not in all_logs) > + self.assertTrue('DumpRenderTree-28528-1-1' not in all_logs) This tests do not seem meaningful as singularities as many values are not in all_logs. We can accomplish the same testing as part of asserting all the keys in all_logs. See my above remark for more details.

Jonathan Bedard

Comment 3 2017-05-12 10:36:53 PDT

Comment on attachment 309827 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=309827&action=review >> Tools/Scripts/webkitpy/common/system/crashlogs.py:158 >> while result_name in crash_logs: > > I am assuming the intention of this code is to handle process ID reuse when requesting all logs regardless of modification time. Such a situation is most likely to occur as a result of running run-webkit-tests between machine restarts. I suggest that we add a comment above line 156 to explain the purpose of this code. We should also add a comment to explain that the formatting of result_name must be kept synchronized with the parsing logic in DarwinPort._merge_crash_logs(). On that note, we could change crash_logs to be a different data structure (say, an array to tuples of the form (process name, process id, crash log contents)) and avoid the need to compute this unique encoded string solely to pass the process id and process name info to DarwinPort._merge_crash_logs(). I've seen this in practice with Sandbox violations on device (it's actually pretty common). Often, a sandbox violation will generate a 'crash' log, but the process will keep running. As a result, we can get multiple logs from a single process. Since the system is classifying them as crash logs, I think we should too. Given that this is not intuitive, I will place a comment here.

Jonathan Bedard

Comment 4 2017-05-12 10:57:20 PDT

Created attachment 309917 [details] Patch

Jonathan Bedard

Comment 5 2017-05-12 11:22:02 PDT

(In reply to Jonathan Bedard from comment #4) > Created attachment 309917 [details] > Patch Updated the patch for Dan's comments. I did not mention that we are trying to prevent breakage in DarwinPort._merge_crash_logs(). I don't think that detail is relevant, it's up to _merge_crash_logs() to conform with what is returned from this function, not the other way around.

Daniel Bates

Comment 6 2017-05-12 11:40:42 PDT

(In reply to Jonathan Bedard from comment #5) > (In reply to Jonathan Bedard from comment #4) > > Created attachment 309917 [details] > > Patch > > Updated the patch for Dan's comments. > > I did not mention that we are trying to prevent breakage in > DarwinPort._merge_crash_logs(). I don't think that detail is relevant, it's > up to _merge_crash_logs() to conform with what is returned from this > function, not the other way around. Do we have unit tests for _merge_crash_logs()? If not, we should add some to catch a regression in the format of the crash_log key in _find_all_logs_darwin().

Jonathan Bedard

Comment 7 2017-05-12 11:57:34 PDT

(In reply to Daniel Bates from comment #6) > (In reply to Jonathan Bedard from comment #5) > > (In reply to Jonathan Bedard from comment #4) > > > Created attachment 309917 [details] > > > Patch > > > > Updated the patch for Dan's comments. > > > > I did not mention that we are trying to prevent breakage in > > DarwinPort._merge_crash_logs(). I don't think that detail is relevant, it's > > up to _merge_crash_logs() to conform with what is returned from this > > function, not the other way around. > > Do we have unit tests for _merge_crash_logs()? If not, we should add some to > catch a regression in the format of the crash_log key in > _find_all_logs_darwin(). We do not have unit tests for this at the moment. I filed https://bugs.webkit.org/show_bug.cgi?id=172039 to track this. Dan pointed out that until we test _merge_crash_logs, we should mention how _merge_crash_logs parses names.

Jonathan Bedard

Comment 8 2017-05-12 12:17:42 PDT

Created attachment 309926 [details] Patch

Aakash Jain

Comment 9 2017-07-03 09:43:48 PDT

Comment on attachment 309926 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=309926&action=review looks good to me. > Tools/Scripts/webkitpy/common/system/crashlogs.py:156 > + # Processes can remain running after Sanbox violations, which generate crash logs. Typo "Sanbox" => "Sandbox"

Jonathan Bedard

Comment 10 2017-07-03 10:50:23 PDT

Created attachment 314496 [details] Patch for landing

WebKit Commit Bot

Comment 11 2017-07-03 13:14:46 PDT

The commit-queue encountered the following flaky tests while processing attachment 314496 [details]: workers/bomb.html bug 171985 (author: fpizlo@apple.com) The commit-queue is continuing to process your patch.

WebKit Commit Bot

Comment 12 2017-07-03 13:15:07 PDT

Comment on attachment 314496 [details] Patch for landing Clearing flags on attachment: 314496 Committed r219088: <http://trac.webkit.org/changeset/219088>

WebKit Commit Bot

Comment 13 2017-07-03 13:15:09 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.