Bug 199709 - [ews-build] Parse and display layout test failures
Summary: [ews-build] Parse and display layout test failures
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Aakash Jain
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-07-11 06:31 PDT by Aakash Jain
Modified: 2019-07-15 09:55 PDT (History)
5 users (show)

See Also:


Attachments
Patch (7.90 KB, patch)
2019-07-11 06:45 PDT, Aakash Jain
aakash_jain: review?
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Aakash Jain 2019-07-11 06:31:07 PDT
Parse layout test failures and update step and build summary accordingly.
Comment 1 Aakash Jain 2019-07-11 06:45:17 PDT
Created attachment 373921 [details]
Patch
Comment 2 EWS Watchlist 2019-07-11 06:48:38 PDT Comment hidden (obsolete)
Comment 3 Aakash Jain 2019-07-11 07:58:55 PDT
Sample runs:

Pass: https://ews-build.webkit-uat.org/#/builders/40/builds/95
Retry in case of unexpected failure: https://ews-build.webkit-uat.org/#/builders/40/builds/91
Comment 4 Jonathan Bedard 2019-07-11 09:15:50 PDT
(In reply to Aakash Jain from comment #3)
> Sample runs:
> 
> Pass: https://ews-build.webkit-uat.org/#/builders/40/builds/95
> Retry in case of unexpected failure:
> https://ews-build.webkit-uat.org/#/builders/40/builds/91

Looking a† build 91...am I correct in understanding that 92, 93 and 94 are all retries of the same patch, right?

Do we have an early exit at any point, if too many retries are attempted?
Comment 5 Aakash Jain 2019-07-11 09:28:02 PDT
> Looking a† build 91...am I correct in understanding that 92, 93 and 94 are all retries of the same patch, right?
Yeah.

> Do we have an early exit at any point, if too many retries are attempted?
Nope. This most likely indicate some infrastructure issue, and we don't want to burn through build-requests. Old EWS also has infinite retry logic for various scenarios.
Comment 6 Jonathan Bedard 2019-07-11 11:18:14 PDT
Comment on attachment 373921 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=373921&action=review

> Tools/BuildSlaveSupport/ews-build/steps.py:842
> +    # FIXME: This will break if run-webkit-tests changes its default log formatter.

What exactly is there to 'FIX' here? Maybe an example of a line we're matching would be helpful. I'm familiar enough with run-webkit-tests that I know that this ragex is correct, but others might not be.

> Tools/BuildSlaveSupport/ews-build/steps.py:845
> +    def _strip_python_logging_prefix(self, line):

Should this just be '_strip_logging_prefix(...)'?

> Tools/BuildSlaveSupport/ews-build/steps.py:853
> +        expressions = [

Why a list of pairs and not a dictionary, does order matter?

> Tools/BuildSlaveSupport/ews-build/steps.py:872
> +                # FIXME: Parse file names and put them in results

Do we have a bug number for this?

> Tools/BuildSlaveSupport/ews-build/steps.py:897
> +                    return FAILURE

So we return a failure if we have incorrectLayoutLines, even if the cmd.rc is 254?

> Tools/BuildSlaveSupport/ews-build/steps.py:902
> +        if cmd.rc == 254:

There is a danger here.

What happens if I upload a bad patch that causes run-webkit-tests to raise this exception? How do we stop the retry?
Comment 7 Aakash Jain 2019-07-15 08:26:42 PDT
Comment on attachment 373921 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=373921&action=review

>> Tools/BuildSlaveSupport/ews-build/steps.py:842
>> +    # FIXME: This will break if run-webkit-tests changes its default log formatter.
> 
> What exactly is there to 'FIX' here? Maybe an example of a line we're matching would be helpful. I'm familiar enough with run-webkit-tests that I know that this ragex is correct, but others might not be.

This code is copied from similar code in build.webkit.org buildbot config (and other internal buildbot configs). Ditto for most other comments. I prefer to not spend much time in it if we know it works.

>> Tools/BuildSlaveSupport/ews-build/steps.py:845
>> +    def _strip_python_logging_prefix(self, line):
> 
> Should this just be '_strip_logging_prefix(...)'?

Ditto. Copied from build.webkit.org, let's keep it same for consistency.

>> Tools/BuildSlaveSupport/ews-build/steps.py:853
>> +        expressions = [
> 
> Why a list of pairs and not a dictionary, does order matter?

Ditto, copied from build.webkit.org code.

>> Tools/BuildSlaveSupport/ews-build/steps.py:872
>> +                # FIXME: Parse file names and put them in results
> 
> Do we have a bug number for this?

Ditto, copied from build.webkit.org code.

>> Tools/BuildSlaveSupport/ews-build/steps.py:897
>> +                    return FAILURE
> 
> So we return a failure if we have incorrectLayoutLines, even if the cmd.rc is 254?

If cmd.rc is 254, that's probably an infrastructure failure, or an exception/hard-failure in which case the stdio wouldn't have required regexes (like 'Regressions: Unexpected.') and so self.incorrectLayoutLines would be empty. 

Also, this code is copied from internal buildbot config in which it has been working fine for a long time.

>> Tools/BuildSlaveSupport/ews-build/steps.py:902
>> +        if cmd.rc == 254:
> 
> There is a danger here.
> 
> What happens if I upload a bad patch that causes run-webkit-tests to raise this exception? How do we stop the retry?

It would stop once the patch is obsolete, else me/bot-watcher would have to manually click 'Stop' button. When we add authentication for all the users, webkit developers might be able to click 'Stop' button.

I am not sure what's the best thing to do in such scenario, should we RETRY here assuming it's an infrastructure issue (just like internal buildbot) or just fail the patch. 

Old EWS probably fails the patch in case of infrastructure issue, we are trying to do better here.
Comment 8 Jonathan Bedard 2019-07-15 09:21:26 PDT
I'll rubber-stamp this since it's mostly code we're moving.

I think that infinitely retrying on an exit code of 254 could be something we end up regretting. Looking through run-webkit-tests, we use that return code when either the port arguments are wrong (which seems fair) or an exception is raised and never caught. The problem lies in that second case. When booting simulators, for example, we will raise an exception if this booting fails. It's fair to call this sort of failure an 'infrastructure' failure, but my experience with a problem like this is that infinite-retry probably isn't going to help.

rs=me
Comment 9 Aakash Jain 2019-07-15 09:54:31 PDT
Committed r247433: <https://trac.webkit.org/changeset/247433>
Comment 10 Radar WebKit Bug Importer 2019-07-15 09:55:49 PDT
<rdar://problem/53108789>