Bug 181166

Summary: Layout Test imported/w3c/web-platform-tests/service-workers/service-worker/register-same-scope-different-script-url.https.html is flaky
Product: WebKit Reporter: Matt Lewis <jlewis3>
Component: Service WorkersAssignee: Chris Dumez <cdumez>
Status: RESOLVED FIXED    
Severity: Normal CC: beidson, cdumez, commit-queue, rniwa, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=180702
https://bugs.webkit.org/show_bug.cgi?id=181167
Attachments:
Description Flags
Patch
none
Patch none

Description Matt Lewis 2017-12-26 17:46:23 PST
The following layout test is flaky on WK2

imported/w3c/web-platform-tests/service-workers/service-worker/register-same-scope-different-script-url.https.html

Probable cause:

https://trac.webkit.org/changeset/226274/webkit

Flakiness Dashboard:

https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=imported%2Fw3c%2Fweb-platform-tests%2Fservice-workers%2Fservice-worker%2Fregister-same-scope-different-script-url.https.html


This test recently became extremely flaky. 

https://build.webkit.org/results/Apple%20High%20Sierra%20Release%20WK2%20(Tests)/r226294%20(1953)/results.html
https://build.webkit.org/builders/Apple%20High%20Sierra%20Release%20WK2%20(Tests)/builds/1953

--- /Volumes/Data/slave/highsierra-release-tests-wk2/build/layout-test-results/imported/w3c/web-platform-tests/service-workers/service-worker/register-same-scope-different-script-url.https-expected.txt
+++ /Volumes/Data/slave/highsierra-release-tests-wk2/build/layout-test-results/imported/w3c/web-platform-tests/service-workers/service-worker/register-same-scope-different-script-url.https-actual.txt
@@ -1,7 +1,9 @@
+
+Harness Error (TIMEOUT), message = null
 
 PASS Register different scripts concurrently 
 PASS Register then register new script URL 
 PASS Register then register new script URL that 404s 
 FAIL Register then register new script that does not install assert_unreached: unexpected rejection: assert_equals: on redundant, installing should be null expected null but got object "[object ServiceWorker]" Reached unreachable code
-PASS Register same-scope new script url effect on controller 
+TIMEOUT Register same-scope new script url effect on controller Test timed out
Comment 1 Matt Lewis 2017-12-26 17:53:33 PST
Marked as flaky in:
https://trac.webkit.org/changeset/226296/webkit
Comment 2 Chris Dumez 2018-02-02 13:58:23 PST
I managed to reproduce the flakiness like so:
run-webkit-tests -gf --repeat-each=25 imported/w3c/web-platform-tests/service-workers/service-worker/register-same-scope-different-script-url.https.html
Comment 3 Radar WebKit Bug Importer 2018-02-02 13:59:55 PST
<rdar://problem/37169508>
Comment 4 Chris Dumez 2018-02-02 14:25:12 PST
Looks like the registration.unregister() promise at the end of the last test sometimes does not get resolved.
Comment 5 Chris Dumez 2018-02-02 15:30:23 PST
If I disable soft-update, then it is no longer flaky. It seems that SoftUpdate jobs sometimes "hang" the jobQueue so that it is no longer processing new jobs.
Comment 6 Chris Dumez 2018-02-04 13:05:34 PST
Created attachment 333054 [details]
Patch
Comment 7 youenn fablet 2018-02-05 09:27:31 PST
Comment on attachment 333054 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=333054&action=review

> Source/WebCore/workers/service/ServiceWorkerJobData.cpp:39
>      : m_identifier { connectionIdentifier, generateThreadSafeObjectIdentifier<ServiceWorkerJobIdentifierType>() }

Preexisting issue but seems a bit weird to have m_identifier here and sourceContext which are both member fields of ServiceWorkerJobData.

> Source/WebCore/workers/service/server/SWServer.cpp:503
> +        // Abort if the job that scheduled this has not been cancelled.

s/has not/has/

> Source/WebCore/workers/service/server/SWServer.cpp:633
> +        jobQueue->cancelJobsFromServiceWorker(worker.identifier());

Maybe just me but I would read it more easily with something like cancelServiceWorkerJobs. Ditto below.
Comment 8 Chris Dumez 2018-02-05 09:35:49 PST
(In reply to youenn fablet from comment #7)
> Comment on attachment 333054 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=333054&action=review
> 
> > Source/WebCore/workers/service/ServiceWorkerJobData.cpp:39
> >      : m_identifier { connectionIdentifier, generateThreadSafeObjectIdentifier<ServiceWorkerJobIdentifierType>() }
> 
> Preexisting issue but seems a bit weird to have m_identifier here and
> sourceContext which are both member fields of ServiceWorkerJobData.
> 
> > Source/WebCore/workers/service/server/SWServer.cpp:503
> > +        // Abort if the job that scheduled this has not been cancelled.
> 
> s/has not/has/
> 
> > Source/WebCore/workers/service/server/SWServer.cpp:633
> > +        jobQueue->cancelJobsFromServiceWorker(worker.identifier());
> 
> Maybe just me but I would read it more easily with something like
> cancelServiceWorkerJobs. Ditto below.

I personally prefer cancelJobsFromServiceWorker to cancelServiceWorkerJobs. "cancelServiceWorkerJobs" sounds very generic given that we are in service worker code.
Comment 9 Chris Dumez 2018-02-05 09:36:55 PST
Created attachment 333086 [details]
Patch
Comment 10 WebKit Commit Bot 2018-02-05 10:12:12 PST
Comment on attachment 333086 [details]
Patch

Clearing flags on attachment: 333086

Committed r228101: <https://trac.webkit.org/changeset/228101>
Comment 11 WebKit Commit Bot 2018-02-05 10:12:14 PST
All reviewed patches have been landed.  Closing bug.