Bug 183705

Summary: [ WK2 ] http/tests/workers/service/client-*-page-cache.html LayoutTests are flaky
Product: WebKit Reporter: Ryan Haddad <ryanhaddad>
Component: Service WorkersAssignee: Chris Dumez <cdumez>
Status: RESOLVED FIXED    
Severity: Normal CC: beidson, cdumez, commit-queue, jlewis3, tsavell, webkit-bug-importer, wenson_hsieh, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
Attachments:
Description Flags
Patch
none
Patch
none
Patch
none
Patch none

Description Ryan Haddad 2018-03-16 13:03:58 PDT
The following two LayoutTests are flaky failures on iOS and macOS WK2:

http/tests/workers/service/client-added-to-clients-when-restored-from-page-cache.html	
http/tests/workers/service/client-removed-from-clients-while-in-page-cache.html

https://build.webkit.org/results/Apple%20Sierra%20Debug%20WK2%20(Tests)/r229663%20(5649)/results.html

https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=http%2Ftests%2Fworkers%2Fservice%2Fclient


Both tests fail with a similar diff:

--- /Volumes/Data/slave/sierra-debug-tests-wk2/build/layout-test-results/http/tests/workers/service/client-added-to-clients-when-restored-from-page-cache-expected.txt
+++ /Volumes/Data/slave/sierra-debug-tests-wk2/build/layout-test-results/http/tests/workers/service/client-added-to-clients-when-restored-from-page-cache-actual.txt
@@ -1,6 +1,4 @@
 * Tests that a client is re-added to the list of service worker clients when it is restored from the page cache
 
-PASS: service worker has initially 2 clients
-PASS: page is about to enter page cache
-PASS: service worker now has 2 clients again after restoring the second one from page cache
+FAIL: Wrong initial number of clients: 3
Comment 1 Truitt Savell 2018-07-18 09:47:17 PDT
These two tests currently became extremely flaky along with a third. 

http/tests/workers/service/client-added-to-clients-when-restored-from-page-cache.html
http/tests/workers/service/client-removed-from-clients-while-in-page-cache.html

http/tests/workers/service/serviceworkerclients-matchAll.https.html

Combined History:
https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=http%2Ftests%2Fworkers%2Fservice%2Fserviceworkerclients-matchAll.https.html%20%20http%2Ftests%2Fworkers%2Fservice%2Fclient-added-to-clients-when-restored-from-page-cache.html%20%20http%2Ftests%2Fworkers%2Fservice%2Fclient-removed-from-clients-while-in-page-cache.html

They all began failing around r233895. 

serviceworkerclients-matchAll.https.html Text Diff:

--- /Volumes/Data/slave/highsierra-release-tests-wk2/build/layout-test-results/http/tests/workers/service/serviceworkerclients-matchAll.https-expected.txt
+++ /Volumes/Data/slave/highsierra-release-tests-wk2/build/layout-test-results/http/tests/workers/service/serviceworkerclients-matchAll.https-actual.txt
@@ -1,4 +1,4 @@
 
 PASS Setup worker 
-PASS Test self.clients.matchAll 
+FAIL Test self.clients.matchAll assert_equals: expected "PASS" but got "FAIL: expected one matched client, got 2"

These issues seem to be due to the test receiving one additional client than they expected.
Comment 2 Truitt Savell 2018-07-19 11:47:34 PDT
Issue seems to have began between r233891 -- 233897. 

r233891 has specific changes to Clients: https://trac.webkit.org/changeset/233891/webkit
Comment 3 Wenson Hsieh 2018-07-19 12:00:58 PDT
(In reply to Truitt Savell from comment #2)
> Issue seems to have began between r233891 -- 233897. 
> 
> r233891 has specific changes to Clients:
> https://trac.webkit.org/changeset/233891/webkit

I really doubt this is the cause. This patch has no behavior change unless clients opt into the new SPI, which WebKitTestRunner does not.
Comment 4 Chris Dumez 2018-07-19 12:02:13 PDT
(In reply to Wenson Hsieh from comment #3)
> (In reply to Truitt Savell from comment #2)
> > Issue seems to have began between r233891 -- 233897. 
> > 
> > r233891 has specific changes to Clients:
> > https://trac.webkit.org/changeset/233891/webkit
> 
> I really doubt this is the cause. This patch has no behavior change unless
> clients opt into the new SPI, which WebKitTestRunner does not.

I am working on a bad crasher right now but I am happy to look into this bug right after that.
Comment 5 Truitt Savell 2018-07-19 15:27:28 PDT
I noticed that 
http/tests/workers/service/client-added-to-clients-when-restored-from-page-cache.html
http/tests/workers/service/client-removed-from-clients-while-in-page-cache.html

always fail together. They may be getting stuck in a state together. Or because they getting more clients than they expect maybe something is providing too many clients earlier in the run.
Comment 6 Chris Dumez 2018-07-20 09:01:10 PDT
Ok, I have some spare cycles to look into this.
Comment 7 Chris Dumez 2018-07-20 09:02:35 PDT
> Issue seems to have began between r233891 -- 233897. 

Likely a regression from process-swap-on-navigation then, which was r233897.
Comment 8 Chris Dumez 2018-07-20 10:05:36 PDT
(In reply to Chris Dumez from comment #7)
> > Issue seems to have began between r233891 -- 233897. 
> 
> Likely a regression from process-swap-on-navigation then, which was r233897.

Unable to reproduce so far:
Tools/Scripts/run-webkit-tests http/tests/workers/service/client-added-to-clients-when-restored-from-page-cache.html http/tests/workers/service/client-removed-from-clients-while-in-page-cache.html --repeat-each=100 -f

Also tried running the whole folder.
Comment 9 Chris Dumez 2018-07-20 11:05:47 PDT
Created attachment 345463 [details]
Patch
Comment 10 youenn fablet 2018-07-20 11:16:19 PDT
Comment on attachment 345463 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=345463&action=review

> LayoutTests/http/tests/workers/service/client-added-to-clients-when-restored-from-page-cache.html:17
> +      worker.postMessage("getClientCount");

Shouldn't we do the usual pattern:
if (++counter < 20) {
  log("FAIL...")
  finishSWTest();
}
await waitFor(50);
worker.postMessage("getClientCount");

> LayoutTests/http/tests/workers/service/serviceworkerclients-matchAll-worker.js:26
> +            result = await matchAllPromise1;

This loop does not seem useful as is.
We might want to change matchAllPromise1 to a function which would return a new promise every time it is called.
Again a waitFor(50) might help.
Comment 11 Chris Dumez 2018-07-20 12:06:43 PDT
Created attachment 345470 [details]
Patch
Comment 12 WebKit Commit Bot 2018-07-20 12:34:16 PDT
Comment on attachment 345470 [details]
Patch

Clearing flags on attachment: 345470

Committed r234061: <https://trac.webkit.org/changeset/234061>
Comment 13 WebKit Commit Bot 2018-07-20 12:34:17 PDT
All reviewed patches have been landed.  Closing bug.
Comment 14 Radar WebKit Bug Importer 2018-07-20 12:35:19 PDT
<rdar://problem/42440606>
Comment 15 Truitt Savell 2018-07-20 16:44:05 PDT
(In reply to WebKit Commit Bot from comment #12)
> Comment on attachment 345470 [details]
> Patch
> 
> Clearing flags on attachment: 345470
> 
> Committed r234061: <https://trac.webkit.org/changeset/234061>

It does not look like this change made a difference. Test is still flakey and behavior seems to be the same. failure message is also the same.
Comment 16 Truitt Savell 2018-07-20 16:44:35 PDT
reopening for visibility
Comment 17 Chris Dumez 2018-07-20 16:47:52 PDT
(In reply to Truitt Savell from comment #15)
> (In reply to WebKit Commit Bot from comment #12)
> > Comment on attachment 345470 [details]
> > Patch
> > 
> > Clearing flags on attachment: 345470
> > 
> > Committed r234061: <https://trac.webkit.org/changeset/234061>
> 
> It does not look like this change made a difference. Test is still flakey
> and behavior seems to be the same. failure message is also the same.

Which test? What is the output?
Comment 18 Chris Dumez 2018-07-20 16:49:38 PDT
(In reply to Chris Dumez from comment #17)
> (In reply to Truitt Savell from comment #15)
> > (In reply to WebKit Commit Bot from comment #12)
> > > Comment on attachment 345470 [details]
> > > Patch
> > > 
> > > Clearing flags on attachment: 345470
> > > 
> > > Committed r234061: <https://trac.webkit.org/changeset/234061>
> > 
> > It does not look like this change made a difference. Test is still flakey
> > and behavior seems to be the same. failure message is also the same.
> 
> Which test? What is the output?

The 2 page cache tests:
--- /Volumes/Data/slave/sierra-release-tests-wk2/build/layout-test-results/http/tests/workers/service/client-removed-from-clients-while-in-page-cache-expected.txt
+++ /Volumes/Data/slave/sierra-release-tests-wk2/build/layout-test-results/http/tests/workers/service/client-removed-from-clients-while-in-page-cache-actual.txt
@@ -1,6 +1,4 @@
 * Tests that a client is removed from the list of service worker clients while it is in the page cache
 
-PASS: service worker has initially 2 clients
-PASS: page is about to enter page cache
-PASS: service worker has only 1 client after 1 entered page cache
+FAIL: Wrong initial number of clients: 3
Comment 20 Chris Dumez 2018-07-20 16:56:40 PDT
(In reply to Truitt Savell from comment #19)
> (In reply to Chris Dumez from comment #17)
> > (In reply to Truitt Savell from comment #15)
> > > (In reply to WebKit Commit Bot from comment #12)
> > > > Comment on attachment 345470 [details]
> > > > Patch
> > > > 
> > > > Clearing flags on attachment: 345470
> > > > 
> > > > Committed r234061: <https://trac.webkit.org/changeset/234061>
> > > 
> > > It does not look like this change made a difference. Test is still flakey
> > > and behavior seems to be the same. failure message is also the same.
> > 
> > Which test? What is the output?
> 
> Sorry I meant tests. All three tests:
> https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.
> html#showAllRuns=true&tests=http%2Ftests%2Fworkers%2Fservice%2Fserviceworkerc
> lients-matchAll.https.html%20%20http%2Ftests%2Fworkers%2Fservice%2Fclient-
> added-to-clients-when-restored-from-page-cache.
> html%20%20http%2Ftests%2Fworkers%2Fservice%2Fclient-removed-from-clients-
> while-in-page-cache.html
> 
> are still having flakey failures.
> 
> Test Outputs:
> https://build.webkit.org/results/
> Apple%20iOS%2011%20Simulator%20Debug%20WK2%20(Tests)/r234067%20(5511)/http/
> tests/workers/service/serviceworkerclients-matchAll.https-pretty-diff.html
> 
> https://build.webkit.org/results/Apple%20Sierra%20Release%20WK2%20(Tests)/
> r234064%20(10678)/http/tests/workers/service/client-added-to-clients-when-
> restored-from-page-cache-pretty-diff.html

So maybe the clients are actually leaking in some cases? :/
Comment 21 Ryan Haddad 2018-07-23 10:40:32 PDT
Marked tests as flaky in https://trac.webkit.org/r234100
Comment 22 Chris Dumez 2018-07-23 12:33:23 PDT
Committed r234107: <https://trac.webkit.org/changeset/234107>
Comment 23 Chris Dumez 2018-07-23 12:33:52 PDT
Reopening as this latest fix was only for http/tests/workers/service/serviceworkerclients-matchAll.https.html. It is works, I'll apply the same fix to the other 2 tests.
Comment 24 Ryan Haddad 2018-07-23 14:39:13 PDT
(In reply to Chris Dumez from comment #23)
> Reopening as this latest fix was only for
> http/tests/workers/service/serviceworkerclients-matchAll.https.html. It is
> works, I'll apply the same fix to the other 2 tests.
Here is a failure on iOS Simulator after r234107:
https://build.webkit.org/results/Apple%20iOS%2011%20Simulator%20Release%20WK2%20(Tests)/r234109%20(6374)/results.html
Comment 25 Ryan Haddad 2018-07-24 13:30:55 PDT
Marked http/tests/workers/service/serviceworkerclients-matchAll.https.html as flaky again in https://trac.webkit.org/r234170.
Comment 26 Chris Dumez 2018-07-24 15:44:32 PDT
(In reply to Ryan Haddad from comment #25)
> Marked http/tests/workers/service/serviceworkerclients-matchAll.https.html
> as flaky again in https://trac.webkit.org/r234170.

Ok thanks. I will look into this again when I am done with more urgent matters.
Comment 27 Chris Dumez 2018-08-31 14:18:26 PDT
Created attachment 348671 [details]
Patch
Comment 28 youenn fablet 2018-08-31 14:39:34 PDT
Comment on attachment 348671 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=348671&action=review

> Source/WebCore/testing/Internals.cpp:2331
> +String Internals::serviceWorkerClientIdentifier(Document& document) const

const Document& maybe?

> LayoutTests/ChangeLog:10
> +        identifiers to see if they are present or not.

Were these tests testable in a regular browser before?
Would it make sense to use internals API if available and rely on client count otherwise?

Looking further for some tests, for controlled/uncontrolled clients, it makes sense to use identifiers.
Comment 29 Chris Dumez 2018-08-31 14:43:17 PDT
Comment on attachment 348671 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=348671&action=review

>> Source/WebCore/testing/Internals.cpp:2331
>> +String Internals::serviceWorkerClientIdentifier(Document& document) const
> 
> const Document& maybe?

Canno
Comment 30 Chris Dumez 2018-08-31 14:44:02 PDT
Comment on attachment 348671 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=348671&action=review

>>> Source/WebCore/testing/Internals.cpp:2331
>>> +String Internals::serviceWorkerClientIdentifier(Document& document) const
>> 
>> const Document& maybe?
> 
> Canno

Oh, my previous implementation needed this non-const but the current implementation seems like it would allow it. Will check.
Comment 31 Chris Dumez 2018-08-31 14:46:32 PDT
Created attachment 348674 [details]
Patch
Comment 32 Chris Dumez 2018-08-31 14:48:23 PDT
Comment on attachment 348674 [details]
Patch

Clearing flags on attachment: 348674

Committed r235570: <https://trac.webkit.org/changeset/235570>
Comment 33 Chris Dumez 2018-08-31 14:48:25 PDT
All reviewed patches have been landed.  Closing bug.