Bug 202852 - http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-with-interaction-database.html is a flaky failure
Summary: http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-w...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Tools / Tests (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Kate Cheney
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-10-11 10:44 PDT by Kate Cheney
Modified: 2019-10-21 17:43 PDT (History)
7 users (show)

See Also:


Attachments
Patch (46.84 KB, patch)
2019-10-17 16:54 PDT, Kate Cheney
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kate Cheney 2019-10-11 10:44:02 PDT
The following layout test is a flaky timeout on macOS WK2 and iOS wk2:

http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-with-interaction-database.html 

Reproducible with:

run-webkit-tests http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-with-interaction-database.html --iterations 2

Flakiness Dashboard:

https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.html#showAllRuns=true&tests=http%2Ftests%2FresourceLoadStatistics%2Fswitch-session-on-navigation-to-prevalent-with-interaction-database.html%20

Diff:
--- /Volumes/Data/slave/highsierra-debug-tests-wk2/build/layout-test-results/http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-with-interaction-expected.txt
+++ /Volumes/Data/slave/highsierra-debug-tests-wk2/build/layout-test-results/http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-with-interaction-actual.txt
@@ -5,8 +5,9 @@
 
 PASS Should have and has the session cookie.
 PASS Should have and has the persistent cookie.
-PASS Origin has isolated session.
+FAIL Origin has no isolated session.
 PASS successfullyParsed is true
+Some tests failed.
 
 TEST COMPLETE
 

Probable cause:

The failures are fixed locally when I clear the NetworkLoad cache and all pending write operations in between the tests. I think this is because clearIsolatedSessions() is called between tests, but a new isolated session is only added with a call to startNetworkLoad() in the NetworkResourceLoader, which doesn't happen if the resource is in the cache. 

Before posting a fix, my questions are:

1. Should the clear() function in the cache also be clearing pending write operations? (I had to do it manually)
2. Should the cache be cleared between tests automatically?
3. The entry in the cache was being stored with a "#step4" on the end of it, is this expected behavior?
Comment 1 Radar WebKit Bug Importer 2019-10-11 10:44:58 PDT
<rdar://problem/56195888>
Comment 2 Alexey Proskuryakov 2019-10-11 14:47:40 PDT
> 2. Should the cache be cleared between tests automatically?

That may have a substantial performance cost (which is worth verifying).

Another consequence is loss of incidental testing. In other words, if we clear all caches, then tests will all be running with 2-3 elements in the cache, which is not very useful. I'm not sure what the right balance between test stability and incidental testing is in this particular case.
Comment 3 Kate Cheney 2019-10-11 16:42:20 PDT
(In reply to Alexey Proskuryakov from comment #2)
> > 2. Should the cache be cleared between tests automatically?
> 
> That may have a substantial performance cost (which is worth verifying).
> 
> Another consequence is loss of incidental testing. In other words, if we
> clear all caches, then tests will all be running with 2-3 elements in the
> cache, which is not very useful. I'm not sure what the right balance between
> test stability and incidental testing is in this particular case.

I don't know if this is the best solution, but I can add in a testRunner function that clears the cache. Then this test, and any others uncovered in the future with the same problem, can call the function before they run to ensure the cache is cleared if the test relies on it.
Comment 4 Chris Dumez 2019-10-17 13:32:50 PDT
(In reply to Katherine_cheney from comment #0)
> The following layout test is a flaky timeout on macOS WK2 and iOS wk2:
> 
> http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-
> with-interaction-database.html 
> 
> Reproducible with:
> 
> run-webkit-tests
> http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-
> with-interaction-database.html --iterations 2
> 
> Flakiness Dashboard:
> 
> https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.
> html#showAllRuns=true&tests=http%2Ftests%2FresourceLoadStatistics%2Fswitch-
> session-on-navigation-to-prevalent-with-interaction-database.html%20
> 
> Diff:
> ---
> /Volumes/Data/slave/highsierra-debug-tests-wk2/build/layout-test-results/
> http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-
> with-interaction-expected.txt
> +++
> /Volumes/Data/slave/highsierra-debug-tests-wk2/build/layout-test-results/
> http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-
> with-interaction-actual.txt
> @@ -5,8 +5,9 @@
>  
>  PASS Should have and has the session cookie.
>  PASS Should have and has the persistent cookie.
> -PASS Origin has isolated session.
> +FAIL Origin has no isolated session.
>  PASS successfullyParsed is true
> +Some tests failed.
>  
>  TEST COMPLETE
>  
> 
> Probable cause:
> 
> The failures are fixed locally when I clear the NetworkLoad cache and all
> pending write operations in between the tests. I think this is because
> clearIsolatedSessions() is called between tests, but a new isolated session
> is only added with a call to startNetworkLoad() in the
> NetworkResourceLoader, which doesn't happen if the resource is in the cache. 

So how about you make sure this particular resource does not enter the disk cache by serving the "Cache-Control: no-store" HTTP header. It is pretty common in our tests.
Comment 5 Chris Dumez 2019-10-17 13:33:55 PDT
(In reply to Chris Dumez from comment #4)
> (In reply to Katherine_cheney from comment #0)
> > The following layout test is a flaky timeout on macOS WK2 and iOS wk2:
> > 
> > http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-
> > with-interaction-database.html 
> > 
> > Reproducible with:
> > 
> > run-webkit-tests
> > http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-
> > with-interaction-database.html --iterations 2
> > 
> > Flakiness Dashboard:
> > 
> > https://webkit-test-results.webkit.org/dashboards/flakiness_dashboard.
> > html#showAllRuns=true&tests=http%2Ftests%2FresourceLoadStatistics%2Fswitch-
> > session-on-navigation-to-prevalent-with-interaction-database.html%20
> > 
> > Diff:
> > ---
> > /Volumes/Data/slave/highsierra-debug-tests-wk2/build/layout-test-results/
> > http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-
> > with-interaction-expected.txt
> > +++
> > /Volumes/Data/slave/highsierra-debug-tests-wk2/build/layout-test-results/
> > http/tests/resourceLoadStatistics/switch-session-on-navigation-to-prevalent-
> > with-interaction-actual.txt
> > @@ -5,8 +5,9 @@
> >  
> >  PASS Should have and has the session cookie.
> >  PASS Should have and has the persistent cookie.
> > -PASS Origin has isolated session.
> > +FAIL Origin has no isolated session.
> >  PASS successfullyParsed is true
> > +Some tests failed.
> >  
> >  TEST COMPLETE
> >  
> > 
> > Probable cause:
> > 
> > The failures are fixed locally when I clear the NetworkLoad cache and all
> > pending write operations in between the tests. I think this is because
> > clearIsolatedSessions() is called between tests, but a new isolated session
> > is only added with a call to startNetworkLoad() in the
> > NetworkResourceLoader, which doesn't happen if the resource is in the cache. 
> 
> So how about you make sure this particular resource does not enter the disk
> cache by serving the "Cache-Control: no-store" HTTP header. It is pretty
> common in our tests.

See for e.g. LayoutTests/http/tests/misc/resources/random-no-store.php
Comment 6 Kate Cheney 2019-10-17 16:54:05 PDT
Created attachment 381251 [details]
Patch
Comment 7 WebKit Commit Bot 2019-10-21 17:43:14 PDT
Comment on attachment 381251 [details]
Patch

Clearing flags on attachment: 381251

Committed r251402: <https://trac.webkit.org/changeset/251402>
Comment 8 WebKit Commit Bot 2019-10-21 17:43:15 PDT
All reviewed patches have been landed.  Closing bug.