Bug 243410

Summary: REGRESSION(250037@main): wpt /service-workers/service-worker/registration-updateviacache.https.html has become flaky
Product: WebKit Reporter: Sam Sneddon [:gsnedders] <gsnedders>
Component: Service WorkersAssignee: Chris Dumez <cdumez>
Status: RESOLVED FIXED    
Severity: Normal CC: cdumez, rackler, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=239657
https://bugs.webkit.org/show_bug.cgi?id=240074

Sam Sneddon [:gsnedders]
Reported 2022-08-01 10:27:27 PDT
See https://wpt.fyi/results/service-workers/service-worker/registration-updateviacache.https.html?label=master&product=safari-15%5Bstable%5D&product=safari%5Bexperimental%5D&aligned&view=subtest A variety of subtests have regressed: register-with-updateViaCache-all register-with-updateViaCache-undefined-then-all register-with-updateViaCache-imports-then-all register-with-updateViaCache-none-then-all https://github.com/WebKit/WebKit/blob/main/LayoutTests/imported/w3c/web-platform-tests/service-workers/service-worker/registration-updateviacache.https-expected.txt makes it look like our test configuration passes there.
Attachments
Radar WebKit Bug Importer
Comment 1 2022-08-01 10:27:41 PDT
Chris Dumez
Comment 2 2022-08-04 11:53:05 PDT
The test is marked as flaky on iOS: LayoutTests/platform/ios/TestExpectations:webkit.org/b/240074 imported/w3c/web-platform-tests/service-workers/service-worker/registration-updateviacache.https.html [ Pass Failure ] But seems to be running on macOS.
Chris Dumez
Comment 3 2022-08-04 12:02:50 PDT
Similarly, in STP 150, I see all subtests passing expect the last one. Not sure why the wpt.fyi dashboard is saying differently.
Chris Dumez
Comment 4 2022-08-04 12:14:05 PDT
The flaky failure on iOS (https://bugs.webkit.org/show_bug.cgi?id=240074) looks a lot like the failure here. So I think the test is just flaky on iOS and macOS. I guess I can try to reproduce the flakiness with iOS instead since our bots seem to indicate it is more prevalent on that platform.
Chris Dumez
Comment 5 2022-08-04 12:26:26 PDT
Previous attempt to fix it by Youenn (Bug 240074) was unsuccessful based on the flakiness dashboard and wpt.fyi results.
Chris Dumez
Comment 6 2022-08-04 13:01:13 PDT
I do see one weirdness in our code here: ``` // If newestWorker is not null, newestWorker's script url equals job's script url with the exclude fragments // flag set, and script's source text is a byte-for-byte match with newestWorker's script resource's source // text, then: if (newestWorker && equalIgnoringFragmentIdentifier(newestWorker->scriptURL(), job.scriptURL) && newestWorker->type() == job.workerType && result.script == newestWorker->script() && doCertificatesMatch(result.certificateInfo, newestWorker->certificateInfo())) { ``` But the spec says at https://w3c.github.io/ServiceWorker/#start-register (step 5.2): ``` If newestWorker is not null, job’s script url equals newestWorker’s script url, job’s worker type equals newestWorker’s type, and job’s update via cache mode's value equals registration’s update via cache mode, then: Invoke Resolve Job Promise with job and registration. Invoke Finish Job with job and abort these steps. ``` It seems we are missing the "update via cache mode" check. I'll try to reproduce the flakiness locally and see if this fixes it.
Chris Dumez
Comment 7 2022-08-04 13:30:11 PDT
(In reply to Chris Dumez from comment #6) > I do see one weirdness in our code here: > ``` > // If newestWorker is not null, newestWorker's script url equals job's > script url with the exclude fragments > // flag set, and script's source text is a byte-for-byte match with > newestWorker's script resource's source > // text, then: > if (newestWorker && > equalIgnoringFragmentIdentifier(newestWorker->scriptURL(), job.scriptURL) && > newestWorker->type() == job.workerType && result.script == > newestWorker->script() && doCertificatesMatch(result.certificateInfo, > newestWorker->certificateInfo())) { > ``` > > But the spec says at https://w3c.github.io/ServiceWorker/#start-register > (step 5.2): > ``` > If newestWorker is not null, job’s script url equals newestWorker’s script > url, job’s worker type equals newestWorker’s type, and job’s update via > cache mode's value equals registration’s update via cache mode, then: > > Invoke Resolve Job Promise with job and registration. > Invoke Finish Job with job and abort these steps. > ``` > > It seems we are missing the "update via cache mode" check. I'll try to > reproduce the flakiness locally and see if this fixes it. Actually, we do have the updateViaCache check in runRegisterJob() already: ``` auto* newestWorker = registration->getNewestWorker(); if (newestWorker && equalIgnoringFragmentIdentifier(job.scriptURL, newestWorker->scriptURL()) && job.workerType == newestWorker->type() && job.registrationOptions.updateViaCache == registration->updateViaCache()) { RELEASE_LOG(ServiceWorker, "%p - SWServerJobQueue::runRegisterJob: Found directly reusable registration %llu for job %s (DONE)", this, registration->identifier().toUInt64(), job.identifier().loggingString().utf8().data()); m_server.resolveRegistrationJob(job, registration->data(), ShouldNotifyWhenResolved::No); finishCurrentJob(); return; } ```
Chris Dumez
Comment 8 2022-08-04 16:16:32 PDT
I can't reproduce the flakiness on either macOS or iOS so I am kind of stuck. I have tried both via WebKitTestRunner and in Safari and did many iterations.
Chris Dumez
Comment 9 2022-08-04 16:48:39 PDT
(In reply to Chris Dumez from comment #8) > I can't reproduce the flakiness on either macOS or iOS so I am kind of stuck. > I have tried both via WebKitTestRunner and in Safari and did many iterations. Ok, I think I have a flaky reproduction now when running the test in Safari using a local http server instead of wpt.fyi. Since Youenn's original patch was suspecting the memory cache, I tried disabling the memory cache but it didn't fix it.
Chris Dumez
Comment 10 2022-08-04 18:31:11 PDT
(In reply to Chris Dumez from comment #9) > (In reply to Chris Dumez from comment #8) > > I can't reproduce the flakiness on either macOS or iOS so I am kind of stuck. > > I have tried both via WebKitTestRunner and in Safari and did many iterations. > > Ok, I think I have a flaky reproduction now when running the test in Safari > using a local http server instead of wpt.fyi. > > Since Youenn's original patch was suspecting the memory cache, I tried > disabling the memory cache but it didn't fix it. Never mind, I must have tested it wrong earlier. Disabling the memory cache definitely fixes it so Youenn was likely right about the issue. However, it seems his fix was incomplete.
Chris Dumez
Comment 11 2022-08-04 19:20:07 PDT
EWS
Comment 12 2022-08-04 21:44:22 PDT
Committed 253140@main (3f019cf4b5d2): <https://commits.webkit.org/253140@main> Reviewed commits have been landed. Closing PR #3034 and removing active labels.
Karl Rackler
Comment 13 2022-08-05 09:42:51 PDT
(In reply to Chris Dumez from comment #10) > (In reply to Chris Dumez from comment #9) > > (In reply to Chris Dumez from comment #8) > > > I can't reproduce the flakiness on either macOS or iOS so I am kind of stuck. > > > I have tried both via WebKitTestRunner and in Safari and did many iterations. > > > > Ok, I think I have a flaky reproduction now when running the test in Safari > > using a local http server instead of wpt.fyi. > > > > Since Youenn's original patch was suspecting the memory cache, I tried > > disabling the memory cache but it didn't fix it. > > Never mind, I must have tested it wrong earlier. Disabling the memory cache > definitely fixes it so Youenn was likely right about the issue. However, it > seems his fix was incomplete. imported/w3c/web-platform-tests/service-workers/service-worker/registration-updateviacache.https.html is consistently passing after landing 251742@main with the exception of a one-off on 8/3/2022, 5:10:36 PM at 253089@main for iOS. Here is the history with 'Filter expected results' off: https://results.webkit.org/?suite=layout-tests&test=imported%2Fw3c%2Fweb-platform-tests%2Fservice-workers%2Fservice-worker%2Fregistration-updateviacache.https.html&platform=ios&platform=mac&limit=50000
Chris Dumez
Comment 15 2022-08-05 09:47:01 PDT
(In reply to Karl Rackler from comment #14) > (In reply to Chris Dumez from comment #11) > > Pull request: https://github.com/WebKit/WebKit/pull/3034 > > After landing 253140@main, Debug is consistently crashing. > > History: > https://results.webkit.org/?suite=layout-tests&test=imported%2Fw3c%2Fweb- > platform-tests%2Fservice-workers%2Fservice-worker%2Fregistration- > updateviacache.https.html&platform=ios&platform=mac&limit=50000 ASSERTION FAILED: jobData.connectionIdentifier() == Process::identifier() Looking now.
Chris Dumez
Comment 16 2022-08-05 09:49:56 PDT
(In reply to Chris Dumez from comment #15) > (In reply to Karl Rackler from comment #14) > > (In reply to Chris Dumez from comment #11) > > > Pull request: https://github.com/WebKit/WebKit/pull/3034 > > > > After landing 253140@main, Debug is consistently crashing. > > > > History: > > https://results.webkit.org/?suite=layout-tests&test=imported%2Fw3c%2Fweb- > > platform-tests%2Fservice-workers%2Fservice-worker%2Fregistration- > > updateviacache.https.html&platform=ios&platform=mac&limit=50000 > > ASSERTION FAILED: jobData.connectionIdentifier() == Process::identifier() > > Looking now. Looks like an outdated assertion now that we do the refresh loads in the network process unconditionally. I'll fix shortly.
Chris Dumez
Comment 17 2022-08-05 10:09:25 PDT
(In reply to Chris Dumez from comment #16) > (In reply to Chris Dumez from comment #15) > > (In reply to Karl Rackler from comment #14) > > > (In reply to Chris Dumez from comment #11) > > > > Pull request: https://github.com/WebKit/WebKit/pull/3034 > > > > > > After landing 253140@main, Debug is consistently crashing. > > > > > > History: > > > https://results.webkit.org/?suite=layout-tests&test=imported%2Fw3c%2Fweb- > > > platform-tests%2Fservice-workers%2Fservice-worker%2Fregistration- > > > updateviacache.https.html&platform=ios&platform=mac&limit=50000 > > > > ASSERTION FAILED: jobData.connectionIdentifier() == Process::identifier() > > > > Looking now. > > Looks like an outdated assertion now that we do the refresh loads in the > network process unconditionally. I'll fix shortly. Follow-up fix in https://commits.webkit.org/253148@main.
Note You need to log in before you can comment on or make changes to this bug.