Bug 201169

Summary: Add support for postMessage buffering between the service worker and window
Product: WebKit Reporter: Philip Walton <philip>
Component: Service WorkersAssignee: Chris Dumez <cdumez>
Status: RESOLVED FIXED    
Severity: Normal CC: beidson, cdumez, commit-queue, dbates, esprehn+autocc, ews-watchlist, kangil.han, nicolas, ryanhaddad, webkit-bug-importer, youennf
Priority: P2 Keywords: InRadar
Version: Safari 12   
Hardware: Mac   
OS: macOS 10.14   
See Also: https://bugs.webkit.org/show_bug.cgi?id=201429
Bug Depends on: 201584    
Bug Blocks:    
Attachments:
Description Flags
WIP Patch
none
WIP Patch
none
WIP Patch
none
Patch
none
Archive of layout-test-results from ews212 for win-future
none
Patch
none
Patch
none
Archive of layout-test-results from webkit-cq-03 for mac-highsierra
none
Archive of layout-test-results from ews210 for win-future
none
Patch
none
Archive of layout-test-results from webkit-cq-01 for mac-highsierra
none
Patch
none
Archive of layout-test-results from webkit-cq-03 for mac-highsierra
none
Patch
none
Archive of layout-test-results from webkit-cq-03 for mac-highsierra
none
Patch
none
Patch none

Description Philip Walton 2019-08-26 19:58:04 PDT
Safari 12 implements the `ServiceWorkerContainer.startMessages()` API [1], but it does not actually implement the underlying buffering of messages sent from the service worker to the window in the `fetch` event for navigation requests [2].

Firefox has supported buffering for a while, and Chrome added support in M74 [3]. Prior to Chrome supporting this feature, we (on the Chrome team) recommended feature detecting support via the presence of the `startMessages()` method (since its purpose is to allow the buffer to start clearing before the DOMContentLoaded event):

```
if (navigator.serviceWorker && navigator.serviceWorker.startMessages) {
  // Assuming the service worker supports message buffering
}
```

But given that Safari has the `startMessages()` method but does not implement the underlying buffering behavior, the above will result in a false positive, and it will likely mean lots of missed message from the service worker after navigation requests.

The ability to reliably send a message from the service worker to the window after new navigations is important, as developers commonly rely on postMessage to inform the window that a newer version of the page exists [4]. If these messages are being dropped in Safari, then Safari users may be seeing outdated content and not receiving the message that new content is available.


[1] https://w3c.github.io/ServiceWorker/#dom-serviceworkercontainer-startmessages

[2] https://twitter.com/chris_dumez/status/1166170761125490689

[3] https://www.chromestatus.com/features/5751307839209472

[4] https://developers.google.com/web/tools/workbox/modules/workbox-broadcast-cache-update
Comment 1 Chris Dumez 2019-08-27 12:16:29 PDT
Created attachment 377361 [details]
WIP Patch

Completely untested.
Comment 2 Chris Dumez 2019-08-27 14:56:28 PDT
Created attachment 377389 [details]
WIP Patch

Still trying to get imported/w3c/web-platform-tests/service-workers/service-worker/postmessage-to-client-message-queue.https.html  to pass.
Comment 3 Chris Dumez 2019-08-27 15:27:22 PDT
It is weird, the test in question is passing in Safari but fails in WebKitTestRunner.
Comment 4 Chris Dumez 2019-08-27 15:32:43 PDT
(In reply to Chris Dumez from comment #3)
> It is weird, the test in question is passing in Safari but fails in
> WebKitTestRunner.

Also passes in MiniBrowser. I suspect a testharness issue here.
Comment 5 Chris Dumez 2019-08-27 15:53:35 PDT
(In reply to Chris Dumez from comment #4)
> (In reply to Chris Dumez from comment #3)
> > It is weird, the test in question is passing in Safari but fails in
> > WebKitTestRunner.
> 
> Also passes in MiniBrowser. I suspect a testharness issue here.

TestRunner::notifyDone() gets called but it does not seem to end the test. I think the test runner may be waiting for the page to finish loading but it never does.
Comment 6 Chris Dumez 2019-08-27 16:14:27 PDT
Created attachment 377398 [details]
WIP Patch
Comment 7 Chris Dumez 2019-08-27 16:57:17 PDT
Created attachment 377402 [details]
Patch
Comment 8 EWS Watchlist 2019-08-27 18:29:43 PDT
Comment on attachment 377402 [details]
Patch

Attachment 377402 [details] did not pass win-ews (win):
Output: https://webkit-queues.webkit.org/results/12975185

New failing tests:
http/wpt/html/semantics/text-level-semantics/the-a-element/a-download-click-404.html
Comment 9 EWS Watchlist 2019-08-27 18:29:46 PDT
Created attachment 377414 [details]
Archive of layout-test-results from ews212 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews212  Port: win-future  Platform: CYGWIN_NT-10.0-17763-3.0.5-338.x86_64-x86_64-64bit
Comment 10 youenn fablet 2019-08-28 02:16:27 PDT
Comment on attachment 377402 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377402&action=review

> Source/WebCore/ChangeLog:16
> +        While the message queue is disabled, messages from posted by the service worker

s/posted by//

> Source/WebCore/ChangeLog:17
> +        to the client simply get queued and we only get processed once the queue gets

s/we//

> Source/WebCore/workers/service/ServiceWorkerContainer.cpp:485
> +    m_messageQueue.enqueueEvent(WTFMove(messageEvent));

This enqueues an event instead of dispatching it, probably delaying the event dispatch to the next run loop.
I wonder whether this might create some issues with event ordering like we had in the past.
It seems that we could remove the postTask call in SWClientConnection::postMessageToServiceWorkerClient to fix that.
Comment 11 Chris Dumez 2019-08-28 08:29:18 PDT
(In reply to youenn fablet from comment #10)
> Comment on attachment 377402 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=377402&action=review
> 
> > Source/WebCore/ChangeLog:16
> > +        While the message queue is disabled, messages from posted by the service worker
> 
> s/posted by//
> 
> > Source/WebCore/ChangeLog:17
> > +        to the client simply get queued and we only get processed once the queue gets
> 
> s/we//
> 
> > Source/WebCore/workers/service/ServiceWorkerContainer.cpp:485
> > +    m_messageQueue.enqueueEvent(WTFMove(messageEvent));
> 
> This enqueues an event instead of dispatching it, probably delaying the
> event dispatch to the next run loop.
> I wonder whether this might create some issues with event ordering like we
> had in the past.
> It seems that we could remove the postTask call in
> SWClientConnection::postMessageToServiceWorkerClient to fix that.

The postTask in SWClientConnection::postMessageToServiceWorkerClient() will be useful once we support workers being clients though, but OK for now.
Comment 12 Chris Dumez 2019-08-28 08:32:26 PDT
Created attachment 377442 [details]
Patch
Comment 13 youenn fablet 2019-08-28 09:33:11 PDT
Comment on attachment 377442 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377442&action=review

> Source/WebCore/dom/Document.cpp:324
> +#if ENABLE(SERVICE_WORKER)

No need for ENABLE(SERVICE_WORKER), ServiceWorkerContainer.h is doing it itself.

> Source/WebCore/workers/service/SWClientConnection.cpp:129
> +        container->postMessage(WTFMove(message), WTFMove(sourceData), WTFMove(sourceOrigin));

If we want to keep the fact that we are actually posting a task to dispatch a message event, we could rename this postMessage method.
Comment 14 Chris Dumez 2019-08-28 10:04:58 PDT
Created attachment 377455 [details]
Patch
Comment 15 WebKit Commit Bot 2019-08-28 11:01:16 PDT
Comment on attachment 377455 [details]
Patch

Rejecting attachment 377455 [details] from commit-queue.

Number of test failures exceeded the failure limit.
Full output: https://webkit-queues.webkit.org/results/12977015
Comment 16 WebKit Commit Bot 2019-08-28 11:01:18 PDT
Created attachment 377462 [details]
Archive of layout-test-results from webkit-cq-03 for mac-highsierra

The attached test failures were seen while running run-webkit-tests on the commit-queue.
Bot: webkit-cq-03  Port: mac-highsierra  Platform: Mac OS X 10.13.6
Comment 17 EWS Watchlist 2019-08-28 12:30:51 PDT
Comment on attachment 377455 [details]
Patch

Attachment 377455 [details] did not pass win-ews (win):
Output: https://webkit-queues.webkit.org/results/12977176

New failing tests:
js/error-should-not-strong-reference-global-object.html
http/wpt/html/semantics/text-level-semantics/the-a-element/a-download-click-404.html
Comment 18 EWS Watchlist 2019-08-28 12:30:53 PDT
Created attachment 377468 [details]
Archive of layout-test-results from ews210 for win-future

The attached test failures were seen while running run-webkit-tests on the win-ews.
Bot: ews210  Port: win-future  Platform: CYGWIN_NT-10.0-17763-3.0.5-338.x86_64-x86_64-64bit
Comment 19 Chris Dumez 2019-08-28 14:08:52 PDT
Created attachment 377480 [details]
Patch
Comment 20 WebKit Commit Bot 2019-08-28 15:06:52 PDT
Comment on attachment 377480 [details]
Patch

Rejecting attachment 377480 [details] from commit-queue.

Number of test failures exceeded the failure limit.
Full output: https://webkit-queues.webkit.org/results/12977857
Comment 21 WebKit Commit Bot 2019-08-28 15:06:54 PDT
Created attachment 377491 [details]
Archive of layout-test-results from webkit-cq-01 for mac-highsierra

The attached test failures were seen while running run-webkit-tests on the commit-queue.
Bot: webkit-cq-01  Port: mac-highsierra  Platform: Mac OS X 10.13.6
Comment 22 Chris Dumez 2019-08-28 15:46:52 PDT
Created attachment 377498 [details]
Patch
Comment 23 WebKit Commit Bot 2019-08-28 17:24:07 PDT
Comment on attachment 377498 [details]
Patch

Rejecting attachment 377498 [details] from commit-queue.

Number of test failures exceeded the failure limit.
Full output: https://webkit-queues.webkit.org/results/12978406
Comment 24 WebKit Commit Bot 2019-08-28 17:24:09 PDT
Created attachment 377514 [details]
Archive of layout-test-results from webkit-cq-03 for mac-highsierra

The attached test failures were seen while running run-webkit-tests on the commit-queue.
Bot: webkit-cq-03  Port: mac-highsierra  Platform: Mac OS X 10.13.6
Comment 25 Chris Dumez 2019-08-30 10:03:18 PDT
Created attachment 377721 [details]
Patch
Comment 26 WebKit Commit Bot 2019-08-30 10:58:24 PDT
Comment on attachment 377721 [details]
Patch

Rejecting attachment 377721 [details] from commit-queue.

Number of test failures exceeded the failure limit.
Full output: https://webkit-queues.webkit.org/results/12984563
Comment 27 WebKit Commit Bot 2019-08-30 10:58:26 PDT
Created attachment 377727 [details]
Archive of layout-test-results from webkit-cq-03 for mac-highsierra

The attached test failures were seen while running run-webkit-tests on the commit-queue.
Bot: webkit-cq-03  Port: mac-highsierra  Platform: Mac OS X 10.13.6
Comment 28 Chris Dumez 2019-08-30 12:03:39 PDT
Comment on attachment 377721 [details]
Patch

Clearing flags on attachment: 377721

Committed r249338: <https://trac.webkit.org/changeset/249338>
Comment 29 Chris Dumez 2019-08-30 12:03:40 PDT
All reviewed patches have been landed.  Closing bug.
Comment 30 Radar WebKit Bug Importer 2019-08-30 12:04:36 PDT
<rdar://problem/54891287>
Comment 31 Ryan Haddad 2019-08-30 13:20:15 PDT
https://trac.webkit.org/changeset/249338 seems to have caused hundreds of test failures on macOS WK1: https://build.webkit.org/builders/Apple%20High%20Sierra%20Release%20WK1%20(Tests)/builds/15233
Comment 32 Ryan Haddad 2019-08-30 13:33:52 PDT
Reverted r249338 for reason:

Caused 500+ layout test failures on WK1

Committed r249344: <https://trac.webkit.org/changeset/249344>
Comment 33 Chris Dumez 2019-08-30 13:34:26 PDT
(In reply to Ryan Haddad from comment #32)
> Reverted r249338 for reason:
> 
> Caused 500+ layout test failures on WK1
> 
> Committed r249344: <https://trac.webkit.org/changeset/249344>

Thanks, will investigate.
Comment 34 Chris Dumez 2019-08-30 13:36:42 PDT
Comment on attachment 377721 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=377721&action=review

> LayoutTests/resources/testharnessreport.js:101
> +            testRunner.forceImmediateCompletion();

It must be this change that is somehow confusing DumpRenderTree (tests output gets compared with the test expectations of the previous or next test).
Comment 35 Chris Dumez 2019-08-30 14:04:40 PDT
Created attachment 377748 [details]
Patch
Comment 36 Chris Dumez 2019-08-30 15:51:48 PDT
Created attachment 377759 [details]
Patch
Comment 37 WebKit Commit Bot 2019-08-30 17:34:14 PDT
Comment on attachment 377759 [details]
Patch

Clearing flags on attachment: 377759

Committed r249353: <https://trac.webkit.org/changeset/249353>
Comment 38 WebKit Commit Bot 2019-08-30 17:34:16 PDT
All reviewed patches have been landed.  Closing bug.
Comment 39 Ryan Haddad 2019-09-07 13:20:57 PDT
Reverted r249353 for reason:

The test for this change is a flaky failure.

Committed r249615: <https://trac.webkit.org/changeset/249615>
Comment 40 Ryan Haddad 2019-09-07 13:21:53 PDT
(In reply to Ryan Haddad from comment #39)
> Reverted r249353 for reason:
> 
> The test for this change is a flaky failure.
> 
> Committed r249615: <https://trac.webkit.org/changeset/249615>
See https://bugs.webkit.org/show_bug.cgi?id=201429#c0
Comment 41 Chris Dumez 2019-09-07 15:53:36 PDT
The reason the layout test is flaky is the same as for a lot of our other layout tests. When the test unregisters then re-registers a service worker, they sometimes get the same registration because unregistering merely does a "tryClear" not a "clear" of the registration.

If I do this, then the flakiness goes away:
diff --git a/Source/WebCore/workers/service/server/SWServerJobQueue.cpp b/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
index 4ac244aee65..5fe62356e8f 100644
--- a/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
+++ b/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
@@ -320,7 +320,7 @@ void SWServerJobQueue::runUnregisterJob(const ServiceWorkerJobData& job)
     m_server.resolveUnregistrationJob(job, m_registrationKey, true);
 
     // Invoke Try Clear Registration with registration.
-    registration->tryClear();
+    registration->clear();
     finishCurrentJob();
 }


But then we no longer match the spec. I will look into it more.
Comment 42 Chris Dumez 2019-09-07 16:03:14 PDT
(In reply to Chris Dumez from comment #41)
> The reason the layout test is flaky is the same as for a lot of our other
> layout tests. When the test unregisters then re-registers a service worker,
> they sometimes get the same registration because unregistering merely does a
> "tryClear" not a "clear" of the registration.
> 
> If I do this, then the flakiness goes away:
> diff --git a/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
> b/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
> index 4ac244aee65..5fe62356e8f 100644
> --- a/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
> +++ b/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
> @@ -320,7 +320,7 @@ void SWServerJobQueue::runUnregisterJob(const
> ServiceWorkerJobData& job)
>      m_server.resolveUnregistrationJob(job, m_registrationKey, true);
>  
>      // Invoke Try Clear Registration with registration.
> -    registration->tryClear();
> +    registration->clear();
>      finishCurrentJob();
>  }
> 
> 
> But then we no longer match the spec. I will look into it more.

I think I found where we do not match the spec and why we are flaky here. I am working on fixing it.
Comment 43 Chris Dumez 2019-09-07 17:55:14 PDT
(In reply to Chris Dumez from comment #42)
> (In reply to Chris Dumez from comment #41)
> > The reason the layout test is flaky is the same as for a lot of our other
> > layout tests. When the test unregisters then re-registers a service worker,
> > they sometimes get the same registration because unregistering merely does a
> > "tryClear" not a "clear" of the registration.
> > 
> > If I do this, then the flakiness goes away:
> > diff --git a/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
> > b/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
> > index 4ac244aee65..5fe62356e8f 100644
> > --- a/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
> > +++ b/Source/WebCore/workers/service/server/SWServerJobQueue.cpp
> > @@ -320,7 +320,7 @@ void SWServerJobQueue::runUnregisterJob(const
> > ServiceWorkerJobData& job)
> >      m_server.resolveUnregistrationJob(job, m_registrationKey, true);
> >  
> >      // Invoke Try Clear Registration with registration.
> > -    registration->tryClear();
> > +    registration->clear();
> >      finishCurrentJob();
> >  }
> > 
> > 
> > But then we no longer match the spec. I will look into it more.
> 
> I think I found where we do not match the spec and why we are flaky here. I
> am working on fixing it.

Addressing flakiness via Bug 201584.
Comment 44 Chris Dumez 2019-09-07 23:10:32 PDT
Comment on attachment 377759 [details]
Patch

Clearing flags on attachment: 377759

Committed r249629: <https://trac.webkit.org/changeset/249629>
Comment 45 Chris Dumez 2019-09-07 23:10:34 PDT
All reviewed patches have been landed.  Closing bug.