Bug 202188 - Handle the case of a service worker taking too much time to process a fetch event
Summary: Handle the case of a service worker taking too much time to process a fetch e...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Service Workers (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: youenn fablet
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-09-25 00:56 PDT by youenn fablet
Modified: 2019-10-10 04:30 PDT (History)
8 users (show)

See Also:


Attachments
Patch (23.34 KB, patch)
2019-09-25 01:05 PDT, youenn fablet
no flags Details | Formatted Diff | Diff
Patch (23.34 KB, patch)
2019-09-25 01:08 PDT, youenn fablet
beidson: review-
Details | Formatted Diff | Diff
WIP fix (30.99 KB, patch)
2019-10-02 16:57 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Another WIP (36.31 KB, patch)
2019-10-04 16:59 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Patch (43.42 KB, patch)
2019-10-07 12:53 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Patch (43.55 KB, patch)
2019-10-07 14:32 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Patch (43.86 KB, patch)
2019-10-08 10:13 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description youenn fablet 2019-09-25 00:56:22 PDT
Handle the case of a service worker taking too much time to process a fetch event
Comment 1 youenn fablet 2019-09-25 01:05:52 PDT
Created attachment 379531 [details]
Patch
Comment 2 youenn fablet 2019-09-25 01:08:06 PDT
Created attachment 379532 [details]
Patch
Comment 3 Brady Eidson 2019-09-25 11:08:36 PDT
Comment on attachment 379532 [details]
Patch

As discussed in email, I'm working on this in parallel, have a different much more conservative approach, and we think this is the wrong time for refactoring tasks unrelated to bug fixing.
Comment 4 Brady Eidson 2019-10-02 16:57:05 PDT
Created attachment 380072 [details]
WIP fix

A lot of discussion IRL outside of this bug, and this is my resulting WIP patch. Obviously the test isn't done and there's more testing I want to do, but the code change can undergo preliminary review
Comment 5 youenn fablet 2019-10-02 23:54:57 PDT
Comment on attachment 380072 [details]
WIP fix

View in context: https://bugs.webkit.org/attachment.cgi?id=380072&action=review

> Source/WebCore/loader/FetchOptions.h:61
> +    Seconds serviceWorkerTimeout { defaultServiceWorkerFetchTimeout };

A global value either at NetworkProcess or NetworkSession is probably sufficient.
This should remove the need to serialize this value for every request.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:54
> +    , m_timeoutTimer { *this, &ServiceWorkerFetchTask::timeoutTimerFired }

There are two main cases for timeouts:
- The service worker script is spinning when processing the fetch handler. This should be a short timeout. The result of this task is either that the fetch is didnothandle or fetch is to be handled by the service worker. If this timeout kicks in, this probably means the worker is mal-functioning and we should avoid this service worker.
- In case the service worker handles the fetch, the fetch response (or body) takes a long time to be received. In that case, it is not clear whether the service worker is mal-functioning. Maybe the network connection is very slow. We could try to mimick how CFNetwork is timing out. It is unclear whether we should actually avoid the service worker in that case.

We might want to add a test for the second case:
1. A scenario where the service worker is caching any response for future reuse. A while(1) is happening shortly after the fetch response promise is fulfilled, for instance just after adding the response to DOM cache
2. A scenario where service worker is doing the fetch and the fetch is taking a long long time to complete.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:56
> +    m_timeoutTimer.startOneShot(m_timeout);

A ServiceWorkerFetchTask is only created if there is a context connection.
This timer will not catch any bug in our implementation resulting in time outs. For instance bugs in creating the context connection, pending job queue, running the service worker.
This seems ok to me, future refactoring will make the creation of ServiceWorkerFetchTask happen much sooner to actually catch these issues.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.h:93
> +    WebSWServerConnection& m_connection;

Could use WeakPtr<WebSWServerConnection> for safety.
I don't think m_connection is guaranteed to stay valid.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.h:94
> +    WebSWServerToContextConnection& m_contextConnection;

This seems a bit fragile, especially since ServiceWorkerFetchTask is refcounted.
Patch in https://bugs.webkit.org/show_bug.cgi?id=202309 is making ServiceWorkerFetchTask no longer refcounted which simplifies things.
Comment 6 Chris Dumez 2019-10-03 08:45:06 PDT
(In reply to youenn fablet from comment #5)
> Comment on attachment 380072 [details]
> WIP fix
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=380072&action=review
> 
> > Source/WebCore/loader/FetchOptions.h:61
> > +    Seconds serviceWorkerTimeout { defaultServiceWorkerFetchTimeout };
> 
> A global value either at NetworkProcess or NetworkSession is probably
> sufficient.
> This should remove the need to serialize this value for every request.

+1: This adds unnecessary complexity which is unfortunate, especially if we want to cherry-pick this to a branch.
Comment 7 Chris Dumez 2019-10-03 09:08:29 PDT
Comment on attachment 380072 [details]
WIP fix

View in context: https://bugs.webkit.org/attachment.cgi?id=380072&action=review

>>> Source/WebCore/loader/FetchOptions.h:61
>>> +    Seconds serviceWorkerTimeout { defaultServiceWorkerFetchTimeout };
>> 
>> A global value either at NetworkProcess or NetworkSession is probably sufficient.
>> This should remove the need to serialize this value for every request.
> 
> +1: This adds unnecessary complexity which is unfortunate, especially if we want to cherry-pick this to a branch.

To be clear, I realize that you need to change the value for layout testing but what we usually do then is have the testRunner utility function send an IPC to the NetworkProcess to update the timeout value.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:63
> +    m_connection.ipcConnection().send(Messages::ServiceWorkerClientFetch::DidReceiveRedirectResponse { response }, m_identifier.fetchIdentifier);

WebSWServerConnection is an IPC::MessageSender so I do not think you need to get the ipcConnection(), you can call send() on m_connection directly.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.h:97
> +    Seconds m_timeout;

As commented above, I think a global timeout value would suffice. Having one per fetch task adds unnecessary complexity.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.h:99
> +    bool m_handled { false };

Boolean variables should have a prefix as per coding style.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.h:100
> +    bool m_messagedTerminalState { false };

Boolean variables should have a prefix as per coding style. Also, the current naming is confusing to me, no idea what it means by simply looking at it.

> Source/WebKit/NetworkProcess/ServiceWorker/WebSWServerToContextConnection.cpp:219
> +    // Gather all other fetches in this service worker

Missing period at the end.

> Source/WebKit/NetworkProcess/ServiceWorker/WebSWServerToContextConnection.cpp:220
> +    HashSet<Ref<ServiceWorkerFetchTask>> otherFetches;

Why does this need to be a HashSet? Why cannot it be a Vector?

> Source/WebKit/NetworkProcess/ServiceWorker/WebSWServerToContextConnection.cpp:226
> +    // Signal load failure for them

Missing period at the end.
Comment 8 Brady Eidson 2019-10-03 13:41:56 PDT
(In reply to youenn fablet from comment #5)
> Comment on attachment 380072 [details]
> WIP fix
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=380072&action=review
> 
> > Source/WebCore/loader/FetchOptions.h:61
> > +    Seconds serviceWorkerTimeout { defaultServiceWorkerFetchTimeout };
> 
> A global value either at NetworkProcess or NetworkSession is probably
> sufficient.
> This should remove the need to serialize this value for every request.

A global value makes testing at best flaky, and at worst not possible.

I don't think attaching a timeout to the request increase complexity in an intractable manner.

> 
> > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:54
> > +    , m_timeoutTimer { *this, &ServiceWorkerFetchTask::timeoutTimerFired }
> 
> There are two main cases for timeouts:
> - The service worker script is spinning when processing the fetch handler.
> This should be a short timeout. The result of this task is either that the
> fetch is didnothandle or fetch is to be handled by the service worker. If
> this timeout kicks in, this probably means the worker is mal-functioning and
> we should avoid this service worker.

That's what this patch implements.

For compatibility reasons, I disagree that it should be shorter than 60s, because that's what Chrome does.
Firefox does 5.5 minutes, which I would agree is too long.


> - In case the service worker handles the fetch, the fetch response (or body)
> takes a long time to be received. In that case, it is not clear whether the
> service worker is mal-functioning. Maybe the network connection is very
> slow. We could try to mimick how CFNetwork is timing out. It is unclear
> whether we should actually avoid the service worker in that case.

CFNetwork either:
A - delivers at least one event every 60 seconds
B - Times out after 60 seconds.

I think ServiceWorkers that are dealing with a network failure lasting longer than 60 seconds need to signal the fetch failed.

> We might want to add a test for the second case:

Working on such a test is why this patch is not ready for review.

> 1. A scenario where the service worker is caching any response for future
> reuse. A while(1) is happening shortly after the fetch response promise is
> fulfilled, for instance just after adding the response to DOM cache
> 2. A scenario where service worker is doing the fetch and the fetch is
> taking a long long time to complete.
> 
> > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:56
> > +    m_timeoutTimer.startOneShot(m_timeout);
> 
> A ServiceWorkerFetchTask is only created if there is a context connection.
> This timer will not catch any bug in our implementation resulting in time
> outs. For instance bugs in creating the context connection, pending job
> queue, running the service worker.
> This seems ok to me, future refactoring will make the creation of
> ServiceWorkerFetchTask happen much sooner to actually catch these issues.

Combined with your work on the process launch race, I think this is fine.

> 
> > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.h:93
> > +    WebSWServerConnection& m_connection;
> 
> Could use WeakPtr<WebSWServerConnection> for safety.
> I don't think m_connection is guaranteed to stay valid.

The connection object soup is complicated. I reasoned through and figured it would. But a WeakPtr is safer.

> 
> > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.h:94
> > +    WebSWServerToContextConnection& m_contextConnection;
> 
> This seems a bit fragile, especially since ServiceWorkerFetchTask is
> refcounted.

ServiceWorkerFetchTasks are owned by WebSWServerToContextConnection.

Fragile in theory, agreed. I don't think it's fragile in context.

> Patch in https://bugs.webkit.org/show_bug.cgi?id=202309 is making
> ServiceWorkerFetchTask no longer refcounted which simplifies things.

As we've discussed, there's tons of great refactoring to do here.
Also as discussed we're trying to implement good timeouts for the branch.
Comment 9 Brady Eidson 2019-10-03 13:43:17 PDT
(In reply to Chris Dumez from comment #6)
> (In reply to youenn fablet from comment #5)
> > Comment on attachment 380072 [details]
> > WIP fix
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=380072&action=review
> > 
> > > Source/WebCore/loader/FetchOptions.h:61
> > > +    Seconds serviceWorkerTimeout { defaultServiceWorkerFetchTimeout };
> > 
> > A global value either at NetworkProcess or NetworkSession is probably
> > sufficient.
> > This should remove the need to serialize this value for every request.
> 
> +1: This adds unnecessary complexity which is unfortunate, especially if we
> want to cherry-pick this to a branch.

To be clear, I first implemented this as a global because it was the obvious thing to do.

It became apparent that per-fetch timeouts were necessary for testing within our testing infrastructure

We are not landing anything without appropriate testing - trunk or branch - per my r-.
Comment 10 Brady Eidson 2019-10-03 13:46:46 PDT
> To be clear, I realize that you need to change the value for layout testing but what we usually do then is have the testRunner utility function send an IPC to the NetworkProcess to update the timeout value.

I missed that you pointed this out. To followup on it, doing so would've made this patch more complicated by spreading the timeout implementation over two projects and way more files, since WebCore Internals has no concept of "the Network Process"
Comment 11 Chris Dumez 2019-10-03 13:47:57 PDT
(In reply to Brady Eidson from comment #10)
> > To be clear, I realize that you need to change the value for layout testing but what we usually do then is have the testRunner utility function send an IPC to the NetworkProcess to update the timeout value.
> 
> I missed that you pointed this out. To followup on it, doing so would've
> made this patch more complicated by spreading the timeout implementation
> over two projects and way more files, since WebCore Internals has no concept
> of "the Network Process"

I said testRunner, not internals.
Comment 12 Chris Dumez 2019-10-03 13:54:28 PDT
(In reply to Chris Dumez from comment #11)
> (In reply to Brady Eidson from comment #10)
> > > To be clear, I realize that you need to change the value for layout testing but what we usually do then is have the testRunner utility function send an IPC to the NetworkProcess to update the timeout value.
> > 
> > I missed that you pointed this out. To followup on it, doing so would've
> > made this patch more complicated by spreading the timeout implementation
> > over two projects and way more files, since WebCore Internals has no concept
> > of "the Network Process"
> 
> I said testRunner, not internals.

testRunner already has plenty of utility functions that send IPC to the network process to update some state. I don't think internals is the right place to do this.
Comment 13 Chris Dumez 2019-10-03 14:10:26 PDT
(In reply to Chris Dumez from comment #12)
> (In reply to Chris Dumez from comment #11)
> > (In reply to Brady Eidson from comment #10)
> > > > To be clear, I realize that you need to change the value for layout testing but what we usually do then is have the testRunner utility function send an IPC to the NetworkProcess to update the timeout value.
> > > 
> > > I missed that you pointed this out. To followup on it, doing so would've
> > > made this patch more complicated by spreading the timeout implementation
> > > over two projects and way more files, since WebCore Internals has no concept
> > > of "the Network Process"
> > 
> > I said testRunner, not internals.
> 
> testRunner already has plenty of utility functions that send IPC to the
> network process to update some state. I don't think internals is the right
> place to do this.

E.g. testRunner.setStorageAccessAPIEnabled() sends an IPC to the network process to update state with minimal code.
Comment 14 Brady Eidson 2019-10-03 15:27:43 PDT
(In reply to Chris Dumez from comment #13)
> (In reply to Chris Dumez from comment #12)
> > (In reply to Chris Dumez from comment #11)
> > > (In reply to Brady Eidson from comment #10)
> > > > > To be clear, I realize that you need to change the value for layout testing but what we usually do then is have the testRunner utility function send an IPC to the NetworkProcess to update the timeout value.
> > > > 
> > > > I missed that you pointed this out. To followup on it, doing so would've
> > > > made this patch more complicated by spreading the timeout implementation
> > > > over two projects and way more files, since WebCore Internals has no concept
> > > > of "the Network Process"
> > > 
> > > I said testRunner, not internals.
> > 
> > testRunner already has plenty of utility functions that send IPC to the
> > network process to update some state. I don't think internals is the right
> > place to do this.
> 
> E.g. testRunner.setStorageAccessAPIEnabled() sends an IPC to the network
> process to update state with minimal code.

We would have to add SPI for this, correct?

Which means the code to change it is in the binary WebKit framework (unlike WebCore internals)?
Comment 15 Brady Eidson 2019-10-04 16:59:18 PDT
Created attachment 380269 [details]
Another WIP

Addresses the feedback, and stashing it so I can take it home to work on more later
Comment 16 youenn fablet 2019-10-07 11:55:25 PDT
> > - In case the service worker handles the fetch, the fetch response (or body)
> > takes a long time to be received. In that case, it is not clear whether the
> > service worker is mal-functioning. Maybe the network connection is very
> > slow. We could try to mimick how CFNetwork is timing out. It is unclear
> > whether we should actually avoid the service worker in that case.
> 
> CFNetwork either:
> A - delivers at least one event every 60 seconds
> B - Times out after 60 seconds.
> 
> I think ServiceWorkers that are dealing with a network failure lasting
> longer than 60 seconds need to signal the fetch failed.

Service workers that are fetching+caching a response should have at least the same 60 seconds policy between events.
I would tend to increase a little bit since service worker adds some overhead.
Comment 17 Brady Eidson 2019-10-07 12:53:29 PDT
Created attachment 380346 [details]
Patch
Comment 18 Brady Eidson 2019-10-07 12:55:23 PDT
(In reply to youenn fablet from comment #16)
> > > - In case the service worker handles the fetch, the fetch response (or body)
> > > takes a long time to be received. In that case, it is not clear whether the
> > > service worker is mal-functioning. Maybe the network connection is very
> > > slow. We could try to mimick how CFNetwork is timing out. It is unclear
> > > whether we should actually avoid the service worker in that case.
> > 
> > CFNetwork either:
> > A - delivers at least one event every 60 seconds
> > B - Times out after 60 seconds.
> > 
> > I think ServiceWorkers that are dealing with a network failure lasting
> > longer than 60 seconds need to signal the fetch failed.
> 
> Service workers that are fetching+caching a response should have at least
> the same 60 seconds policy between events.
> I would tend to increase a little bit since service worker adds some
> overhead.

That's fair. 70s?

Let's hash out a good number, and I'll update the constant in the patch before landing. Everything else is still ready for review
Comment 19 Brady Eidson 2019-10-07 14:32:50 PDT
Created attachment 380362 [details]
Patch
Comment 20 Alex Christensen 2019-10-07 14:51:06 PDT
Comment on attachment 380362 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=380362&action=review

> LayoutTests/http/tests/workers/service/resources/basic-timeout.js:42
> +            testRunner.setServiceWorkerFetchTimeout(60);

This should probably reset or be consistent with the default.
Comment 21 Chris Dumez 2019-10-07 14:59:12 PDT
Comment on attachment 380362 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=380362&action=review

> Source/WebKit/NetworkProcess/ServiceWorker/WebSWServerToContextConnection.cpp:165
>      send(Messages::WebSWContextManagerConnection::StartFetch { serverConnectionIdentifier, serviceWorkerIdentifier, fetchIdentifier, request, options, data, referrer });

I'd feel a lot better if we dealt with send() returning false here.
Comment 22 youenn fablet 2019-10-07 23:59:30 PDT
Comment on attachment 380362 [details]
Patch

Overall patch looks good to me too.

I am concerned by the case of restarting m_timeoutTimer in didReceiveResponse/didReceiveData.
This does not seem necessary to cover the bug reports we know of and might potentially create regressions in existing websites (since we fail the load in that case while otherwise we just go to the network process).

I would tend to land the restarting of m_timeoutTimer in receive response/receive data as a separate follow-up patch.
We should add a test covering these cases (see for instance LayoutTests/imported/w3c/web-platform-tests/service-workers/service-worker/fetch-event-respond-with-readable-stream.https.html and LayoutTests/imported/w3c/web-platform-tests/service-workers/service-worker/fetch-event-respond-with-readable-stream-chunk.https.html), and check what other browsers are doing in that area.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:114
> +    m_didReachTerminalState = true;

I do not think that m_didReachTerminalState is necessary.
By design, if we have didFinish/didFail/didNotHandle, service worker process should not send any IPC for that particular fetch task.
This is also guaranteed by the fact that ServiceWorkerFetchTask will no longer receive any IPC message after receiving any of these messages.
See WebSWServerToContextConnection::didReceiveFetchTaskMessage.
Comment 23 Brady Eidson 2019-10-08 10:05:13 PDT
(In reply to Alex Christensen from comment #20)
> Comment on attachment 380362 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=380362&action=review
> 
> > LayoutTests/http/tests/workers/service/resources/basic-timeout.js:42
> > +            testRunner.setServiceWorkerFetchTimeout(60);
> 
> This should probably reset or be consistent with the default.

TBH it just needs to be "big enough to never time out"

The value is reset between tests no matter what.
Comment 24 Brady Eidson 2019-10-08 10:08:01 PDT
(In reply to youenn fablet from comment #22)
> Comment on attachment 380362 [details]
> Patch
> 
> Overall patch looks good to me too.
> 
> I am concerned by the case of restarting m_timeoutTimer in
> didReceiveResponse/didReceiveData.
> This does not seem necessary to cover the bug reports we know of and might
> potentially create regressions in existing websites (since we fail the load
> in that case while otherwise we just go to the network process).
> 
> I would tend to land the restarting of m_timeoutTimer in receive
> response/receive data as a separate follow-up patch.
> We should add a test covering these cases (see for instance
> LayoutTests/imported/w3c/web-platform-tests/service-workers/service-worker/
> fetch-event-respond-with-readable-stream.https.html and
> LayoutTests/imported/w3c/web-platform-tests/service-workers/service-worker/
> fetch-event-respond-with-readable-stream-chunk.https.html), and check what
> other browsers are doing in that area.

Fair point, will leave that out!

> > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:114
> > +    m_didReachTerminalState = true;
> 
> I do not think that m_didReachTerminalState is necessary.
> By design, if we have didFinish/didFail/didNotHandle, service worker process
> should not send any IPC for that particular fetch task.
> This is also guaranteed by the fact that ServiceWorkerFetchTask will no
> longer receive any IPC message after receiving any of these messages.
> See WebSWServerToContextConnection::didReceiveFetchTaskMessage.

I added that because during development a very basic showed double messaging.

I do not recall details.
Comment 25 Brady Eidson 2019-10-08 10:09:05 PDT
(In reply to Chris Dumez from comment #21)
> Comment on attachment 380362 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=380362&action=review
> 
> > Source/WebKit/NetworkProcess/ServiceWorker/WebSWServerToContextConnection.cpp:165
> >      send(Messages::WebSWContextManagerConnection::StartFetch { serverConnectionIdentifier, serviceWorkerIdentifier, fetchIdentifier, request, options, data, referrer });
> 
> I'd feel a lot better if we dealt with send() returning false here.

If send() failed, the timeout timer would catch those cases.

I agree it'd be a nice optimization to make that failure be immediate..
Comment 26 Brady Eidson 2019-10-08 10:10:04 PDT
(In reply to Brady Eidson from comment #25)
> (In reply to Chris Dumez from comment #21)
> > Comment on attachment 380362 [details]
> > Patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=380362&action=review
> > 
> > > Source/WebKit/NetworkProcess/ServiceWorker/WebSWServerToContextConnection.cpp:165
> > >      send(Messages::WebSWContextManagerConnection::StartFetch { serverConnectionIdentifier, serviceWorkerIdentifier, fetchIdentifier, request, options, data, referrer });
> > 
> > I'd feel a lot better if we dealt with send() returning false here.
> 
> If send() failed, the timeout timer would catch those cases.
> 
> I agree it'd be a nice optimization to make that failure be immediate..

I'll make that change.
Comment 27 Brady Eidson 2019-10-08 10:13:00 PDT
Created attachment 380439 [details]
Patch
Comment 28 Brady Eidson 2019-10-08 10:16:04 PDT
(In reply to Brady Eidson from comment #24)
> (In reply to youenn fablet from comment #22)
> > Comment on attachment 380362 [details]
> > Patch
> > 
> > Overall patch looks good to me too.
> > 
> > I am concerned by the case of restarting m_timeoutTimer in
> > didReceiveResponse/didReceiveData.
> > This does not seem necessary to cover the bug reports we know of and might
> > potentially create regressions in existing websites (since we fail the load
> > in that case while otherwise we just go to the network process).
> > 
> > I would tend to land the restarting of m_timeoutTimer in receive
> > response/receive data as a separate follow-up patch.
> > We should add a test covering these cases (see for instance
> > LayoutTests/imported/w3c/web-platform-tests/service-workers/service-worker/
> > fetch-event-respond-with-readable-stream.https.html and
> > LayoutTests/imported/w3c/web-platform-tests/service-workers/service-worker/
> > fetch-event-respond-with-readable-stream-chunk.https.html), and check what
> > other browsers are doing in that area.
> 
> Fair point, will leave that out!
> 
> > > Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:114
> > > +    m_didReachTerminalState = true;
> > 
> > I do not think that m_didReachTerminalState is necessary.
> > By design, if we have didFinish/didFail/didNotHandle, service worker process
> > should not send any IPC for that particular fetch task.
> > This is also guaranteed by the fact that ServiceWorkerFetchTask will no
> > longer receive any IPC message after receiving any of these messages.
> > See WebSWServerToContextConnection::didReceiveFetchTaskMessage.
> 
> I added that because during development a very basic showed double messaging.

Very basic test*
Comment 29 WebKit Commit Bot 2019-10-08 12:50:20 PDT
Comment on attachment 380439 [details]
Patch

Clearing flags on attachment: 380439

Committed r250852: <https://trac.webkit.org/changeset/250852>
Comment 30 WebKit Commit Bot 2019-10-08 12:50:21 PDT
All reviewed patches have been landed.  Closing bug.
Comment 31 Radar WebKit Bug Importer 2019-10-08 12:51:17 PDT
<rdar://problem/56084814>
Comment 32 youenn fablet 2019-10-09 00:25:47 PDT
Comment on attachment 380439 [details]
Patch

Timer in didReceiveResponse should be stopped otherwise we will fail intercepted loads of more than 70 seconds.
Some nits below. 

View in context: https://bugs.webkit.org/attachment.cgi?id=380439&action=review

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:62
> +    m_wasHandled = true;

We could stop the timer here.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:70
> +    m_wasHandled = true;

We NEED to stop the timer here.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:-58
> -    m_connection->send(Messages::ServiceWorkerClientFetch::DidReceiveData { data, encodedDataLength }, m_identifier.fetchIdentifier);

We could add an ASSERT that the timer is not active.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.cpp:83
> +    if (m_connection)

We could add an ASSERT that the timer is not active.

> Source/WebKit/NetworkProcess/ServiceWorker/ServiceWorkerFetchTask.h:78
> +    const WebCore::ServiceWorkerIdentifier& serviceWorkerIdentifier() const { return m_serviceWorkerIdentifier; }

We could return directly by value, that is what the constructor is taking as input.

> Source/WebKit/NetworkProcess/ServiceWorker/WebSWServerToContextConnection.cpp:221
> +    HashSet<Ref<ServiceWorkerFetchTask>> otherFetches;

A Vector is probably more efficient.
Comment 33 youenn fablet 2019-10-10 04:30:13 PDT
(In reply to youenn fablet from comment #32)
> Comment on attachment 380439 [details]
> Patch
> 
> Timer in didReceiveResponse should be stopped otherwise we will fail
> intercepted loads of more than 70 seconds.

Patch up for review at bug 202787