RESOLVED FIXED 180477
We should be able to recover after a Service Worker process crash
https://bugs.webkit.org/show_bug.cgi?id=180477
Summary We should be able to recover after a Service Worker process crash
Chris Dumez
Reported 2017-12-06 08:58:57 PST
We should be able to recover after a Service Worker process crash.
Attachments
WIP Patch (32.75 KB, patch)
2017-12-06 17:01 PST, Chris Dumez
no flags
WIP Patch (32.88 KB, patch)
2017-12-06 17:04 PST, Chris Dumez
no flags
Patch (35.24 KB, patch)
2017-12-06 20:04 PST, Chris Dumez
no flags
Patch (38.81 KB, patch)
2017-12-06 22:24 PST, Chris Dumez
no flags
Patch (38.88 KB, patch)
2017-12-06 22:31 PST, Chris Dumez
no flags
Patch (38.90 KB, patch)
2017-12-06 22:59 PST, Chris Dumez
no flags
Chris Dumez
Comment 1 2017-12-06 17:01:48 PST
Created attachment 328654 [details] WIP Patch
Chris Dumez
Comment 2 2017-12-06 17:04:02 PST
Created attachment 328656 [details] WIP Patch Will clean up tonight.
Chris Dumez
Comment 3 2017-12-06 17:04:12 PST
*** Bug 180496 has been marked as a duplicate of this bug. ***
Chris Dumez
Comment 4 2017-12-06 20:04:54 PST
Brady Eidson
Comment 5 2017-12-06 21:27:31 PST
Comment on attachment 328675 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328675&action=review > Source/WebKit/StorageProcess/ServiceWorker/WebSWServerConnection.h:98 > + template<typename U> static void sendToContextProcess(U&& message); Why static? In a future world of multiple context processes, how can this possibly be static? If it's a convenience because of the lambda captures... can we capture the connection, or a way of looking the connection up?
Chris Dumez
Comment 6 2017-12-06 21:43:21 PST
Comment on attachment 328675 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328675&action=review >> Source/WebKit/StorageProcess/ServiceWorker/WebSWServerConnection.h:98 >> + template<typename U> static void sendToContextProcess(U&& message); > > Why static? > > In a future world of multiple context processes, how can this possibly be static? > > If it's a convenience because of the lambda captures... can we capture the connection, or a way of looking the connection up? Yes, it is because of the lambda captures. It would be unsafe to capture |this| was I have no way to make sure the connection is still alive when the lambda is called. I'll try and find another way without making this static.
Chris Dumez
Comment 7 2017-12-06 21:48:24 PST
Comment on attachment 328675 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328675&action=review >>> Source/WebKit/StorageProcess/ServiceWorker/WebSWServerConnection.h:98 >>> + template<typename U> static void sendToContextProcess(U&& message); >> >> Why static? >> >> In a future world of multiple context processes, how can this possibly be static? >> >> If it's a convenience because of the lambda captures... can we capture the connection, or a way of looking the connection up? > > Yes, it is because of the lambda captures. It would be unsafe to capture |this| was I have no way to make sure the connection is still alive when the lambda is called. I'll try and find another way without making this static. WebSWServerConnection is not refcounted so I cannot protect it when I capture.
youenn fablet
Comment 8 2017-12-06 21:51:58 PST
Comment on attachment 328675 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328675&action=review > Source/WebCore/ChangeLog:12 > + Once the connection is the context is established, process "run service worker" s/is/in > Source/WebCore/ChangeLog:17 > + if processed. We used to assert that we had a connection to the context process. s/if/is > Source/WebKit/StorageProcess/ServiceWorker/WebSWServerConnection.cpp:123 > + server().runServiceWorkerIfNecessary(*serviceWorkerIdentifier, [connectionIdentifier = identifier(), fetchIdentifier, serviceWorkerIdentifier, request, options, formData](bool success) { We should probably have WebSWServerConnection::startFetch take rvalue references for request, options and formData to take benefit of these here. Space between ] ( although I am not a fan either. Ditto in below WebSWServerConnection::XX methods. > Source/WebKit/StorageProcess/ServiceWorker/WebSWServerConnection.cpp:125 > + sendToContextProcess(Messages::WebSWContextManagerConnection::StartFetch { connectionIdentifier, fetchIdentifier, serviceWorkerIdentifier, request, options, formData }); In case success is false, we need to return a DidNotHandle message on the web content process. > Source/WebKit/StorageProcess/ServiceWorker/WebSWServerConnection.h:55 > + IPC::Connection& ipcConnection() const { return m_contentConnection.get(); } Not sure ipcConnection is a great name since there is also a service worker process connection. Should it be something like contentConnection() or webContentConnection() instead? > Source/WebKit/StorageProcess/ServiceWorker/WebSWServerConnection.h:101 > Ref<IPC::Connection> m_contentConnection; Hum, m_contextConnection is never used so we could remove it, or start using it. Probably better to just remove it for now. We are probably not ready to split service workers into separate processes based on origins. > Source/WebKit/StorageProcess/StorageProcess.cpp:112 > + // Relaunch the context process if necessary. Is this comment needed? > Source/WebKit/StorageProcess/StorageProcess.cpp:124 > + if (m_swServerConnections.size() == 1 && m_serverToContextConnection && &m_swServerConnections.begin()->value->ipcConnection() == m_serverToContextConnection->ipcConnection()) Instead, should we have WebSWServerConnection has a bool isServiceWorkerProcessConnection()? > LayoutTests/http/tests/workers/service/resources/postmessage-after-sw-process-crash.js:19 > + }, 0); Are we sure 0 is sufficient to ensure it will crash before the messages? Should we not try to make the test a little bit more robust with something like: - a first message to the SW to set a value inside the worker, the SW having a default value. - a second message to get from the SW the stored value, which should be the default value if the worker process was crashed and the service worker rerun.
Chris Dumez
Comment 9 2017-12-06 21:58:56 PST
Comment on attachment 328675 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328675&action=review >> LayoutTests/http/tests/workers/service/resources/postmessage-after-sw-process-crash.js:19 >> + }, 0); > > Are we sure 0 is sufficient to ensure it will crash before the messages? > Should we not try to make the test a little bit more robust with something like: > - a first message to the SW to set a value inside the worker, the SW having a default value. > - a second message to get from the SW the stored value, which should be the default value if the worker process was crashed and the service worker rerun. I am sure. See https://bugs.webkit.org/show_bug.cgi?id=180496 where the test was failing before I implemented recovery.
Chris Dumez
Comment 10 2017-12-06 22:06:04 PST
Comment on attachment 328675 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=328675&action=review >> Source/WebKit/StorageProcess/ServiceWorker/WebSWServerConnection.h:55 >> + IPC::Connection& ipcConnection() const { return m_contentConnection.get(); } > > Not sure ipcConnection is a great name since there is also a service worker process connection. > Should it be something like contentConnection() or webContentConnection() instead? Just trying to be consistent with Brady here who used WebSWServerToContextConnection::ipcConnection().
Brady Eidson
Comment 11 2017-12-06 22:19:30 PST
(In reply to Chris Dumez from comment #10) > Comment on attachment 328675 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=328675&action=review > > >> Source/WebKit/StorageProcess/ServiceWorker/WebSWServerConnection.h:55 > >> + IPC::Connection& ipcConnection() const { return m_contentConnection.get(); } > > > > Not sure ipcConnection is a great name since there is also a service worker process connection. > > Should it be something like contentConnection() or webContentConnection() instead? > > Just trying to be consistent with Brady here who used > WebSWServerToContextConnection::ipcConnection(). Meh - Use something more sensical if such a thing exists.
Chris Dumez
Comment 12 2017-12-06 22:23:22 PST
(In reply to Brady Eidson from comment #11) > (In reply to Chris Dumez from comment #10) > > Comment on attachment 328675 [details] > > Patch > > > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=328675&action=review > > > > >> Source/WebKit/StorageProcess/ServiceWorker/WebSWServerConnection.h:55 > > >> + IPC::Connection& ipcConnection() const { return m_contentConnection.get(); } > > > > > > Not sure ipcConnection is a great name since there is also a service worker process connection. > > > Should it be something like contentConnection() or webContentConnection() instead? > > > > Just trying to be consistent with Brady here who used > > WebSWServerToContextConnection::ipcConnection(). > > Meh - Use something more sensical if such a thing exists. We have so many "connections", I am personally fine with ipcConnection() as it makes it clear it is the underlying IPC::Connection. contentConnection() or webContentConnection() are super vague and I would have no idea what they return from their name.
Chris Dumez
Comment 13 2017-12-06 22:24:53 PST
WebKit Commit Bot
Comment 14 2017-12-06 22:27:45 PST
Comment on attachment 328679 [details] Patch Rejecting attachment 328679 [details] from commit-queue. Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-02', 'validate-changelog', '--check-oops', '--non-interactive', 328679, '--port=mac']" exit_code: 1 cwd: /Volumes/Data/EWS/WebKit ChangeLog entry in LayoutTests/ChangeLog contains OOPS!. Full output: http://webkit-queues.webkit.org/results/5525882
Chris Dumez
Comment 15 2017-12-06 22:31:45 PST
WebKit Commit Bot
Comment 16 2017-12-06 22:53:35 PST
Comment on attachment 328680 [details] Patch Rejecting attachment 328680 [details] from commit-queue. Failed to run "['/Volumes/Data/EWS/WebKit/Tools/Scripts/webkit-patch', '--status-host=webkit-queues.webkit.org', '--bot-id=webkit-cq-02', 'build', '--no-clean', '--no-update', '--build-style=release', '--port=mac']" exit_code: 2 cwd: /Volumes/Data/EWS/WebKit Last 500 characters of output: unsigned long long, std::optional<WTF::ObjectIdentifier<WebCore::ServiceWorkerIdentifierType> >, WebCore::ResourceRequest&&, WebCore::FetchOptions&&, IPC::FormDataReference&&) in WebSWServerConnection.o ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) ** BUILD FAILED ** The following build commands failed: Ld /Volumes/Data/EWS/WebKit/WebKitBuild/Release/WebKit.framework/Versions/A/WebKit normal x86_64 (1 failure) Full output: http://webkit-queues.webkit.org/results/5526070
Chris Dumez
Comment 17 2017-12-06 22:59:08 PST
youenn fablet
Comment 18 2017-12-06 23:08:03 PST
But do we really need this ipcConnection() if it is only used to know which server connection we are talking to, in which case a bool getter would be better? Also ipcConnection may return a const ref if possible.
Chris Dumez
Comment 19 2017-12-06 23:10:16 PST
(In reply to youenn fablet from comment #18) > But do we really need this ipcConnection() if it is only used to know which > server connection we are talking to, in which case a bool getter would be > better? > Also ipcConnection may return a const ref if possible. We do not have this information at the moment so adding the bool getter was not trivial. I can look into this in a follow up.
WebKit Commit Bot
Comment 20 2017-12-06 23:37:53 PST
Comment on attachment 328681 [details] Patch Clearing flags on attachment: 328681 Committed r225622: <https://trac.webkit.org/changeset/225622>
WebKit Commit Bot
Comment 21 2017-12-06 23:37:55 PST
All reviewed patches have been landed. Closing bug.
Radar WebKit Bug Importer
Comment 22 2017-12-06 23:38:22 PST
Note You need to log in before you can comment on or make changes to this bug.