Summary: | [WK2] WebPageProxy loadURL() won't work when called just after terminateProcess() | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | WebKit | Reporter: | Adenilson Cavalcanti Silva <savagobr> | ||||||||||||||
Component: | WebKit2 | Assignee: | Benjamin Poulain <benjamin> | ||||||||||||||
Status: | RESOLVED FIXED | ||||||||||||||||
Severity: | Normal | CC: | andersca, ap, benjamin, cgarcia, gyuyoung.kim, hausmann, jesus, rafael.lobo, rakuco, sergio, webkit.review.bot, zan | ||||||||||||||
Priority: | P2 | ||||||||||||||||
Version: | 528+ (Nightly build) | ||||||||||||||||
Hardware: | All | ||||||||||||||||
OS: | All | ||||||||||||||||
Bug Depends on: | 109980 | ||||||||||||||||
Bug Blocks: | |||||||||||||||||
Attachments: |
|
Description
Adenilson Cavalcanti Silva
2013-02-25 03:31:09 PST
An update on this issue: I started by moving most of the code in WebPageProxy processDidCrash() to a private function named processTeardown(). It is used by both processDidCrash() and at terminateProcess(). The basic idea here is that when terminateProcess() was called, it should do the cleanup and processDidCrash() should *not* execute in a later step. Further investigation revealed that WebProcessProxy will call WebPageProxy::processDidCrash() when the connection to WebProcess is closed. So it seems that is required to know at WebProcessProxy the 2 cases: a) When WebPageProxy::terminateProcess() has executed: because in this case, we will do the process cleanup just after calling ChildProcesProxy::terminate() and WebPageProxy::processDidCrash() should *not* be executed, otherwise cleanup would be done twice. and b) When the process crashed by external factors: in this case no process cleanup has being done so far and *there is* the need to execute WebPageProxy::processDidCrash(). There is an added complication I've observed. Some ports (e.g. EFL) after being notified by WebPageProxy::processDidCrash() that the process has exited, will call WebPageProxy::loadURL(). This happens exactly in the middle of WebProcess associated resources tear down. I think it makes sense to notify the port of WebProcess unavailability only in cases of real crash, not in the case of explicit termination (i.e. when WebPageProxy::terminateProcess() was called). Finally, it would be safer to notify the port of WebProcess crash only *after* cleaning up the associated resources (so that if the port calls WebPageProxy::loadURL() or reload(), we wouldn't be in the an undefined state). It looks like we need terminateProcess() to: 1) Ensure any future message from the WebProcess will be discarded. 2) Kill the Process 3) Reset to a clean state. 4) Maybe call processDidCrash, or maybe schedule its call, I am not sure. When terminateProcess() finishes, any command on the PageProxy should be valid. (In reply to comment #2) > It looks like we need terminateProcess() to: > 4) Maybe call processDidCrash, or maybe schedule its call, I am not sure. We shouldn’t call processDidCrash - if the client explicitly asked for the process to be terminated, we shouldn’t consider it to be a crash. Created attachment 190867 [details]
Makes WebPageProxy cleanup at terminateProcess() step
An explanation on this patch... first, the good: it does the resources cleanup associated with a client call to process termination and implements a way for WebProcessProxy to know if it is a crash or if it is an explicit termination, thus notifying the client when needed. The bad: it won't make the tests on #109980 to succeed because when when the WebPageProxy::reattachToWebProcess() is executed, WebProcessProxy *hasn't* completed its own cleanup. Comment on attachment 190867 [details] Makes WebPageProxy cleanup at terminateProcess() step View in context: https://bugs.webkit.org/attachment.cgi?id=190867&action=review > Source/WebKit2/UIProcess/WebProcessProxy.cpp:404 > + m_isTerminated = false; if !m_isTerminated, then m_isTerminated is already false. Created attachment 196821 [details]
Do all cleanup on terminate (both WebPageProxy and WebProcessProxy).
The good: does all cleanup (including WebProcessProxy) so that WebPageProxy::reattachToWebProcess() will work (and so loadURL()). It will break the other unit tests that simulate crashes calling WebPageProxy::terminateProcess(), since we no longer notify the client of a 'crash' (since it really is not a crash, but was requested by the user). Question: what to do with the other unit tests that depend on execution of a crash handler? Created attachment 196846 [details]
With new test (load, crash, load)
WebProcess termination now is done using a new function 'userRequestedTerminate' that performs complete shutdown of WebProcessProxy before returning. This ensures that at time reattachToWebProcess() is executed, WebProcessProxy is in a sane state.
Next, cleanup of WebPageProxy is done at terminateProcess(), using a new function named 'processTeardown()', that is shared with processDidCrash(). Finally, client is not notified of process crash, since it was terminated per client's request.
This changes makes the new test succeed, but will cause a regression in all tests that depend on a callback to be executed at process termination.
Comment on attachment 196846 [details] With new test (load, crash, load) View in context: https://bugs.webkit.org/attachment.cgi?id=196846&action=review > will cause a regression in all tests that depend on a callback to be executed at process termination. What are these other tests? We clearly need to do something about them, can't just make them fail. > Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:59 > + void crashWebProcess() > + { > + WKPageTerminate(webView.page()); > + } Unsure why you called this function crashWebProcess. It doesn't make it crash, does it? Alexey Thanks for commenting. Answering the questions: a) What are the tests? All that use WKPageTerminate API, which are 3 tests (MouseMoveAfterCrash, ReloadPageAfterCrash, ResizeWindowAfterCrash). I think they can work *if* we execute the code in the crash callback within the scope of primary test function. b) why 'crashWebProcess': it terminates WebProcess thus simulating a crash (the other 3 tests follow a similar nomenclature). If you wish, I can change the function name. Created attachment 196945 [details]
Fixed failing unit test
Fixed failing unit test.
Comment on attachment 196945 [details] Fixed failing unit test View in context: https://bugs.webkit.org/attachment.cgi?id=196945&action=review > Source/WebKit2/ChangeLog:14 > + WebPageProxy should do its resources clean up as soon as executing the call to terminate. > + > + Its state will be undefined if depending on the execution of processDidCrash(), thus this > + patch moves the cleanup code to a shared private function that is used for both the cases > + i.e. user termination and real crash. WebProcess shutdown is done using a new method that > + ensures that all cleanup was done before returning. Finally, for user requested termination, > + clients are no longer notified of crash. I think the changelog should be clearer. You are fixing a very specific issue. You first state what was the bug, then explain why it happened, then you put the text explaining how you fixed the problem. > Source/WebKit2/UIProcess/WebPageProxy.cpp:1571 > + m_process->userRequestedTerminate(); > + processTeardown(); Why are those called in this order? You should first clean up the state, then invoke any callback. > Source/WebKit2/UIProcess/WebPageProxy.cpp:3803 > + if (!m_isValid) > + return; When does that happen? 1) terminateProcess() 2) processTeardown() 3) processDidCrash() ? Because I would think you cannot get processDidCrash() after processTeardown() (because of m_process->removeMessageReceiver). Can this be tested? > Source/WebKit2/UIProcess/WebPageProxy.cpp:3810 > + // After removing the message receivers and cleaning up resources, we > + // notify the client/port that a crash happened. This ensures > + // that we are in a defined state and calls to loadURL() or reload() > + // should work. I don't think this comment is necessary. > Source/WebKit2/UIProcess/WebPageProxy.h:604 > + void processTeardown(); This should be private. > Source/WebKit2/UIProcess/WebProcessProxy.h:119 > + void userRequestedTerminate(); > + The name should not have "user", as it is a callback for any programmatic termination. Not only the one triggered by the user. Why not terminateProcess()? Please ping me when you have an update. Let's iterate quickly on this until fixed. Benjamin Thanks for reviewing this patch. I will try to address the comments bellow. > I think the changelog should be clearer. You are fixing a very specific >issue. You first state what was the bug, then explain why it happened, then >you put the text explaining how you fixed the problem. > Concerning the CL... agreed. The fix is one thing (i.e. load fails after call to terminate) and the points of failure are 2: a) Cleaning up of WebPageProxy b) Reattach executing *before* complete WebProcessProxy cleanup. I will upload a new version of this patch with a hopefully better explanation in the CL. > > Source/WebKit2/UIProcess/WebPageProxy.cpp:1571 > > + m_process->userRequestedTerminate(); > > + processTeardown(); > > Why are those called in this order? > Executing WebProcess termination must be done *before* any cleanup of WebPageProxy. If you think about it, in normal crashes, this is exactly what has happened already. This ensures that reattach/respawn of WebProcess will succeed, since when we return to WebPageProxy, WebProcessProxy is in a well defined state (and not in the middle of its own cleanup). > > When does that happen? > 1) terminateProcess() > 2) processTeardown() > 3) processDidCrash() The previous behavior was to execute processDidCrash(), called from WebProcessProxy. With this patch, using the new member function 'userRequestedTerminate', WebProcessProxy will *not* callback any method into WebPageProxy. This includes processDidCrash(). So, the execution flow for user requested termination was: WebPageProxy................ WebProcessProxy terminateProcess() ---------> processTerminate() processDidCrash() <---------- DidClose() *Now* the execution flow for this case is: WebPageProxy................ WebProcessProxy terminateProcess() ---------> userRequestedTerminate() //kill everything processTearDown() For normal crashes (i.e. not user requested termination), the execution flow hasn't changed. WebProcessProxy will call processDidCrash() into WebPageProxy which will then notify the client/port of the crash. Another thing that this patch changes is the ordering of steps for normal crashes. Before it was: notify the client/port of a crash and next try to do the cleanup. Now it is: first do the cleanup, next notify the client/port of a crash. This helps to ensure that WebPageProxy is in a defined state if someone calls its member functions (e.g. loadURL()). In those cases, since the cleanup hasn't executed yet, the test for WebProcess state would fail (and reattach is never executed). > > > Source/WebKit2/UIProcess/WebPageProxy.cpp:3810 > > + // After removing the message receivers and cleaning up resources, we > > + // notify the client/port that a crash happened. This ensures > > + // that we are in a defined state and calls to loadURL() or reload() > > + // should work. > > I don't think this comment is necessary. > Ack. > > Source/WebKit2/UIProcess/WebPageProxy.h:604 > > + void processTeardown(); > > This should be private. > Agreed. > > Source/WebKit2/UIProcess/WebProcessProxy.h:119 > > + void userRequestedTerminate(); > > + > > The name should not have "user", as it is a callback for any programmatic termination. Not only the one triggered by the user. > Isn't this executed when the user/client code call WKPageTerminate()? Created attachment 197640 [details]
Addressing reviewers comments: CL, member function naming, using private access, etc.
Tested on Qt/Linux, waiting to finish compile on Mac to run unit tests before asking for cq.
Comment on attachment 197640 [details] Addressing reviewers comments: CL, member function naming, using private access, etc. View in context: https://bugs.webkit.org/attachment.cgi?id=197640&action=review > Source/WebKit2/ChangeLog:17 > + closing connections. Otherwise, WebPageProxy can even be able to > + detect that WebProcess is no longer running, but a call to respawn > + the process will fail. "Otherwise, WebPageProxy can even be able to detect that WebProcess [...]" > Source/WebKit2/ChangeLog:19 > + To fix this issues, this patch moves the cleanup code to a shared private function "this issue" or "these issues". > Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:83 > + EXPECT_TRUE(testHelper->firstSuccessfulLoad); This can never be false. Comment on attachment 197640 [details] Addressing reviewers comments: CL, member function naming, using private access, etc. View in context: https://bugs.webkit.org/attachment.cgi?id=197640&action=review > Source/WebKit2/UIProcess/WebPageProxy.cpp:3810 > +void WebPageProxy::processTeardown() This function name is ambiguous; it sounds like it will process the teardown. Heads up on test results: Adenilsons-Mac-mini:Webkit adenilsoncavalcanti$ egrep -n Crash ../vanilla.txt | grep After 292: Test: MouseMoveAfterCrash -> Passed 302: Test: ReloadPageAfterCrash -> Passed 303: Test: ResizeWindowAfterCrash -> Passed Adenilsons-Mac-mini:Webkit adenilsoncavalcanti$ egrep -n Crash ../patched.txt | grep After 293: Test: LoadPageAfterCrash -> Passed 294: Test: MouseMoveAfterCrash -> Passed 304: Test: ReloadPageAfterCrash -> Passed 305: Test: ResizeWindowAfterCrash -> Passed All other API tests are ok too. Comment on attachment 197640 [details] Addressing reviewers comments: CL, member function naming, using private access, etc. View in context: https://bugs.webkit.org/attachment.cgi?id=197640&action=review >> Source/WebKit2/UIProcess/WebPageProxy.cpp:3810 >> +void WebPageProxy::processTeardown() > > This function name is ambiguous; it sounds like it will process the teardown. processTeardown-> resetStateAfterProcessExit? Anders, any suggestion? > Source/WebKit2/UIProcess/WebPageProxy.cpp:3899 > - > if (!m_isValid) { Can m_isValid be true at all at this point? Created attachment 197886 [details]
Patch
Landed: https://trac.webkit.org/r148312 (In reply to comment #22) > Landed: https://trac.webkit.org/r148312 So now ReloadPageAfterCrash is useless. The "didCrash" callback is never called and the test is now simply tests if the page reloads. In fact, if you simply comment "WKPageTerminate" from that test, it will also pass. (In reply to comment #23) > (In reply to comment #22) > > Landed: https://trac.webkit.org/r148312 > > So now ReloadPageAfterCrash is useless. The "didCrash" callback is never called and the test is now simply tests if the page reloads. In fact, if you simply comment "WKPageTerminate" from that test, it will also pass. Yep. I suggested to Adenilson to make a injected bundle to crash the WebProcess. That way we can start making process-crash tests again. How are we going to test WKPageTerminate if we never get a callback such as "didTerminate"? What if instead of having "processDidCrash" callback we had a single "processDidTerminate" or something like that with a boolean argument confirming if it was a crash or not? Given that the user of this API can figure out by himself if it has crashed by purpose or not checking whether he has used WKPageTerminate I think we don't even need that argument at all: he could just ignore "processDidTerminate" if he is indeed expecting the page to terminate or act as a "processDidCrash" otherwise at user level without changing the behavior we had before. (In reply to comment #25) > How are we going to test WKPageTerminate if we never get a callback such as "didTerminate"? What if instead of having "processDidCrash" callback we had a single "processDidTerminate" or something like that with a boolean argument confirming if it was a crash or not? Given that the user of this API can figure out by himself if it has crashed by purpose or not checking whether he has used WKPageTerminate I think we don't even need that argument at all: he could just ignore "processDidTerminate" if he is indeed expecting the page to terminate or act as a "processDidCrash" otherwise at user level without changing the behavior we had before. It is not clear to me how that would help WKPageTerminate. What kind of tests do you have in mind. A callback processDidTerminate does not really fit into the API. (In reply to comment #26) > (In reply to comment #25) > > How are we going to test WKPageTerminate if we never get a callback such as "didTerminate"? What if instead of having "processDidCrash" callback we had a single "processDidTerminate" or something like that with a boolean argument confirming if it was a crash or not? Given that the user of this API can figure out by himself if it has crashed by purpose or not checking whether he has used WKPageTerminate I think we don't even need that argument at all: he could just ignore "processDidTerminate" if he is indeed expecting the page to terminate or act as a "processDidCrash" otherwise at user level without changing the behavior we had before. > > It is not clear to me how that would help WKPageTerminate. What kind of tests do you have in mind. > > A callback processDidTerminate does not really fit into the API. I agree with Benjamin here. Plus, from an API perspective, both options look a bit messy. A boolean that will ending up as a state representation, or a few assumptions around using or ignoring callbacks. If what we want in the end is only a way to test a page-crashed-and-reload-something-else flow, the Injected Bundle path looks cleaner. (In reply to comment #27) > (In reply to comment #26) > > (In reply to comment #25) > > > How are we going to test WKPageTerminate if we never get a callback such as "didTerminate"? What if instead of having "processDidCrash" callback we had a single "processDidTerminate" or something like that with a boolean argument confirming if it was a crash or not? Given that the user of this API can figure out by himself if it has crashed by purpose or not checking whether he has used WKPageTerminate I think we don't even need that argument at all: he could just ignore "processDidTerminate" if he is indeed expecting the page to terminate or act as a "processDidCrash" otherwise at user level without changing the behavior we had before. > > > > It is not clear to me how that would help WKPageTerminate. What kind of tests do you have in mind. > > > > A callback processDidTerminate does not really fit into the API. > > > I agree with Benjamin here. Plus, from an API perspective, both options look a bit messy. A boolean that will ending up as a state representation, or a few assumptions around using or ignoring callbacks. > > If what we want in the end is only a way to test a page-crashed-and-reload-something-else flow, the Injected Bundle path looks cleaner. It's not only about whether we have a crash or not, but how can we test WKPageTerminate. The callback looks like a straight-forward solution. It's still not clear to me how WKPageTerminate can be tested today... or maybe we just don't want to test it. (In reply to comment #28) > It's not only about whether we have a crash or not, but how can we test WKPageTerminate. The callback looks like a straight-forward solution. It's still not clear to me how WKPageTerminate can be tested today... or maybe we just don't want to test it. Seriously I have trouble following you. To make thing easier, what kind of tests do you envision if we had a callback? (In reply to comment #28) > It's not only about whether we have a crash or not, but how can we test WKPageTerminate. The callback looks like a straight-forward solution. It's still not clear to me how WKPageTerminate can be tested today... or maybe we just don't want to test it. Seriously I have trouble following you. To make thing easier, what kind of tests do you envision if we had a callback? |