Bug 109980 - [WK2] Calls to WKPageLoadURL() when WebProcess was terminated may fail
Summary: [WK2] Calls to WKPageLoadURL() when WebProcess was terminated may fail
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on: 109842
Blocks: 110743
  Show dependency treegraph
 
Reported: 2013-02-15 15:59 PST by Adenilson Cavalcanti Silva
Modified: 2013-04-15 18:14 PDT (History)
12 users (show)

See Also:


Attachments
First version (not for commit, tests are timing out ATM) (12.23 KB, patch)
2013-02-18 17:10 PST, Adenilson Cavalcanti Silva
benjamin: review-
Details | Formatted Diff | Diff
Changes: comments, empty line, no longer using gtest magick. (12.62 KB, patch)
2013-02-19 12:09 PST, Adenilson Cavalcanti Silva
benjamin: review+
Details | Formatted Diff | Diff
Fixes on comments (12.54 KB, patch)
2013-02-22 14:18 PST, Adenilson Cavalcanti Silva
benjamin: review+
savagobr: commit-queue-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adenilson Cavalcanti Silva 2013-02-15 15:59:54 PST
Steps:
- load a page with WKPageLoadURL()
- kill WebProcess with WKPageTerminate()
- try loading a page using again WKPageLoadURL()

The expected result would be WebProcess being re-created and page loading succeed. But in a test, it fails to re-create WebProcess and thus loading the page.

This sounds like a good opportunity to create a new test to stress the following scenarios:
a) Load -> Crash -> Load
b) Load -> Crash -> Load (but from the crash handler)
Comment 1 Adenilson Cavalcanti Silva 2013-02-18 17:10:52 PST
Created attachment 188966 [details]
First version (not for commit, tests are timing out ATM)

Not for commit, tests are timing out ATM.
Comment 2 Benjamin Poulain 2013-02-18 18:32:45 PST
Comment on attachment 188966 [details]
First version (not for commit, tests are timing out ATM)

View in context: https://bugs.webkit.org/attachment.cgi?id=188966&action=review

> Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:39
> +

No need for the blank line.

> Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:97
> +void didCrash(WKPageRef page, const void* clientInfo)
> +{
> +    WebKit2CrashLoader* testHelper = const_cast<WebKit2CrashLoader*>(static_cast<const WebKit2CrashLoader*>(clientInfo));
> +
> +    // On second crash, we try to load from the crash handler.
> +    EXPECT_TRUE(testHelper->firstSuccessfulLoad);
> +    testHelper->loadUrl();
> +}

I would move this definition to just above LoadPageOnCrashHandler.
This would reduce the span between definition and usage.

> Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:99
> +TEST_F(WebKit2CrashLoader, LoadPageAfterCrash)

I am not a fan of TEST_F. I'd prefer something explicit with less GoogleTest magic.

The test case name should also be strictly WebKit2 in order to easily select tests for running.

> Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:110
> +TEST_F(WebKit2CrashLoader, LoadPageOnCrashHandler)

LoadPageInCrashHandler?

> Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:118
> +{
> +    // We install a crash handler, it should load a page.
> +    loaderClient.processDidCrash = didCrash;
> +    loadUrl();
> +    Util::run(&firstSuccessfulLoad);
> +    crashWebProcess();
> +    Util::run(&secondSuccessfulLoad);
> +}

This tests requires a bit more comments. It is not obvious how the test work.
Comment 3 Adenilson Cavalcanti Silva 2013-02-19 12:09:58 PST
Created attachment 189135 [details]
Changes: comments, empty line, no longer using gtest magick.

Implemented reviewer's suggestions. The tests are timing out (not sure if the bug is in the tests itself or in WK2).

After the first load succeeded, next page load will fail and WebProcess won't be re-created.
Comment 4 Benjamin Poulain 2013-02-19 13:28:45 PST
Comment on attachment 189135 [details]
Changes: comments, empty line, no longer using gtest magick.

View in context: https://bugs.webkit.org/attachment.cgi?id=189135&action=review

> Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:92
> +// First test will load a blank page and next kill WebProcess, the expected
> +// result is that a call to page load should spawn a new WebProcess.

You should say "This test" instead of "First test".
Rationale:
-If someone add a test in this file, the comments get out of date.
-We should not imply there is an order of testing. Testing the test in any order should be the same.

> Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:103
> +// Crash handler, used only by the second test.

"the second test" -> LoadPageInCrashHandler

> Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:108
> +    // In the second test, we try to load from the crash handler.

ditto.

> Tools/TestWebKitAPI/Tests/WebKit2/LoadPageOnCrash.cpp:113
> +// This is similar to the first test (i.e. we will load a page and

"the first test"->LoadPageAfterCrash
Comment 5 Adenilson Cavalcanti Silva 2013-02-22 14:18:37 PST
Created attachment 189826 [details]
Fixes on comments

Fixes on comments, not to land.
Comment 6 Adenilson Cavalcanti Silva 2013-02-22 18:35:59 PST
I think I figured it out what is happening. What would be expected is:

terminateProcess() --> loadURL() --> reattachtoWebProcess()

But what really is happening:

terminateProcess() --> loadURL() --> processDidCrash()


At the time that loadURL() is executed, the teardown of WebProcess is not completed yet (i.e. processDidCrash() hasn't executed).

While loading a URL, the test for the state of webProcess  is done calling isValid() that will return true (after all, it is set to false only at a later stage in processDidCrash()).

Simply forcing the execution of reattachToWebProcess() is not a solution, because execution flow will eventually return to processDidCrash(). Plus, it would fail in several points thanks to the fact that there is already associated receivers to the webProcess data structures.

I see at least 2 ways to deal with this scenario:

a) Have a flag to mark that we are in a undefined state (i.e. between terminateProcess() and processDidCrash(). That could be used by loadURL() to decide if/when to reattachToWebProcess() and later used by processDidCrash() to change the execution flow.

b) Move some (or most) of the teardown code in processDidCrash() to terminateProcess(). Not sure about the deep consequences of this (i.e. how long does it take to close connections, kill processes, etc) and how that would impact UIProcess.


Any ideas about this?
Comment 7 Benjamin Poulain 2013-02-22 18:53:28 PST
Interesting findings.

Having terminateProcess() perform all the necessary cleaning before getting processDidCrash() seems reasonable to me.
Alexey, do you foresee any problem with this?

The good news is that should not affect users. The bad news is, it will make testing a pain.
Comment 8 Adenilson Cavalcanti Silva 2013-02-23 06:51:24 PST
Maybe an alternative would be to move the code in processDidCrash() to a private function (i.e. processTearDown()) and have it be called by loadURL() if we are in the undefined state.

It could use a flag to know the state (i.e. m_isAboutToCrash) that is set at terminateProcess() exit.

In normal cases, it would work like:
a) terminateProcess() sets flag to true
b) processDidCrash() reset flag and calls processTearDown().


And in the case stressed by the test:
a) terminateProcess() sets the flag to true; 
b) loadURL() tests for the flag and calls processTearDown() so reattachToWebprocess() will work fine
c) processDidCrash() tests for the flag and returns after reseting it (and will *not need* to call processTearDown())

What you guys think?
Comment 9 Adenilson Cavalcanti Silva 2013-02-23 06:56:02 PST
I forgot to comment that at 'b' step (i.e. loadURL()), if we are in the undefined state, after calling processTearDown(), it will reset the control flag.

Otherwise, processDidCrash() won't be able to know if it can avoid calling processTearDown().
Comment 10 Alexey Proskuryakov 2013-02-23 14:50:19 PST
I think that processDidCrash() should not be called after WKPageTerminate(), simply because there is no crash. WKPageTerminate() should perform all necessary cleanup.
Comment 11 Adenilson Cavalcanti Silva 2013-02-25 03:33:08 PST
Opening bug #110743 for writing the fix.
Comment 12 Adenilson Cavalcanti Silva 2013-02-25 03:36:32 PST
Alexey

We can change the code to not execute processDidCrash() at calls to terminateProcess().

But, it still needs access to a way for cleaning up WebProcess associated resources, as it will be executed when the process indeed crashed (but was not terminated by a API call). i.e. kill -9 $(pidof WebProcess)
Comment 13 Adenilson Cavalcanti Silva 2013-04-15 18:14:39 PDT
This test was used in the final patch of issue 110743 (landed in revision 148312).