Bug 60875

Summary: [SOUP] Abnormal operation if server sends 5xx status code without HTTP body
Product: WebKit Reporter: Keunsoon Lee <keunsoon.lee>
Component: PlatformAssignee: James Simonsen <simonjam>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, alex, cmarcelo, dglazkov, fishd, gustavo, gyuyoung.kim, gyuyoung.kim, mrobinson, ossy, rniwa, robert, sangseok.lim, simonjam, svillar, webkit.review.bot, xan.lopez
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: Other   
OS: Linux   
Bug Depends on: 62835    
Bug Blocks:    
Attachments:
Description Flags
Handle status code 5xx after receiving HTTP body
none
add details on ChangeLog
mrobinson: review-
modify ChangeLog, add a new helper function, check handle->client() again
none
add test case on LayoutTests/http/tests/loading
mrobinson: review-
modify some codes according to review
mrobinson: review-
Archive of layout-test-results from ec2-cr-linux-03
none
add the layout test on chromium's layout test_expectations.txt to skip test
none
add bug number on chromium test_expectations.txt
none
Patch none

Description Keunsoon Lee 2011-05-16 01:31:48 PDT
* Background
   - Almost all web servers sends its own error page when they sends 5xx status code. But some web servers and proxy servers send no error page. 

* Reproduce scenario
1) Visit any website
2) The web site's server or proxy server send 5xx status code without error page (i.e. HTTP body)
   - Usually the HTTP response header does not have Content-Type in this case.
3) Webkit soup port accepts MIME-Type "inode/directory" from libsoup's content sniff callback
4) Webkit soup port throws the HTTP response to WebCore.
5) WebCore recognizes the HTTP response as download because it cannot handle the MIME-Type "inode/directory"

* Tips for reproducing on actual world (in my specific case)
1) Run Webkit Browser on smartphone
2) Use 3G network via Vodafone proxy
3) Try to visit "http://www.kimyuna.zoa.to"
4) Then, the proxy server cannot find the web page, and sends "504 Gateway Timeout" without HTTP body.
Comment 1 Keunsoon Lee 2011-05-16 01:48:25 PDT
Created attachment 93625 [details]
Handle status code 5xx after receiving HTTP body

Handle status code 5xx after receiving HTTP body
    - Background
        1) Webkit soup port premises that server always sends error page in case of status code 5xx.
        2) But, sometimes there is no HTTP body for error page in that case.
    - Modified algorithm;
        1) libsoup sends status code 5xx (server error)
        2) Webkit soup port orders to accumulate HTTP body chunks to libsoup.
        3) Webkit soup port ignores gotHeadersCallback, contentSniffedCallback and gotChunkCallback.
        4) Webkit soup port checks there is HTTP body or not on sendRequestCallback and;
            4-1) if there is HTTP body, calls didReceiveResponse()
            4-2) if there is no HTTP body, calls didFail()
Comment 2 Martin Robinson 2011-05-16 07:57:22 PDT
Comment on attachment 93625 [details]
Handle status code 5xx after receiving HTTP body

View in context: https://bugs.webkit.org/attachment.cgi?id=93625&action=review

> Source/WebCore/ChangeLog:9
> +        [SOUP] Abnormal operation if server sends 5xx status code without HTTP body
> +        https://bugs.webkit.org/show_bug.cgi?id=60875
> +
> +        Handle status code 5xx after receiving HTTP body
> +

This doesn't really say what this fixes other than "abnormal behavior."

> Source/WebCore/ChangeLog:13
> +        * platform/network/soup/ResourceHandleSoup.cpp:
> +        (WebCore::statusWillBeHandledBySoup):
> +        (WebCore::soupErrorShouldCauseLoadFailure):
> +        (WebCore::convertSoupErrorToResourceError):

It makes sense to fill these out.
Comment 3 Martin Robinson 2011-05-16 07:57:41 PDT
Sergio, maybe you can check this one out?
Comment 4 Keunsoon Lee 2011-05-16 21:46:22 PDT
Created attachment 93736 [details]
add details on ChangeLog
Comment 5 Martin Robinson 2011-05-18 13:46:16 PDT
Comment on attachment 93736 [details]
add details on ChangeLog

View in context: https://bugs.webkit.org/attachment.cgi?id=93736&action=review

Do you have an example of a site this breaks? Ideally this change would come with a test case. Is the issue that when there is a 5xx error and no body, soup doesn't fail the load and it should? What do other network stacks do?

> Source/WebCore/ChangeLog:23
> +        Handle status code 5xx after receiving HTTP body
> +           - Background
> +            1) Webkit soup port premises that server always sends error page in case of status code 5xx.
> +            2) But, sometimes there is no HTTP body for error page. 
> +            3) In that case, Webkit tries to download the HTTP response even if there is no HTTP body. (abnormal operation)
> +           - Modified algorithm;
> +            1) libsoup sends status code 5xx (server error)
> +            2) Webkit soup port orders to accumulate HTTP body chunks to libsoup.
> +            3) Webkit soup port ignores gotHeadersCallback, contentSniffedCallback and gotChunkCallback.
> +            4) Webkit soup port checks there is HTTP body or not on sendRequestCallback and;
> +                4-1) if there is HTTP body, calls didReceiveResponse()
> +                4-2) if there is no HTTP body, calls didFail() - prevent the abnormal operation
> +           - With this modification;
> +            1) If server sends error page with HTTP body, Browser will show the received error page.
> +            2) If server sends no error page, that is, no HTTP body, Browser will show its own error page from each port.
> +

This seems okay, but typically ChangeLogs are in paragraph form. Take a look at trac.webkit.org for inspiration.

> Source/WebCore/platform/network/soup/ResourceHandleSoup.cpp:411
> +            || ((message && SOUP_STATUS_IS_SERVER_ERROR(message->status_code)) && (!message->response_body || !message->response_body->length))

Do you mind splitting this off into a helper function called something like statusIsServerErrorAndHaveNotReceivedBody?

> Source/WebCore/platform/network/soup/ResourceHandleSoup.cpp:422
> +        || ((message && SOUP_STATUS_IS_SERVER_ERROR(message->status_code)) && (!message->response_body || !message->response_body->length))) {

You can also use it here.
Comment 6 Keunsoon Lee 2011-05-19 04:33:20 PDT
Created attachment 94059 [details]
modify ChangeLog, add a new helper function, check handle->client() again

You can test the problematic situation on following test site.
    http://spe.mobilephone.net/wit/witguide.html
    WIT Root -> Common -> HTTP -> httperror.xhtml

If you are working with EFL port, you cannot find anything different between attached patch and original code.
Because EFL port does not support download flow and does not have its own error page.
But, you can see trying download on another port which is supporting download or showing own error page.

I also found 4xx status code cause same problem, so add it on a new patch.

And check handle->client() again after calling didReceiveResponse() on sendRequestCallback().
Because, WebCore makes handle->client() zero after didReceiveResponse() if something wrong on received response.
(I guess it means cancel.)
For example, received response's Content-Type is "foo/foo", which you can test on the test site.

Thank you.
Comment 7 Keunsoon Lee 2011-05-19 05:54:44 PDT
This issue is not about libsoup, but Webkit soup port.

In the problematic case, libsoup does almost correct behavior, except for sends contentSniffCallback even if there is no received HTTP body (the sniffed MIME Type is "inode/directory"). 

The real problem is that Webkit soup port does not check existence of HTTP body and throws it to WebCore unconditionally. WebCore shows, ignores or downloads the HTTP response according to MIME Type.

Because of libsoup's above bug, HTTP response which has not received body but Content-Type field is exist can be delivered to WebCore. Then WebCore tries to download the http response.

To prevent this, Webkit soup port needs to check if received body is exist or not.
Comment 8 Sergio Villar Senin 2011-05-27 01:53:05 PDT
From my understanding, the actual issue is that for empty body 4xx/5xx responses with non-textual content types, webkitgtk+ tries to download the content instead of showing an error page. The patch could indeed fix this particular situation but I'm afraid that it could cause side effects, as it will start to handle all the 4xx/5xx error codes in WebKit, something that was not needed so far. IMO this change should come with a proper test case.
Comment 9 Keunsoon Lee 2011-05-27 04:08:08 PDT
You are right. This patch handles all 4xx/5xx responses unlike current code. 
But, I think there is no difference from a WebCore's viewpoint. WebCore (i.e. ResourceHandle) still receives didReceiveResponse on current code for 4xx/5xx responses. 
And it can cause small delay because of waiting for data, but I think it is tolerable for wrong operation.

For side effects, I did my best to test this patch, but it still needs to be tested from others.
What do you think of the best way to test this?
Comment 10 Martin Robinson 2011-05-27 05:40:09 PDT
(In reply to comment #9)
> You are right. This patch handles all 4xx/5xx responses unlike current code. 
> But, I think there is no difference from a WebCore's viewpoint. WebCore (i.e. ResourceHandle) still receives didReceiveResponse on current code for 4xx/5xx responses. 
> And it can cause small delay because of waiting for data, but I think it is tolerable for wrong operation.
> 
> For side effects, I did my best to test this patch, but it still needs to be tested from others.
> What do you think of the best way to test this?

You should write a test like those in LayoutTests/http/tests.
Comment 11 Martin Robinson 2011-05-27 05:40:44 PDT
(In reply to comment #8)
> From my understanding, the actual issue is that for empty body 4xx/5xx responses with non-textual content types, webkitgtk+ tries to download the content instead of showing an error page. The patch could indeed fix this particular situation but I'm afraid that it could cause side effects, as it will start to handle all the 4xx/5xx error codes in WebKit, something that was not needed so far. IMO this change should come with a proper test case.

Do you have a suggestion for a better approach?
Comment 12 Keunsoon Lee 2011-06-02 00:55:50 PDT
Created attachment 95734 [details]
add test case on LayoutTests/http/tests/loading

I added test case.
It sends XMLHttpRequest for all 4xx and 5xx status code, and check if loaded successfully or failed.
For regression test, both cases which receiving response body and no response body are tested.

Thank you.
Comment 13 Martin Robinson 2011-06-15 14:02:44 PDT
Comment on attachment 95734 [details]
add test case on LayoutTests/http/tests/loading

View in context: https://bugs.webkit.org/attachment.cgi?id=95734&action=review

Sorry for the late review. This looks pretty good, but I have a few nits. See below.

> Source/WebCore/platform/network/soup/ResourceHandleSoup.cpp:408
> +static bool soupErrorAndHaveNotReceivedBody(SoupMessage* message)

I think this could be renamed to something like: receivedClientOrServerErrorWithoutBody.

> Source/WebCore/platform/network/soup/ResourceHandleSoup.cpp:434
> +    if ((message && SOUP_STATUS_IS_TRANSPORT_ERROR(message->status_code)) 
> +        || soupErrorAndHaveNotReceivedBody(message)) {

Please make this one line.

> Source/WebCore/platform/network/soup/ResourceHandleSoup.cpp:491
> +            if (d->m_cancelled || !(client = handle->client())) {
> +                cleanupSoupRequestOperation(handle.get());
> +                return;
> +            }

I'm not sure I understand why you are resetting client here. Why is that? Can the client change?

> LayoutTests/http/tests/loading/status-code-error-without-response-body.html:4
> +This test checks how FrameLoader acts if status code error (i.e. 4xx and 5xx) is come with no response body.<br>

Should read "This test verifies that FrameLoader properly handles 4xx and 5xx errors without response bodies."

> LayoutTests/http/tests/loading/status-code-error-without-response-body.html:5
> +You can see series of PASS and DONE, if success.<br><br>

Should read "If this test succeeds, you should see a series of "PASS" messages and then "DONE""

> LayoutTests/http/tests/loading/status-code-error-without-response-body.html:49
> +var curStatusCode = 0;
> +var curTest = 0;

Should be currentStatusCode and currentTest.

> LayoutTests/http/tests/loading/status-code-error-without-response-body.html:70
> +    if (curTest < tests.length) {
> +        if (curStatusCode < statusCodes.length) {
> +            runTest(tests[curTest][0]+statusCodes[curStatusCode++], tests[curTest][1]);
> +        } else {
> +            curStatusCode = 0;
> +            curTest++;
> +            nextTest();
> +        }
> +    } else {
> +         done();
> +    }

Typically in WebKit we use early returns. So this would be something like:

if (currentTest >= tests.length) {
    done();
   return;
}

if (currentStatusCode >= statusCodes.length) {
    runTest(tests[currentTest][0] + statusCodes[currentStatusCode++], tests[currentTest][1]);
    return;
}

 currentStatusCode = 0;
 currentTest++;
 nextTest();

> LayoutTests/http/tests/loading/status-code-error-without-response-body.html:79
> +    log("It is only useful inside the regression test tool.\n");

You should probably mention DumpRenderTree here explicitly. "This test needs to be run in DumpRenderTree."
Comment 14 Keunsoon Lee 2011-06-16 01:52:09 PDT
Comment on attachment 95734 [details]
add test case on LayoutTests/http/tests/loading

View in context: https://bugs.webkit.org/attachment.cgi?id=95734&action=review

Thank you for detailed review for my humble code.
I modified and uploaded the patch, but have some questions below.

Thank you in advance.

>> Source/WebCore/platform/network/soup/ResourceHandleSoup.cpp:434
>> +        || soupErrorAndHaveNotReceivedBody(message)) {
> 
> Please make this one line.

I just put those two lines on one physical line.
Did you happen to mean to make one function, and not to use "||"?

>> Source/WebCore/platform/network/soup/ResourceHandleSoup.cpp:491
>> +            }
> 
> I'm not sure I understand why you are resetting client here. Why is that? Can the client change?

I agree that client does not need to be reset from your advice. Checking handle->client() would be sufficient.
Do you happen to  not agree this checking itself?
Comment 15 Keunsoon Lee 2011-06-16 01:53:51 PDT
Created attachment 97423 [details]
modify some codes according to review
Comment 16 WebKit Review Bot 2011-06-16 02:10:49 PDT
Comment on attachment 97423 [details]
modify some codes according to review

Attachment 97423 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://queues.webkit.org/results/8876137

New failing tests:
http/tests/loading/status-code-error-without-response-body.html
Comment 17 WebKit Review Bot 2011-06-16 02:10:55 PDT
Created attachment 97425 [details]
Archive of layout-test-results from ec2-cr-linux-03

The attached test failures were seen while running run-webkit-tests on the chromium-ews.
Bot: ec2-cr-linux-03  Port: Chromium  Platform: Linux-2.6.35-28-virtual-x86_64-with-Ubuntu-10.10-maverick
Comment 18 Keunsoon Lee 2011-06-16 03:43:32 PDT
The patch does not take effect on chromium port. So, I think the red box on "cr-linux" is not because of my patch.
And I double checked that the problem test case is not causing any problem on gtk port.
(Webkit bot said inspector/syntax-highlight-css.html is crashed on DumpRenderTree.)

What do I have to do?
Comment 19 Martin Robinson 2011-06-16 07:33:33 PDT
(In reply to comment #18)
> The patch does not take effect on chromium port. So, I think the red box on "cr-linux" is not because of my patch.
> And I double checked that the problem test case is not causing any problem on gtk port.
> (Webkit bot said inspector/syntax-highlight-css.html is crashed on DumpRenderTree.)
> 
> What do I have to do?

You patch looks good, but the red cr-linux bubble is because the bot failed a test.

New failing tests:
http/tests/loading/status-code-error-without-response-body.html

You can download the actual test results by going to: https://bugs.webkit.org/attachment.cgi?id=97425
Comment 20 Keunsoon Lee 2011-06-16 07:46:36 PDT
It seems chromium also cannot handle 4xx/5xx error code without body.
Then should I fix it on chromium port?

Sorry for my shallow knowledge.
Comment 21 Martin Robinson 2011-06-16 07:50:02 PDT
(In reply to comment #20)
> It seems chromium also cannot handle 4xx/5xx error code without body.
> Then should I fix it on chromium port?
> 
> Sorry for my shallow knowledge.

I recommend trying to find one of them on IRC and asking. They may just want to skip the test for now.
Comment 22 Keunsoon Lee 2011-06-16 08:45:41 PDT
Hi!

I asked about this to IRC #webkit, and they said we can ignore this red bubble.
(WildFox - Nikolas Zimmermann - said.)

Thank you in advance.
Comment 23 Martin Robinson 2011-06-16 09:04:12 PDT
(In reply to comment #22)
> Hi!
> 
> I asked about this to IRC #webkit, and they said we can ignore this red bubble.
> (WildFox - Nikolas Zimmermann - said.)
> 
> Thank you in advance.

I'm not sure he understood that the test failure was valid. Please double check. At the very least, I think you want to skip the test on Chromium.
Comment 24 Keunsoon Lee 2011-06-16 09:17:33 PDT
I'll ask it again.

But, I have a doubt. Why does only this patch cause red bubble?
Third patch has almost same LayoutTest, and it is passed.

Do you have an idea?
Comment 25 Keunsoon Lee 2011-06-16 10:56:07 PDT
To skip test for chromium, the layout test's path should be registered on LayoutTests/platform/chromium/test_expectations.txt. 

But, this file is theirs, so, I'm a little bit reluctant to modify it.
To skip this problem, we can move out this layout test to LayoutTests/platform/gtk and LayoutTests/platform/efl.

Furthermore, I still think this problem is not only for gtk and soup, and they also should handle this problem case.
In this case, we can let them know their port has this bug.

Or, we can do both of them.
We can register this test case skip on LayoutTests/platform/chromium/test_expectations.txt, and let them know this. 
Besides, I think chromium's review is necessary if I modify their layout test file.

What do you think so?
Comment 26 Keunsoon Lee 2011-06-16 11:08:09 PDT
Created attachment 97468 [details]
add the layout test on chromium's layout test_expectations.txt to skip test
Comment 27 Martin Robinson 2011-06-16 11:09:24 PDT
(In reply to comment #25)

> What do you think so?

I think that skipping the test and informing the Chromium team is the way to go here. Please take a look at who has been working in Source/WebCore/platform/network/chromium and CC them on this bug. Even better if you can also ping them on IRC and get their input.

Since, as you say, this test is useful for finding bugs in other ports, I think we should not put it in LayoutTests/platform.
Comment 28 Martin Robinson 2011-06-16 11:11:12 PDT
Comment on attachment 97468 [details]
add the layout test on chromium's layout test_expectations.txt to skip test

View in context: https://bugs.webkit.org/attachment.cgi?id=97468&action=review

Thanks for updating the patch! There are a few minor issues which I've pointed out below.

> LayoutTests/ChangeLog:15
> +        * http/tests/loading/resources/status-code-error-with-response-body.php: Added.
> +            Accepting status code and creating response having body with the received status code
> +        * http/tests/loading/resources/status-code-error-without-response-body.php: Added.
> +            Accepting status code and creating response having no body with the received status code
> +        * http/tests/loading/status-code-error-without-response-body-expected.txt: Added.
> +        * http/tests/loading/status-code-error-without-response-body.html: Added.
> +            Sending XMLHttpRequest and check if return callback is onerror or onload for all 4xx and 5xx status codes.
> +

You forgot to mention the change to the Skipped file in your ChangeLog.

> LayoutTests/platform/chromium/test_expectations.txt:216
> +// This test fails, but does not have bug number
> +WONTFIX SKIP : http/tests/loading/status-code-error-without-response-body.html = FAIL
> +

Please open a bug for this issue. I don't think WONTFIX is the right flag for this sort of issue, but you can double-check with the Chromium people
Comment 29 Keunsoon Lee 2011-06-16 11:37:21 PDT
Created attachment 97472 [details]
add bug number on chromium test_expectations.txt

Chromium port people said they will create a new thread for this problem case after we finishing this thread.

Thank you.
Comment 30 Martin Robinson 2011-06-16 11:41:27 PDT
Comment on attachment 97472 [details]
add bug number on chromium test_expectations.txt

Great. Thank you!
Comment 31 WebKit Review Bot 2011-06-16 11:54:21 PDT
Comment on attachment 97472 [details]
add bug number on chromium test_expectations.txt

Clearing flags on attachment: 97472

Committed r89055: <http://trac.webkit.org/changeset/89055>
Comment 32 WebKit Review Bot 2011-06-16 11:54:31 PDT
All reviewed patches have been landed.  Closing bug.
Comment 33 Robert Hogan 2011-06-16 12:07:59 PDT
I've noticed at https://bugs.webkit.org/show_bug.cgi?id=62515 that calling didFail() on a 412 response to an XMLHTTPRequest will cause the error code to get reset to zero. It seems to me 'didFail()' is for network-layer errors rather than server or client errors. Am I wrong in thinking this? How does GTK handle the test at the above bug?
Comment 34 Ryosuke Niwa 2011-06-16 14:18:43 PDT
The test is failing on Qt: http://build.webkit.org/results/Qt%20Linux%20Release/r89060%20(34236)/results.html
Comment 35 James Simonsen 2011-06-16 15:29:11 PDT
I disagree with the behavior of the new test. Following the XHR specs here:

http://www.w3.org/TR/XMLHttpRequest/ "Infrastructure for the send() method"
http://www.w3.org/TR/XMLHttpRequest2/ "Infrastructure for the send() method"

"If there is a network error
In case of DNS errors, TLS negotiation failure, or other type of network errors, this is a network error. Do not request any kind of end user interaction.

Note: This does not include HTTP responses that indicate some type of error, such as HTTP status code 410."

Instead, we should work our way through the state machine to here:

"Switch to the DONE state."

There is no special handling of status codes for responses without bodies. In all cases, we should fire onload.

I'd still like to keep this XHR test, since it's a worthwhile thing to test (and it's already caught bugs). I will fix it up to match the spec and move it to the xmlhttprequest directory.

It seems the original bug is still untested though. It's about the main resource, not XHRs. There ought to be a test specifically for that.
Comment 36 James Simonsen 2011-06-16 17:28:46 PDT
Created attachment 97526 [details]
Patch
Comment 38 Ryosuke Niwa 2011-06-16 17:33:05 PDT
I'm sorry but it seems like we should roll this patch out because:
1. James points out that new behavior may not be conforming to HTTP spec
2. The added test is failing on all ports but GTK.
Comment 39 James Simonsen 2011-06-16 17:39:11 PDT
(In reply to comment #38)
> I'm sorry but it seems like we should roll this patch out because:
> 1. James points out that new behavior may not be conforming to HTTP spec
> 2. The added test is failing on all ports but GTK.

Yeah, sounds good to me.

We can land the fixed test sometime later, which will pass on windows and mac. That's in the patch I uploaded.
Comment 40 Eric Seidel (no email) 2011-06-18 14:08:52 PDT
Comment on attachment 97526 [details]
Patch

Cleared review? from attachment 97526 [details] so that this bug does not appear in http://webkit.org/pending-review.  If you would like this patch reviewed, please attach it to a new bug (or re-open this bug before marking it for review again).