Bug 111380 - Handle CRLF in MIME types of Blobs appended to multipart FormData
Summary: Handle CRLF in MIME types of Blobs appended to multipart FormData
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on: 111603
Blocks:
  Show dependency treegraph
 
Reported: 2013-03-04 16:33 PST by Victor Costan
Modified: 2013-04-10 10:12 PDT (History)
11 users (show)

See Also:


Attachments
Patch (6.94 KB, patch)
2013-03-04 16:41 PST, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (7.20 KB, patch)
2013-03-04 20:00 PST, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (7.88 KB, patch)
2013-03-04 21:18 PST, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (10.11 KB, patch)
2013-03-05 02:45 PST, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (10.15 KB, patch)
2013-03-05 10:01 PST, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (13.38 KB, patch)
2013-03-06 00:03 PST, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (12.88 KB, patch)
2013-03-09 17:46 PST, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (12.08 KB, patch)
2013-04-05 17:20 PDT, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (15.20 KB, patch)
2013-04-06 17:39 PDT, Victor Costan
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from webkit-ews-02 for mac-mountainlion (883.61 KB, application/zip)
2013-04-06 20:07 PDT, Build Bot
no flags Details
Archive of layout-test-results from webkit-ews-01 for mac-mountainlion (897.81 KB, application/zip)
2013-04-06 21:11 PDT, Build Bot
no flags Details
Archive of layout-test-results from webkit-ews-14 for mac-mountainlion-wk2 (613.47 KB, application/zip)
2013-04-07 03:32 PDT, Build Bot
no flags Details
Patch (19.43 KB, patch)
2013-04-08 13:20 PDT, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (27.10 KB, patch)
2013-04-09 04:23 PDT, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (30.29 KB, patch)
2013-04-09 21:47 PDT, Victor Costan
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2 (913.58 KB, application/zip)
2013-04-09 22:16 PDT, Build Bot
no flags Details
Patch (32.76 KB, patch)
2013-04-09 22:48 PDT, Victor Costan
no flags Details | Formatted Diff | Diff
Patch (34.56 KB, patch)
2013-04-10 00:22 PDT, Victor Costan
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Victor Costan 2013-03-04 16:33:02 PST
This is a follow-up to the conversation in https://bugs.webkit.org/show_bug.cgi?id=111255

I am working on a patch that adds a test case verifying that FormData filenames and MIME types with special characters are escaped properly in HTTP headers.
Comment 1 Victor Costan 2013-03-04 16:41:19 PST
Created attachment 191349 [details]
Patch
Comment 2 Build Bot 2013-03-04 19:04:14 PST
Comment on attachment 191349 [details]
Patch

Attachment 191349 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-commit-queue.appspot.com/results/16986005

New failing tests:
http/tests/local/formdata/send-form-data-filename-escaping.html
Comment 3 Victor Costan 2013-03-04 20:00:40 PST
Created attachment 191383 [details]
Patch
Comment 4 Darin Adler 2013-03-04 21:03:39 PST
Comment on attachment 191383 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=191383&action=review

> LayoutTests/ChangeLog:13
> +        * http/tests/local/formdata/send-form-data-with-filename.html: Better test description.

Expected results for this test need to be updated if you change the test description. I don’t see those changes in the patch.
Comment 5 Darin Adler 2013-03-04 21:04:11 PST
Comment on attachment 191383 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=191383&action=review

> LayoutTests/ChangeLog:11
> +        * http/tests/local/formdata/send-form-data-filename-escaping-expected.html: Added.

Filename here is wrong. Should be txt not html.
Comment 6 Victor Costan 2013-03-04 21:18:21 PST
Created attachment 191393 [details]
Patch
Comment 7 Victor Costan 2013-03-04 21:21:50 PST
@Darin Adler: Thank you so much for the quick answer, and I'm really sorry for wasting your time with those mistakes! I'm still getting used to the flow. I hope the patch I just uploaded addresses your feedback.

Can you please take another look? I'll also watch the buildbot status.
Comment 8 Alexey Proskuryakov 2013-03-04 23:26:28 PST
Comment on attachment 191393 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=191393&action=review

> LayoutTests/http/tests/local/formdata/send-form-data-filename-escaping-expected.txt:8
> +file1=not-hax.txt:text/plain:1234567890&file3=more%22hax%0D%0A.txt:application/h\"ax:
> +1234567890&file4=also-not-hax.txt:text/plain:1234567890

Tests for dangerous characters in Content-Type were added in bug 99983, so you probably don't need to add them here, unless you noticed an opportunity to extend test coverage.

This test output is not very clear. What are the real pass/fail criteria? I can see a newline in the output - why is that not bad?
Comment 9 WebKit Review Bot 2013-03-05 00:01:34 PST
Comment on attachment 191393 [details]
Patch

Attachment 191393 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://webkit-commit-queue.appspot.com/results/16990087

New failing tests:
http/tests/local/formdata/send-form-data-filename-escaping.html
Comment 10 WebKit Review Bot 2013-03-05 00:41:37 PST
Comment on attachment 191393 [details]
Patch

Attachment 191393 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://webkit-commit-queue.appspot.com/results/17046004

New failing tests:
http/tests/local/formdata/send-form-data-filename-escaping.html
Comment 11 Victor Costan 2013-03-05 02:45:27 PST
Created attachment 191445 [details]
Patch
Comment 12 Victor Costan 2013-03-05 02:54:26 PST
@Alexey Proskuryakov: I thought the newline is caused by some line wrapping somewhere, and I realized that's not true. In fact, this was showing that the newline leaks from the MIME type.

The tests that you pointed me to seem to cover the code in XMLHttpRequest::send(Blob* body, ExceptionCode& ec), whereas my tests cover FormDataBuilder::addContentTypeToMultiPartHeader(Vector<char>& buffer, const CString& mimeType)

I changed the test case to sneak in a Content-Length header. Without the FormDataBuilder patch, that header doesn't show up in the MIME type or file contents, so I'm guessing it is sent as a header.
Comment 13 Victor Costan 2013-03-05 03:05:17 PST
Also, for the record, explanation for my approach to fixing this problem.

The File API spec [1] says that the type parameter must be used as the Content-Type header value even if it's not a valid MIME type. At the same time, the HTTP spec [2] says that a HTTP receiver may replace any LWS (CR+LF in a header value) by one SP (space) without changing the semantics of the HTTP header value. So I'm taking advantage of this and replacing CRLF -> SP in the Content-Type header value.

[1] http://www.w3.org/TR/FileAPI/#constructorParams
[2] http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2
Comment 14 WebKit Review Bot 2013-03-05 04:38:04 PST
Comment on attachment 191445 [details]
Patch

Attachment 191445 [details] did not pass chromium-ews (chromium-xvfb):
Output: http://webkit-commit-queue.appspot.com/results/17045097

New failing tests:
http/tests/local/formdata/send-form-data-filename-escaping.html
Comment 15 Victor Costan 2013-03-05 10:01:34 PST
Created attachment 191509 [details]
Patch
Comment 16 Victor Costan 2013-03-05 10:05:58 PST
I managed to build and test on Chromium for Mac and Linux. It seems like the quote (") in Content-Type shows up as \" everywhere except on Chromium/Linux.

I'm not sure if this is due to the testing environment or some piece of code that completely eludes me, but 1) I think it doesn't really matter, only CRLF can mislead HTTP header parsers and 2) if it does matter, I think it can be the subject of another patch.
Comment 17 Alexey Proskuryakov 2013-03-05 22:13:02 PST
> The tests that you pointed me to seem to cover the code in XMLHttpRequest::send(Blob* body, ExceptionCode& ec), whereas my tests cover FormDataBuilder::addContentTypeToMultiPartHeader(Vector<char>& buffer, const CString& mimeType)

Ah, I finally see. So that was about HTTP headers, and this is about MIME headers in multipart form submission. Neat.
Comment 18 Alexey Proskuryakov 2013-03-05 22:45:38 PST
Comment on attachment 191509 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=191509&action=review

I do not see anything in relevant specs that says how to handle invalid types for serialization. What I suggest doing is to check what Firefox does, probably match its behavior, and likely file a bug against XMLHttpRequest specification.

> Source/WebCore/platform/network/FormDataBuilder.cpp:82
> +static void appendStringWithoutCrLf(Vector<char>& buffer, const CString& string)

WebKit style would be "appendStringReplacingCRLF".

> Source/WebCore/platform/network/FormDataBuilder.cpp:84
> +    unsigned length = string.length();

CString::length() returns a size_t, not an unsigned.

> Source/WebCore/platform/network/FormDataBuilder.cpp:85
> +    for (unsigned i = 0; i < length; ++i) {

So index should be size_t too.

> Source/WebCore/platform/network/FormDataBuilder.cpp:86
> +        unsigned char c = string.data()[i];

CString::data() returns a char, not an unsigned char.

> Source/WebCore/platform/network/FormDataBuilder.cpp:87
> +        if (c == 0x0d && i + 1 < length && string.data()[i + 1] == 0x0a) {

It is not sufficient to replace CRLF sequences. Receivers are likely to treat isolated CR and LF bytes as line separators (they certainly do that in HTTP headers).

> Source/WebCore/platform/network/FormDataBuilder.cpp:88
> +            // HTTP receivers may replace any LWS with a single SP.

I don't think that HTTP rules apply here, as these are MIME headers.

Generation of MIME headers is specified by HTML5 "multipart/form-data encoding algorithm", which quickly defers to RFC 2388. Said RFC doesn't consider the possibility of invalid type, just saying that Content-Type defaults to text/plain if not known.

> LayoutTests/http/tests/local/formdata/send-form-data-filename-escaping-expected.txt:7
> +file1=before-hax.txt:text/plain:1234567890&file2=more%22hax%0D%0A.txt%0D%0A:application/hax content-length: 20 :1234567890&file3=after-hax.txt:text/plain:1234567890

My previous comment still stands - the test does not have clear pass/fail criteria. If something changes in the output five years later, how will we know if the test really fails, or it's a benign change?

I suggest staring with splitting the test into multiple subtests, each with its own output and pass criteria.

> LayoutTests/http/tests/local/formdata/send-form-data-filename-escaping.html:42
> +    testFailed("This test is not interactive, please run using DumpRenderTree");

Is this actually true? What prevents the test from passing?

For Content-Type, we don't need to have a real file, we can build the blob dynamically. If some subtests can be run interactively and others can not, please split this into multiple .html tests.
Comment 19 Victor Costan 2013-03-06 00:03:43 PST
Created attachment 191664 [details]
Patch
Comment 20 Victor Costan 2013-03-06 00:07:33 PST
@Alexey Proskuryakov: I tried this out on Firefox Aurora, and it seems that they replace CR and LF with one space, and replace a CRLF sequence with one space.

I uploaded a patch that implements the same behavior, and addresses your other feedback. I also applied your feedback to the chunk of code that I had used as an example.

I restructured the test case. Is this getting close to what you'd like to see here?

Thank you very much for your patience and explanations!
Comment 21 Alexey Proskuryakov 2013-03-06 08:28:28 PST
I suggested filing a bug against XMLHttpRequest, however Anne thinks that Blob itself should be restricting characters in MIME types, and posted a suggestion to public-webapps mailing list: <http://lists.w3.org/Archives/Public/public-webapps/2013JanMar/0633.html>. It was then pointed out that WebKit already implements non-compliant restrictions on the type, and perhaps that should be extended to cover newlines, and standardized.

Either matching Firefox or doing this is fine with me. Let's wait a few days to see what the conclusion on the list is.
Comment 22 Victor Costan 2013-03-06 08:53:59 PST
@Alexey Proskuryakov: Would it be OK to land this change today, if my promise that I will file a patch implementing the outcome of the public-webapps thread when it comes out?

I think the changes based on your feedback to appendQuotedString in FormDataBuilder.cpp make sense even if the CR/LF filtering happens in the Blob constructor.
Comment 23 Alexey Proskuryakov 2013-03-06 10:07:20 PST
If there is a reason to land this earlier, please say so. I cannot tell from your e-mail address if you work for a company that ships a WebKit based product, and could need this in an upcoming update.

There is an active discussion on the mail thread, and not waiting for it could mean extra work for us all.

You have improved the test case based on my suggestions, but I would really like to have an interactive test if possible, and to have pass/fail criteria. I still don't see how we would triage changes in this test's output in the future, as the output is quite cryptic.

> I think the changes based on your feedback to appendQuotedString in FormDataBuilder.cpp make sense even if the CR/LF filtering happens in the Blob constructor.

Could you please clarify? Keeping unreachable code in the tree makes it harder to learn how WebKit works, because one would assume that the code is reachable somehow.
Comment 24 Victor Costan 2013-03-06 10:19:18 PST
Sorry, I was unclear when I wrote the previous comment. appendStringReplacingCRLF is the method I wrote, and that will be unreachable (actually, I will remove it) if the CR/LF prevention ends up being done in Blob.

However, I wrote that method trying to match the implementation in appendQuotedString (an existing method) as much as possible. I believe that your feedback applies to appendQuotedString as well, and that method is used to process the Blob filename, so it will not be dead code, no matter what ends up happening with the MIME type.


Regarding the tests: I'm sorry, I think I'm not getting it. Can you please point me to some examples of good interactive tests and/or documentation on what makes a good interactive test?
Comment 25 Alexey Proskuryakov 2013-03-06 10:32:30 PST
An interactive test may not be a very good name for it, I was just using it because of the comment in test, saying "This test is not interactive, please run using DumpRenderTree".

You removed the check in the latest patch (which I admittedly didn't notice at first), but does the test actually work when dragged into a browser window? The idea is that it's easier to debug failing tests in a browser, and it's also valuable to have tests that can be easily run inside other browsers for comparison.

A test that does not run in browser should print an explanation, like this one used to do.

> I believe that your feedback applies to appendQuotedString as well, and that method is used to process the Blob filename, so it will not be dead code, no matter what ends up happening with the MIME type.

This is correct. It does not make sense to wait for Blob discussion to do this.

Ideally, you would file a new bug just for cleanup, which you have a lot of in this patch (changing description of another test, cleaning up shared test script to use dataList[i].type instead of dataList[i]['type']).
Comment 26 Victor Costan 2013-03-06 10:59:09 PST
Thank you for the guidance, I will file a cleanup bug.

The test "almost" works in a browser. It currently fails due to CORS restrictions when ran from the file:// URL provided by the test result.

I tried serving it off of the http server started by Tools/Scripts/run-webkit-httpd, but it won't work because it uses some files outside of LayoutTests/http (LayoutTest/fast/js/resources/js-test-{pre,post}.js)

On the other hand, I did use this test case to figure out Firefox's behavior, so I think it can be made to work properly in a browser with proper infrastructure, e.g. a variant of multipart-post-echo.php that emits CORS headers. Would it make sense for me to write that?
Comment 27 Victor Costan 2013-03-06 11:29:52 PST
@Alexey Proskuryakov: I filed a new bug for the cleanup. Can you please take a look?
https://bugs.webkit.org/show_bug.cgi?id=111603
Comment 28 Alexey Proskuryakov 2013-03-06 11:53:55 PST
> I tried serving it off of the http server started by Tools/Scripts/run-webkit-httpd, but it won't work because it uses some files outside of LayoutTests/http (LayoutTest/fast/js/resources/js-test-{pre,post}.js)

These are available to http tests as well. Just use src="/js-test-resources/js-test-pre.js".
Comment 29 Benjamin Poulain 2013-03-07 01:18:55 PST
Comment on attachment 191664 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=191664&action=review

> Source/WebCore/platform/network/FormDataBuilder.cpp:89
> +            if (i + 1 < length && string.data()[i + 1] == 0x0a)

Unlike WTF::String, CString is zero terminated.

Which mean you can do i + 1 on the last character, you will just get null. Thus the code could be:
    ASSERT_WITH_MESSAGE(i + 1 <= length, "Do not read past CString terminating null.");
    if (string.data()[i + 1] == 0x0a)
Comment 30 Victor Costan 2013-03-09 17:46:29 PST
Created attachment 192358 [details]
Patch
Comment 31 Victor Costan 2013-03-09 17:52:42 PST
@Benjamin Poulain: thank you very much for catching that!

@Alexey Proskuryakov: /js-test-resources/js-test-{pre, post}.js works on Tools/Scripts/run-webkit-httpd, thank you very much for telling me about this!

Unfortunately, that path doesn't seem to work when I run tests with Tools/Scripts/run-webkit-tests. Also, I noticed you're not using that path in your empty filename test.

I used a horrible hack to get tests working in both places -- I have a <script> with the relative path, and one with the /js-test-resources path. I submitted the patch like this to get feedback from you and the buildbots. What do you think I should do?

Also, I reworked the tests. Am I heading in the right direction?
Comment 32 Alexey Proskuryakov 2013-03-09 19:33:05 PST
> Also, I noticed you're not using that path in your empty filename test.

This is because these tests are in http/tests/local. This directory is special - http server is available when they are run, but they are loaded from file:// URLs.

HTTP tests should use "/js-test-resources". http/tests/local tests should be avoided unless absolutely necessary, and then they should use relative paths to fast/js/resources.
Comment 33 Victor Costan 2013-03-09 21:16:33 PST
@Alexey Proskuryakov: thank you for clarifying!

In that case, why don't the synchronous XHRs fail due to CORS? And, if I can't open a test in a browser, how is it interactive?

Sorry I keep asking questions, and thank you for your patience!
Comment 34 Alexey Proskuryakov 2013-03-09 22:02:10 PST
Local files have universal access when run as tests (like when disabling restrictions for local files in Safari's Develop menu).

In hindsight, one of my tests in bug 111687 shouldn't have been in http/tests/local.
Comment 35 Build Bot 2013-03-10 08:22:55 PDT
Comment on attachment 192358 [details]
Patch

Attachment 192358 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-commit-queue.appspot.com/results/17158002

New failing tests:
fast/css/sticky/inline-sticky.html
Comment 36 Alexey Proskuryakov 2013-04-05 11:42:59 PDT
Looks like the spec has been updated. Victor, would you be willing to check if we are happy with the updated text, and update the patch accordingly?
Comment 37 Victor Costan 2013-04-05 13:59:40 PDT
@Alexey Proskuryakov: Thank you for the heads-up! I will do that today.
Comment 38 Victor Costan 2013-04-05 16:45:15 PDT
@Alexey Proskuryakov: I looked at the specification. I know what to do, in terms of the patch, and I'll start working on it.

At the same time, I think the specifications are not as clear as they could be, and I'd like to ask for the following changes on the mailing list. Can you please take a look and tell me if this looks reasonable?


The File API specification describes a good method for interpreting a Blob's type property as MIME type, in 6.4.1 Paragraph 4. http://dev.w3.org/2006/webapi/FileAPI/#methodsandparams-blob

I think this method should be referenced in the HTML 5.1 specification, in Section 4.10.22.7 (Multipart form data), in a new paragraph describing the handling of Blob form data. 
http://www.w3.org/html/wg/drafts/html/master/forms.html#multipart/form-data-encoding-algorithm

I think the method should also be referenced in the XMLHttpRequest specification, in Section 4.6.6, Paragraph 4, in the sub-section describing Blob handling.
https://dvcs.w3.org/hg/xhr/raw-file/tip/Overview.html#the-send()-method
Comment 39 Victor Costan 2013-04-05 17:20:39 PDT
Created attachment 196705 [details]
Patch
Comment 40 Victor Costan 2013-04-06 17:39:04 PDT
Created attachment 196765 [details]
Patch
Comment 41 Build Bot 2013-04-06 20:07:16 PDT
Comment on attachment 196765 [details]
Patch

Attachment 196765 [details] did not pass mac-ews (mac):
Output: http://webkit-commit-queue.appspot.com/results/17438694

New failing tests:
fast/files/blob-slice-overflow.html
http/tests/local/blob/send-hybrid-blob.html
http/tests/misc/empty-file-formdata.html
http/tests/local/formdata/send-form-data-with-sliced-file.html
http/tests/local/blob/send-sliced-data-blob.html
fast/files/blob-slice-test.html
fast/files/read-blob-async.html
http/tests/local/formdata/form-data-with-unknown-file-extension.html
http/tests/local/formdata/send-form-data-with-empty-blob-filename.html
Comment 42 Build Bot 2013-04-06 20:07:19 PDT
Created attachment 196767 [details]
Archive of layout-test-results from webkit-ews-02 for mac-mountainlion

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: webkit-ews-02  Port: mac-mountainlion  Platform: Mac OS X 10.8.2
Comment 43 Build Bot 2013-04-06 21:11:06 PDT
Comment on attachment 196765 [details]
Patch

Attachment 196765 [details] did not pass mac-ews (mac):
Output: http://webkit-commit-queue.appspot.com/results/17442667

New failing tests:
fast/files/blob-slice-overflow.html
http/tests/local/blob/send-hybrid-blob.html
http/tests/misc/empty-file-formdata.html
http/tests/local/formdata/send-form-data-with-sliced-file.html
http/tests/local/blob/send-sliced-data-blob.html
fast/files/blob-slice-test.html
fast/files/read-blob-async.html
http/tests/local/formdata/form-data-with-unknown-file-extension.html
http/tests/local/formdata/send-form-data-with-empty-blob-filename.html
Comment 44 Build Bot 2013-04-06 21:11:08 PDT
Created attachment 196768 [details]
Archive of layout-test-results from webkit-ews-01 for mac-mountainlion

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: webkit-ews-01  Port: mac-mountainlion  Platform: Mac OS X 10.8.2
Comment 45 Build Bot 2013-04-07 03:32:26 PDT
Comment on attachment 196765 [details]
Patch

Attachment 196765 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-commit-queue.appspot.com/results/17527648

New failing tests:
http/tests/local/blob/send-sliced-data-blob.html
fast/files/blob-slice-test.html
fast/files/blob-slice-overflow.html
http/tests/local/formdata/send-form-data-with-empty-blob-filename.html
http/tests/misc/empty-file-formdata.html
Comment 46 Build Bot 2013-04-07 03:32:30 PDT
Created attachment 196774 [details]
Archive of layout-test-results from webkit-ews-14 for mac-mountainlion-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-14  Port: mac-mountainlion-wk2  Platform: Mac OS X 10.8.2
Comment 47 Victor Costan 2013-04-07 09:28:30 PDT
The crashes seem to be caused by the "return emptyString()". They don't appear if I replace that with "return contentType" or "return contentType.lower()". Yet, for the life of me, I can't figure out what I'm doing wrong. I've seen other methods with a String type return emptyString() just fine. I'll read more code today or tomorrow.
Comment 48 Alexey Proskuryakov 2013-04-08 10:36:15 PDT
The crash is actually here:

bool Blob::isValidContentType(const String& contentType)
{
    size_t length = contentType.length();
    if (contentType.is8Bit()) {

The is8bit function cannot be called on null strings. You should probably add an early return for this case.

> The crashes seem to be caused by the "return emptyString()". They don't appear if I replace that with "return contentType" or "return contentType.lower()".

That's surprising.
Comment 49 Victor Costan 2013-04-08 13:20:05 PDT
Created attachment 196947 [details]
Patch
Comment 50 Victor Costan 2013-04-08 13:35:44 PDT
Alexey, thank you so much for getting me unstuck!

I tried to add some comments to WTFString, covering your answer and what I learned on my own. I hope this patch works out, and I'll work on a test for Blob.slice().

What do you think of my comments for the spec, do they seem reasonable?
Comment 51 Alexey Proskuryakov 2013-04-08 16:59:40 PDT
Comment on attachment 196947 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=196947&action=review

The spec says that the type should be normalized when Blob is created (with constructor or slice()). But in this patch, normalization is not done in the constructor. Is this intentional?

This can be observed through (new Blob([], {type:'<invalid string>'})).type value. Please make this match the spec, and add a new test case (in a separate file) for this.

> Source/WTF/ChangeLog:7
> +        * wtf/text/WTFString.h: usage comments for characters{,8,16}() and is8Bit()
> +        (WTF::String::is8Bit): now ASSERTs that the String is not null, so failures are more expressive

Please split this into a separate patch. This is not really related to the change you are making.

> Source/WebCore/ChangeLog:10
> +        (WebCore):

It's best to remove useless parts of generated ChangeLog. The part generated by prepare-ChangeLog is meant to be helpful to humans, not prescriptive.

> Source/WebCore/ChangeLog:14
> +        (Blob):

Ditto.

> Source/WebCore/fileapi/Blob.h:77
> +    // The checks described in the File API spec for Blob.slice().

I think that it's helpful to have this comment  - there are many definitions of "valid" indeed.

But it's not just about Blob.slice() - Blob constructor algorithm in the spec says the same. So, perhaps the comment should say something less specific, like maybe "// As defined by File API specification."

> Source/WebCore/fileapi/Blob.h:79
> +    static bool isValidContentType(const String& type);
> +    static String normalizedContentType(const String& contentType);

The argument names don't seem helpful, please remove them.

> Source/WebCore/platform/network/FormData.cpp:266
> +                String normalizedContentType = Blob::normalizedContentType(value.blob()->type());
>                  String contentType;
> -                if (value.blob()->type().isEmpty())
> +                if (normalizedContentType.isEmpty())
>                      contentType = "application/octet-stream";
>                  else
> -                    contentType = value.blob()->type();
> +                    contentType = normalizedContentType;

This can be written in a simpler way:

String contentType = Blob::normalizedContentType(value.blob()->type());
if (contentType.isEmpty())
    contentType = "application/octet-stream";

> LayoutTests/http/tests/local/formdata/send-form-data-filename-escaping.html:5
> +<script src="/js-test-resources/js-test-pre.js"></script>
> +<script src="../../../../fast/js/resources/js-test-pre.js"></script>

Why do you need both?

> LayoutTests/http/tests/local/formdata/send-form-data-filename-escaping.html:68
> +<script src="/js-test-resources/js-test-post.js"></script>
> +<script src="../../../../fast/js/resources/js-test-post.js"></script>

Ditto.

> LayoutTests/http/tests/local/formdata/send-form-data-mimetype-normalization.html:5
> +<script src="/js-test-resources/js-test-pre.js"></script>
> +<script src="../../../../fast/js/resources/js-test-pre.js"></script>

Ditto.

> LayoutTests/http/tests/local/formdata/send-form-data-mimetype-normalization.html:57
> +<script src="/js-test-resources/js-test-post.js"></script>
> +<script src="../../../../fast/js/resources/js-test-post.js"></script>

Ditto.

> LayoutTests/http/tests/local/formdata/resources/send-form-data-common.js:26
>  function dumpResponse(xhr, fileSliced)
>  {
>      debug(xhr.responseText);
> +    return xhr.responseText;
>  }

I don't quite understand the changes in this file. Could you please explain them in ChangeLog?
Comment 52 Alexey Proskuryakov 2013-04-08 17:06:59 PDT
> The File API specification describes a good method for interpreting a Blob's type property as MIME type, in 6.4.1 Paragraph 4. http://dev.w3.org/2006/webapi/FileAPI/#methodsandparams-blob

I'm not sure if this is what this text says, perhaps we are looking at different versions of it. Currently, 6.4.1.4 is about converting an input string to a value for the type attribute, not about using it afterwards. As such, it's entirely internal to the File API spec.

There is certainly a gap in specifications, as they never say that Blob's type should be used in form submission. But I'm not sure how to best address that.
Comment 53 Victor Costan 2013-04-08 17:43:26 PDT
You're totally right, I was looking at an older spec. I will re-read the spec and re-do the patch, it really makes a lot more sense to do the filtering in the constructor.
Comment 54 Victor Costan 2013-04-09 04:23:20 PDT
Created attachment 197025 [details]
Patch
Comment 55 Alexey Proskuryakov 2013-04-09 10:41:59 PDT
Comment on attachment 197025 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=197025&action=review

I like the new behavior, but I think that another round is needed to improve style.

> Source/WebCore/fileapi/Blob.h:48
> +// Utilities for handling a Blob's type attribute.
> +namespace BlobType {

It is unusual in WebKit code base to put helpers in a namespace. And we try to not define two entities in one header file.

What you had in the previous iteration was perfectly fine, these functions can be in Blob class.

> Source/WebCore/platform/network/BlobData.h:169
> +// Declared in Blob.h, which includes this header.
> +namespace BlobType {
> +bool isNormalized(const String&);
> +} // namespace WebCore::BlobType

I understand why you needed this, but it's not a great thing to do. The correct things to do would be:

- put the functions in a separate header file,
- or just move setContentType() to BlobData.cpp. It's not hot code, and the function does not need to be inline.

> Source/WebCore/platform/network/BlobData.h:183
> +        // FIXME: This would be useful, but it causes a link error in WebKit2.framework
> +        // ASSERT(BlobType::isNormalized(contentType));

The reason why this fails to link is that the function needs to be exported using WebCore.exp.in file.

> LayoutTests/http/tests/local/formdata/send-form-data-mimetype-normalization.html:57
> +<script src="/js-test-resources/js-test-post.js"></script>
> +<script src="../../../../fast/js/resources/js-test-post.js"></script>

I still don't understand why both are needed.

> LayoutTests/http/tests/local/formdata/resources/send-form-data-common.js:48
> +    if (sendAsAsync)
> +        return null;
> +    else
> +        return dumpResponse(xhr, fileSliced);

WebKit style is to not have an "else" after "return".

I'd still like to have an explanation of the changes to this file in a ChageLog.
Comment 56 Victor Costan 2013-04-09 21:47:23 PDT
Created attachment 197204 [details]
Patch
Comment 57 Victor Costan 2013-04-09 22:00:41 PDT
Alexey, thank you very much for your patience and feedback!

I hope I addressed all the feedback points this time around.

The double <script>s were hacks to get tests in http/tests/local to work when served from the webkit-httpd server. The right solution was to move them out of local. Is http/tests/xmlhttprequest a reasonable place for them to live in?

The JS "returns" pass the HTTP response text returned from the synchronous XHR back to the main JS test file, so I can check it. I needed this to make the tests interactive.

Does this make sense?
Comment 58 Build Bot 2013-04-09 22:16:53 PDT
Comment on attachment 197204 [details]
Patch

Attachment 197204 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-commit-queue.appspot.com/results/17739014

New failing tests:
http/tests/xmlhttprequest/post-blob-content-type-async.html
fast/files/blob-constructor.html
http/tests/xmlhttprequest/post-blob-content-type-sync.html
fast/repaint/japanese-rl-selection-repaint-in-regions.html
Comment 59 Build Bot 2013-04-09 22:16:56 PDT
Created attachment 197207 [details]
Archive of layout-test-results from webkit-ews-09 for mac-mountainlion-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-09  Port: mac-mountainlion-wk2  Platform: Mac OS X 10.8.2
Comment 60 Alexey Proskuryakov 2013-04-09 22:19:03 PDT
> Is http/tests/xmlhttprequest a reasonable place for them to live in?

http/tests/fileapi might be more appropriate. What do you think?

I'll take a deeper look at the patch tomorrow.
Comment 61 Victor Costan 2013-04-09 22:48:08 PDT
Created attachment 197210 [details]
Patch
Comment 62 Victor Costan 2013-04-09 22:50:40 PDT
Alexey, thank you for the suggestion!

I moved the XHR-FormData-Blob tests to http/tests/fileapi and I fixed the tests that the buildbot complained about, except for fast/repaint/japanese-rl-selection-repaint-in-regions.html. I didn't see anything related to blobs there.
Comment 63 Victor Costan 2013-04-10 00:22:20 PDT
Created attachment 197217 [details]
Patch
Comment 64 Alexey Proskuryakov 2013-04-10 09:35:42 PDT
Comment on attachment 197217 [details]
Patch

r=me
Comment 65 WebKit Commit Bot 2013-04-10 10:12:14 PDT
Comment on attachment 197217 [details]
Patch

Clearing flags on attachment: 197217

Committed r148105: <http://trac.webkit.org/changeset/148105>
Comment 66 WebKit Commit Bot 2013-04-10 10:12:21 PDT
All reviewed patches have been landed.  Closing bug.