193909 – Implement serializing in MIME type parser

RESOLVED FIXED193909

Implement serializing in MIME type parser

https://bugs.webkit.org/show_bug.cgi?id=193909

Summary Implement serializing in MIME type parser

Rob Buis

Reported 2019-01-28 08:09:36 PST

Implement: https://mimesniff.spec.whatwg.org/#serialize-a-mime-type

Attachments
Patch (9.49 KB, patch) 2019-02-07 12:49 PST, Rob Buis	no flags	Details Formatted Diff Diff
Patch (15.56 KB, patch) 2019-02-10 05:23 PST, Rob Buis	no flags	Details Formatted Diff Diff
Patch (16.95 KB, patch) 2019-02-11 09:35 PST, Rob Buis	no flags	Details Formatted Diff Diff
Patch (16.60 KB, patch) 2019-02-17 12:12 PST, Rob Buis	ews-watchlist: commit-queue-	Details Formatted Diff Diff
Archive of layout-test-results from ews101 for mac-highsierra (2.42 MB, application/zip) 2019-02-17 13:13 PST, EWS Watchlist	no flags	Details
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Rob Buis

Comment 1 2019-02-07 12:49:08 PST

Created attachment 361430 [details] Patch

Darin Adler

Comment 2 2019-02-09 12:25:47 PST

Comment on attachment 361430 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=361430&action=review Looks great. Could probably pile on a few more test cases. > Source/WebCore/platform/network/ParsedContentType.cpp:332 > - m_mimeType.convertToASCIILowercase(); > + m_mimeType = m_mimeType.convertToASCIILowercase(); Wow, this seems like a major bug. The old code didn’t do anything at all. I guess your new regression tests cover this. > Source/WebCore/platform/network/ParsedContentType.cpp:351 > - keyName.convertToASCIILowercase(); > + name = name.convertToASCIILowercase(); Again! > Source/WebCore/platform/network/ParsedContentType.cpp:361 > + for (auto paramName : m_parameterNames) { Using auto here causes unnecessary reference count churn. You could use auto& instead and there would not be any. Goes against WebKit coding style to call this "paramName". Writing out the whole word "parameter" doesn’t make it that much longer. But also in the context of this sort function, I think just "name" would be even better than "parameterName" (or "paramName"). > Source/WebCore/platform/network/ParsedContentType.cpp:366 > + if (value.isEmpty() || value.find(isNonTokenCharacter) != notFound) { Unfortunate that we don’t have a "contains" overload, because find(x) != notFound seems a little harder to read. > Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:105 > + return String(); Might be nice to return something like "ERROR" here so it’s easier to see the difference between a parse failure, empty string, and null string in test cases below. Might want to do similar for null vs. empty string? > Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:110 > + EXPECT_EQ(serializeIfValid("text/plain"), "text/plain"); When tests are about strings, I find it better to use EXPECT_STREQ so that the errors are easier to understand. That requires a strategy for extending the lifetime of a const char*; some of our tests use global buffers since there is no need for the tests to work multi-threaded, others call utf8().data(), others even use std::string and c_str(). You can see examples of multiple approaches in various test files such as TextCodec.cpp, HashMap.cpp and many others. > Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:120 > + EXPECT_EQ(serializeIfValid("/plain"), String()); I typically always include test cases like empty string and null string, even if they don't make any sense, just to make sure the function doesn’t cash in those cases. Similarly I’d suggest having test that is just ";". > Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:123 > + EXPECT_EQ(serializeIfValid("text/plain;test"), "text/plain"); I don’t see any test cases covering uppercase in parameter names. I’d like to see failure case tests.

Darin Adler

Comment 3 2019-02-09 12:26:17 PST

Comment on attachment 361430 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=361430&action=review >> Source/WebCore/platform/network/ParsedContentType.cpp:361 >> + for (auto paramName : m_parameterNames) { > > Using auto here causes unnecessary reference count churn. You could use auto& instead and there would not be any. > > Goes against WebKit coding style to call this "paramName". Writing out the whole word "parameter" doesn’t make it that much longer. But also in the context of this sort function, I think just "name" would be even better than "parameterName" (or "paramName"). short function

Darin Adler

Comment 4 2019-02-09 12:27:18 PST

Comment on attachment 361430 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=361430&action=review >> Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:120 >> + EXPECT_EQ(serializeIfValid("/plain"), String()); > > I typically always include test cases like empty string and null string, even if they don't make any sense, just to make sure the function doesn’t cash in those cases. Similarly I’d suggest having test that is just ";". More worried about functions that "crash" than ones that "cash" ;)

Darin Adler

Comment 5 2019-02-09 12:28:05 PST

Comment on attachment 361430 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=361430&action=review >> Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:123 >> + EXPECT_EQ(serializeIfValid("text/plain;test"), "text/plain"); > > I don’t see any test cases covering uppercase in parameter names. I’d like to see failure case tests. Other test cases to add: non-ASCII characters, null characters, other control characters, leading spaces, not just trailing spaces.

Darin Adler

Comment 6 2019-02-09 12:28:31 PST

Comment on attachment 361430 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=361430&action=review >>> Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:123 >>> + EXPECT_EQ(serializeIfValid("text/plain;test"), "text/plain"); >> >> I don’t see any test cases covering uppercase in parameter names. I’d like to see failure case tests. > > Other test cases to add: non-ASCII characters, null characters, other control characters, leading spaces, not just trailing spaces. Single quotes, unpaired surrogates.

Rob Buis

Comment 7 2019-02-10 05:23:11 PST

Created attachment 361627 [details] Patch

Rob Buis

Comment 8 2019-02-10 12:50:35 PST

(In reply to Darin Adler from comment #2) > Comment on attachment 361430 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=361430&action=review > > Looks great. Could probably pile on a few more test cases. > > > Source/WebCore/platform/network/ParsedContentType.cpp:332 > > - m_mimeType.convertToASCIILowercase(); > > + m_mimeType = m_mimeType.convertToASCIILowercase(); > > Wow, this seems like a major bug. The old code didn’t do anything at all. I > guess your new regression tests cover this. Yes, that was unfortunate, I simply did not look closely enough at the String API. No way to catch this when we just used ParsedContentType for validation, but when serializing it became quickly obvious. > > Source/WebCore/platform/network/ParsedContentType.cpp:361 > > + for (auto paramName : m_parameterNames) { > > Using auto here causes unnecessary reference count churn. You could use > auto& instead and there would not be any. Done. > Goes against WebKit coding style to call this "paramName". Writing out the > whole word "parameter" doesn’t make it that much longer. But also in the > context of this sort function, I think just "name" would be even better than > "parameterName" (or "paramName"). I like name, also since I use value for the parameter value a bit below. > > Source/WebCore/platform/network/ParsedContentType.cpp:366 > > + if (value.isEmpty() || value.find(isNonTokenCharacter) != notFound) { > > Unfortunate that we don’t have a "contains" overload, because find(x) != > notFound seems a little harder to read. Agreed, not pretty. I don't have time for that right now but maybe later if somebody makes a bug for this. > > Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:105 > > + return String(); > > Might be nice to return something like "ERROR" here so it’s easier to see > the difference between a parse failure, empty string, and null string in > test cases below. Might want to do similar for null vs. empty string? Good idea, I chose NOTVALID. > > Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:110 > > + EXPECT_EQ(serializeIfValid("text/plain"), "text/plain"); > > When tests are about strings, I find it better to use EXPECT_STREQ so that > the errors are easier to understand. That requires a strategy for extending > the lifetime of a const char*; some of our tests use global buffers since > there is no need for the tests to work multi-threaded, others call > utf8().data(), others even use std::string and c_str(). You can see examples > of multiple approaches in various test files such as TextCodec.cpp, > HashMap.cpp and many others. I did not know EXPECT_STREQ, done. > > Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:120 > > + EXPECT_EQ(serializeIfValid("/plain"), String()); > > I typically always include test cases like empty string and null string, > even if they don't make any sense, just to make sure the function doesn’t > cash in those cases. Similarly I’d suggest having test that is just ";". Done. > > Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:123 > > + EXPECT_EQ(serializeIfValid("text/plain;test"), "text/plain"); > > I don’t see any test cases covering uppercase in parameter names. I’d like > to see failure case tests. In general I now tried to add as many tests, taking your suggestions into account.

Darin Adler

Comment 9 2019-02-10 16:05:51 PST

Comment on attachment 361627 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=361627&action=review > Tools/TestWebKitAPI/Tests/WebCore/ParsedContentType.cpp:104 > + if (Optional<ParsedContentType> parsedContentType = ParsedContentType::create(input)) > + return parsedContentType->serialize().utf8().data(); This isn’t right. The return value will be deallocated and the test won‘t work if run under memory test tools like Address Sanitizer. Need to put the return value somewhere that won’t be deallocated. See patterns in various other tests for the various approaches. I suggest the "have a global char array and copy the result into it inside this function" approach.

Rob Buis

Comment 10 2019-02-11 09:35:00 PST

Created attachment 361686 [details] Patch

Rob Buis

Comment 11 2019-02-12 01:38:52 PST

Comment on attachment 361686 [details] Patch Fixed the deallocation problem by copying TextCoded.cpp approach.

WebKit Commit Bot

Comment 12 2019-02-12 02:04:44 PST

Comment on attachment 361686 [details] Patch Clearing flags on attachment: 361686 Committed r241291: <https://trac.webkit.org/changeset/241291>

WebKit Commit Bot

Comment 13 2019-02-12 02:04:45 PST

All reviewed patches have been landed. Closing bug.

Rob Buis

Comment 14 2019-02-17 12:11:59 PST

Reopening to attach new patch.

Rob Buis

Comment 15 2019-02-17 12:12:02 PST

Created attachment 362241 [details] Patch

Radar WebKit Bug Importer

Comment 16 2019-02-17 12:12:17 PST

<rdar://problem/48148464>

Radar WebKit Bug Importer

Comment 17 2019-02-17 12:12:18 PST

<rdar://problem/48148465>

EWS Watchlist

Comment 18 2019-02-17 13:13:17 PST

Comment on attachment 362241 [details] Patch Attachment 362241 [details] did not pass mac-ews (mac): Output: https://webkit-queues.webkit.org/results/11183743 New failing tests: imported/w3c/web-platform-tests/xhr/overridemimetype-blob.html

EWS Watchlist

Comment 19 2019-02-17 13:13:20 PST

Created attachment 362244 [details] Archive of layout-test-results from ews101 for mac-highsierra The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews101 Port: mac-highsierra Platform: Mac OS X 10.13.6

Rob Buis

Comment 20 2019-02-17 13:22:19 PST

Wrong bug...

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version Other

Hardware Unspecified

OS Unspecified

Product WebKit

Component DOM

Assignee

Rob Buis

Reported

2019-01-28 08:09 PST

Modified

2019-02-17 13:22 PST History

CC List

6 users Show

URL

Keywords InRadar

Depends on

Blocks

180526

Dependencies

tree graph