193756 – Simplify and streamline code that creates an appropriate document based on MIME type

RESOLVED FIXED193756

Simplify and streamline code that creates an appropriate document based on MIME type

https://bugs.webkit.org/show_bug.cgi?id=193756

Summary Simplify and streamline code that creates an appropriate document based on MI...

Darin Adler

Reported 2019-01-23 19:55:34 PST

Simplify and streamline code that creates an appropriate document based on MIME type

Attachments
Patch (6.94 KB, patch) 2019-01-23 20:00 PST, Darin Adler	no flags	Details Formatted Diff Diff
Patch (6.78 KB, patch) 2019-01-30 08:55 PST, Darin Adler	no flags	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

Darin Adler

Comment 1 2019-01-23 20:00:44 PST

Created attachment 359989 [details] Patch

Darin Adler

Comment 2 2019-01-24 07:31:36 PST

Can’t figure out how this patch caused the WinCairo build failure. Does anyone have a theory?

Alex Christensen

Comment 3 2019-01-24 09:13:07 PST

That build failure looks like it was caused by not cleaning up from having run https://bugs.webkit.org/show_bug.cgi?id=193602

Darin Adler

Comment 4 2019-01-24 09:33:05 PST

Thanks, Chris! So I think this is ready to review, then.

Chris Dumez

Comment 5 2019-01-24 09:43:17 PST

(In reply to Darin Adler from comment #4) > Thanks, Chris! So I think this is ready to review, then. You're welcome but it was Alex :)

Chris Dumez

Comment 6 2019-01-24 09:51:52 PST

Comment on attachment 359989 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=359989&action=review > Source/WebCore/dom/DOMImplementation.cpp:141 > + if (equalLettersIgnoringASCIICase(type, "text/html")) Why is this OK? I thought MIME type checks were usually case-sensitive. Looking at the call sites: - DOMParser::parseFromString() already validates the MIME type in a case-sensitive manner before calling this. The spec seems to indicate case-sensitive checks should be done (https://w3c.github.io/DOM-Parsing/#dom-domparser-parsefromstring & "Unless otherwise stated, string comparisons are done in a case-sensitive manner." in Conformance section). - DocumentWriter::createDocument() -> unclear yet, would need to look more into this one.

Darin Adler

Comment 7 2019-01-24 13:01:46 PST

(In reply to Chris Dumez from comment #6) > I thought MIME type checks were usually case-sensitive. I was almost certain they are *not* until you questioned it. There are many non-case-sensitive checks in the MIME type registry. Almost all the tables in there are ASCII case folding. Note comments like this one: https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types "MIME types are case-insensitive but traditionally written in lowercase."' > - DOMParser::parseFromString() already validates the MIME type in a > case-sensitive manner before calling this. The spec seems to indicate > case-sensitive checks should be done > (https://w3c.github.io/DOM-Parsing/#dom-domparser-parsefromstring & "Unless > otherwise stated, string comparisons are done in a case-sensitive manner." > in Conformance section). I will do some research. I don’t think DOMParser is the best place to look for this information. But it seems like we may need to push things in the opposite direction, with some compatibility risk. Or perhaps there is an "ASCII lowercasing" step somewhere else in the algorithm.

Chris Dumez

Comment 8 2019-01-24 13:14:30 PST

(In reply to Darin Adler from comment #7) > (In reply to Chris Dumez from comment #6) > > I thought MIME type checks were usually case-sensitive. > > I was almost certain they are *not* until you questioned it. There are many > non-case-sensitive checks in the MIME type registry. Almost all the tables > in there are ASCII case folding. > > Note comments like this one: > > https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types > > "MIME types are case-insensitive but traditionally written in lowercase."' > > > - DOMParser::parseFromString() already validates the MIME type in a > > case-sensitive manner before calling this. The spec seems to indicate > > case-sensitive checks should be done > > (https://w3c.github.io/DOM-Parsing/#dom-domparser-parsefromstring & "Unless > > otherwise stated, string comparisons are done in a case-sensitive manner." > > in Conformance section). > > I will do some research. I don’t think DOMParser is the best place to look > for this information. But it seems like we may need to push things in the > opposite direction, with some compatibility risk. Or perhaps there is an > "ASCII lowercasing" step somewhere else in the algorithm. My understanding is that MIME types coming from HTTP are case-insensitive. However, when it comes to content types passed to Web API (Like DOMParser.parseFromString()), it is up to the specification. In this particular case, DOMParser.parseFromString() is case-sensitive. Technically your patch does not change behavior for DOMParser.parseFromString() though since that method would have already thrown if the input type was not lowercase. createDocument() would merely do case-insensitive checks unnecessarily for the purpose of this particular caller.

Darin Adler

Comment 9 2019-01-24 14:56:02 PST

(In reply to Chris Dumez from comment #8) > My understanding is that MIME types coming from HTTP are case-insensitive. > However, when it comes to content types passed to Web API (Like > DOMParser.parseFromString()), it is up to the specification. > In this particular case, DOMParser.parseFromString() is case-sensitive. I suppose that might create a demand inside WebKit for case-sensitive MIME type helper functions, which is unfortunate. On the other hand, if the logic is all right there in line in the DOMParser class, the the helper functions don’t matter. > Technically your patch does not change behavior for > DOMParser.parseFromString() though since that method would have already > thrown if the input type was not lowercase. I think that goes beyond just "technically", I think that’s an important point. The main purpose of this DOMImplementation::createDocument function is, and always has been, creating documents as part of normal page loading. If DOMParser also shares the function, but is stricter, that would not mean we should change createDocument’s behavior. I think the real question is how we want to structure code inside WebKit. Do we have a lot of call for case-sensitive MIME type checks, or do we want to continue to make most of our MIME logic be ASCII case-insensitive. Another design is to have callers convert the type to ASCII lowercase before calling this function, or not, depending on what behavior they need. Which is the best way to do things in our code? And of course we want to make sure we have the correct behavior for standards like DOMParser that require case sensitive checks. I guess if my "fix" to do ASCII case-insensitive checks here is important, then I would technically need to construct test cases to show what was broken before. It’s possible there is some code that lower-cases the MIME type before we even get here so then I’m unnecessarily making this code slower. And also many of our MIME type helper functions might also be unnecessarily slow because they do ASCII case-insensitive operations. Do you have a preference for how I proceed? If you think it’s a big deal, I could make a version of this patch that makes no changes to case sensitivity and decide whether to make that change separate from the rest of the restructuring here. It does rub me the wrong way that the function currently does a combination of case sensitive checks (mostly here in the function) and ASCII case-insensitive checks (mostly in the MIMETypeRegistry functions) and I felt good about making it consistent.

Darin Adler

Comment 10 2019-01-26 11:14:11 PST

Chris, because of the case sensitivity thing you raised, I don’t think anyone else is going to review this patch, so please do one of these when you have the chance: 1) Ask me to post a new patch that makes no case sensitivity changes. 2) Review this patch. 3) Confirm that the case sensitivity changes are OK to let other reviewers know it’s OK to do the rest of the reviewing.

Darin Adler

Comment 11 2019-01-30 08:41:22 PST

Comment on attachment 359989 [details] Patch Oops, this isn’t right for PDF. Need one more ImageDocument::create case.

Darin Adler

Comment 12 2019-01-30 08:55:42 PST

Created attachment 360583 [details] Patch

WebKit Commit Bot

Comment 13 2019-01-31 10:10:55 PST

Comment on attachment 360583 [details] Patch Clearing flags on attachment: 360583 Committed r240795: <https://trac.webkit.org/changeset/240795>

WebKit Commit Bot

Comment 14 2019-01-31 10:10:57 PST

All reviewed patches have been landed. Closing bug.

Radar WebKit Bug Importer

Comment 15 2019-01-31 10:11:29 PST

<rdar://problem/47706409>

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware All

OS All

Product WebKit

Component Page Loading

Assignee

Darin Adler

Reported

2019-01-23 19:55 PST

Modified

2019-01-31 10:11 PST History

CC List

9 users Show

URL

Keywords InRadar

Depends on

Blocks