Bug 193756 - Simplify and streamline code that creates an appropriate document based on MIME type
Summary: Simplify and streamline code that creates an appropriate document based on MI...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Page Loading (show other bugs)
Version: WebKit Nightly Build
Hardware: All All
: P2 Normal
Assignee: Darin Adler
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-01-23 19:55 PST by Darin Adler
Modified: 2019-01-31 10:11 PST (History)
9 users (show)

See Also:


Attachments
Patch (6.94 KB, patch)
2019-01-23 20:00 PST, Darin Adler
no flags Details | Formatted Diff | Diff
Patch (6.78 KB, patch)
2019-01-30 08:55 PST, Darin Adler
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Darin Adler 2019-01-23 19:55:34 PST
Simplify and streamline code that creates an appropriate document based on MIME type
Comment 1 Darin Adler 2019-01-23 20:00:44 PST
Created attachment 359989 [details]
Patch
Comment 2 Darin Adler 2019-01-24 07:31:36 PST
Can’t figure out how this patch caused the WinCairo build failure. Does anyone have a theory?
Comment 3 Alex Christensen 2019-01-24 09:13:07 PST
That build failure looks like it was caused by not cleaning up from having run https://bugs.webkit.org/show_bug.cgi?id=193602
Comment 4 Darin Adler 2019-01-24 09:33:05 PST
Thanks, Chris! So I think this is ready to review, then.
Comment 5 Chris Dumez 2019-01-24 09:43:17 PST
(In reply to Darin Adler from comment #4)
> Thanks, Chris! So I think this is ready to review, then.

You're welcome but it was Alex :)
Comment 6 Chris Dumez 2019-01-24 09:51:52 PST
Comment on attachment 359989 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=359989&action=review

> Source/WebCore/dom/DOMImplementation.cpp:141
> +    if (equalLettersIgnoringASCIICase(type, "text/html"))

Why is this OK? I thought MIME type checks were usually case-sensitive. Looking at the call sites:
- DOMParser::parseFromString() already validates the MIME type in a case-sensitive manner before calling this. The spec seems to indicate case-sensitive checks should be done (https://w3c.github.io/DOM-Parsing/#dom-domparser-parsefromstring & "Unless otherwise stated, string comparisons are done in a case-sensitive manner." in Conformance section).
- DocumentWriter::createDocument() -> unclear yet, would need to look more into this one.
Comment 7 Darin Adler 2019-01-24 13:01:46 PST
(In reply to Chris Dumez from comment #6)
> I thought MIME type checks were usually case-sensitive.

I was almost certain they are *not* until you questioned it. There are many non-case-sensitive checks in the MIME type registry. Almost all the tables in there are ASCII case folding.

Note comments like this one:

https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types

"MIME types are case-insensitive but traditionally written in lowercase."'

> - DOMParser::parseFromString() already validates the MIME type in a
> case-sensitive manner before calling this. The spec seems to indicate
> case-sensitive checks should be done
> (https://w3c.github.io/DOM-Parsing/#dom-domparser-parsefromstring & "Unless
> otherwise stated, string comparisons are done in a case-sensitive manner."
> in Conformance section).

I will do some research. I don’t think DOMParser is the best place to look for this information. But it seems like we may need to push things in the opposite direction, with some compatibility risk. Or perhaps there is an "ASCII lowercasing" step somewhere else in the algorithm.
Comment 8 Chris Dumez 2019-01-24 13:14:30 PST
(In reply to Darin Adler from comment #7)
> (In reply to Chris Dumez from comment #6)
> > I thought MIME type checks were usually case-sensitive.
> 
> I was almost certain they are *not* until you questioned it. There are many
> non-case-sensitive checks in the MIME type registry. Almost all the tables
> in there are ASCII case folding.
> 
> Note comments like this one:
> 
> https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types
> 
> "MIME types are case-insensitive but traditionally written in lowercase."'
> 
> > - DOMParser::parseFromString() already validates the MIME type in a
> > case-sensitive manner before calling this. The spec seems to indicate
> > case-sensitive checks should be done
> > (https://w3c.github.io/DOM-Parsing/#dom-domparser-parsefromstring & "Unless
> > otherwise stated, string comparisons are done in a case-sensitive manner."
> > in Conformance section).
> 
> I will do some research. I don’t think DOMParser is the best place to look
> for this information. But it seems like we may need to push things in the
> opposite direction, with some compatibility risk. Or perhaps there is an
> "ASCII lowercasing" step somewhere else in the algorithm.

My understanding is that MIME types coming from HTTP are case-insensitive. However, when it comes to content types passed to Web API (Like DOMParser.parseFromString()), it is up to the specification.
In this particular case, DOMParser.parseFromString() is case-sensitive. Technically your patch does not change behavior for DOMParser.parseFromString() though since that method would have already thrown if the input type was not lowercase.
createDocument() would merely do case-insensitive checks unnecessarily for the purpose of this particular caller.
Comment 9 Darin Adler 2019-01-24 14:56:02 PST
(In reply to Chris Dumez from comment #8)
> My understanding is that MIME types coming from HTTP are case-insensitive.
> However, when it comes to content types passed to Web API (Like
> DOMParser.parseFromString()), it is up to the specification.
> In this particular case, DOMParser.parseFromString() is case-sensitive.

I suppose that might create a demand inside WebKit for case-sensitive MIME type helper functions, which is unfortunate. On the other hand, if the logic is all right there in line in the DOMParser class, the the helper functions don’t matter.

> Technically your patch does not change behavior for
> DOMParser.parseFromString() though since that method would have already
> thrown if the input type was not lowercase.

I think that goes beyond just "technically", I think that’s an important point.

The main purpose of this DOMImplementation::createDocument function is, and always has been, creating documents as part of normal page loading. If DOMParser also shares the function, but is stricter, that would not mean we should change createDocument’s behavior.

I think the real question is how we want to structure code inside WebKit. Do we have a lot of call for case-sensitive MIME type checks, or do we want to continue to make most of our MIME logic be ASCII case-insensitive. Another design is to have callers convert the type to ASCII lowercase before calling this function, or not, depending on what behavior they need. Which is the best way to do things in our code?

And of course we want to make sure we have the correct behavior for standards like DOMParser that require case sensitive checks.

I guess if my "fix" to do ASCII case-insensitive checks here is important, then I would technically need to construct test cases to show what was broken before. It’s possible there is some code that lower-cases the MIME type before we even get here so then I’m unnecessarily making this code slower. And also many of our MIME type helper functions might also be unnecessarily slow because they do ASCII case-insensitive operations.

Do you have a preference for how I proceed? If you think it’s a big deal, I could make a version of this patch that makes no changes to case sensitivity and decide whether to make that change separate from the rest of the restructuring here. It does rub me the wrong way that the function currently does a combination of case sensitive checks (mostly here in the function) and ASCII case-insensitive checks (mostly in the MIMETypeRegistry functions) and I felt good about making it consistent.
Comment 10 Darin Adler 2019-01-26 11:14:11 PST
Chris, because of the case sensitivity thing you raised, I don’t think anyone else is going to review this patch, so please do one of these when you have the chance:

1) Ask me to post a new patch that makes no case sensitivity changes.

2) Review this patch.

3) Confirm that the case sensitivity changes are OK to let other reviewers know it’s OK to do the rest of the reviewing.
Comment 11 Darin Adler 2019-01-30 08:41:22 PST
Comment on attachment 359989 [details]
Patch

Oops, this isn’t right for PDF. Need one more ImageDocument::create case.
Comment 12 Darin Adler 2019-01-30 08:55:42 PST
Created attachment 360583 [details]
Patch
Comment 13 WebKit Commit Bot 2019-01-31 10:10:55 PST
Comment on attachment 360583 [details]
Patch

Clearing flags on attachment: 360583

Committed r240795: <https://trac.webkit.org/changeset/240795>
Comment 14 WebKit Commit Bot 2019-01-31 10:10:57 PST
All reviewed patches have been landed.  Closing bug.
Comment 15 Radar WebKit Bug Importer 2019-01-31 10:11:29 PST
<rdar://problem/47706409>