Simplify and streamline code that creates an appropriate document based on MIME type
Created attachment 359989 [details] Patch
Can’t figure out how this patch caused the WinCairo build failure. Does anyone have a theory?
That build failure looks like it was caused by not cleaning up from having run https://bugs.webkit.org/show_bug.cgi?id=193602
Thanks, Chris! So I think this is ready to review, then.
(In reply to Darin Adler from comment #4) > Thanks, Chris! So I think this is ready to review, then. You're welcome but it was Alex :)
Comment on attachment 359989 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=359989&action=review > Source/WebCore/dom/DOMImplementation.cpp:141 > + if (equalLettersIgnoringASCIICase(type, "text/html")) Why is this OK? I thought MIME type checks were usually case-sensitive. Looking at the call sites: - DOMParser::parseFromString() already validates the MIME type in a case-sensitive manner before calling this. The spec seems to indicate case-sensitive checks should be done (https://w3c.github.io/DOM-Parsing/#dom-domparser-parsefromstring & "Unless otherwise stated, string comparisons are done in a case-sensitive manner." in Conformance section). - DocumentWriter::createDocument() -> unclear yet, would need to look more into this one.
(In reply to Chris Dumez from comment #6) > I thought MIME type checks were usually case-sensitive. I was almost certain they are *not* until you questioned it. There are many non-case-sensitive checks in the MIME type registry. Almost all the tables in there are ASCII case folding. Note comments like this one: https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types "MIME types are case-insensitive but traditionally written in lowercase."' > - DOMParser::parseFromString() already validates the MIME type in a > case-sensitive manner before calling this. The spec seems to indicate > case-sensitive checks should be done > (https://w3c.github.io/DOM-Parsing/#dom-domparser-parsefromstring & "Unless > otherwise stated, string comparisons are done in a case-sensitive manner." > in Conformance section). I will do some research. I don’t think DOMParser is the best place to look for this information. But it seems like we may need to push things in the opposite direction, with some compatibility risk. Or perhaps there is an "ASCII lowercasing" step somewhere else in the algorithm.
(In reply to Darin Adler from comment #7) > (In reply to Chris Dumez from comment #6) > > I thought MIME type checks were usually case-sensitive. > > I was almost certain they are *not* until you questioned it. There are many > non-case-sensitive checks in the MIME type registry. Almost all the tables > in there are ASCII case folding. > > Note comments like this one: > > https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types > > "MIME types are case-insensitive but traditionally written in lowercase."' > > > - DOMParser::parseFromString() already validates the MIME type in a > > case-sensitive manner before calling this. The spec seems to indicate > > case-sensitive checks should be done > > (https://w3c.github.io/DOM-Parsing/#dom-domparser-parsefromstring & "Unless > > otherwise stated, string comparisons are done in a case-sensitive manner." > > in Conformance section). > > I will do some research. I don’t think DOMParser is the best place to look > for this information. But it seems like we may need to push things in the > opposite direction, with some compatibility risk. Or perhaps there is an > "ASCII lowercasing" step somewhere else in the algorithm. My understanding is that MIME types coming from HTTP are case-insensitive. However, when it comes to content types passed to Web API (Like DOMParser.parseFromString()), it is up to the specification. In this particular case, DOMParser.parseFromString() is case-sensitive. Technically your patch does not change behavior for DOMParser.parseFromString() though since that method would have already thrown if the input type was not lowercase. createDocument() would merely do case-insensitive checks unnecessarily for the purpose of this particular caller.
(In reply to Chris Dumez from comment #8) > My understanding is that MIME types coming from HTTP are case-insensitive. > However, when it comes to content types passed to Web API (Like > DOMParser.parseFromString()), it is up to the specification. > In this particular case, DOMParser.parseFromString() is case-sensitive. I suppose that might create a demand inside WebKit for case-sensitive MIME type helper functions, which is unfortunate. On the other hand, if the logic is all right there in line in the DOMParser class, the the helper functions don’t matter. > Technically your patch does not change behavior for > DOMParser.parseFromString() though since that method would have already > thrown if the input type was not lowercase. I think that goes beyond just "technically", I think that’s an important point. The main purpose of this DOMImplementation::createDocument function is, and always has been, creating documents as part of normal page loading. If DOMParser also shares the function, but is stricter, that would not mean we should change createDocument’s behavior. I think the real question is how we want to structure code inside WebKit. Do we have a lot of call for case-sensitive MIME type checks, or do we want to continue to make most of our MIME logic be ASCII case-insensitive. Another design is to have callers convert the type to ASCII lowercase before calling this function, or not, depending on what behavior they need. Which is the best way to do things in our code? And of course we want to make sure we have the correct behavior for standards like DOMParser that require case sensitive checks. I guess if my "fix" to do ASCII case-insensitive checks here is important, then I would technically need to construct test cases to show what was broken before. It’s possible there is some code that lower-cases the MIME type before we even get here so then I’m unnecessarily making this code slower. And also many of our MIME type helper functions might also be unnecessarily slow because they do ASCII case-insensitive operations. Do you have a preference for how I proceed? If you think it’s a big deal, I could make a version of this patch that makes no changes to case sensitivity and decide whether to make that change separate from the rest of the restructuring here. It does rub me the wrong way that the function currently does a combination of case sensitive checks (mostly here in the function) and ASCII case-insensitive checks (mostly in the MIMETypeRegistry functions) and I felt good about making it consistent.
Chris, because of the case sensitivity thing you raised, I don’t think anyone else is going to review this patch, so please do one of these when you have the chance: 1) Ask me to post a new patch that makes no case sensitivity changes. 2) Review this patch. 3) Confirm that the case sensitivity changes are OK to let other reviewers know it’s OK to do the rest of the reviewing.
Comment on attachment 359989 [details] Patch Oops, this isn’t right for PDF. Need one more ImageDocument::create case.
Created attachment 360583 [details] Patch
Comment on attachment 360583 [details] Patch Clearing flags on attachment: 360583 Committed r240795: <https://trac.webkit.org/changeset/240795>
All reviewed patches have been landed. Closing bug.
<rdar://problem/47706409>