Bug 178618

Summary: [iOS] DocumentWriter::createDocument can spend ~100ms unnecessarily converting image UTIs to MIME types
Product: WebKit Reporter: Wenson Hsieh <wenson_hsieh>
Component: WebCore Misc.Assignee: Wenson Hsieh <wenson_hsieh>
Status: RESOLVED FIXED    
Severity: Normal CC: commit-queue, jlewis3, mcatanzaro, sabouhallawa, thorton
Priority: P2 Keywords: InRadar
Version: WebKit Nightly Build   
Hardware: Unspecified   
OS: Unspecified   
See Also: https://bugs.webkit.org/show_bug.cgi?id=184161
Attachments:
Description Flags
Patch
none
Patch none

Description Wenson Hsieh 2017-10-20 23:37:24 PDT
DocumentWriter::createDocument can spend ~100ms unnecessarily converting image UTIs to MIME types.
Comment 1 Wenson Hsieh 2017-10-20 23:55:48 PDT
Created attachment 324492 [details]
Patch
Comment 2 Wenson Hsieh 2017-10-21 00:02:16 PDT
<rdar://problem/35108852>
Comment 3 Said Abou-Hallawa 2017-10-23 12:16:23 PDT
Comment on attachment 324492 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=324492&action=review

> Source/WebCore/platform/MIMETypeRegistry.cpp:78
> +    static const char* const knownImageMIMETypes[] = {
> +        "image/tiff", "image/gif", "image/jpeg", "image/vnd.microsoft.icon", "image/jp2", "image/png", "image/bmp",
> +        // Favicons don't have a MIME type in the registry either.
> +        "image/x-icon",
> +        // We only get one MIME type per UTI, hence our need to add this manually.
> +        "image/pjpeg"
> +    };

Now we will have three lists to maintain: the allowedImageUTIs() in UTIRegistry.cpp and supportedImageMIMETypes and supportedImageResourceMIMETypes in this file. If we want to add support for another image format, for example, we will need to change two completely independent places: the s_allowedImageUTIs[] in allowedImageUTIs() and knownImageMIMETypes[] in this file.

I would suggest having only one list, either UTIs or MIME types, it does not matter,  but have a conversion function from one type to the other. The conversion  function will have a lookup table from UTI to MIME type or vice versa. It can also assert that the return value matches the return of MIMETypeForImageSourceType() if it converts from UTI to MIME type. So it will tell the developer who changes s_allowedImageUTIs[] that he needs to change lookup table in the conversion function as well.

> Source/WebCore/platform/MIMETypeRegistry.cpp:600
> -        initializeMIMETypeRegistry();
> +        initializePDFAndPostScriptMIMETypes();

I do not think this is clean implementation. Can't we have something like this:

bool MIMETypeRegistry::isPDFOrPostScriptMIMEType(const String& mimeType)
{
    return isPDFMIMEType(mimeType) || isPostScriptMIMEType(mimeType);
}

bool MIMETypeRegistry::isPostScriptMIMEType(const String& mimeType)
{
    return mimeType == "application/postscript";
}

> Source/WebCore/platform/MIMETypeRegistry.cpp:675
>  HashSet<String, ASCIICaseInsensitiveHash>& MIMETypeRegistry::getPDFAndPostScriptMIMETypes()
>  {
>      if (!pdfAndPostScriptMIMETypes)
> -        initializeMIMETypeRegistry();
> +        initializePDFAndPostScriptMIMETypes();
>      return *pdfAndPostScriptMIMETypes;
>  }

Who is calling this function? It is not an exported function and I could not find a place in WebCore which calls it.
Comment 4 Wenson Hsieh 2017-10-23 12:30:29 PDT
Comment on attachment 324492 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=324492&action=review

>> Source/WebCore/platform/MIMETypeRegistry.cpp:78
>> +    };
> 
> Now we will have three lists to maintain: the allowedImageUTIs() in UTIRegistry.cpp and supportedImageMIMETypes and supportedImageResourceMIMETypes in this file. If we want to add support for another image format, for example, we will need to change two completely independent places: the s_allowedImageUTIs[] in allowedImageUTIs() and knownImageMIMETypes[] in this file.
> 
> I would suggest having only one list, either UTIs or MIME types, it does not matter,  but have a conversion function from one type to the other. The conversion  function will have a lookup table from UTI to MIME type or vice versa. It can also assert that the return value matches the return of MIMETypeForImageSourceType() if it converts from UTI to MIME type. So it will tell the developer who changes s_allowedImageUTIs[] that he needs to change lookup table in the conversion function as well.

The conversion functions MIMETypeFromUTI and UTIFromMIMEType already exist; are you proposing that we tweak them to check a hard-coded lookup table of UTI <=> MIME type before hitting UTTypeCopyPreferredTagWithClass?

>> Source/WebCore/platform/MIMETypeRegistry.cpp:600
>> +        initializePDFAndPostScriptMIMETypes();
> 
> I do not think this is clean implementation. Can't we have something like this:
> 
> bool MIMETypeRegistry::isPDFOrPostScriptMIMEType(const String& mimeType)
> {
>     return isPDFMIMEType(mimeType) || isPostScriptMIMEType(mimeType);
> }
> 
> bool MIMETypeRegistry::isPostScriptMIMEType(const String& mimeType)
> {
>     return mimeType == "application/postscript";
> }

Good point -- this also allows us to just get rid of pdfAndPostScriptMIMETypes altogether!

>> Source/WebCore/platform/MIMETypeRegistry.cpp:675
>>  }
> 
> Who is calling this function? It is not an exported function and I could not find a place in WebCore which calls it.

Yeah...it looks like nobody calls this. We should probably just remove it.
Comment 5 Wenson Hsieh 2017-10-23 13:57:17 PDT
(In reply to Wenson Hsieh from comment #4)
> Comment on attachment 324492 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=324492&action=review
> 
> >> Source/WebCore/platform/MIMETypeRegistry.cpp:78
> >> +    };
> > 
> > Now we will have three lists to maintain: the allowedImageUTIs() in UTIRegistry.cpp and supportedImageMIMETypes and supportedImageResourceMIMETypes in this file. If we want to add support for another image format, for example, we will need to change two completely independent places: the s_allowedImageUTIs[] in allowedImageUTIs() and knownImageMIMETypes[] in this file.
> > 
> > I would suggest having only one list, either UTIs or MIME types, it does not matter,  but have a conversion function from one type to the other. The conversion  function will have a lookup table from UTI to MIME type or vice versa. It can also assert that the return value matches the return of MIMETypeForImageSourceType() if it converts from UTI to MIME type. So it will tell the developer who changes s_allowedImageUTIs[] that he needs to change lookup table in the conversion function as well.
> 
> The conversion functions MIMETypeFromUTI and UTIFromMIMEType already exist;
> are you proposing that we tweak them to check a hard-coded lookup table of
> UTI <=> MIME type before hitting UTTypeCopyPreferredTagWithClass?

Talked to Said in person -- we agree that there should be a mechanism to enforce keeping supported MIME types in sync with allowedImageUTIs. Patch forthcoming.

> 
> >> Source/WebCore/platform/MIMETypeRegistry.cpp:600
> >> +        initializePDFAndPostScriptMIMETypes();
> > 
> > I do not think this is clean implementation. Can't we have something like this:
> > 
> > bool MIMETypeRegistry::isPDFOrPostScriptMIMEType(const String& mimeType)
> > {
> >     return isPDFMIMEType(mimeType) || isPostScriptMIMEType(mimeType);
> > }
> > 
> > bool MIMETypeRegistry::isPostScriptMIMEType(const String& mimeType)
> > {
> >     return mimeType == "application/postscript";
> > }
> 
> Good point -- this also allows us to just get rid of
> pdfAndPostScriptMIMETypes altogether!
> 
> >> Source/WebCore/platform/MIMETypeRegistry.cpp:675
> >>  }
> > 
> > Who is calling this function? It is not an exported function and I could not find a place in WebCore which calls it.
> 
> Yeah...it looks like nobody calls this. We should probably just remove it.
Comment 6 Wenson Hsieh 2017-10-23 14:12:07 PDT
Created attachment 324585 [details]
Patch
Comment 7 WebKit Commit Bot 2017-10-23 15:37:36 PDT
Comment on attachment 324585 [details]
Patch

Clearing flags on attachment: 324585

Committed r223860: <https://trac.webkit.org/changeset/223860>
Comment 8 WebKit Commit Bot 2017-10-23 15:37:38 PDT
All reviewed patches have been landed.  Closing bug.
Comment 9 Matt Lewis 2017-10-23 16:23:31 PDT
This revision broke the windows build:

https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/5458/steps/compile-webkit/logs/stdio

error:

C:\cygwin\home\buildbot\slave\win-release\build\Source\WebCore\platform\MIMETypeRegistry.cpp(39): fatal error C1083: Cannot open include file: 'UTIUtilities.h': No such file or directory [C:\cygwin\home\buildbot\slave\win-release\build\WebKitBuild\Release\Source\WebCore\WebCore.vcxproj]
Comment 10 Wenson Hsieh 2017-10-23 16:59:52 PDT
(In reply to Matt Lewis from comment #9)
> This revision broke the windows build:
> 
> https://build.webkit.org/builders/Apple%20Win%20Release%20%28Build%29/builds/
> 5458/steps/compile-webkit/logs/stdio
> 
> error:
> 
> C:\cygwin\home\buildbot\slave\win-
> release\build\Source\WebCore\platform\MIMETypeRegistry.cpp(39): fatal error
> C1083: Cannot open include file: 'UTIUtilities.h': No such file or directory
> [C:\cygwin\home\buildbot\slave\win-
> release\build\WebKitBuild\Release\Source\WebCore\WebCore.vcxproj]

Build fix in https://trac.webkit.org/changeset/223862/webkit
Comment 11 Michael Catanzaro 2017-10-23 18:42:59 PDT
Committed r223872: <https://trac.webkit.org/changeset/223872>