Bug 213253 - Fix the case about NFC normalization in wpt/FileAPI/unicode.html
Summary: Fix the case about NFC normalization in wpt/FileAPI/unicode.html
Status: RESOLVED DUPLICATE of bug 217327
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on: 213254
Blocks:
  Show dependency treegraph
 
Reported: 2020-06-16 09:36 PDT by Tetsuharu Ohzeki [UTC+9]
Modified: 2020-10-06 05:07 PDT (History)
2 users (show)

See Also:


Attachments
Patch (4.77 KB, patch)
2020-06-16 09:50 PDT, Tetsuharu Ohzeki [UTC+9]
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tetsuharu Ohzeki [UTC+9] 2020-06-16 09:36:57 PDT
I'll attach a patch
Comment 1 Tetsuharu Ohzeki [UTC+9] 2020-06-16 09:50:19 PDT
Created attachment 402016 [details]
Patch
Comment 2 Alexey Proskuryakov 2020-06-17 18:12:03 PDT
Comment on attachment 402016 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=402016&action=review

> Source/WebCore/fileapi/BlobBuilder.cpp:74
> -    auto bytes = UTF8Encoding().encodeWithNormalization(text, UnencodableHandling::Entities);
> +    auto bytes = UTF8Encoding().encode(text, UnencodableHandling::Entities);

While this is correct in abstract, I don't think that we can do this before the normalization is performed in all code paths where decomposed text from macOS can enter WebKit. And once we do that, encode can stop normalizing.
Comment 3 Tetsuharu Ohzeki [UTC+9] 2020-06-17 18:36:24 PDT
(In reply to Alexey Proskuryakov from comment #2)
> Comment on attachment 402016 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=402016&action=review
> 
> > Source/WebCore/fileapi/BlobBuilder.cpp:74
> > -    auto bytes = UTF8Encoding().encodeWithNormalization(text, UnencodableHandling::Entities);
> > +    auto bytes = UTF8Encoding().encode(text, UnencodableHandling::Entities);
> 
> While this is correct in abstract, I don't think that we can do this before
> the normalization is performed in all code paths where decomposed text from
> macOS can enter WebKit. And once we do that, encode can stop normalizing.


This effect only for constructing `Blob` API from string on JavaScript.

The exist `TextEncoding::encode()` do normalization in it and it is widely used in evenrywhere in WebKit.
https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/platform/text/TextEncoding.cpp#L77

I had planned to add `TextEncoding::encodeWithoutNormalization()` but I also thought from the comment that it might be better to rename the exist implementation once, I filed bug 213254 and I created this patch built on it.
Comment 4 Alexey Proskuryakov 2020-06-18 09:24:17 PDT
Since blobs can be sent over the wire and/or processed by algorithms that were only tested on Windows, their content should not include decomposed strings coming from macOS. It is true that the test is accurate and we need to make it pass, but disabling normalization would be the last step for making it happen.
Comment 5 Tetsuharu Ohzeki [UTC+9] 2020-06-18 11:41:04 PDT
Umm..... it's a long way. Thank you for your explanation.
Comment 6 Tetsuharu Ohzeki [UTC+9] 2020-10-06 05:07:37 PDT

*** This bug has been marked as a duplicate of bug 217327 ***