Bug 213254 - Rename TextEncoding::encode() to TextEncoding::encodeWithNormalization()
Summary: Rename TextEncoding::encode() to TextEncoding::encodeWithNormalization()
Status: NEW
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 213253
  Show dependency treegraph
 
Reported: 2020-06-16 09:42 PDT by Tetsuharu Ohzeki [UTC+9]
Modified: 2020-06-19 15:26 PDT (History)
13 users (show)

See Also:


Attachments
Patch (18.18 KB, patch)
2020-06-16 09:46 PDT, Tetsuharu Ohzeki [UTC+9]
ap: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tetsuharu Ohzeki [UTC+9] 2020-06-16 09:42:53 PDT
I'm planning to add variants of TextEncoding::encode() which does not normalize in bug 213253.
To prepare it, I rename the exist implementation to TextEncoding::encodeWithNormalization() to clarify what it do and separate a impact range because TextEncoding::encode()  is marked as `WEBCORE_EXPORT`.
Comment 1 Tetsuharu Ohzeki [UTC+9] 2020-06-16 09:46:37 PDT
Created attachment 402015 [details]
Patch
Comment 2 Alexey Proskuryakov 2020-06-17 18:14:10 PDT
Comment on attachment 402015 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=402015&action=review

> Source/WebCore/ChangeLog:9
> +        I'm planning to add a variant of `TextEncoding::encode()` which does not
> +        normalize to fix bug 213253.

I don't think that this is necessary. Once we perform normalization on ingress (macOS file names, keyboard input etc.), TextEncoding::encode() won't need to normalize ever. There isn't a need for a separate code path.
Comment 3 Tetsuharu Ohzeki [UTC+9] 2020-06-17 18:36:41 PDT
Comment on attachment 402015 [details]
Patch

(In reply to Alexey Proskuryakov from comment #2)
> Comment on attachment 402015 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=402015&action=review
> 
> > Source/WebCore/ChangeLog:9
> > +        I'm planning to add a variant of `TextEncoding::encode()` which does not
> > +        normalize to fix bug 213253.
> 
> I don't think that this is necessary. Once we perform normalization on
> ingress (macOS file names, keyboard input etc.), TextEncoding::encode()
> won't need to normalize ever. There isn't a need for a separate code path.

The exist `TextEncoding::encode()` do normalization in it and it is widely used in evenrywhere in WebKit.
https://trac.webkit.org/browser/webkit/trunk/Source/WebCore/platform/text/TextEncoding.cpp#L77

To fix bug 213253, we need construcing `Blob` object on JavaScript without unicode normalization. I also had a plan to add the function which does not do unicode nomalization as `TextEncoding::encodeWithoutNormalization()` but I thought it's better to rename the exist implementation and I create this separated patch.
Comment 4 Alexey Proskuryakov 2020-06-18 09:25:42 PDT
Comment on attachment 402015 [details]
Patch

Blobs are not an exception here, they can't have normalization disabled before we disable it in TextEncoding::encode() entirely.
Comment 5 Huáng Jùnliàng 2020-06-19 15:26:01 PDT
I want to add some real world cases here.

Babel REPL wraps babel sources into a blob and send it to the WebWorker. This issue causes semantic changes to the sources and error is thrown. https://github.com/babel/website/issues/2254

The blob wrapping is done in a WebPack plugin called worker-loader 

https://github.com/webpack-contrib/worker-loader/blob/3500f14083b5ca2bb953b2919ec1ffc7bc373b2b/src/workers/InlineWorker.js#L28

so besides Babel REPL, users of worker-loader is also affected by this issue.