RESOLVED DUPLICATE of bug 113001 119320
Do not normalize into NFC the values of form fields
https://bugs.webkit.org/show_bug.cgi?id=119320
Summary Do not normalize into NFC the values of form fields
Ryosuke Niwa
Reported 2013-07-30 21:53:22 PDT
Consider merging https://chromium.googlesource.com/chromium/blink/+/c167d111dd8a44da334d74e4b6608811b465945d This commit renames the TextEncoding::encode(const String&, UnencodableHandling) const API function to normalizeAndEncode() to make clear that the function first applies Unicode NFC normalization before encoding the string. All existing references to encode() except for one were updated to call normalizeAndEncode() instead. TextEncoding::encode(const String&, UnencodableHandling) const is added back as a new API function which only encodes the string, but does not normalize the string before encoding. The one call to the old encode() function that was not updated is in FormDataList::appendString(const String&). This was left as a call to the new encode() to fix Issue 117128. Chrome and Safari are unlike other browsers in that they apply Unicode NFC normalization to form values when submitting a form; in particular, the following browsers were tested and found not to normalize the form values: - Firefox 22.0 - Firefox ESR 17.0.7 - Opera 12.16 - IE 6 - IE 7 - IE 8 - IE 9 - IE 10 - Amaya 11.4.7 NFC normalization actually changes the meaning of text in certain scripts. Notably, there are certain Biblical Hebrew words for which normalization causes the word to be erroneously encoded. One example is given on page 9 of the SBL Hebrew Font User Manual version 1.5: http://www.sbl-site.org/Fonts/SBLHebrewManualv1.5.pdf This example is added as the new form-data-encoding-3.html layout test.
Attachments
Patch (42.38 KB, patch)
2013-08-11 06:28 PDT, Daniel Trebbien
no flags
Alexey Proskuryakov
Comment 1 2013-07-31 09:46:15 PDT
> NFC normalization actually changes the meaning of text in certain scripts. This is a mistake and should never be relied upon. *** This bug has been marked as a duplicate of bug 113001 ***
Daniel Trebbien
Comment 2 2013-08-11 06:28:17 PDT
Created attachment 208491 [details] Patch I think that this bug should be re-opened now that Safari is the only major browser which normalizes form values. Bug 8769 cites the charmod-norm spec. Note that not normalizing form submission values is "perfectly acceptable" according to C309 (http://www.w3.org/TR/charmod-norm/#C309) because the browser can be viewed as the "producer", and the server to which the form data is being submitted can be viewed as the "remote component ... to which normalization is delegated". The server should be able to decide whether normalization is performed or not, and to which values.
Alexey Proskuryakov
Comment 3 2013-08-11 09:27:49 PDT
WebKit has always been the only browser engine to do this, it's not a new development.
Daniel Trebbien
Comment 4 2013-08-11 09:52:44 PDT
(In reply to comment #3) > WebKit has always been the only browser engine to do this, it's not a new development. Well, I mean that Chrome/Blink will soon not have this bug. In fact, the latest Chromium nightlies and Chrome Canary builds do not have this bug. What is the reason for keeping this behavior? It's not mandated by a spec, and it shouldn't be required for compatibility with Windows because Internet Explorer 6+ does not normalize. I just checked IE 11 Preview and form values are not normalized.
Alexey Proskuryakov
Comment 5 2013-08-12 08:41:26 PDT
This is already discussed in the original. In any case, please keep the discussion to the original bug, to keep it in one place.
Note You need to log in before you can comment on or make changes to this bug.