WebKit Bugzilla
New
Browse
Search+
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED DUPLICATE of
bug 216016
179307
WebKit treats Big5-HKSCS as a distinct encoding from Big5, Encoding standard says it's the same
https://bugs.webkit.org/show_bug.cgi?id=179307
Summary
WebKit treats Big5-HKSCS as a distinct encoding from Big5, Encoding standard ...
Maciej Stachowiak
Reported
2017-11-05 16:38:29 PST
WebKit treats Big5-HKSCS as a distinct encoding from Big5, but the Encoding standard says it's the same. Chrome and Firefox report Big5 as the canonical name when using the TextDecoder API. It's not clear to me if they actually decode it differently though, I am not sure how to make a test for that.
Attachments
Test case for (lack of) WebKit's Big5 quirks, meant to go in LayoutTests/fast/encodings
(853 bytes, text/html)
2017-11-05 20:41 PST
,
Maciej Stachowiak
no flags
Details
View All
Add attachment
proposed patch, testcase, etc.
Maciej Stachowiak
Comment 1
2017-11-05 18:29:18 PST
Here's some past revisions that may explain why we have this behavior (pointed out by Darin):
https://trac.webkit.org/changeset/3611/webkit
We changed to treat all Big5 as an alias for the Windows version (like the latest Encoding spec does)
https://trac.webkit.org/changeset/4054/webkit
We changed to treat most Big5 character sets as Big5_HKSCS_1999, unless they were explicitly Microsoft-specific.
https://trac.webkit.org/changeset/4689/webkit
We changed to treat most Big5 character sets as the DOS/Windows version, but left Big5-HKSCS alone. It's not totally clear why Big5-HKSCS was left alone in that last change. I don't think this is compatible with other browsers do, so we should probably abandon this direction. But I need to make some tests.
Alexey Proskuryakov
Comment 2
2017-11-05 19:32:48 PST
Big5 is a large family of standards governed by various entities, and we basically never got to check if ICU supported the variant(s) that other browsers used. This is likely moot now, as Chrome also uses ICU.
Maciej Stachowiak
Comment 3
2017-11-05 20:36:22 PST
These are our differences from the standard on Big5-related encodings: MISMATCH: encoding big5-hkscs is Big5 in the standard, but Big5-HKSCS in WebKit EXTRA NAME: WebKit knows extra nonstandard name x-windows-950 for Big5 EXTRA NAME: WebKit knows extra nonstandard name windows-950 for Big5 EXTRA NAME: WebKit knows extra nonstandard name x-big5 for Big5 EXTRA NAME: WebKit knows extra nonstandard name ms950 for Big5 EXTRA NAME: WebKit knows extra nonstandard name windows-950-2000 for Big5 EXTRA ENCODING: WebKit knows nonstandard encoding Big5-HKSCS with names ['big5-hkscs', 'big5hk', 'hkscs-big5', 'ibm-1375', 'ibm-1375_p100-2008']
Maciej Stachowiak
Comment 4
2017-11-05 20:41:52 PST
Created
attachment 326098
[details]
Test case for (lack of) WebKit's Big5 quirks, meant to go in LayoutTests/fast/encodings This test case gives exactly the spec-mandated results for Firefox and Chrome. They both have the exact spec behavior. Safari has the differences described above.
Maciej Stachowiak
Comment 5
2017-11-05 20:58:37 PST
Here's the Gecko bug from when they did the merge:
https://bugzilla.mozilla.org/show_bug.cgi?id=912470
It seems like their Big5 supports HKSCS character sequences. But I'm not sure if that's the same as our Big5-HKSCS or something that's a larger of that and Windows-flavord Big5.
Maciej Stachowiak
Comment 6
2017-11-05 21:56:40 PST
Based on
http://w3c-test.org/encoding/big5-encoder.html
, it doesn't look like either Big5 or Big5_HKSCS encodings from ICU quite match what the Encoding standard requires, and their failures are not the same either, so merging down to one of the two is bound to cause bugs. We might need a custom Big5 codec. ICU seems to support several apparent Big5 variants: ibm-1373_P100-2002 windows-950-2000 ibm-950_P110-1999 ibm-1375_P100-2008 ibm-5471_P100-2006 I'm not sure if any of these are the proper web variant.
Anne van Kesteren
Comment 7
2020-05-06 07:11:42 PDT
***
Bug 159890
has been marked as a duplicate of this bug. ***
Anne van Kesteren
Comment 8
2022-09-27 06:27:24 PDT
According to
https://wpt.fyi/results/encoding?label=master&label=experimental&aligned&view=subtest&q=big5
we pass all the tests so this was fixed at some point. Probably by Alex?
Anne van Kesteren
Comment 9
2022-09-27 06:44:37 PDT
Confirmed:
https://github.com/WebKit/WebKit/commit/70a5c3285eca476faa66c6e6055d615c26c78fc4
*** This bug has been marked as a duplicate of
bug 216016
***
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug