Bug 160931 - Align our encoding labels with the encoding specification
Summary: Align our encoding labels with the encoding specification
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: DOM (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Chris Dumez
URL: https://encoding.spec.whatwg.org/#nam...
Keywords: WebExposed
Depends on: 171128
Blocks: 161219
  Show dependency treegraph
 
Reported: 2016-08-16 20:11 PDT by Chris Dumez
Modified: 2017-04-21 12:18 PDT (History)
9 users (show)

See Also:


Attachments
Patch (64.29 KB, patch)
2016-08-16 20:19 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews113 for mac-yosemite (1.17 MB, application/zip)
2016-08-16 20:43 PDT, Build Bot
no flags Details
Patch (64.29 KB, patch)
2016-08-16 20:56 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews117 for mac-yosemite (428.08 KB, application/zip)
2016-08-16 21:34 PDT, Build Bot
no flags Details
Patch (65.01 KB, patch)
2016-08-16 21:36 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (68.78 KB, patch)
2016-08-16 22:11 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews101 for mac-yosemite (1.46 MB, application/zip)
2016-08-16 22:39 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews116 for mac-yosemite (890.46 KB, application/zip)
2016-08-16 22:57 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews106 for mac-yosemite-wk2 (1.33 MB, application/zip)
2016-08-16 23:01 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews124 for ios-simulator-elcapitan-wk2 (1.75 MB, application/zip)
2016-08-17 00:01 PDT, Build Bot
no flags Details
Patch (69.02 KB, patch)
2016-08-17 12:49 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Archive of layout-test-results from ews103 for mac-yosemite (1.40 MB, application/zip)
2016-08-17 13:18 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews104 for mac-yosemite-wk2 (1.36 MB, application/zip)
2016-08-17 13:40 PDT, Build Bot
no flags Details
Archive of layout-test-results from ews112 for mac-yosemite (1.96 MB, application/zip)
2016-08-17 13:49 PDT, Build Bot
no flags Details
Patch (70.25 KB, patch)
2016-08-17 14:06 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Test page to check ICU encodings support (11.67 KB, text/html)
2016-08-17 16:04 PDT, Chris Dumez
no flags Details
Safari 10 ICU encodings support (28.98 KB, text/plain)
2016-08-17 16:04 PDT, Chris Dumez
no flags Details
Firefox 48 ICU encodings support (32.31 KB, text/plain)
2016-08-17 16:05 PDT, Chris Dumez
no flags Details
WebKit w/ patch ICU encodings support (32.11 KB, text/plain)
2016-08-17 16:05 PDT, Chris Dumez
no flags Details
ICU changes due to patch (side by side diff) (67.59 KB, text/plain)
2016-08-17 16:15 PDT, Chris Dumez
no flags Details
ICU changes due to patch (side by side diff) (53.33 KB, text/plain)
2016-08-17 16:22 PDT, Chris Dumez
no flags Details
Archive of layout-test-results from ews122 for ios-simulator-elcapitan-wk2 (693.46 KB, application/zip)
2016-08-17 16:25 PDT, Build Bot
no flags Details
ICU labels support comparison (WK, WK w/ patch, Firefox) (330.29 KB, application/x-iwork-keynote-sffnumbers)
2016-08-17 17:02 PDT, Chris Dumez
no flags Details
Test page to check ICU encodings support (11.77 KB, text/html)
2016-08-17 17:03 PDT, Chris Dumez
no flags Details
ICU labels support comparison (WK, WK w/ patch, Firefox) (322.23 KB, text/plain)
2016-08-17 19:02 PDT, Chris Dumez
no flags Details
Test page to check ICU encodings support (11.53 KB, text/html)
2016-08-17 19:09 PDT, Chris Dumez
no flags Details
ICU labels support comparison (WK, WK w/ patch, Firefox, Chrome) (349.21 KB, patch)
2016-08-17 19:45 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
Patch (70.37 KB, patch)
2016-08-17 19:58 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
ICU labels support comparison (WK, WK w/ patch, Firefox, Chrome) (349.24 KB, application/x-iwork-keynote-sffnumbers)
2016-08-17 19:59 PDT, Chris Dumez
no flags Details
Patch (69.20 KB, patch)
2016-08-18 09:27 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
ICU labels support comparison (WK, WK w/ patch, Firefox, Chrome) (349.15 KB, application/x-iwork-keynote-sffnumbers)
2016-08-18 09:28 PDT, Chris Dumez
no flags Details
Patch (68.81 KB, patch)
2016-08-18 10:39 PDT, Chris Dumez
darin: review+
Details | Formatted Diff | Diff
ICU labels support comparison (WK, WK w/ patch, Firefox, Chrome) (349.35 KB, application/x-iwork-keynote-sffnumbers)
2016-08-18 10:41 PDT, Chris Dumez
no flags Details
Patch for landing (68.04 KB, patch)
2016-08-18 11:39 PDT, Chris Dumez
no flags Details | Formatted Diff | Diff
ICU labels support comparison (WK, WK w/ patch, Firefox, Chrome) (349.46 KB, application/x-iwork-keynote-sffnumbers)
2016-08-18 11:39 PDT, Chris Dumez
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Dumez 2016-08-16 20:11:42 PDT
Align our encoding labels with the encoding specification:
- https://encoding.spec.whatwg.org/#names-and-labels

This also aligns with Firefox and Chrome.
Comment 1 Chris Dumez 2016-08-16 20:19:42 PDT
Created attachment 286260 [details]
Patch
Comment 2 Build Bot 2016-08-16 20:43:30 PDT
Comment on attachment 286260 [details]
Patch

Attachment 286260 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1884877

Number of test failures exceeded the failure limit.
Comment 3 Build Bot 2016-08-16 20:43:34 PDT
Created attachment 286264 [details]
Archive of layout-test-results from ews113 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews113  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 4 Chris Dumez 2016-08-16 20:56:15 PDT
Created attachment 286265 [details]
Patch
Comment 5 Alexey Proskuryakov 2016-08-16 21:08:50 PDT
Comment on attachment 286265 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286265&action=review

> Source/WebCore/ChangeLog:9
> +        Align our encoding labels with the encoding specification:
> +        - https://encoding.spec.whatwg.org/#names-and-labels

The best approach here would be to have our own encoding name table, and never iterate ICU codecs.
Comment 6 Chris Dumez 2016-08-16 21:20:52 PDT
(In reply to comment #5)
> Comment on attachment 286265 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=286265&action=review
> 
> > Source/WebCore/ChangeLog:9
> > +        Align our encoding labels with the encoding specification:
> > +        - https://encoding.spec.whatwg.org/#names-and-labels
> 
> The best approach here would be to have our own encoding name table, and
> never iterate ICU codecs.

Ok, I will try and refactor tomorrow then.
Comment 7 Build Bot 2016-08-16 21:34:53 PDT
Comment on attachment 286265 [details]
Patch

Attachment 286265 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1885077

Number of test failures exceeded the failure limit.
Comment 8 Build Bot 2016-08-16 21:34:57 PDT
Created attachment 286268 [details]
Archive of layout-test-results from ews117 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews117  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 9 Chris Dumez 2016-08-16 21:36:49 PDT
Created attachment 286269 [details]
Patch
Comment 10 Chris Dumez 2016-08-16 22:11:13 PDT
Created attachment 286274 [details]
Patch
Comment 11 Build Bot 2016-08-16 22:39:34 PDT
Comment on attachment 286274 [details]
Patch

Attachment 286274 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/1885316

New failing tests:
fast/encoding/utf-32-big-endian-bom.html
http/tests/misc/submit-get-in-utf32le.html
http/tests/misc/url-in-utf32le.html
http/tests/misc/submit-post-in-utf32be.html
fast/encoding/char-decoding.html
fast/encoding/char-decoding-mac.html
http/tests/misc/submit-get-in-utf32be.html
fast/encoding/utf-32-big-endian-nobom.xml
fast/encoding/char-encoding.html
fast/encoding/GBK/EUC-CN.html
http/tests/misc/submit-post-in-utf32le.html
fast/encoding/utf-32-little-endian-bom.html
fast/encoding/utf-32-little-endian-nobom.xml
fast/encoding/char-encoding-mac.html
http/tests/misc/url-in-utf32be.html
Comment 12 Build Bot 2016-08-16 22:39:39 PDT
Created attachment 286275 [details]
Archive of layout-test-results from ews101 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews101  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 13 Build Bot 2016-08-16 22:56:58 PDT
Comment on attachment 286274 [details]
Patch

Attachment 286274 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1885366

Number of test failures exceeded the failure limit.
Comment 14 Build Bot 2016-08-16 22:57:03 PDT
Created attachment 286277 [details]
Archive of layout-test-results from ews116 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews116  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 15 Build Bot 2016-08-16 23:01:25 PDT
Comment on attachment 286274 [details]
Patch

Attachment 286274 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/1885394

New failing tests:
fast/encoding/utf-32-big-endian-bom.html
http/tests/misc/submit-get-in-utf32le.html
http/tests/misc/url-in-utf32le.html
http/tests/misc/submit-post-in-utf32be.html
fast/encoding/char-decoding.html
fast/encoding/char-decoding-mac.html
fast/encoding/char-encoding-mac.html
fast/encoding/utf-32-little-endian-nobom.xml
fast/encoding/char-encoding.html
fast/encoding/GBK/EUC-CN.html
http/tests/misc/submit-post-in-utf32le.html
fast/encoding/utf-32-little-endian-bom.html
fast/encoding/utf-32-big-endian-nobom.xml
http/tests/misc/submit-get-in-utf32be.html
http/tests/misc/url-in-utf32be.html
Comment 16 Build Bot 2016-08-16 23:01:31 PDT
Created attachment 286279 [details]
Archive of layout-test-results from ews106 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews106  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 17 Build Bot 2016-08-17 00:01:13 PDT
Comment on attachment 286274 [details]
Patch

Attachment 286274 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/1885471

New failing tests:
fast/encoding/utf-32-big-endian-bom.html
http/tests/misc/submit-get-in-utf32le.html
http/tests/misc/url-in-utf32le.html
http/tests/misc/submit-post-in-utf32be.html
fast/encoding/char-decoding.html
fast/encoding/char-decoding-mac.html
http/tests/misc/submit-get-in-utf32be.html
fast/encoding/utf-32-big-endian-nobom.xml
fast/encoding/char-encoding.html
fast/encoding/GBK/EUC-CN.html
http/tests/misc/submit-post-in-utf32le.html
fast/encoding/utf-32-little-endian-bom.html
fast/encoding/utf-32-little-endian-nobom.xml
fast/encoding/char-encoding-mac.html
http/tests/misc/url-in-utf32be.html
Comment 18 Build Bot 2016-08-17 00:01:19 PDT
Created attachment 286285 [details]
Archive of layout-test-results from ews124 for ios-simulator-elcapitan-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews124  Port: ios-simulator-elcapitan-wk2  Platform: Mac OS X 10.11.5
Comment 19 Chris Dumez 2016-08-17 12:49:51 PDT
Created attachment 286317 [details]
Patch
Comment 20 Build Bot 2016-08-17 13:18:02 PDT
Comment on attachment 286317 [details]
Patch

Attachment 286317 [details] did not pass mac-ews (mac):
Output: http://webkit-queues.webkit.org/results/1888764

New failing tests:
fast/encoding/utf-32-big-endian-bom.html
http/tests/misc/submit-get-in-utf32le.html
http/tests/misc/url-in-utf32le.html
http/tests/misc/submit-post-in-utf32be.html
fast/encoding/char-decoding.html
fast/encoding/char-decoding-mac.html
http/tests/misc/submit-get-in-utf32be.html
fast/encoding/utf-32-big-endian-nobom.xml
fast/encoding/char-encoding.html
fast/encoding/GBK/EUC-CN.html
http/tests/misc/submit-post-in-utf32le.html
fast/encoding/utf-32-little-endian-bom.html
fast/encoding/utf-32-little-endian-nobom.xml
fast/encoding/char-encoding-mac.html
http/tests/misc/url-in-utf32be.html
Comment 21 Build Bot 2016-08-17 13:18:07 PDT
Created attachment 286320 [details]
Archive of layout-test-results from ews103 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: ews103  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 22 Build Bot 2016-08-17 13:40:55 PDT
Comment on attachment 286317 [details]
Patch

Attachment 286317 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.webkit.org/results/1888848

New failing tests:
fast/encoding/utf-32-big-endian-bom.html
http/tests/misc/submit-get-in-utf32le.html
http/tests/misc/url-in-utf32le.html
http/tests/misc/submit-post-in-utf32be.html
fast/encoding/char-decoding.html
fast/encoding/char-decoding-mac.html
http/tests/misc/submit-get-in-utf32be.html
fast/encoding/utf-32-big-endian-nobom.xml
fast/encoding/char-encoding.html
fast/encoding/GBK/EUC-CN.html
http/tests/misc/submit-post-in-utf32le.html
fast/encoding/utf-32-little-endian-bom.html
fast/encoding/utf-32-little-endian-nobom.xml
fast/encoding/char-encoding-mac.html
http/tests/misc/url-in-utf32be.html
Comment 23 Build Bot 2016-08-17 13:40:59 PDT
Created attachment 286322 [details]
Archive of layout-test-results from ews104 for mac-yosemite-wk2

The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: ews104  Port: mac-yosemite-wk2  Platform: Mac OS X 10.10.5
Comment 24 Build Bot 2016-08-17 13:49:46 PDT
Comment on attachment 286317 [details]
Patch

Attachment 286317 [details] did not pass mac-debug-ews (mac):
Output: http://webkit-queues.webkit.org/results/1888859

New failing tests:
fast/encoding/utf-32-big-endian-bom.html
http/tests/misc/submit-get-in-utf32le.html
http/tests/misc/url-in-utf32le.html
http/tests/misc/submit-post-in-utf32be.html
fast/encoding/char-decoding.html
fast/encoding/char-decoding-mac.html
http/tests/misc/submit-get-in-utf32be.html
fast/encoding/utf-32-big-endian-nobom.xml
fast/encoding/char-encoding.html
fast/encoding/GBK/EUC-CN.html
http/tests/misc/submit-post-in-utf32le.html
fast/encoding/utf-32-little-endian-bom.html
fast/encoding/utf-32-little-endian-nobom.xml
fast/encoding/char-encoding-mac.html
http/tests/misc/url-in-utf32be.html
Comment 25 Build Bot 2016-08-17 13:49:50 PDT
Created attachment 286324 [details]
Archive of layout-test-results from ews112 for mac-yosemite

The attached test failures were seen while running run-webkit-tests on the mac-debug-ews.
Bot: ews112  Port: mac-yosemite  Platform: Mac OS X 10.10.5
Comment 26 Chris Dumez 2016-08-17 14:06:09 PDT
Created attachment 286327 [details]
Patch
Comment 27 Chris Dumez 2016-08-17 16:04:17 PDT
Created attachment 286335 [details]
Test page to check ICU encodings support
Comment 28 Chris Dumez 2016-08-17 16:04:50 PDT
Created attachment 286336 [details]
Safari 10 ICU encodings support
Comment 29 Chris Dumez 2016-08-17 16:05:13 PDT
Created attachment 286337 [details]
Firefox 48 ICU encodings support
Comment 30 Chris Dumez 2016-08-17 16:05:36 PDT
Created attachment 286338 [details]
WebKit w/ patch ICU encodings support
Comment 31 Chris Dumez 2016-08-17 16:15:50 PDT
Created attachment 286340 [details]
ICU changes due to patch (side by side diff)
Comment 32 Chris Dumez 2016-08-17 16:22:46 PDT
Created attachment 286341 [details]
ICU changes due to patch (side by side diff)
Comment 33 Build Bot 2016-08-17 16:25:08 PDT
Comment on attachment 286327 [details]
Patch

Attachment 286327 [details] did not pass ios-sim-ews (ios-simulator-wk2):
Output: http://webkit-queues.webkit.org/results/1889304

New failing tests:
fast/scrolling/ios/scrollTo-at-page-load.html
Comment 34 Build Bot 2016-08-17 16:25:12 PDT
Created attachment 286342 [details]
Archive of layout-test-results from ews122 for ios-simulator-elcapitan-wk2

The attached test failures were seen while running run-webkit-tests on the ios-sim-ews.
Bot: ews122  Port: ios-simulator-elcapitan-wk2  Platform: Mac OS X 10.11.5
Comment 35 Chris Dumez 2016-08-17 17:02:35 PDT
Created attachment 286345 [details]
ICU labels support comparison (WK, WK w/ patch, Firefox)
Comment 36 Chris Dumez 2016-08-17 17:03:04 PDT
Created attachment 286346 [details]
Test page to check ICU encodings support
Comment 37 Alexey Proskuryakov 2016-08-17 17:17:17 PDT
Are all the "SUPPORTED (got: windows-1252)" results actually "unsupported"?
Comment 38 Alexey Proskuryakov 2016-08-17 17:18:58 PDT
I think that non-ASCII compatible encodings like ISO_2022 need to be tested separately, they won't work with a data: URL.
Comment 39 Alexey Proskuryakov 2016-08-17 17:20:26 PDT
I'm actually not sure what "ISO_2022,locale=ja,version=0" is and whether it is ASCII compatible.
Comment 40 Chris Dumez 2016-08-17 18:24:25 PDT
(In reply to comment #37)
> Are all the "SUPPORTED (got: windows-1252)" results actually "unsupported"?

I guess it depends what you mean by "unsupported". If we failed to recognize the charset, we would fallback to "UTF-8". If we use "windows-1252" then I believe it means we recognized the label and decided to use windows-1252.
Comment 41 Chris Dumez 2016-08-17 19:02:51 PDT
Created attachment 286355 [details]
ICU labels support comparison (WK, WK w/ patch, Firefox)

Improved the excel sheet by:
- coloring browsers that agree with each other
- Dropping SUPPORTED / UNSUPPORTED as it was confusing
Comment 42 Chris Dumez 2016-08-17 19:08:02 PDT
As you can see from the spreadsheet, when it comes to ICU labels:
- We are mostly aligned with Firefox
- In cases we are not aligned with Firefox, we maintain our previous behavior (the one in Safari 10).
- A lot of ICU labels give different results after my patch. However, in ALL cases, we are now consistent with Firefox.

All in all, this looks pretty good to me but I am no expert:
- We pass all W3C tests
- The new behavior is mostly aligned with Firefox and is aligned with Safari 10 otherwise.

If you see any encoding that is no longer supported with my patch but you think is important, I am happy to re-introduce them.
Comment 43 Chris Dumez 2016-08-17 19:09:36 PDT
Created attachment 286357 [details]
Test page to check ICU encodings support
Comment 44 Chris Dumez 2016-08-17 19:33:42 PDT
I am updating the spreadsheet with Chrome results too.
Comment 45 Chris Dumez 2016-08-17 19:45:58 PDT
Created attachment 286358 [details]
ICU labels support comparison (WK, WK w/ patch, Firefox, Chrome)

Spreadsheet now includes Chrome results too. Chrome is super close to Firefox.
There are however a few cases (marked in red) where Chrome agrees with our old behavior. Therefore, I'll update my patch to maintain our old behavior in this case.
Comment 46 Chris Dumez 2016-08-17 19:58:35 PDT
Created attachment 286360 [details]
Patch
Comment 47 Chris Dumez 2016-08-17 19:59:20 PDT
Created attachment 286361 [details]
ICU labels support comparison (WK, WK w/ patch, Firefox, Chrome)
Comment 48 Chris Dumez 2016-08-17 20:03:50 PDT
Comment on attachment 286360 [details]
Patch

I think the results are even more convincing now (with the Chrome results added to the spreadsheet and the couple of fixes to align better with Chrome).
Comment 49 Alexey Proskuryakov 2016-08-17 22:19:38 PDT
Looks pretty convincing to me too.

I feel like windows-1252 is an artifact of testing approach, so maybe testing with data: subframes wasn't sufficiently accurate (sorry about that!). Unknown encoding names are ignored, and I expect that to be the case for all browsers.

Should probably use HTTP to generate Content-Type with a charset. That may also help with some other surprising results, such as ISO-2022, HZ and other complex Asian encodings.

I'm confused by what's going on with aliases that become "replacement" - it seems wrong, but I do not have a theory about what's going on.

I'm not sure if we have test coverage for visual vs. logical RTL encodings - this table alone doesn't tell the full story for encodings in ISO-8859-6 and ISO-8859-8 groups.
Comment 50 Alexey Proskuryakov 2016-08-17 22:24:36 PDT
Comment on attachment 286360 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286360&action=review

> Source/WebCore/platform/text/TextCodecICU.cpp:241
> +        if (!strcmp(encodingName.name, "x-mac-turkish")) {

Are these Mac encodings also supported by TextCodecMac, using TEC?

> Source/WebCore/platform/text/TextCodecLatin1.cpp:-115
> -    // ASCII and Latin-1 both decode as Windows Latin-1 although they retain unique identities.

Is this still accurate?
Comment 51 Chris Dumez 2016-08-18 07:39:39 PDT
(In reply to comment #49)
> Looks pretty convincing to me too.
> 
> I feel like windows-1252 is an artifact of testing approach, so maybe
> testing with data: subframes wasn't sufficiently accurate (sorry about
> that!). Unknown encoding names are ignored, and I expect that to be the case
> for all browsers.
> 
> Should probably use HTTP to generate Content-Type with a charset. That may
> also help with some other surprising results, such as ISO-2022, HZ and other
> complex Asian encodings.
> 
> I'm confused by what's going on with aliases that become "replacement" - it
> seems wrong, but I do not have a theory about what's going on.
> 
> I'm not sure if we have test coverage for visual vs. logical RTL encodings -
> this table alone doesn't tell the full story for encodings in ISO-8859-6 and
> ISO-8859-8 groups.

I actually used http for the latest results because I could not get the data uri I frame to work in Chrome.
Comment 52 Chris Dumez 2016-08-18 07:41:19 PDT
(In reply to comment #50)
> Comment on attachment 286360 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=286360&action=review
> 
> > Source/WebCore/platform/text/TextCodecICU.cpp:241
> > +        if (!strcmp(encodingName.name, "x-mac-turkish")) {
> 
> Are these Mac encodings also supported by TextCodecMac, using TEC?
> 

I will check but I remember I had to add them to make the layout tests pass.

> > Source/WebCore/platform/text/TextCodecLatin1.cpp:-115
> > -    // ASCII and Latin-1 both decode as Windows Latin-1 although they retain unique identities.
> 
> Is this still accurate?

I don't think so, I will drop the comment.
Comment 53 Chris Dumez 2016-08-18 09:27:58 PDT
Created attachment 286371 [details]
Patch

Added back support for "x-macroman" -> "macintosh" alias.
Comment 54 Chris Dumez 2016-08-18 09:28:23 PDT
Created attachment 286372 [details]
ICU labels support comparison (WK, WK w/ patch, Firefox, Chrome)
Comment 55 Chris Dumez 2016-08-18 09:43:21 PDT
Relevant mozilla bug for Big5-HKSCS -> Big5 merge:
https://bugzilla.mozilla.org/show_bug.cgi?id=912470
Comment 56 Chris Dumez 2016-08-18 09:48:59 PDT
(In reply to comment #55)
> Relevant mozilla bug for Big5-HKSCS -> Big5 merge:
> https://bugzilla.mozilla.org/show_bug.cgi?id=912470

http://app.midland.com.hk/residential_ebook/%A4%F1%B5%D8%A7Q%A4s-beverly-hill is an example of page using Big5-HKSCS charset.
Comment 57 Chris Dumez 2016-08-18 09:51:58 PDT
(In reply to comment #56)
> (In reply to comment #55)
> > Relevant mozilla bug for Big5-HKSCS -> Big5 merge:
> > https://bugzilla.mozilla.org/show_bug.cgi?id=912470
> 
> http://app.midland.com.hk/residential_ebook/%A4%F1%B5%D8%A7Q%A4s-beverly-
> hill is an example of page using Big5-HKSCS charset.

I personally cannot spot any difference with our without my change on this Big5-HKSCS page.
Comment 58 Chris Dumez 2016-08-18 10:13:32 PDT
(In reply to comment #57)
> (In reply to comment #56)
> > (In reply to comment #55)
> > > Relevant mozilla bug for Big5-HKSCS -> Big5 merge:
> > > https://bugzilla.mozilla.org/show_bug.cgi?id=912470
> > 
> > http://app.midland.com.hk/residential_ebook/%A4%F1%B5%D8%A7Q%A4s-beverly-
> > hill is an example of page using Big5-HKSCS charset.
> 
> I personally cannot spot any difference with our without my change on this
> Big5-HKSCS page.

I do see some differences on http://www.nwstbus.com.hk/fareConcession/concession_detail.aspx?id=49&intLangID=2 though. The character coming after "厚德" seems to be missing with my change.
Comment 59 Chris Dumez 2016-08-18 10:34:08 PDT
(In reply to comment #58)
> (In reply to comment #57)
> > (In reply to comment #56)
> > > (In reply to comment #55)
> > > > Relevant mozilla bug for Big5-HKSCS -> Big5 merge:
> > > > https://bugzilla.mozilla.org/show_bug.cgi?id=912470
> > > 
> > > http://app.midland.com.hk/residential_ebook/%A4%F1%B5%D8%A7Q%A4s-beverly-
> > > hill is an example of page using Big5-HKSCS charset.
> > 
> > I personally cannot spot any difference with our without my change on this
> > Big5-HKSCS page.
> 
> I do see some differences on
> http://www.nwstbus.com.hk/fareConcession/concession_detail.
> aspx?id=49&intLangID=2 though. The character coming after "厚德" seems to be
> missing with my change.

I'll stop marking Big5-HKSCS be an alias for Big5 as it seems to break rendering for some sites. I'll go back to the old behavior of using Big5-HKSCS codec.
Comment 60 Chris Dumez 2016-08-18 10:39:32 PDT
Created attachment 286376 [details]
Patch

Big5-HKSCS is no longer an alias to Big5 to maintain our previous behavior and avoid breakage.
Comment 61 Chris Dumez 2016-08-18 10:41:51 PDT
Created attachment 286377 [details]
ICU labels support comparison (WK, WK w/ patch, Firefox, Chrome)
Comment 62 Darin Adler 2016-08-18 10:51:24 PDT
Comment on attachment 286376 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286376&action=review

> LayoutTests/imported/w3c/ChangeLog:12
> +        651. The only checks we're failing is due to "Big5-HKSCS" not being an alias
> +        to "Big5".

What prevents us from fixing this at this time?

> Source/WebCore/platform/text/TextCodecICU.cpp:79
> +DECLARE_ALIASES(UTF_8, "unicode-1-1-utf-8", "utf8", "unicode11utf8", "unicode20utf8", "x-unicode20utf8");

We don’t use ICU for UTF-8, so it’s peculiar to have this inside TextCodecICU.

> Source/WebCore/platform/text/TextCodecICU.cpp:100
> +DECLARE_ALIASES(windows_1252, "ansi_x3.4-1968", "ascii", "cp1252", "cp819", "csisolatin1", "ibm819", "iso-8859-1", "iso-ir-100", "iso8859-1", "iso88591", "iso_8859-1", "iso_8859-1:1987", "l1", "latin1", "us-ascii", "x-cp1252");

We don’t use ICU for this encoding, so it’s peculiar to have this inside TextCodecICU.

> Source/WebCore/platform/text/TextCodecICU.cpp:134
> +    size_t aliasCount;

Seems overkill to use size_t for this. Could be uint8_t, for example. If there’s no actual cost to using 64-bits for these small counts, maybe this is fine.
Comment 63 Darin Adler 2016-08-18 10:52:43 PDT
Comment on attachment 286376 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=286376&action=review

>> LayoutTests/imported/w3c/ChangeLog:12
>> +        to "Big5".
> 
> What prevents us from fixing this at this time?

Oh, I see your earlier comments about that now.
Comment 64 Alexey Proskuryakov 2016-08-18 11:02:24 PDT
I don't feel very good about the Big-5 situation. To just keep the current behavior, we should probably also keep windows-950 group aliases. But longer term, I just can't tell which of the variants is a superset of which, which is supported in Apple fonts, and whether the super naive algorithm in the encoding spec is actually the way to go.
Comment 65 Chris Dumez 2016-08-18 11:39:18 PDT
Created attachment 286380 [details]
Patch for landing
Comment 66 Chris Dumez 2016-08-18 11:39:42 PDT
Created attachment 286381 [details]
ICU labels support comparison (WK, WK w/ patch, Firefox, Chrome)
Comment 67 Chris Dumez 2016-08-18 11:40:26 PDT
(In reply to comment #64)
> I don't feel very good about the Big-5 situation. To just keep the current
> behavior, we should probably also keep windows-950 group aliases. But longer
> term, I just can't tell which of the variants is a superset of which, which
> is supported in Apple fonts, and whether the super naive algorithm in the
> encoding spec is actually the way to go.

Brought back windows-950 in the last iteration and updated the spreadsheet accordingly.
Comment 68 Chris Dumez 2016-08-18 11:41:43 PDT
(In reply to comment #62)
> Comment on attachment 286376 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=286376&action=review
> 
> > LayoutTests/imported/w3c/ChangeLog:12
> > +        651. The only checks we're failing is due to "Big5-HKSCS" not being an alias
> > +        to "Big5".
> 
> What prevents us from fixing this at this time?
> 
> > Source/WebCore/platform/text/TextCodecICU.cpp:79
> > +DECLARE_ALIASES(UTF_8, "unicode-1-1-utf-8", "utf8", "unicode11utf8", "unicode20utf8", "x-unicode20utf8");
> 
> We don’t use ICU for UTF-8, so it’s peculiar to have this inside
> TextCodecICU.
> 
> > Source/WebCore/platform/text/TextCodecICU.cpp:100
> > +DECLARE_ALIASES(windows_1252, "ansi_x3.4-1968", "ascii", "cp1252", "cp819", "csisolatin1", "ibm819", "iso-8859-1", "iso-ir-100", "iso8859-1", "iso88591", "iso_8859-1", "iso_8859-1:1987", "l1", "latin1", "us-ascii", "x-cp1252");
> 
> We don’t use ICU for this encoding, so it’s peculiar to have this inside
> TextCodecICU.
> 
> > Source/WebCore/platform/text/TextCodecICU.cpp:134
> > +    size_t aliasCount;
> 
> Seems overkill to use size_t for this. Could be uint8_t, for example. If
> there’s no actual cost to using 64-bits for these small counts, maybe this
> is fine.

I dropped the encodings for which we don't use ICU and used unsigned for aliasCount.
Comment 69 WebKit Commit Bot 2016-08-18 12:55:13 PDT
Comment on attachment 286380 [details]
Patch for landing

Clearing flags on attachment: 286380

Committed r204605: <http://trac.webkit.org/changeset/204605>
Comment 70 WebKit Commit Bot 2016-08-18 12:55:22 PDT
All reviewed patches have been landed.  Closing bug.
Comment 71 Darin Adler 2016-08-18 14:13:52 PDT
Comment on attachment 286380 [details]
Patch for landing

View in context: https://bugs.webkit.org/attachment.cgi?id=286380&action=review

> Source/WebCore/platform/text/TextCodecICU.cpp:128
> +    const char* const name;

Just noticed that the second const here is redundant and not needed since this is a member of a struct which itself is only used const. The other two members are not marked const, nor need they be.
Comment 72 Chris Dumez 2016-08-18 16:05:41 PDT
(In reply to comment #71)
> Comment on attachment 286380 [details]
> Patch for landing
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=286380&action=review
> 
> > Source/WebCore/platform/text/TextCodecICU.cpp:128
> > +    const char* const name;
> 
> Just noticed that the second const here is redundant and not needed since
> this is a member of a struct which itself is only used const. The other two
> members are not marked const, nor need they be.

Fixed in <http://trac.webkit.org/changeset/204613>.