Bug 179460

Summary: Remove TEC decoders that duplicate ICU decoders
Product: WebKit Reporter: Maciej Stachowiak <mjs>
Component: TextAssignee: Myles C. Maxfield <mmaxfield>
Status: RESOLVED FIXED    
Severity: Normal CC: achristensen, ap, commit-queue, darin, mmaxfield, ryanhaddad, webkit-bug-importer
Priority: P2 Keywords: InRadar
Version: Safari Technology Preview   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on:    
Bug Blocks: 179303    
Attachments:
Description Flags
Patch
none
Patch for landing none

Maciej Stachowiak
Reported 2017-11-08 17:17:14 PST
Remove TEC decoders that duplicate ICU decoders. Some of them define additional aliases, but we can just copy those. There's no need to implement encodings twice.
Attachments
Patch (9.24 KB, patch)
2017-11-09 21:13 PST, Maciej Stachowiak
no flags
Patch for landing (9.99 KB, patch)
2017-11-10 10:55 PST, Maciej Stachowiak
no flags
Myles C. Maxfield
Comment 1 2017-11-09 11:31:51 PST
I would love to do this.
Maciej Stachowiak
Comment 2 2017-11-09 19:14:08 PST
This is the relevant data for when Mac has duplicate encodings and possible extra names: EXTRA NAME: macOS WebKit knows extra nonstandard name isoceltic for ISO-8859-14 EXTRA NAME: macOS WebKit knows extra nonstandard name iso8859101992 for ISO-8859-10 EXTRA NAME: macOS WebKit knows extra nonstandard name isoir226 for ISO-8859-16 EXTRA NAME: macOS WebKit knows extra nonstandard name iso8859141998 for ISO-8859-14 EXTRA NAME: macOS WebKit knows extra nonstandard name iso8859162001 for ISO-8859-16 EXTRA NAME: macOS WebKit knows extra nonstandard name l10 for ISO-8859-16 EXTRA NAME: macOS WebKit knows extra nonstandard name isoir157 for ISO-8859-10 EXTRA NAME: macOS WebKit knows extra nonstandard name isoir199 for ISO-8859-14 EXTRA NAME: macOS WebKit knows extra nonstandard name latin10 for ISO-8859-16 EXTRA NAME: macOS WebKit knows extra nonstandard name latin8 for ISO-8859-14 EXTRA NAME: macOS WebKit knows extra nonstandard name l8 for ISO-8859-14 MULTIPLE CODECS: macOS WebKit supports encoding csisolatin6 through multiple implementations ['TEC', 'ICU'] MULTIPLE CODECS: macOS WebKit supports encoding iso-8859-16 through multiple implementations ['TEC', 'ICU'] MULTIPLE CODECS: macOS WebKit supports encoding iso-8859-14 through multiple implementations ['TEC', 'ICU'] MULTIPLE CODECS: macOS WebKit supports encoding iso-8859-11 through multiple implementations ['TEC', 'ICU'] MULTIPLE CODECS: macOS WebKit supports encoding iso-8859-10 through multiple implementations ['TEC', 'ICU'] MULTIPLE CODECS: macOS WebKit supports encoding latin6 through multiple implementations ['TEC', 'ICU'] MULTIPLE CODECS: macOS WebKit supports encoding koi8-u through multiple implementations ['TEC', 'ICU'] MULTIPLE CODECS: macOS WebKit supports encoding l6 through multiple implementations ['TEC', 'ICU']
Maciej Stachowiak
Comment 3 2017-11-09 21:13:36 PST
Darin Adler
Comment 4 2017-11-10 09:21:13 PST
Comment on attachment 326548 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=326548&action=review Fewer encodings in this category than I expected, but I understand that this is just one step on the path. > Source/WebCore/platform/text/TextEncodingRegistry.cpp:125 > + ASSERT_WITH_MESSAGE(!oldAtomicName, "Duplicate text encoding name %s for %s (previously registered as %s)", alias, atomicName, oldAtomicName); Technically, making this an assertion requires changing the #if above to ASSERT_DISABLED rather than ERROR_DISABLED. Makes me think our Assertions.h macros are a bit too complex.
Darin Adler
Comment 5 2017-11-10 09:35:45 PST
Comment on attachment 326548 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=326548&action=review >> Source/WebCore/platform/text/TextEncodingRegistry.cpp:125 >> + ASSERT_WITH_MESSAGE(!oldAtomicName, "Duplicate text encoding name %s for %s (previously registered as %s)", alias, atomicName, oldAtomicName); > > Technically, making this an assertion requires changing the #if above to ASSERT_DISABLED rather than ERROR_DISABLED. Makes me think our Assertions.h macros are a bit too complex. Another way to do it is to get rid of this function and at the single call site put this one long line of code: ASSERT_WITH_MESSAGE(!textEncodingNameMap->contains(alias), "Duplicate text encoding name %s for %s (previously registered as %s)", alias, atomicName, textEncodingNameMap->get(alias)); We needed a function before only because the code was not a simple one-liner.
Maciej Stachowiak
Comment 6 2017-11-10 10:40:49 PST
(In reply to Darin Adler from comment #5) > > Another way to do it is to get rid of this function and at the single call > site put this one long line of code: > > ASSERT_WITH_MESSAGE(!textEncodingNameMap->contains(alias), "Duplicate > text encoding name %s for %s (previously registered as %s)", alias, > atomicName, textEncodingNameMap->get(alias)); > > We needed a function before only because the code was not a simple one-liner. I like that suggestion, I'll do it that way.
Maciej Stachowiak
Comment 7 2017-11-10 10:55:01 PST
Created attachment 326605 [details] Patch for landing
WebKit Commit Bot
Comment 8 2017-11-10 11:26:28 PST
Comment on attachment 326605 [details] Patch for landing Clearing flags on attachment: 326605 Committed r224700: <https://trac.webkit.org/changeset/224700>
WebKit Commit Bot
Comment 9 2017-11-10 11:26:30 PST
All reviewed patches have been landed. Closing bug.
Ryan Haddad
Comment 10 2017-11-10 12:11:16 PST
This change broke the debug builds: /Volumes/Data/slave/highsierra-debug/build/Source/WebCore/platform/text/TextEncodingRegistry.cpp:140:150: error: format specifies type 'char *' but the argument has type 'bool' [-Werror,-Wformat] https://build.webkit.org/builders/Apple%20High%20Sierra%20Debug%20%28Build%29/builds/1292
Alex Christensen
Comment 11 2017-11-10 12:11:28 PST
Maciej Stachowiak
Comment 12 2017-11-10 18:46:30 PST
(In reply to Alex Christensen from comment #11) > http://trac.webkit.org/r224701 Thanks for fixing it for me. I probably should have let the final path run through EWS.
Radar WebKit Bug Importer
Comment 13 2017-11-15 09:38:45 PST
Note You need to log in before you can comment on or make changes to this bug.