Remove support for iOS-only softbank-sjis encoding if possible. I'm not entirely sure how to check. There are apparently also docomo-sjis and kddi-sjis, but we don't support those in WebKit. See bug #19309 for more info.
I meant bug #179309
By my reading of the code, it seems like the softbank-sjis text codec can't possibly actually work, which might be evidence that it's actually not needed. The code looks like this: #if PLATFORM(IOS) // See comment above in registerEncodingNames(). int32_t i = 0; for (const char* alias = textCodecMacAliases[i]; alias; alias = textCodecMacAliases[++i]) registrar(alias, create, 0); #endif But that will result in a TextCodecICU with a null m_canonicalConverterName, which will therefore always fail to create the ICU converter, because it will pass null to ucnv_open.
(In reply to Maciej Stachowiak from comment #2) > But that will result in a TextCodecICU with a null m_canonicalConverterName, > which will therefore always fail to create the ICU converter, because it > will pass null to ucnv_open. I think you are right that it doesn’t work. From my reading of the code, it won’t fail, but also won’t do anything useful. From the ucnv_open documentation: "If this parameter is NULL, the default converter will be used." So it will create ICU's "default converter", whatever that is; reading the documentation doesn’t really make it all that clear to me what it is, although it can be changed with ucnv_setDefaultName. And also, it seems that TextCodecICU::createICUConverter will crash with a null dereference calling strcmp, if this is tried again after the converter is cached. So, in summary, should almost certainly remove, highly like it does not do anything useful!
Indeed, <meta charset="softbank-sjis"> doesn't work! May be good to also check if it cannot be specified using WebKit API or SPI by a client.
I can’t see any way it would work given the way the code is written.
(In reply to Alexey Proskuryakov from comment #4) > Indeed, <meta charset="softbank-sjis"> doesn't work! May be good to also > check if it cannot be specified using WebKit API or SPI by a client. Do you know how I could make a test case for this? I'm not sure what the softbank-sjis encoding is supposed to look like.
Created attachment 326408 [details] Patch
Comment on attachment 326408 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=326408&action=review This patch only has LayoutTests changes, nothing in WebCore. > LayoutTests/fast/encoding/charset-softbank-sjis.html:5 > + <meta charset=utf-8> Pretty sure that default encoding is well defined in DRT/WKTR, so I'd omit this line. I feel like we are unnecessarily relying on parser details here. The small downside is that the test will fail locally. One way to correct for that is to check that document.charset is not softbank-sjis (or doesn't include "jis", for some resilience). Or you could print the actual charset and have text explaining that the test passes as long as it's not softbank-sjis. > LayoutTests/fast/encoding/charset-softbank-sjis.html:9 > + if (window.testRunner) > + testRunner.dumpAsText(); js-test-pre.js enables dumpAsText. I recommend using js-test.js in new tests instead, as is saves one line from boilerplate.
(In reply to Alexey Proskuryakov from comment #8) > Comment on attachment 326408 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=326408&action=review > > This patch only has LayoutTests changes, nothing in WebCore. Oops, I'll upload a new one with the actual code change. > > > LayoutTests/fast/encoding/charset-softbank-sjis.html:5 > > + <meta charset=utf-8> > > Pretty sure that default encoding is well defined in DRT/WKTR, so I'd omit > this line. I feel like we are unnecessarily relying on parser details here. > > The small downside is that the test will fail locally. One way to correct > for that is to check that document.charset is not softbank-sjis (or doesn't > include "jis", for some resilience). Or you could print the actual charset > and have text explaining that the test passes as long as it's not > softbank-sjis. This was the only way I could figure out to get the test to behave consistently between DRT/WKTR and when opened in Safari. DTR/WKTR defaults to windows-1252, while Safari defaults to utf-8. I could make a test for encoding name that doesn't depend on the exact encoding chosen as you suggest. But I am not sure how I could do the specific character test, where I check that a UTF-8 e with acute accent is decoded as such. Can you think of a way to do that that's less weird than depending on HTML rules for parsing <meta charset>? > > > LayoutTests/fast/encoding/charset-softbank-sjis.html:9 > > + if (window.testRunner) > > + testRunner.dumpAsText(); > > js-test-pre.js enables dumpAsText. > > I recommend using js-test.js in new tests instead, as is saves one line from > boilerplate. Will do.
Created attachment 326415 [details] Patch, doesn't yet address request to avoid depending on parsing details
> But I am not sure how I could do the specific character test, where I check that a UTF-8 e with acute accent is decoded as such. What is the reason why you want to decode a character? I think that checking document.charset is a good test for this fix.
(In reply to Alexey Proskuryakov from comment #11) > > But I am not sure how I could do the specific character test, where I check that a UTF-8 e with acute accent is decoded as such. > > What is the reason why you want to decode a character? I think that checking > document.charset is a good test for this fix. Some of our other encoding tests do that, and it seems worthwhile to check that the document decodes as expected in addition to checking the exposed encoding name.
Comment on attachment 326415 [details] Patch, doesn't yet address request to avoid depending on parsing details Attachment 326415 [details] did not pass mac-ews (mac): Output: http://webkit-queues.webkit.org/results/5157050 New failing tests: fast/encoding/charset-softbank-sjis.html
Created attachment 326416 [details] Archive of layout-test-results from ews103 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-ews. Bot: ews103 Port: mac-elcapitan Platform: Mac OS X 10.11.6
Comment on attachment 326415 [details] Patch, doesn't yet address request to avoid depending on parsing details Attachment 326415 [details] did not pass mac-wk2-ews (mac-wk2): Output: http://webkit-queues.webkit.org/results/5157002 New failing tests: fast/encoding/charset-softbank-sjis.html
Created attachment 326417 [details] Archive of layout-test-results from ews107 for mac-elcapitan-wk2 The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews. Bot: ews107 Port: mac-elcapitan-wk2 Platform: Mac OS X 10.11.6
I personally think that it's enough to check the exposed name, but if you want to test end to end, the workaround is to test in a subframe - the encoding of the top frame is inherited as default for the child frame, as long as they are same origin.
Comment on attachment 326415 [details] Patch, doesn't yet address request to avoid depending on parsing details Attachment 326415 [details] did not pass ios-sim-ews (ios-simulator-wk2): Output: http://webkit-queues.webkit.org/results/5157104 New failing tests: fast/encoding/charset-softbank-sjis.html
Created attachment 326421 [details] Archive of layout-test-results from ews126 for ios-simulator-wk2 The attached test failures were seen while running run-webkit-tests on the ios-sim-ews. Bot: ews126 Port: ios-simulator-wk2 Platform: Mac OS X 10.12.6
Comment on attachment 326415 [details] Patch, doesn't yet address request to avoid depending on parsing details Attachment 326415 [details] did not pass mac-debug-ews (mac): Output: http://webkit-queues.webkit.org/results/5157100 New failing tests: fast/encoding/charset-softbank-sjis.html
Created attachment 326423 [details] Archive of layout-test-results from ews117 for mac-elcapitan The attached test failures were seen while running run-webkit-tests on the mac-debug-ews. Bot: ews117 Port: mac-elcapitan Platform: Mac OS X 10.11.6
I accidentally took out the test for encoding name. I'll upload a new patch with that fixed, plus Alexey's suggestion to use an iframe for consistent encoding results.
Created attachment 326431 [details] Patch with test that works, and done the way Alexey suggested
Comment on attachment 326431 [details] Patch with test that works, and done the way Alexey suggested View in context: https://bugs.webkit.org/attachment.cgi?id=326431&action=review > LayoutTests/fast/encoding/charset-softbank-sjis-expected.txt:5 > +TEST COMPLETE > +Test PASSED if this renders as an accented e: âéâ. This is not right. > LayoutTests/fast/encoding/resources/softbank-sjis-iframe.html:5 > +<script src="../../../resources/js-test.js"></script> I think that this should be in the main frame. Otherwise, it will call notifyDone once the subframe is done loading, which may be before the main frame is done. > LayoutTests/fast/encoding/resources/softbank-sjis-iframe.html:11 > +Test PASSED if this renders as an accented e: âéâ. I think that it would work better if this were at the top of <body> (and I'd make it an explicit element). But also, why not put it into a script as a shouldBe? I think that this should do it: shouldBe("'\xe9'", "'é'");
Created attachment 326433 [details] Patch with further tweaks to the regression test
Comment on attachment 326433 [details] Patch with further tweaks to the regression test View in context: https://bugs.webkit.org/attachment.cgi?id=326433&action=review Great! > Source/WebCore/ChangeLog:9 > + (WebCore::TextCodecICU::registerEncodingNames): Remove support for softbanj-sjis "softbank"
Created attachment 326461 [details] Patch for landing
Comment on attachment 326461 [details] Patch for landing Clearing flags on attachment: 326461 Committed r224636: <https://trac.webkit.org/changeset/224636>
All reviewed patches have been landed. Closing bug.
<rdar://problem/35562183>