Align EUC-KR encoding with Chrome, Firefox, and specification
Created attachment 407776 [details] Patch
Created attachment 407807 [details] Patch
Created attachment 407815 [details] Patch
Comment on attachment 407815 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407815&action=review > Source/WebCore/ChangeLog:3 > + Align EUC-KR encoding with Chrome, Firefox, and specification Will we be using ICU codecs at all once you are done? :-O
Yes. Most of them are compliant with modern standards, and https://bugs.webkit.org/show_bug.cgi?id=216094 shows evidence that they are making a nonzero amount of changes to comply with modern standards (windows-1255 changed to be "correct" on Big Sur and iOS 14), but I'm not going to just wait for them to move at the speed of browsers.
I think we might want to create an automated way to try ICU in the future so we it’s easy enough to know when to turn these back off and rely on ICU again. Worth thinking about making this very easy to notice and fix.
Comment on attachment 407815 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407815&action=review > Source/WebCore/platform/text/TextCodecCJK.cpp:340 > + static auto* table = [] { > + auto table = new EUCKREncodingIndex(); > + for (size_t i = 0; i < WTF_ARRAY_LENGTH(eucKRDecodingIndex); i++) > + (*table)[i] = { eucKRDecodingIndex[i].second, eucKRDecodingIndex[i].first }; > + std::sort(table->begin(), table->end(), [] (auto& a, auto& b) { > + return a.first < b.first; > + }); > + return table; > + }(); > + return *table; Since we allocate these tables on the heap, can we figure out a way to make these smaller, some kind of simple compression scheme? Not about this particular one, but all the giant tables. > Source/WebCore/platform/text/TextCodecCJK.cpp:372 > + result.append(pointer / 190 + 0x81); > + result.append(pointer % 190 + 0x41); Where do these magic numbers come from? > Source/WebCore/platform/text/TextCodecCJK.cpp:385 > + uint16_t pointer = (lead - 0x81) * 190 + byte - 0x41; Where do these magic numbers come from?
Comment on attachment 407815 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407815&action=review >> Source/WebCore/platform/text/TextCodecCJK.cpp:340 >> + return *table; > > Since we allocate these tables on the heap, can we figure out a way to make these smaller, some kind of simple compression scheme? Not about this particular one, but all the giant tables. If we're using them then we want them in memory somewhere in this form. I think we should compress them then decompress them if needed to reduce the size of the WebCore dylib >> Source/WebCore/platform/text/TextCodecCJK.cpp:372 >> + result.append(pointer % 190 + 0x41); > > Where do these magic numbers come from? https://encoding.spec.whatwg.org/#euc-kr-encoder >> Source/WebCore/platform/text/TextCodecCJK.cpp:385 >> + uint16_t pointer = (lead - 0x81) * 190 + byte - 0x41; > > Where do these magic numbers come from? https://encoding.spec.whatwg.org/#euc-kr-decoder
Comment on attachment 407815 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=407815&action=review >>> Source/WebCore/platform/text/TextCodecCJK.cpp:340 >>> + return *table; >> >> Since we allocate these tables on the heap, can we figure out a way to make these smaller, some kind of simple compression scheme? Not about this particular one, but all the giant tables. > > If we're using them then we want them in memory somewhere in this form. I think we should compress them then decompress them if needed to reduce the size of the WebCore dylib Yes, that’s what was talking about. The tables in the binary, the ones we use to as input to a function to allocate the tables we actually use. I was suggesting that we compress those.
fast/block/margin-collapse/103.html did not change with this patch, but some of the iOS failures (also not from this patch) are things I need to rebase.
http://trac.webkit.org/r266520
<rdar://problem/68282300>