Consider merging https://chromium.googlesource.com/chromium/blink/+/3b87a35b8ccb719156c4af78968915de96e23517 1. Remove Japanese encoding detector. Often, it misdetects. More importantly, it's not used at all (it's a dead code unless the default encoding is Japanese [1]). 2. Try to sniff meta tag for charset before auto-detecting. [1] That's because 'shouldAutoDetect' and isJapaneseEncoding() cannot be satisfied at the same time except when the default encoding is Japanese (e.g. Shift_JIS). The only downside is that we'll misdecode a page in ISO-2022-JP without any charset declaration (http C-T header, html meta, xml head, css , etc) even when the default encoding is Japanese. However, the percentage of ISO-2022-JP pages without charset declaration is miniscule and it's not worth having this code.
I need to figure out whether claims made in the commit message are actually true & applies to WebKit. But if they are, deleting the big chunk of code seems like a good idea.
> it's a dead code unless the default encoding is Japanese This part is correct, and by design. Note that chromium has ICU encoding auto-detection enabled I think, so the consequences of removing this alternative detector are much smaller for them.
I believe this is still a valuable feature for Safari users in Japan. Is that wrong, Ryosuke?
(In reply to comment #3) > I believe this is still a valuable feature for Safari users in Japan. Is that wrong, Ryosuke? I need to figure that out.
Created attachment 295076 [details] /Applications/Safari Technology Preview.app
Note that Chrome has since reverted course and gone back to doing content detection again. See https://github.com/whatwg/encoding/issues/68 and https://hsivonen.fi/chardetng/ for what Firefox migrated to, along with a write-up of the status-quo of the situation here.
Sounds like not fix. At least we're not gonna merge this old Blink patch.