Bug 123974 - Remove Japanese encoding detector
Summary: Remove Japanese encoding detector
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: Text (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: BlinkMergeCandidate
Depends on:
Blocks:
 
Reported: 2013-11-07 01:17 PST by Ryosuke Niwa
Modified: 2022-05-05 23:07 PDT (History)
7 users (show)

See Also:


Attachments
/Applications/Safari Technology Preview.app (deleted)
2016-11-17 13:22 PST, rurumi663@gmail.com
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ryosuke Niwa 2013-11-07 01:17:23 PST
Consider merging https://chromium.googlesource.com/chromium/blink/+/3b87a35b8ccb719156c4af78968915de96e23517

1. Remove Japanese encoding detector. Often, it misdetects. More importantly, it's not used at all (it's a dead code unless the default encoding is Japanese [1]). 
2. Try to sniff meta tag for charset before auto-detecting.

[1] That's because 'shouldAutoDetect' and isJapaneseEncoding() cannot be satisfied at the same time except when the default encoding is Japanese (e.g. Shift_JIS). 

The only downside is that we'll misdecode a page in ISO-2022-JP without any charset declaration (http C-T header, html meta, xml head, css , etc) even when the default encoding is Japanese. However, the percentage of ISO-2022-JP pages without charset declaration is miniscule and it's not worth having this code.
Comment 1 Ryosuke Niwa 2013-11-07 01:18:17 PST
I need to figure out whether claims made in the commit message are actually true & applies to WebKit.

But if they are, deleting the big chunk of code seems like a good idea.
Comment 2 Alexey Proskuryakov 2013-11-07 09:12:41 PST
> it's a dead code unless the default encoding is Japanese

This part is correct, and by design.

Note that chromium has ICU encoding auto-detection enabled I think, so the consequences of removing this alternative detector are much smaller for them.
Comment 3 Darin Adler 2013-11-07 11:48:10 PST
I believe this is still a valuable feature for Safari users in Japan. Is that wrong, Ryosuke?
Comment 4 Ryosuke Niwa 2013-11-07 13:23:50 PST
(In reply to comment #3)
> I believe this is still a valuable feature for Safari users in Japan. Is that wrong, Ryosuke?

I need to figure that out.
Comment 5 rurumi663@gmail.com 2016-11-17 13:22:35 PST
Created attachment 295076 [details]
/Applications/Safari Technology Preview.app
Comment 6 Sam Sneddon [:gsnedders] 2022-05-05 22:13:18 PDT
Note that Chrome has since reverted course and gone back to doing content detection again.

See https://github.com/whatwg/encoding/issues/68 and https://hsivonen.fi/chardetng/ for what Firefox migrated to, along with a write-up of the status-quo of the situation here.
Comment 7 Ryosuke Niwa 2022-05-05 23:07:29 PDT
Sounds like not fix. At least we're not gonna merge this old Blink patch.