123974 – Remove Japanese encoding detector

RESOLVED WONTFIX 123974

Remove Japanese encoding detector

https://bugs.webkit.org/show_bug.cgi?id=123974

Summary Remove Japanese encoding detector

Ryosuke Niwa

Reported 2013-11-07 01:17:23 PST

Consider merging https://chromium.googlesource.com/chromium/blink/+/3b87a35b8ccb719156c4af78968915de96e23517 1. Remove Japanese encoding detector. Often, it misdetects. More importantly, it's not used at all (it's a dead code unless the default encoding is Japanese [1]). 2. Try to sniff meta tag for charset before auto-detecting. [1] That's because 'shouldAutoDetect' and isJapaneseEncoding() cannot be satisfied at the same time except when the default encoding is Japanese (e.g. Shift_JIS). The only downside is that we'll misdecode a page in ISO-2022-JP without any charset declaration (http C-T header, html meta, xml head, css , etc) even when the default encoding is Japanese. However, the percentage of ISO-2022-JP pages without charset declaration is miniscule and it's not worth having this code.

Attachments
/Applications/Safari Technology Preview.app (deleted) 2016-11-17 13:22 PST, rurumi663@gmail.com	no flags	Details Formatted Diff Diff
View All Add attachment proposed patch, testcase, etc.

Ryosuke Niwa

Comment 1 2013-11-07 01:18:17 PST

I need to figure out whether claims made in the commit message are actually true & applies to WebKit. But if they are, deleting the big chunk of code seems like a good idea.

Alexey Proskuryakov

Comment 2 2013-11-07 09:12:41 PST

> it's a dead code unless the default encoding is Japanese This part is correct, and by design. Note that chromium has ICU encoding auto-detection enabled I think, so the consequences of removing this alternative detector are much smaller for them.

Darin Adler

Comment 3 2013-11-07 11:48:10 PST

I believe this is still a valuable feature for Safari users in Japan. Is that wrong, Ryosuke?

Ryosuke Niwa

Comment 4 2013-11-07 13:23:50 PST

(In reply to comment #3) > I believe this is still a valuable feature for Safari users in Japan. Is that wrong, Ryosuke? I need to figure that out.

rurumi663@gmail.com

Comment 5 2016-11-17 13:22:35 PST

Created attachment 295076 [details] /Applications/Safari Technology Preview.app

Sam Sneddon [:gsnedders]

Comment 6 2022-05-05 22:13:18 PDT

Note that Chrome has since reverted course and gone back to doing content detection again. See https://github.com/whatwg/encoding/issues/68 and https://hsivonen.fi/chardetng/ for what Firefox migrated to, along with a write-up of the status-quo of the situation here.

Ryosuke Niwa

Comment 7 2022-05-05 23:07:29 PDT

Sounds like not fix. At least we're not gonna merge this old Blink patch.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution WONTFIX

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component Text

Assignee

Nobody

Reported

2013-11-07 01:17 PST

Modified

2022-05-05 23:07 PDT History

CC List

7 users Show

URL

Keywords BlinkMergeCandidate

Depends on

Blocks