WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED WONTFIX
75594
Adding feature that suggests turning on the encoding detector.
https://bugs.webkit.org/show_bug.cgi?id=75594
Summary
Adding feature that suggests turning on the encoding detector.
Keishi Hattori
Reported
2012-01-04 18:12:20 PST
We will be adding a new feature for Chrome that will rescue users from garbled text (character encoding mismatch). Many CJK users come across sites with garbled text, but the encoding menu is placed deep inside the menu so they have little chance of finding and fixing it. We are working on a new infobar that offers to turn on the encoding detector when necessary.
Attachments
Patch
(10.53 KB, patch)
2012-01-04 23:45 PST
,
Keishi Hattori
no flags
Details
Formatted Diff
Diff
Patch
(10.50 KB, patch)
2012-01-06 03:42 PST
,
Keishi Hattori
no flags
Details
Formatted Diff
Diff
Garbled text info bar
(179.74 KB, image/png)
2012-01-09 21:34 PST
,
Keishi Hattori
no flags
Details
Patch
(10.42 KB, patch)
2012-01-16 04:20 PST
,
Keishi Hattori
no flags
Details
Formatted Diff
Diff
Patch
(76.25 KB, patch)
2012-02-15 01:52 PST
,
Keishi Hattori
no flags
Details
Formatted Diff
Diff
Patch
(87.42 KB, patch)
2012-02-15 02:58 PST
,
Keishi Hattori
no flags
Details
Formatted Diff
Diff
Patch
(87.34 KB, patch)
2012-02-22 09:52 PST
,
Keishi Hattori
tkent
: review-
Details
Formatted Diff
Diff
Show Obsolete
(5)
View All
Add attachment
proposed patch, testcase, etc.
Keishi Hattori
Comment 1
2012-01-04 23:45:47 PST
Created
attachment 121225
[details]
Patch
Keishi Hattori
Comment 2
2012-01-04 23:49:56 PST
(In reply to
comment #1
)
> Created an attachment (id=121225) [details] > Patch
This patch adds two methods shouldUseEncodingDetector and reloadFromCache. shouldUseEncodingDetector() returns true when the current document encoding and auto detected encoding don't match. If this returns true we will show the infobar. reloadFromCache() reloads the page from the backForwardCache. This is used to quickly reload the page with auto detect turned on. I tried hard to make the change to WebCore minimal and there will be no behavior change to ports other than chromium.
WebKit Review Bot
Comment 3
2012-01-04 23:55:36 PST
Please wait for approval from
fishd@chromium.org
before submitting because this patch contains changes to the Chromium public API.
Alexey Proskuryakov
Comment 4
2012-01-05 08:14:09 PST
Comment on
attachment 121225
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=121225&action=review
You might want to consider following up on Jungshik's promise and addressing
bug 24420
prior to further extending detector functionality. The current behavior is simply horrible, as we end up with parts of document being decoded using different decoders.
> Source/WebCore/loader/TextResourceDecoder.cpp:589 > +bool TextResourceDecoder::detectJapaneseEncoding(const char* data, size_t len, TextEncoding* detectedEncoding) const
detectedEncoding should be a reference, not a pointer. This function can not and does not want to deal with a null argument.
> Source/WebCore/loader/TextResourceDecoder.h:57 > + void setUsesEncodingDetector(bool usesEncodingDetector) { m_usesEncodingDetector = usesEncodingDetector; }
This appears unused.
Keishi Hattori
Comment 5
2012-01-06 03:42:11 PST
Created
attachment 121421
[details]
Patch
Keishi Hattori
Comment 6
2012-01-06 03:43:20 PST
(In reply to
comment #4
)
> (From update of
attachment 121225
[details]
) > View in context:
https://bugs.webkit.org/attachment.cgi?id=121225&action=review
> > You might want to consider following up on Jungshik's promise and addressing
bug 24420
prior to further extending detector functionality. The current behavior is simply horrible, as we end up with parts of document being decoded using different decoders.
I feel the same way. The decoding part needs fixing before doing anything to it. So I tried not to touch that part yet and place all of the code in WebKit/chromium. I would like to fix those bugs and refactor the decoding process, once this project is done. We want to deliver a solution for Japanese users right now. We realized that turning on auto detection would solve most Japanese garbled text problems. But turning on auto detection (by default) is difficult because of performance. And fixing the decoder bugs won't help with turning on the auto detector.
> > Source/WebCore/loader/TextResourceDecoder.cpp:589 > > +bool TextResourceDecoder::detectJapaneseEncoding(const char* data, size_t len, TextEncoding* detectedEncoding) const > > detectedEncoding should be a reference, not a pointer. This function can not and does not want to deal with a null argument.
Done.
> > Source/WebCore/loader/TextResourceDecoder.h:57 > > + void setUsesEncodingDetector(bool usesEncodingDetector) { m_usesEncodingDetector = usesEncodingDetector; } > > This appears unused.
Sorry. This was for the next patch. Removed.
Keishi Hattori
Comment 7
2012-01-09 21:34:43 PST
Created
attachment 121791
[details]
Garbled text info bar Here is a screenshot of the infobar in action. Pressing the "Fix It" button turns on the auto encoding detector for this site. We are already talking with the Chrome UX team about this. Darin, this adds WebKit APIs so please take a look at the patch.
Darin Fisher (:fishd, Google)
Comment 8
2012-01-12 11:09:03 PST
Comment on
attachment 121421
[details]
Patch Some quick comments: This change is a bit complex for two reasons: 1- Adding a big method on WebDocument that doesn't have an analogy on WebCore::Document. WebDocument is supposed to just be a thin API wrapper. This might suggest exposing other primitives in the API or something. I need to think about this. 2- Adding a variation of reload will need to be reviewed very carefully. There are daemons in the navigation system, and this needs to be reviewed very carefully.
Keishi Hattori
Comment 9
2012-01-13 02:55:40 PST
(In reply to
comment #8
)
> (From update of
attachment 121421
[details]
) > Some quick comments:
Thanks for the reply! We realized that we won't make it into M18 but we are still in a rush to deliver this feature.
> This change is a bit complex for two reasons: > > 1- Adding a big method on WebDocument that doesn't have an analogy on WebCore::Document. WebDocument is supposed to just be a thin API wrapper. This might suggest exposing other primitives in the API or something. I need to think about this.
The reason I put it there is 1. We wanted chromium to check each frame if it has garbled text. So we didn't want to add it to WebView 2. I didn't need to use the Frame object inside shouldUseEncodingDetector() 3. Its a const method and won't modify the Document. 4. We thought about adding accessor APIs and implementing the logic on the chromium side. But we need access to TextResourceDecoder and that didn't feel like something that should be exposed to chromium. And it might change drastically when we refactor the decoding process. How about moving shouldUseEncodingDetector() to WebFrame? Will that help?
> 2- Adding a variation of reload will need to be reviewed very carefully. There are daemons in the navigation system, and this needs to be reviewed very carefully.
I'm not entirely sure what your concern about daemons is, but reloadFromCache is basically WebViewImpl::setPageEncoding without setting a new encoding.
http://trac.webkit.org/browser/trunk/Source/WebKit/chromium/src/WebViewImpl.cpp#L1767
So this type of navigation is already happening when a user chooses an override encoding from the wrench menu.
Hajime Morrita
Comment 10
2012-01-16 01:27:13 PST
> > 2- Adding a variation of reload will need to be reviewed very carefully. There are daemons in the navigation system, and this needs to be reviewed very carefully. > > I'm not entirely sure what your concern about daemons is, > but reloadFromCache is basically WebViewImpl::setPageEncoding without setting a new encoding. >
http://trac.webkit.org/browser/trunk/Source/WebKit/chromium/src/WebViewImpl.cpp#L1767
> > So this type of navigation is already happening when a user chooses an override encoding from the wrench menu.
You can clarify the fact by calling WebView::setPageEncoding() from reloadFromCache() or vice versa. I guess saveDocumentAndScrollState() does make less sense because if the encoding changes, the text representation in the page will also change.
Keishi Hattori
Comment 11
2012-01-16 04:20:59 PST
Created
attachment 122615
[details]
Patch
Keishi Hattori
Comment 12
2012-01-16 04:22:52 PST
(In reply to
comment #10
)
> You can clarify the fact by calling WebView::setPageEncoding() from reloadFromCache() or vice versa.
Done.
> I guess saveDocumentAndScrollState() does make less sense because if the encoding changes, > the text representation in the page will also change.
I removed it but the scroll position was remembered anyway. FrameLoader::reloadWithOverrideEncoding() must be saving it. Saving scroll state does make less sense but it can be useful for most pages where the text layout doesn't change drastically and that's why I left it in the first place.
Keishi Hattori
Comment 13
2012-02-15 01:52:15 PST
Created
attachment 127138
[details]
Patch
Keishi Hattori
Comment 14
2012-02-15 01:57:37 PST
In this new patch I refactored the TextResourceDecoder by introducing a TextEncodingDetector class. The logic part, that fishd said didn't belong in the WebKit layer, was moved to the chromium side, and only added the necessary accessors to the WebKit API.
Keishi Hattori
Comment 15
2012-02-15 02:58:20 PST
Created
attachment 127146
[details]
Patch
WebKit Review Bot
Comment 16
2012-02-15 04:07:35 PST
Attachment 127138
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/update-webkit']" exit_code: 9 Updating OpenSource Index mismatch: 0dfd183742a71cb5de5dadc3ae177fc72b63a194 != 9cdcda984def14b8bf8a32b6da6784c8a6ef7b3a rereading 8567f8d3c2539a28a496edaf1048483e973975c2 M LayoutTests/fast/forms/radio-nested-labels.html M LayoutTests/ChangeLog 107798 = 3671b2d23de7ade4cb1d1e78a3f6f7673db6a6c9 already exists! Why are we refetching it? at /usr/lib/git-core/git-svn line 5210 Died at Tools/Scripts/update-webkit line 164. If any of these errors are false positives, please file a bug against check-webkit-style.
WebKit Review Bot
Comment 17
2012-02-15 04:12:47 PST
Attachment 127146
[details]
did not pass style-queue: Failed to run "['Tools/Scripts/update-webkit']" exit_code: 9 Updating OpenSource Index mismatch: 0dfd183742a71cb5de5dadc3ae177fc72b63a194 != 9cdcda984def14b8bf8a32b6da6784c8a6ef7b3a rereading 8567f8d3c2539a28a496edaf1048483e973975c2 M LayoutTests/fast/forms/radio-nested-labels.html M LayoutTests/ChangeLog 107798 = 3671b2d23de7ade4cb1d1e78a3f6f7673db6a6c9 already exists! Why are we refetching it? at /usr/lib/git-core/git-svn line 5210 Died at Tools/Scripts/update-webkit line 164. If any of these errors are false positives, please file a bug against check-webkit-style.
Keishi Hattori
Comment 18
2012-02-22 09:52:21 PST
Created
attachment 128233
[details]
Patch
Keishi Hattori
Comment 19
2012-02-22 09:57:00 PST
Could ap or someone review the WebCore side of this patch?
Alexey Proskuryakov
Comment 20
2012-02-22 10:29:13 PST
Comment on
attachment 128233
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=128233&action=review
I'm confused... Who enabled TextEncodingDetectorICU on Mac, and when?
> Source/WebCore/loader/TextResourceDecoder.cpp:-641 > - // FIXME: It is wrong to change the encoding downstream after we have already done some decoding.
Where did the FIXME go?
> Source/WebCore/platform/text/TextEncodingDetectorWithICUAndKanjiCode.h:54 > + bool isAscii;
Style: isASCII.
Kent Tamura
Comment 21
2012-04-19 17:34:59 PDT
Comment on
attachment 128233
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=128233&action=review
I don't understand the context at all. Anyway the patch is large and EWSs are purple. Creating KanjiCode.{cpp,h} should be a separated patch.
> Source/WebCore/platform/text/KanjiCode.cpp:3 > +/* > + * Copyright (C) 2012 Google Inc. All rights reserved. > + *
We must respect the copyright notice of the original code.
Sam Sneddon [:gsnedders]
Comment 22
2022-09-16 14:49:34 PDT
AFAIK, this was only ever likely to land in the Chromium port, which is now long gone from WebKit.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug