RESOLVED FIXED 53398
Use Document::encoding() instead of DocumentWriter::encoding()
https://bugs.webkit.org/show_bug.cgi?id=53398
Summary Use Document::encoding() instead of DocumentWriter::encoding()
Patrick R. Gansterer
Reported 2011-01-30 09:05:10 PST
Use Document::encoding() instead of DocumentWriter::encoding()
Attachments
Patch (4.80 KB, patch)
2011-01-30 09:11 PST, Patrick R. Gansterer
no flags
Patch (4.81 KB, patch)
2011-01-30 09:24 PST, Patrick R. Gansterer
no flags
Patch (4.34 KB, patch)
2011-02-05 07:51 PST, Patrick R. Gansterer
no flags
Patrick R. Gansterer
Comment 1 2011-01-30 09:11:39 PST
WebKit Review Bot
Comment 2 2011-01-30 09:14:36 PST
Patrick R. Gansterer
Comment 3 2011-01-30 09:24:47 PST
Build Bot
Comment 4 2011-01-30 09:28:32 PST
Adam Barth
Comment 5 2011-01-30 10:34:31 PST
Comment on attachment 80594 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=80594&action=review R- for using a virtual method when a non-virtual one does the same job. Also, one question: > Source/WebCore/ChangeLog:6 > + Use Document::encoding() instead of DocumentWriter::encoding() > + https://bugs.webkit.org/show_bug.cgi?id=53398 Do you mean Document::charset? Document.h says: "Use Document::encoding() to avoid virtual dispatch" but you're not doing that... Are these two always the same?
Alexey Proskuryakov
Comment 6 2011-01-30 16:29:20 PST
Note that Document::encoding() can be affected by assigning to document.charset. It's not clear what exactly we want to change in that case - it's particularly questionable that we change document.inputEncoding.
Patrick R. Gansterer
Comment 7 2011-02-03 13:49:06 PST
Comment on attachment 80594 [details] Patch Document::encoding() and DocumentWriter::encoding() are always the same. Document::setDecoder() will be called at http://trac.webkit.org/browser/trunk/Source/WebCore/loader/DocumentWriter.cpp?rev=76872#L133 http://trac.webkit.org/browser/trunk/Source/WebCore/loader/DocumentWriter.cpp?rev=76872#L177 where Document::encoding will be set to DocumentWriter::encoding There is one place where this happens in the other direction: http://trac.webkit.org/browser/trunk/Source/WebCore/loader/FrameLoader.cpp?rev=76872#L2166 And there is one call where we set Document::encoding and don't have a DocumentWriter: http://trac.webkit.org/browser/trunk/Source/WebCore/xml/XSLTProcessor.cpp?rev=76872#L92
Patrick R. Gansterer
Comment 8 2011-02-04 13:52:52 PST
(In reply to comment #6) > Note that Document::encoding() can be affected by assigning to document.charset. It's not clear what exactly we want to change in that case - it's particularly questionable that we change document.inputEncoding. Document::setCharset will set encoding of Document::m_decoder. The decoder of Document and DocumentWriter are the same: see comment #7. AFAICS this patch does not change behavior of the existing code.
Alexey Proskuryakov
Comment 9 2011-02-04 14:08:20 PST
Comment 7 lists all places where encoding is set. Could you please describe how DocumentWriter encoding becomes equal to document encoding in the particular case of document.charset being set from JS, for ease of comprehension?
Patrick R. Gansterer
Comment 10 2011-02-04 14:20:14 PST
(In reply to comment #6) > Note that Document::encoding() can be affected by assigning to document.charset. It's not clear what exactly we want to change in that case - it's particularly questionable that we change document.inputEncoding. charset and inputEncoding are always the same. http://trac.webkit.org/browser/trunk/LayoutTests/fast/dom/Document/document-charset.html?rev=13706 tests for this. (In reply to comment #9) > Comment 7 lists all places where encoding is set. Could you please describe how DocumentWriter encoding becomes equal to document encoding in the particular case of document.charset being set from JS, for ease of comprehension? DocumentWriter and Document share the same TextResourceDecoder (m_decoder). But two other cases: a) User selects an ecoding: We generate a new Document, where the TextResourceDecoder has the encoding selected by the user. So decoder() returns the correct encoding. b) When we fall back to the defaultEncoding we return null string in Document::encoding(), but we will use the defaultEncoding when creating a TextResourceDecoder with no valid encoding.
Alexey Proskuryakov
Comment 11 2011-02-04 14:37:59 PST
> charset and inputEncoding are always the same. http://trac.webkit.org/browser/trunk/LayoutTests/fast/dom/Document/document-charset.html?rev=13706 tests for this. Yes, that's current behavior, that was chosen in absence of any spec or compatibility info. Now that IE9 has inputEncoding, we should test what it does.
Patrick R. Gansterer
Comment 12 2011-02-04 14:52:31 PST
(In reply to comment #11) > > charset and inputEncoding are always the same. http://trac.webkit.org/browser/trunk/LayoutTests/fast/dom/Document/document-charset.html?rev=13706 tests for this. > > Yes, that's current behavior, that was chosen in absence of any spec or compatibility info. Now that IE9 has inputEncoding, we should test what it does. IMHO that's a different task, since this patch does not change the behavior. Matching the IE9 behavior is a separate task we can do after this is landed too. But if you think we need to ensure IE9 compatibility _before_ I need a IE9, which I don't have and can't install on XP :-(.
Alexey Proskuryakov
Comment 13 2011-02-04 15:10:05 PST
There is a chance of being counter-productive if we clean up implementation before deciding what the desired behavior is. That's all I'm concerned about.
Patrick R. Gansterer
Comment 14 2011-02-05 01:52:46 PST
(In reply to comment #13) > There is a chance of being counter-productive if we clean up implementation before deciding what the desired behavior is. That's all I'm concerned about. I agree with you! Opening LayoutTests/fast/dom/Document/document-charset.html with IE9beta on Windows Server 2008 R2 gives the following result: Initial document.charset: koi8-r document.defaultCharset: defined document.characterSet: undefined document.inputEncoding: undefined Setting charset to UTF-8... document.charset: utf-8 document.defaultCharset: defined document.characterSet: undefined document.inputEncoding: undefined Setting characterSet to KOI8-R (expected to fail, matching Firefox)... document.charset: utf-8 document.defaultCharset: defined document.characterSet: KOI8-R document.inputEncoding: undefined It seams inputEncoding isn't working in IE9! Matching the current behavior of IE9 does not seam to go in the right direction. :-)
Patrick R. Gansterer
Comment 15 2011-02-05 01:57:32 PST
comment #14 was with Quirks mode :-/ With IE9 standards mode: Initial document.charset: utf-8 document.defaultCharset: defined document.characterSet: utf-8 document.inputEncoding: UTF-8 Setting charset to UTF-8... document.charset: utf-8 document.defaultCharset: defined document.characterSet: utf-8 document.inputEncoding: UTF-8 Setting characterSet to KOI8-R (expected to fail, matching Firefox)... Failure (expected)
Patrick R. Gansterer
Comment 16 2011-02-05 07:51:58 PST
Created attachment 81360 [details] Patch After document.charset is set, the given encoding will be used in IE9. AFAIKS we match the IE9 behavior.
Alexey Proskuryakov
Comment 17 2011-02-05 11:36:10 PST
Comment on attachment 80594 [details] Patch OK. Please consider adding a doctype to document-charset.html, to make the test pass in IE9 out of the box.
Alexey Proskuryakov
Comment 18 2011-02-05 11:40:41 PST
Comment on attachment 81360 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=81360&action=review Another thing to test is whether the main document decoding starts using the new encoding after assignment to charset. That's obviously tricky, as decoding happens ahead of execution - but as far as I can tell, that's completely broken in WebKit now. Changing TextResourceDecoder charset on the fly resets it, dropping buffered characters, so assigning to document.charset means that a decoding error will occur at some later point. > LayoutTests/fast/encoding/external-script-charset-expected.txt:4 > PASS > +PASS Would be nice to explain what these subtests are, like it's done in other tests.
Patrick R. Gansterer
Comment 19 2011-02-05 12:30:50 PST
Patrick R. Gansterer
Comment 20 2011-02-05 12:34:42 PST
Comment on attachment 80594 [details] Patch Clearing flags on attachment: 80594 Committed r77750: <http://trac.webkit.org/changeset/77750>
Note You need to log in before you can comment on or make changes to this bug.