Bug 17732

Summary: An external JavaScript, after changing its src attribute, which has not charset declaration in its script tag and encoding info in corresponding http response, can not be correctly decoded.
Product: WebKit Reporter: johnnyding <johnnyding.webkit>
Component: Page LoadingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, darin
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: PC   
OS: OS X 10.5   
Attachments:
Description Flags
error encoding content is showed in right side
none
html file for this test case
none
script file for this test case
none
patch v1 for fixing this problem
none
patch v2 for fixing this problem. The change is according to Alexey's comments
none
patch v3, based on patch v2, move method scriptCharset from Document class to HTMLScriptElement class
none
patch v3, based on patch v2, move method scriptCharset from Document class to HTMLScriptElement class ap: review+

johnnyding
Reported 2008-03-09 12:28:35 PDT
Open http://money.163.com(it is sub domain of www.163.com. www.163.com is one of three largest market web portal in China), there's some encoding error on the right side of the web page. IE & FF handles it correctly. See attached image.
Attachments
error encoding content is showed in right side (104.04 KB, image/jpeg)
2008-03-09 12:30 PDT, johnnyding
no flags
html file for this test case (715 bytes, text/html)
2008-03-09 12:31 PDT, johnnyding
no flags
script file for this test case (180 bytes, application/octet-stream)
2008-03-09 12:32 PDT, johnnyding
no flags
patch v1 for fixing this problem (5.00 KB, patch)
2008-03-09 23:02 PDT, johnnyding
no flags
patch v2 for fixing this problem. The change is according to Alexey's comments (7.72 KB, patch)
2008-03-10 03:37 PDT, johnnyding
no flags
patch v3, based on patch v2, move method scriptCharset from Document class to HTMLScriptElement class (8.46 KB, patch)
2008-03-10 07:22 PDT, johnnyding
no flags
patch v3, based on patch v2, move method scriptCharset from Document class to HTMLScriptElement class (8.42 KB, patch)
2008-03-10 07:30 PDT, johnnyding
ap: review+
johnnyding
Comment 1 2008-03-09 12:30:23 PDT
Created attachment 19618 [details] error encoding content is showed in right side
johnnyding
Comment 2 2008-03-09 12:31:39 PDT
Created attachment 19619 [details] html file for this test case
johnnyding
Comment 3 2008-03-09 12:32:19 PDT
Created attachment 19620 [details] script file for this test case
johnnyding
Comment 4 2008-03-09 12:36:15 PDT
Download both html file and script file to your local file system. The expected behavior is content after "value which is encoded by gb2312 in html file is: " and "same value which is encoded by gb2312 in extern javascript file is: " should have same display, but they are not. Please see above attached testcase. After digging in WebKit, I found the error encoding content was generated by an external JavaScript after changing its src attribute, which had not charset declaration in its script tag and encoding info in http response when downloading it. So CachedScript will use "latin1" as encoding charset. See methods CachedScript::CachedScript and Loader::didReceiveResponse. That is why content of the external Javascript had been incorrectly decoded. I think we can fix this problem by using document()->frame()->loader()->encoding() to create CacheScript obj when the external Javascript has not charset declaration in its script tag(getAttribute(charsetAttr) return null string). See method HTMLScriptElement::parseMappedAttribute. This behavior makes sense just like getting correct scriptSrcCharset in HTMLTokenizer::parseTag. I have changed code in my local build, it works. A patch is coming soon.
johnnyding
Comment 5 2008-03-09 23:02:44 PDT
Created attachment 19627 [details] patch v1 for fixing this problem
Alexey Proskuryakov
Comment 6 2008-03-10 00:01:00 PDT
Looking great! The code in parseMappedAttribute() still doesn't quite match what we have in insertedIntoDocument(). The latter also calls stripWhiteSpace(). I suggest creating a helper function that would be called from both places, rather than duplicating the logic. + script_obj.src = "./resources/script-decoding-error-after-setting-src.js"; I'm surprised that this works without layoutTestController.waitUntilDone()/notifyDone(). + This case is for testing script decoding error after setting its src attribute.<br> A link to this bug would be helpful. Also, I would omit the word "error" here. In the future, please set review flag on patches - this ensures that they get into review queue and don't get lost.
johnnyding
Comment 7 2008-03-10 03:37:07 PDT
Created attachment 19630 [details] patch v2 for fixing this problem. The change is according to Alexey's comments
Alexey Proskuryakov
Comment 8 2008-03-10 04:41:33 PDT
Comment on attachment 19630 [details] patch v2 for fixing this problem. The change is according to Alexey's comments Hmm, I would have added scriptCharset() to HTMLScriptElement, not Document - or is there a reason why this isn't possible? Document is so large that one would not want to add stuff to it unless absolutely necessary.
johnnyding
Comment 9 2008-03-10 07:22:54 PDT
Created attachment 19633 [details] patch v3, based on patch v2, move method scriptCharset from Document class to HTMLScriptElement class
johnnyding
Comment 10 2008-03-10 07:30:24 PDT
Created attachment 19634 [details] patch v3, based on patch v2, move method scriptCharset from Document class to HTMLScriptElement class
Alexey Proskuryakov
Comment 11 2008-03-10 08:18:04 PDT
Comment on attachment 19634 [details] patch v3, based on patch v2, move method scriptCharset from Document class to HTMLScriptElement class r=me I'm going to tweak wording of some comments a bit when landing.
johnnyding
Comment 12 2008-03-10 08:28:22 PDT
(In reply to comment #11) > (From update of attachment 19634 [details] [edit]) > r=me > I'm going to tweak wording of some comments a bit when landing. Thanks.
Alexey Proskuryakov
Comment 13 2008-03-10 08:39:49 PDT
Committed revision 30928.
Note You need to log in before you can comment on or make changes to this bug.