WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
6452
KURL::appendEscapingBadChars() doesn't know about %u-escaping.
https://bugs.webkit.org/show_bug.cgi?id=6452
Summary
KURL::appendEscapingBadChars() doesn't know about %u-escaping.
Henrik Grubbström
Reported
2006-01-09 08:54:03 PST
Sending a %u-escaped string (eg as generated by escape() for unicode data) to XMLHttpRequest::open() will cause any %u's to be escaped an extra time (ie the webserver will receive %25u2014 instead of %u2014). The likely culprit seems to be WebCore/kwq/KWQKURL.mm::appendEscapingBadChars(), which doesn't know about %u-style escaping. A patch similar to the following ought to fix the problem: --- KWQKURL.mm.orig Mon Jan 9 17:43:02 2006 +++ KWQKURL.mm Mon Jan 9 17:46:04 2006 @@ -981,6 +981,16 @@ *p++ = c; *p++ = *str++; *p++ = *str++; + } else if (c == '%' && strEnd - str >= 5 && + (str[0] == 'u' || str[0] == 'U') && + isHexDigit(str[0]) && isHexDigit(str[1]) && + isHexDigit(str[2]) && isHexDigit(str[3])) { + *p++ = c; + *p++ = *str++; + *p++ = *str++; + *p++ = *str++; + *p++ = *str++; + *p++ = *str++; } else if (c == '?') { *p++ = c; } else {
Attachments
Suggested patch for KWQKURL.mm::appendEscapingBadChars()
(631 bytes, patch)
2006-01-10 10:01 PST
,
Henrik Grubbström
darin
: review-
Details
Formatted Diff
Diff
Now with ChangeLog entry.
(1.04 KB, patch)
2006-01-10 10:15 PST
,
Henrik Grubbström
darin
: review-
Details
Formatted Diff
Diff
don't escape % characters
(2.47 KB, patch)
2006-02-14 12:21 PST
,
Alexey Proskuryakov
mjs
: review+
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Alexey Proskuryakov
Comment 1
2006-01-10 05:14:22 PST
Do you have a test case that behaves differently in Safari and Firefox (or Internet Explorer)? The problem is not entirely clear to me from the description, sorry.
Henrik Grubbström
Comment 2
2006-01-10 05:47:26 PST
I've verified that my javascript code works as expected in Firefox 1.5. It's hard to do a portable test case, since the typical use is in AJAX code, but here's an extract: index.js: var isSafari = navigator.appVersion.match("AppleWebKit"); function xmlhttpRequest(filepath, callback) { var xmlhttp; if(window.XMLHttpRequest) { // Mozilla, Safari, ... xmlhttp = new XMLHttpRequest(); } else if (window.ActiveXObject) { // IE xmlhttp = new ActiveXObject("Microsoft.XMLHTTP"); } xmlhttp.open("GET", filepath, true); xmlhttp.onreadystatechange = function() { var value = callback(xmlhttp); return value; }; xmlhttp.send(null) } function getResponseXML(xmlhttp) { var doc = document.importNode(xmlhttp.responseXML.documentElement, true); if(isSafari) { // Safari doesn't initialise all node attributes (name, class etc.) // properly without this workaround. doc.innerHTML = doc.innerHTML; } return doc; } function editGetFields(tag_list, data) { for(var i = 0; i < tag_list.length; i++) { var name = tag_list.item(i).getAttribute("name"); if(name) { data[name] = tag_list.item(i).value; } } } function editTextSave(button, close) { var div_edit = document.getElementById("document-editor"); var document_id = div_edit.getAttribute("name"); var article_id = getArticleId(div_edit); var article_tr = document.getElementById("article-open-" + article_id); var data = new Object; editGetFields(div_edit.getElementsByTagName("input"), data); editGetFields(div_edit.getElementsByTagName("textarea"), data); var callback = function(xmlhttp) { if(xmlhttp.readyState != 4) return; var table_db = getResponseXML(xmlhttp); var tr_db = table_db.getElementsByTagName("tr").item(0); article_tr.parentNode.replaceChild(tr_db, article_tr); if(close) { displayArticleList(); } } xmlhttpRequest("actions/edit-text-save.xml?" + "article_id="+escape(article_id) + "&" + "document_id="+escape(document_id) + "&" + "finished="+escape(data.finished) + "&" + "headline="+escape(data.headline) + "&" + "subheadline="+escape(data.subheadline) + "&" + "keywords="+escape(data.keywords) + "&" + "blurb="+escape(data.blurb) + "&" + "byline="+escape(data.byline) + "&" + "notes="+escape(data.notes) + "&" + "story="+escape(data.story), callback); return false; } If one of data.x contains a unicode character outside ISO-8859-1 (like eg \u2014 (EM_DASH)), escape() above will encode it with %u-style encoding (eg %u2014). This is as expected. However when the resulting request is received by the http server the DWIM in XMLHttpRequest::open() will have quoted it an extra time (to eg %25u2014) when sent by Safari: ceylon.roxen.com - - [09/Jan/2006:17:15:44 +0100] "GET /es/actions/edit-text-sav e.xml?article_id=441&document_id=324&finished=on&headline=Katastrofhj%E4lp%20_%2 0ny%20exportsatsning&subheadline=&keywords=&blurb=&byline=¬es=&story=%E5%E4%F 6%20%25uACA1%25uAC94%25uAC84%0A%0AKatastrofhj%E4lp%20%25u2014%20ny%20exportsatsn ing%20%E5%E4%F6%20%BFSpanska%20fr%E5gor%3F%2075%20%25u2031%20%25u260E%0A%0AStora %20katastrofer%20skapar%20lidande%20och%20n%F6d%20%25u2014%20men%20ocks%E5%20exp [...] Note that there are several cases of double-escaped unicode characters above. But not with Firefox: shipon.roxen.com - - [09/Jan/2006:16:23:33 +0100] "GET /es/actions/edit-text-sav e.xml?article_id=441&document_id=324&finished=on&headline=Katastrofhj%E4lp%20_%2 0ny%20exportsatsning&subheadline=&keywords=&blurb=&byline=¬es=&story=%u6CE8%u 6587%u65B9%u6CD5%u304C%u3072%u3068%u76EE%u3067%u308F%u304B%u308B%u30AC%u30A4%u30 C9%u30C4%u30A2%u30FC%u306F%u3053%u3061%u3089%u3002%0A%u260E%20%E5%E4%F6%2075%20% [...] Above unicode characters have been escaped a single time (as expected).
Alexey Proskuryakov
Comment 3
2006-01-10 09:52:55 PST
Yes, I can see this with a simple test html and tcpflow. Would you like to submit the patch for review (by adding ChangeLog notes, attaching the patch as a file and setting a review flag)?
Henrik Grubbström
Comment 4
2006-01-10 10:01:50 PST
Created
attachment 5599
[details]
Suggested patch for KWQKURL.mm::appendEscapingBadChars()
Henrik Grubbström
Comment 5
2006-01-10 10:15:39 PST
Created
attachment 5600
[details]
Now with ChangeLog entry.
Alexey Proskuryakov
Comment 6
2006-01-10 10:33:06 PST
My test case: <
http://nypop.com/~ap/webkit/xmlhttp-url-encoding.html
> (too lazy to make one that doesn't require tcpflow, and no idea about how to make an automated one).
Darin Adler
Comment 7
2006-01-10 15:08:06 PST
Comment on
attachment 5599
[details]
Suggested patch for KWQKURL.mm::appendEscapingBadChars() First, thanks for submitting this patch. It seems quite important to fix this bug! But I don't think this fix is correct. I think we should either be escaping all % characters in the string or not escaping any. It doesn't seem right to decide whether to escape the % character based on whether it's followed by u201F (for example) or not. If we shouldn't be escaping "%" characters in this case, then we need to change KURL behavior, perhaps adding a parameter if we don't always want that behavior.
Darin Adler
Comment 8
2006-01-10 19:49:46 PST
Comment on
attachment 5600
[details]
Now with ChangeLog entry. OK, but my review comment stands (see
comment 7
above). I don't think this is the correct fix.
Henrik Grubbström
Comment 9
2006-01-11 01:39:04 PST
I agree that having DWIM's like appendEscapingBadChars() is generally a bad idea to begin with, but now that it's there, it should either be fixed or removed entirely (which might break code that doesn't perform escaping properly, and thus relies on it being there). The alternative of escaping all "%" characters would break compatibility with eg Firefox even more.
Alexey Proskuryakov
Comment 10
2006-01-11 05:05:32 PST
I have updated the test case a bit to show more different cases. In the test, both MSIE and Firefox never escape % characters (MSIE doesn't even escape non-ASCII characters, actually). Looks like the DWIM code first appeared in a massive KURL rewrite between revisions 1905 and 1906; so it's not entirely obvious if it serves any purpose.
Alexey Proskuryakov
Comment 11
2006-02-14 12:21:52 PST
Created
attachment 6486
[details]
don't escape % characters I don't really see how to check all code paths that end up in appendEscapingBadChars()... There is a regression test included in the patch, and "everything else" still seems to work fine - which is admittedly less than a perfect evidence.
Maciej Stachowiak
Comment 12
2006-02-14 18:05:24 PST
Comment on
attachment 6486
[details]
don't escape % characters Based on your test case I think never escaping % is the right thing. I think I am the one who did the rewrite and I didn't have a specific reason for escaping %. r=me
Brett Wilson (Google)
Comment 13
2007-09-26 13:25:54 PDT
Note that IE7 changes this behavior. It rejects all invalid escape sequences and will not load the URL. This includes the %uXXXX form.
Lucas Forschler
Comment 14
2019-02-06 09:03:07 PST
Mass moving XML DOM bugs to the "DOM" Component.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug