Bug 30241 - Inconsistent URL encoding/decoding of JavaScript URLs.
Summary: Inconsistent URL encoding/decoding of JavaScript URLs.
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks: 37641
  Show dependency treegraph
 
Reported: 2009-10-08 17:12 PDT by Daniel Bates
Modified: 2011-08-19 13:39 PDT (History)
3 users (show)

See Also:


Attachments
Example (698 bytes, text/html)
2009-10-08 17:12 PDT, Daniel Bates
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Bates 2009-10-08 17:12:32 PDT
Created attachment 40919 [details]
Example

JavaScript URLs that are URL encoded via FrameLoader::completeURL are not properly decoded before eventually being passed to both the XSSAuditor and ScriptController::evaluate, because the method KURL::decodeURLEscapeSequences is NOT the inverse function of KURL::parse().

In particular, this occurs in FrameLoader::requestFrame:
http://trac.webkit.org/browser/trunk/WebCore/loader/FrameLoader.cpp#L348
where the completeURL() is called on |scriptURL| before it is passed to frame->loader()->executeIfJavaScriptURL().

Remarks:
The call flow of FrameLoader::completeURL is:
FrameLoader::completeURL -> Document::completeURL -> KURL::KURL(const KURL& base, const String& relative, ...) -> KURL::init - > KURL::parse

The issue is that KURL::parse uses the method KURL::appendEscapingBadChars, which as its name implies escapes only bad characters.

One such bad character is the space character. Consider the JavaScript URL, "javascript: '%0A'" (*). Calling KURL::parse on this (directly or implicitly via one of the functions in the above call chain) will result in a KURL object that represents the URL, "javascript:%20'%46'" (**). Notice, this result differs from the fully URL encoded result of "javascript:%20%27%2546%27". Decoding the string form of (**) using KURL::decodeURLEscapeSequences produces the result: "javascript: 'F'". Clearly, this is not the inverse of the (**).
Comment 1 Daniel Bates 2009-10-08 17:21:22 PDT
(*) should be "javascript: '%46'"

(In reply to comment #0)
> Created an attachment (id=40919) [details]
> Example
> 
> JavaScript URLs that are URL encoded via FrameLoader::completeURL are not
> properly decoded before eventually being passed to both the XSSAuditor and
> ScriptController::evaluate, because the method KURL::decodeURLEscapeSequences
> is NOT the inverse function of KURL::parse().
> 
> In particular, this occurs in FrameLoader::requestFrame:
> http://trac.webkit.org/browser/trunk/WebCore/loader/FrameLoader.cpp#L348
> where the completeURL() is called on |scriptURL| before it is passed to
> frame->loader()->executeIfJavaScriptURL().
> 
> Remarks:
> The call flow of FrameLoader::completeURL is:
> FrameLoader::completeURL -> Document::completeURL -> KURL::KURL(const KURL&
> base, const String& relative, ...) -> KURL::init - > KURL::parse
> 
> The issue is that KURL::parse uses the method KURL::appendEscapingBadChars,
> which as its name implies escapes only bad characters.
> 
> One such bad character is the space character. Consider the JavaScript URL,
> "javascript: '%0A'" (*). Calling KURL::parse on this (directly or implicitly
> via one of the functions in the above call chain) will result in a KURL object
> that represents the URL, "javascript:%20'%46'" (**). Notice, this result
> differs from the fully URL encoded result of "javascript:%20%27%2546%27".
> Decoding the string form of (**) using KURL::decodeURLEscapeSequences produces
> the result: "javascript: 'F'". Clearly, this is not the inverse of the (**).
Comment 2 Adam Barth 2011-08-19 13:39:15 PDT
This bug is fixed in the new architecture.