Bug 64870 - [chromium] FileAPI doesn't work properly with filenames containing '#'
Summary: [chromium] FileAPI doesn't work properly with filenames containing '#'
Status: RESOLVED WONTFIX
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit API (show other bugs)
Version: 528+ (Nightly build)
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-07-20 07:47 PDT by Sergey Ryazanov
Modified: 2013-04-11 14:46 PDT (History)
8 users (show)

See Also:


Attachments
Fix that escapes filenames in KURLGoogle.cpp (KURL.cpp already do it). (3.46 KB, patch)
2011-07-25 08:30 PDT, Sergey Ryazanov
no flags Details | Formatted Diff | Diff
Fixed code style (5.14 KB, patch)
2011-07-26 02:53 PDT, Sergey Ryazanov
gyuyoung.kim: commit-queue-
Details | Formatted Diff | Diff
Removed accidentally included files. (3.54 KB, patch)
2011-07-26 04:44 PDT, Sergey Ryazanov
levin: review-
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sergey Ryazanov 2011-07-20 07:47:28 PDT
While constructing FileAPI URLs special charactes are not escaped and the URL gets invalid.

Outside Chromium this issue has been fixed in https://bugs.webkit.org/show_bug.cgi?id=60218.
Comment 1 Sergey Ryazanov 2011-07-25 08:30:43 PDT
Created attachment 101871 [details]
Fix that escapes filenames in KURLGoogle.cpp (KURL.cpp already do it).
Comment 2 WebKit Review Bot 2011-07-25 08:34:51 PDT
Attachment 101871 [details] did not pass style-queue:

Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/fast/filesystem/op-get-entry-e..." exit_code: 1

Source/WebCore/platform/KURLGoogle.cpp:792:  One line control clauses should not use braces.  [whitespace/braces] [4]
Total errors found: 1 in 3 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Sergey Ryazanov 2011-07-26 02:53:00 PDT
Created attachment 101981 [details]
Fixed code style
Comment 4 Gyuyoung Kim 2011-07-26 03:08:12 PDT
Comment on attachment 101981 [details]
Fixed code style

Attachment 101981 [details] did not pass efl-ews (efl):
Output: http://queues.webkit.org/results/9244396
Comment 5 Early Warning System Bot 2011-07-26 03:42:20 PDT
Comment on attachment 101981 [details]
Fixed code style

Attachment 101981 [details] did not pass qt-ews (qt):
Output: http://queues.webkit.org/results/9247362
Comment 6 Collabora GTK+ EWS bot 2011-07-26 03:50:07 PDT
Comment on attachment 101981 [details]
Fixed code style

Attachment 101981 [details] did not pass gtk-ews (gtk):
Output: http://queues.webkit.org/results/9248325
Comment 7 Sergey Ryazanov 2011-07-26 04:44:06 PDT
Created attachment 101986 [details]
Removed accidentally included files.
Comment 8 Darin Fisher (:fishd, Google) 2011-07-28 10:02:25 PDT
Comment on attachment 101986 [details]
Removed accidentally included files.

View in context: https://bugs.webkit.org/attachment.cgi?id=101986&action=review

> Source/WebCore/platform/KURLGoogle.cpp:769
> +    const UChar* pos = path.characters();

KURLGoogle is meant to just be a lightweight adapter to the
google-url library.  I suspect the proper fix should be a
patch to google-url instead, but I defer to Brett Wilson's
judgement on that.
Comment 9 Brett Wilson (Google) 2011-07-28 10:54:39 PDT
setPath is definitely the wrong place for this. What happens when somebody constructs a full URL without going through setPath? You'll get different canonicalization behavior.

I presume your file refs are going through the "path URL" case of googleurl because you're using some random scheme? In this case, how much modification of these URLs do you need?

In the case of blob URLs, we decided not to add support to googleurl for them. In this way, they're like data URLs. Data URLs have a specific format & requirements, but it's not googleurl's (or KURL's) job to interpret it, it's the code that's generating or interpreting the data URL's job to make sure it's well-formed & escaped properly.

If your file stuff is similar, then your component should be generating properly escaped paths and we shouldn't need this. On the other hand, if we're going to be doing a lot of stuff, like doing relative URL resolution and fragment navigation on these URLs, it would be better to teach googleurl about them so the normal loader behavior works as expected.
Comment 10 Darin Fisher (:fishd, Google) 2011-07-28 11:10:55 PDT
File URLs have the format filesystem:http://webkit.org/temporary/path/to/file.png

They should support resolving relative URLs, query strings, and hash fragments.
Comment 11 Sergey Ryazanov 2011-08-02 05:49:15 PDT
WebKit's KURL::setPath is supposed to work properly with any argument including having '#' and '?' symbols (AsyncFileSystemChromium::virtualPathToFileSystemURL relies on it). Google's url_util::ReplaceComponents apparently is't. Consequently adaptor's code (that's the KURLGoogle.ccp, KURL::setPath or KURLGooglePrivate::replaceComponents) looks to be the right place to handle the difference.

As Darin said the scheme is filesystem:[securuty-origin]/[path]. [path] is not as restricted as in the BLOB scheme.

Instead of forcing googleurl to escaping all input of url_util::ReplaceComponents I would suggest to use encodeWithURLEscapeSequences/decodeURLEscapeSequences in each KURL getter and setter of KURLGoogle.cpp whose KURL.cpp counterpart does.

(In reply to comment #9)
> setPath is definitely the wrong place for this. What happens when somebody constructs a full URL without going through setPath? You'll get different canonicalization behavior.
> 
> I presume your file refs are going through the "path URL" case of googleurl because you're using some random scheme? In this case, how much modification of these URLs do you need?
> 
> In the case of blob URLs, we decided not to add support to googleurl for them. In this way, they're like data URLs. Data URLs have a specific format & requirements, but it's not googleurl's (or KURL's) job to interpret it, it's the code that's generating or interpreting the data URL's job to make sure it's well-formed & escaped properly.
> 
> If your file stuff is similar, then your component should be generating properly escaped paths and we shouldn't need this. On the other hand, if we're going to be doing a lot of stuff, like doing relative URL resolution and fragment navigation on these URLs, it would be better to teach googleurl about them so the normal loader behavior works as expected.
Comment 12 Brett Wilson (Google) 2011-08-02 10:56:42 PDT
ReplaceComponents does the appropriate escaping for the type of URL you have. If you have an http URL, it will escape the ? and #. The kind of URL you have (a path URL) doesn't have ? or # components, so it doesn't escape them.

We need to teach googleurl about these FileAPI files for this reason, and for relative URL resolution problems as well. When this is done, it shouldn't be necessary to make any changes to setPath.
Comment 13 Eric U. 2011-08-30 16:47:18 PDT
(In reply to comment #12)
> ReplaceComponents does the appropriate escaping for the type of URL you have. If you have an http URL, it will escape the ? and #. The kind of URL you have (a path URL) doesn't have ? or # components, so it doesn't escape them.
> 
> We need to teach googleurl about these FileAPI files for this reason, and for relative URL resolution problems as well. When this is done, it shouldn't be necessary to make any changes to setPath.

I'm deep into the GURL code working on this right now [and all the chromium code that calls it], but we will still need to add escaping to EntryBase::toURL() in EntryBase.cpp.
Comment 14 Sergey Ryazanov 2011-08-31 05:15:02 PDT
Well, WebKit's KURL does escaping but it's not necessary required. Escaping needed because of the way such an URL is parsed to extract a file path. In Chromium everything after '#' goes into the hash part (not into the 'path'). But as far as I understand the filesystem URL scheme is not supposed to have that part.
Comment 15 David Levin 2011-11-11 08:16:04 PST
Comment on attachment 101986 [details]
Removed accidentally included files.

I know that this needs a ChangeLog before it can be committed (but I don't know about the change itself).

Probably best to ping brettw about that if more input is needed.
Comment 16 Eric U. 2011-11-11 09:05:13 PST
(In reply to comment #14)
> Well, WebKit's KURL does escaping but it's not necessary required. Escaping needed because of the way such an URL is parsed to extract a file path. In Chromium everything after '#' goes into the hash part (not into the 'path'). But as far as I understand the filesystem URL scheme is not supposed to have that part.

Actually, filesystem URLs are supposed to have # and ? parts--they just don't because I haven't finished the work on GURL yet.