78648 – FileReader is dysfunctional in documents with "null" origin string

RESOLVED FIXED 78648

FileReader is dysfunctional in documents with "null" origin string

https://bugs.webkit.org/show_bug.cgi?id=78648

Summary FileReader is dysfunctional in documents with "null" origin string

Alexey Proskuryakov

Reported 2012-02-14 16:14:16 PST

FileReader doesn't work in documents with unique security origin, or in file documents when file path separation is enforced. The reason is that: 1. Origin string for such is null. 2. FileReaderLoader::start tries to create a blob URL. 3. BlobURL::createBlobURL bails out when it sees a "null" origin string. As an example of configuration where this is broken, try opening any FileReader regression test from fast/files in Google Chrome. Why does FileReader need a blob URL to read a file, in the first place?

Attachments
Patch (4.00 KB, patch) 2012-06-08 17:39 PDT, Jian Li	abarth: review- jianli: commit-queue-	Details Formatted Diff Diff
Patch (13.20 KB, patch) 2012-06-14 15:59 PDT, Jian Li	no flags	Details Formatted Diff Diff
Fix styles (12.99 KB, patch) 2012-06-14 16:06 PDT, Jian Li	no flags	Details Formatted Diff Diff
Patch for landing (13.87 KB, patch) 2012-06-15 01:02 PDT, Hajime Morrita	no flags	Details Formatted Diff Diff
Show Obsolete (3) View All Add attachment proposed patch, testcase, etc.

Jian Li

Comment 1 2012-02-14 16:44:21 PST

(In reply to comment #0) > FileReader doesn't work in documents with unique security origin, or in file documents when file path separation is enforced. The reason is that: > 1. Origin string for such is null. > 2. FileReaderLoader::start tries to create a blob URL. > 3. BlobURL::createBlobURL bails out when it sees a "null" origin string. > > As an example of configuration where this is broken, try opening any FileReader regression test from fast/files in Google Chrome. > > Why does FileReader need a blob URL to read a file, in the first place? This is because we want to share the same code for both file/blob reading and blob URL request fetching in multi-process environment. When a file/blob is created for the first time, it is identified by an internal used blob URL, that can allow it to be read by either FileReader or blob URL request.

Alexey Proskuryakov

Comment 2 2012-02-14 21:08:41 PST

Why do you want to share file and blob reading code, but not http and blob, for example? This is still not making any sense to me.

Jian Li

Comment 3 2012-02-16 10:52:23 PST

(In reply to comment #2) > Why do you want to share file and blob reading code, but not http and blob, for example? This is still not making any sense to me. Not sure I understand what you mean. Which http do you mean? We want to share the common code between blob reading and blob: request handling.

Alexey Proskuryakov

Comment 4 2012-02-16 11:02:28 PST

Let me put it in a different way. Do you have a suggestion for how to fix this bug?

Jian Li

Comment 5 2012-02-16 11:19:33 PST

(In reply to comment #4) > Let me put it in a different way. Do you have a suggestion for how to fix this bug? FileReader used to work in file: documents. But due to the tightening of security origin, it did not work any more. I would prefer a security expert, like Adam Barth, to confirm if we do want to make FileReader work in file: or unique security origin context. If you launch Google Chrome with the command line option "--allow-file-access-from-files", you can run those blob/file regression tests from fast/files. I am not sure if Safari/WebKit has such option to set. If we decide to support this case, I can work on a fix on this.

Adam Barth

Comment 6 2012-02-16 11:36:39 PST

> FileReader used to work in file: documents. But due to the tightening of security origin, it did not work any more. I would prefer a security expert, like Adam Barth, to confirm if we do want to make FileReader work in file: or unique security origin context. FileReader should work regardless of the security origin. As far as I can tell, our current behavior violates the specs.

Jian Li

Comment 7 2012-02-16 11:41:07 PST

(In reply to comment #6) > > FileReader used to work in file: documents. But due to the tightening of security origin, it did not work any more. I would prefer a security expert, like Adam Barth, to confirm if we do want to make FileReader work in file: or unique security origin context. > > FileReader should work regardless of the security origin. As far as I can tell, our current behavior violates the specs. Sounds good. I am going to work on a patch for this.

Alexey Proskuryakov

Comment 8 2012-06-06 17:16:08 PDT

Jian, did you have a chance to look into this?

Jian Li

Comment 9 2012-06-06 17:49:02 PDT

(In reply to comment #8) > Jian, did you have a chance to look into this? Sorry for late response. I will pick up this one to see if I can do sth about it.

Jian Li

Comment 10 2012-06-08 17:39:15 PDT

Created attachment 146668 [details] Patch

Alexey Proskuryakov

Comment 11 2012-06-08 23:58:10 PDT

Comment on attachment 146668 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=146668&action=review > Source/WebCore/ChangeLog:8 > + We only fix the problem with using FileReader from a file document. The What approach would you take to fix this in general case? File URLs are of course most important in this scenario, but this does not look like a step in the direction of a general fix.

Adam Barth

Comment 12 2012-06-09 10:42:10 PDT

Comment on attachment 146668 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=146668&action=review This patch looks like the wrong approach. Rather than trying to hack together a blob URL that works for file URLs, it seems like a better approach is to teach FileReader to work with Blobs directly without needing to use URLs at all. > Source/WebCore/fileapi/BlobURL.cpp:52 > - return createBlobURL(securityOrigin->toString()); > + String originString; > + if (securityOrigin->protocol() == "file") > + originString = "file://" + securityOrigin->filePath(); > + else > + originString = securityOrigin->toString(); > + return createBlobURL(originString); This isn't right. We shouldn't be special-casing protocol() == "file". It also won't work right depending on various bools in Settings. > Source/WebCore/fileapi/BlobURL.cpp:67 > +KURL BlobURL::extractInnerURL(const KURL& url) > +{ > + String unescapedUrl = decodeURLEscapeSequences(url.path()); > + size_t lastSlashPos = unescapedUrl.reverseFind('/'); > + if (lastSlashPos != notFound) > + unescapedUrl = unescapedUrl.substring(0, lastSlashPos); > + return KURL(ParsedURLString, unescapedUrl); > +} This is wrong. KURL has a function for doing this correctly these days. > Source/WebCore/page/SecurityOrigin.cpp:92 > +#if ENABLE(BLOB) > + if (url.protocolIs("blob")) > + return BlobURL::extractInnerURL(url); > +#endif Why isn't KURL able to extract this inner URL properly? > Source/WebCore/page/SecurityOrigin.h:68 > + String filePath() const { return m_filePath; } Please don't add an accessor for this information. It's private data for SecurityOrigin that no code outside this class should need.

Jian Li

Comment 13 2012-06-12 17:48:00 PDT

The reason for using Blob URL to read the blob data is to share the code in both FileReader and blob URL fetching scenarios for all supported platforms, including both single-process and multi-process platforms. For single-process platform, we could try to make FileReader interact with the blob directly. However, for multi-process platform, the blob data is supposed to live in the browser process and we do not want to invent another protocol to fetch the data from the browser process. If you want the direct hook with the blob data for single-process platform, I could make the change now. But if we want to also support multi-process platform, I am afraid that making FileReader read via blob URL is the simplest solution. If we want to support more than file URL, we can create a special URL for FileReader and twist SecurityOrigin check to work for it. How do you think? If you know about better solution, please let me know. (In reply to comment #12) > (From update of attachment 146668 [details]) > View in context: https://bugs.webkit.org/attachment.cgi?id=146668&action=review > > This patch looks like the wrong approach. Rather than trying to hack together a blob URL that works for file URLs, it seems like a better approach is to teach FileReader to work with Blobs directly without needing to use URLs at all. > > > Source/WebCore/fileapi/BlobURL.cpp:52 > > - return createBlobURL(securityOrigin->toString()); > > + String originString; > > + if (securityOrigin->protocol() == "file") > > + originString = "file://" + securityOrigin->filePath(); > > + else > > + originString = securityOrigin->toString(); > > + return createBlobURL(originString); > > This isn't right. We shouldn't be special-casing protocol() == "file". It also won't work right depending on various bools in Settings. > > > Source/WebCore/fileapi/BlobURL.cpp:67 > > +KURL BlobURL::extractInnerURL(const KURL& url) > > +{ > > + String unescapedUrl = decodeURLEscapeSequences(url.path()); > > + size_t lastSlashPos = unescapedUrl.reverseFind('/'); > > + if (lastSlashPos != notFound) > > + unescapedUrl = unescapedUrl.substring(0, lastSlashPos); > > + return KURL(ParsedURLString, unescapedUrl); > > +} > > This is wrong. KURL has a function for doing this correctly these days. > > > Source/WebCore/page/SecurityOrigin.cpp:92 > > +#if ENABLE(BLOB) > > + if (url.protocolIs("blob")) > > + return BlobURL::extractInnerURL(url); > > +#endif > > Why isn't KURL able to extract this inner URL properly? > > > Source/WebCore/page/SecurityOrigin.h:68 > > + String filePath() const { return m_filePath; } > > Please don't add an accessor for this information. It's private data for SecurityOrigin that no code outside this class should need.

Adam Barth

Comment 14 2012-06-12 19:56:18 PDT

Running a simple experiment, it looks like we're using blob URLs of the following form: blob:https%3A//bugs.webkit.org/2e92ba1a-89b9-45b0-90a5-1cba0d901540 Would it be possible to use blob URLs of one of the following forms: blob:2e92ba1a-89b9-45b0-90a5-1cba0d901540 blob:null/2e92ba1a-89b9-45b0-90a5-1cba0d901540 blob:about%3Ablank/2e92ba1a-89b9-45b0-90a5-1cba0d901540 If we did that, we could avoid serializing the origin into the blob URL. There's a larger problem of the syntax of blob URLs not interoperating between browsers, which is a problem in the long run but not something we need to solve in this bug.

Jian Li

Comment 15 2012-06-13 11:34:17 PDT

(In reply to comment #14) > Running a simple experiment, it looks like we're using blob URLs of the following form: > > blob:https%3A//bugs.webkit.org/2e92ba1a-89b9-45b0-90a5-1cba0d901540 > > Would it be possible to use blob URLs of one of the following forms: > > blob:2e92ba1a-89b9-45b0-90a5-1cba0d901540 > blob:null/2e92ba1a-89b9-45b0-90a5-1cba0d901540 > blob:about%3Ablank/2e92ba1a-89b9-45b0-90a5-1cba0d901540 > > If we did that, we could avoid serializing the origin into the blob URL. > > There's a larger problem of the syntax of blob URLs not interoperating between browsers, which is a problem in the long run but not something we need to solve in this bug. I think the blob URL should only be used in the 2nd form. That is, SecurityOrigin::toString returns null when the document has unique security origin or the file url. This is the reason I tried to fix the most common case: file URL. I am not sure what kind of user case we want to support for the document with unique security origin. We have to embed the origin in the blob URL because we need to perform security origin check when fetching an URL. If we do not embed the origin, we will have to get the origin of the page when the blob is spawned. For single-process platform, this should not be a problem. For multi-process platform, this means that an expensive and time-consuming IPC needs to be invoked to get the origin information from the browser process before the origin check could be performed. This security origin check is only needed for using blob URL. Currently reading blob via FileReader is routed through blob URL. I think we can skip origin check for FileReader since we create a one-time used blob URL internally. How do you think about this? Another idea is to let FileReader read from blob data directly but this approach will only work single-process platform. If you think the latter is better, I can start to produce a patch for this.

Adam Barth

Comment 16 2012-06-13 17:41:09 PDT

> I am not sure what kind of user case we want to support for the document with unique security origin. The main use case I'm interested in is reading an object out of a FileSystem and then postMessaging the Blob into an sandboxed iframe. However, an easier case is probably something like the following: var b = new Blob(["...."], { type="image/png" }); var u = webkitURL.createObjectURL(b); var i = document.createElement("img"); i.src = u; document.appendChild(i); > We have to embed the origin in the blob URL because we need to perform security origin check when fetching an URL. If we do not embed the origin, we will have to get the origin of the page when the blob is spawned. For single-process platform, this should not be a problem. For multi-process platform, this means that an expensive and time-consuming IPC needs to be invoked to get the origin information from the browser process before the origin check could be performed. In the case of a unique origin, the URL is only usable in one process anyway because it's not possible for a document in another process to have the same origin. Perhaps we should just keep a map from blob URLs to SecurityOrigins in memory for the unique origin case? > This security origin check is only needed for using blob URL. Currently reading blob via FileReader is routed through blob URL. I think we can skip origin check for FileReader since we create a one-time used blob URL internally. How do you think about this? Another idea is to let FileReader read from blob data directly but this approach will only work single-process platform. If you think the latter is better, I can start to produce a patch for this. I'd rather solve the general problem if we can. What do you think about keeping an in-memory map as described above? If the blob URL isn't in the map and has a "null" origin, we can conclude that the access check has failed.

Jian Li

Comment 17 2012-06-13 17:50:40 PDT

(In reply to comment #16) > > I am not sure what kind of user case we want to support for the document with unique security origin. > > The main use case I'm interested in is reading an object out of a FileSystem and then postMessaging the Blob into an sandboxed iframe. However, an easier case is probably something like the following: > > var b = new Blob(["...."], { type="image/png" }); > var u = webkitURL.createObjectURL(b); > var i = document.createElement("img"); > i.src = u; > document.appendChild(i); > > > We have to embed the origin in the blob URL because we need to perform security origin check when fetching an URL. If we do not embed the origin, we will have to get the origin of the page when the blob is spawned. For single-process platform, this should not be a problem. For multi-process platform, this means that an expensive and time-consuming IPC needs to be invoked to get the origin information from the browser process before the origin check could be performed. > > In the case of a unique origin, the URL is only usable in one process anyway because it's not possible for a document in another process to have the same origin. Perhaps we should just keep a map from blob URLs to SecurityOrigins in memory for the unique origin case? > > > This security origin check is only needed for using blob URL. Currently reading blob via FileReader is routed through blob URL. I think we can skip origin check for FileReader since we create a one-time used blob URL internally. How do you think about this? Another idea is to let FileReader read from blob data directly but this approach will only work single-process platform. If you think the latter is better, I can start to produce a patch for this. > > I'd rather solve the general problem if we can. What do you think about keeping an in-memory map as described above? If the blob URL isn't in the map and has a "null" origin, we can conclude that the access check has failed. Using in-memory map for unique origin case sounds like a good idea. I will proceed with this approach.

Adam Barth

Comment 18 2012-06-13 18:04:47 PDT

Thanks. I appreciate your talking though the design with me.

Jian Li

Comment 19 2012-06-14 15:59:16 PDT

Created attachment 147669 [details] Patch

WebKit Review Bot

Comment 20 2012-06-14 16:00:58 PDT

Attachment 147669 [details] did not pass style-queue: Failed to run "['Tools/Scripts/check-webkit-style', '--diff-files', u'LayoutTests/ChangeLog', u'LayoutTests/fast..." exit_code: 1 Source/WebCore/page/SecurityOrigin.cpp:99: Use 0 instead of NULL. [readability/null] [5] Source/WebCore/fileapi/ThreadableBlobRegistry.cpp:106: Use 0 or null instead of NULL (even in *comments*). [readability/null] [4] Source/WebCore/fileapi/ThreadableBlobRegistry.cpp:112: One line control clauses should not use braces. [whitespace/braces] [4] Total errors found: 3 in 13 files If any of these errors are false positives, please file a bug against check-webkit-style.

Jian Li

Comment 21 2012-06-14 16:06:07 PDT

Created attachment 147671 [details] Fix styles

Adam Barth

Comment 22 2012-06-14 16:08:21 PDT

Comment on attachment 147669 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=147669&action=review > LayoutTests/fast/files/file-reader-file-url.html:13 > +<iframe src="resources/file-reader-file-url-iframe.html"></iframe> Can you add a test in an iframe with the sandbox attribute? See the tests in http://trac.webkit.org/browser/trunk/LayoutTests/http/tests/security/ with *sandbox* for examples. >> Source/WebCore/fileapi/ThreadableBlobRegistry.cpp:106 >> + // If the blob URL contains NULL origin, as in the context with unique security origin or file URL, save the mapping between url and origin so that the origin can be retrived when doing security origin check. > > Use 0 or null instead of NULL (even in *comments*). [readability/null] [4] NULL -> "null" should make the style checker happy. >> Source/WebCore/fileapi/ThreadableBlobRegistry.cpp:112 >> + } else { > > One line control clauses should not use braces. [whitespace/braces] [4] The style checker is right here. > Source/WebCore/fileapi/ThreadableBlobRegistry.h:51 > + static PassRefPtr<SecurityOrigin> getCachedOrigin(const KURL&); getCachedOrigin -> cachedOrigin We usually skip the "get" prefix because get is such a weak verb. >> Source/WebCore/page/SecurityOrigin.cpp:99 >> + return NULL; > > Use 0 instead of NULL. [readability/null] [5] The style checker is right here.

Adam Barth

Comment 23 2012-06-14 16:09:37 PDT

Comment on attachment 147671 [details] Fix styles My only real substantial comment is to add a test for the sandboxed case. If you're feeling up to it, you might also want to add a test for postMessaging a Blob into a sandboxed origin. I suspect that will work fine too after this patch.

Adam Barth

Comment 24 2012-06-14 16:09:55 PDT

Thanks for fixing this bug!

Build Bot

Comment 25 2012-06-14 16:36:58 PDT

Comment on attachment 147671 [details] Fix styles Attachment 147671 [details] did not pass win-ews (win): Output: http://queues.webkit.org/results/12954535

Jian Li

Comment 26 2012-06-14 17:59:27 PDT

Committed as http://trac.webkit.org/changeset/120384.

WebKit Review Bot

Comment 27 2012-06-14 19:24:43 PDT

Re-opened since this is blocked by 89156

Hajime Morrita

Comment 28 2012-06-15 01:02:08 PDT

Created attachment 147768 [details] Patch for landing

Hajime Morrita

Comment 29 2012-06-15 01:03:09 PDT

My rollout was wrong. Let me land it again.

WebKit Review Bot

Comment 30 2012-06-15 02:57:31 PDT

Comment on attachment 147768 [details] Patch for landing Clearing flags on attachment: 147768 Committed r120433: <http://trac.webkit.org/changeset/120433>

WebKit Review Bot

Comment 31 2012-06-15 02:57:37 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version 528+ (Nightly build)

Hardware Unspecified

OS Unspecified

Product WebKit

Component WebCore Misc.

Assignee

Jian Li

Reported

2012-02-14 16:14 PST

Modified

2012-06-15 02:57 PDT History

CC List

9 users Show

URL

Keywords

Depends on

89156 89157

Blocks

Dependencies

tree graph