Bug 260771 - REGRESSION(266247@main): PDF "Save" button does nothing, "Print" function also broken
Summary: REGRESSION(266247@main): PDF "Save" button does nothing, "Print" function als...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: PDF (show other bugs)
Version: Other
Hardware: PC Linux
: P3 Normal
Assignee: Michael Catanzaro
URL:
Keywords: Gtk, InRadar
: 265186 (view as bug list)
Depends on:
Blocks:
 
Reported: 2023-08-26 09:07 PDT by Paul Bryan
Modified: 2023-11-21 05:29 PST (History)
8 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Bryan 2023-08-26 09:07:30 PDT
When viewing a PDF in Epiphany and MiniBrowser, clicking on the "Save" button does nothing.

WebKitGTK version (from About Web -> Troubleshooting -> Debugging Information): 2.41.91
Distributor (Linux operating system, Flathub, Epiphany Tech Preview, etc.): Epiphany Tech Preview via Flathub
Comment 1 Michael Catanzaro 2023-09-26 12:42:35 PDT
There are hints in the web console:

[Warning] The download attribute on anchor was ignored because its href URL has a different security origin. (viewer.js, line 12388)
[Error] Not allowed to load local resource: blob:webkit-pdfjs-viewer://pdfjs/6e197ddb-9da6-43c4-8fe6-c0710b24c4f8
	download (viewer.js:12388)
	download (viewer.js:12433)
	(anonymous function) (viewer.js:835)
Comment 2 Michael Catanzaro 2023-09-26 16:29:24 PDT
Hm, it broke somewhere between 265737@main "Update to PDF.js v3.8.162" and 266427@main "Update to PDF.js v3.9.179". I will bisect this.
Comment 3 Michael Catanzaro 2023-09-26 17:38:30 PDT
Printing is broken too:

[Error] Not allowed to load local resource: blob:webkit-pdfjs-viewer://pdfjs/6921a4a0-4716-45bc-8fe6-df18b3bafb15
	(anonymous function) (viewer.js:13401)
Comment 4 Michael Catanzaro 2023-09-28 12:51:37 PDT
Well I found the commit to blame. We're going to have to think about how to handle this.

b1a659b38f09d05fd3e7b3b112066894ab099b8c is the first bad commit
commit b1a659b38f09d05fd3e7b3b112066894ab099b8c
Author: Anne van Kesteren <annevk@annevk.nl>
Date:   Mon Jul 24 08:42:00 2023 -0700

    Return opaque origin for blob: URL containing inner non-http(s): URL
    https://bugs.webkit.org/show_bug.cgi?id=257262
    rdar://109781193
    
    Reviewed by Alex Christensen and Darin Adler.
    
    Refactor SecurityOrigin so it is more clear blob: URLs are the sole
    special case. And change how we derive the blob: URL origin to align
    with the URL standard:
    
    * No longer perform percent-decoding (matches other browsers).
    * Restrict non-opaque origins to HTTP(S) URLs (will soon match other
      browsers). However:
    
      * Still give blob: URLs derived from file: origins an origin for now
        as removing that ability needs a bit more care. This currently goes
        against the URL standard, but that might change.
      * Also give registered schemes a pass to allow embedders to continue
        to use blob: URLs as they see fit.
    
    Also change BlobURL to rely more directly on SecurityOrigin.
    
    * LayoutTests/imported/w3c/web-platform-tests/url/a-element-origin-expected.txt:
    * LayoutTests/imported/w3c/web-platform-tests/url/a-element-origin-xhtml-expected.txt:
    * LayoutTests/imported/w3c/web-platform-tests/url/url-origin.any-expected.txt:
    * LayoutTests/imported/w3c/web-platform-tests/url/url-origin.any.worker-expected.txt:
    * Source/WebCore/fileapi/BlobURL.cpp:
    (WebCore::BlobURL::getOriginURL):
    (WebCore::BlobURL::isSecureBlobURL):
    * Source/WebCore/fileapi/ThreadableBlobRegistry.cpp:
    (WebCore::ThreadableBlobRegistry::getCachedOrigin):
    * Source/WebCore/page/SecurityOrigin.cpp:
    (WebCore::SecurityOrigin::create):
    (WebCore::SecurityOrigin::forBlobURL):
    (WebCore::SecurityOrigin::isSecure):
    (WebCore::SecurityOrigin::shouldUseInnerURL): Deleted.
    (WebCore::SecurityOrigin::extractInnerURL): Deleted.
    * Source/WebCore/page/SecurityOrigin.h:
    * Source/WebCore/page/SecurityOriginData.cpp:
    (WebCore::SecurityOriginData::shouldTreatAsOpaqueOrigin):
    
    Canonical link: https://commits.webkit.org/266247@main

 .../url/a-element-origin-expected.txt              | 10 ++--
 .../url/a-element-origin-xhtml-expected.txt        | 10 ++--
 .../url/url-origin.any-expected.txt                | 10 ++--
 .../url/url-origin.any.worker-expected.txt         | 10 ++--
 Source/WebCore/fileapi/BlobURL.cpp                 | 14 ++----
 Source/WebCore/fileapi/ThreadableBlobRegistry.cpp  |  3 +-
 Source/WebCore/page/SecurityOrigin.cpp             | 57 ++++++++++++----------
 Source/WebCore/page/SecurityOrigin.h               | 18 ++-----
 Source/WebCore/page/SecurityOriginData.cpp         | 13 ++---
 9 files changed, 66 insertions(+), 79 deletions(-)
Comment 5 Michael Catanzaro 2023-09-28 15:41:37 PDT
Hey Anne, this is really simple but fixes the bug. Does it look OK?

diff --git a/Source/WebCore/page/SecurityOrigin.cpp b/Source/WebCore/page/SecurityOrigin.cpp
index 605e0523d462..bdeba62901d8 100644
--- a/Source/WebCore/page/SecurityOrigin.cpp
+++ b/Source/WebCore/page/SecurityOrigin.cpp
@@ -178,6 +178,7 @@ inline bool isSafelistedBlobProtocol(const URL& url)
     // except that assert gets hit on certain tests.
     return url.protocolIsInHTTPFamily()
         || url.protocolIsFile()
+        || url.protocol() == "webkit-pdfjs-viewer"_s
         || LegacySchemeRegistry::schemeIsHandledBySchemeHandler(url.protocol());
 }

I'm thinking that's even correct rather than a hack. Will propose a pull request tomorrow probably.
Comment 6 Anne van Kesteren 2023-09-29 01:25:02 PDT
The problem with that fix is that then

> new URL("webkit-pdfjs-viewer:blah").origin

would not return "null" as I understand it. So an implementation detail of a WebKit port would become web-exposed (and non-compliant with the URL standard).

I'd prefer not shipping that in Safari.
Comment 7 Anne van Kesteren 2023-09-29 01:27:24 PDT
To elaborate, it's probably acceptable, but if we can scope it with an ifdef or feature flag that would be preferable. And if it's not too much to ask maybe add a test for the specific scheme so it's clear what we don't want to regress?
Comment 8 Michael Catanzaro 2023-09-29 06:54:28 PDT
(In reply to Anne van Kesteren from comment #6)
> I'd prefer not shipping that in Safari.

I'll guard it behind ENABLE(PDFJS). If you use PDF.js, you're going to need it. I think you *do* build with that enabled, though? It was originally added for Apple platforms.

(In reply to Anne van Kesteren from comment #7)
> And if it's not too much to ask
> maybe add a test for the specific scheme so it's clear what we don't want to
> regress?

I'll think about this....
Comment 9 Anne van Kesteren 2023-09-29 07:31:09 PDT
Bug 242263 suggests we don't.

The test could be as simple as asserting that

  new URL("webkit-pdfjs-viewer:blah").origin

is

  "null"

and PDFJS builds would mark that as failure.
Comment 10 Michael Catanzaro 2023-09-29 07:57:04 PDT
I'd rather do it the other way around and mark it as a failure for ports that don't use PDF.js, since the goal is to ensure we don't break PDF.js. :P
Comment 11 Anne van Kesteren 2023-09-29 08:28:55 PDT
Well, my goal is we don't break the URL Standard, which the current architecture of PDFJS apparently requires (perhaps worth a follow-up bug). I guess you can have the same test twice.
Comment 12 Michael Catanzaro 2023-09-29 08:53:13 PDT
I'm making PDF.js function the same as all URL schemes registered by applications, so I guess we already violate this standard for all such schemes?
Comment 13 Anne van Kesteren 2023-09-29 09:47:17 PDT
That's right, but the standards really only cover browsers, not a browser engine embedded in an arbitrary app.
Comment 14 Michael Catanzaro 2023-09-29 10:50:47 PDT
Browsers register lots of URL schemes; at least Epiphany has URL schemes for view source mode, reader mode, and about handlers. (There's also the built-in webkit:// scheme, although that one probably doesn't use LegacySchemeRegistry and so wouldn't hit this codepath.)

Anyway, since Apple is clearly planning to turn on PDF.js support, I don't think we need the same test twice with different expectations.
Comment 15 Tim Nguyen (:ntim) 2023-09-29 11:16:57 PDT
(In reply to Michael Catanzaro from comment #14)
> Browsers register lots of URL schemes; at least Epiphany has URL schemes for
> view source mode, reader mode, and about handlers. (There's also the
> built-in webkit:// scheme, although that one probably doesn't use
> LegacySchemeRegistry and so wouldn't hit this codepath.)
> 
> Anyway, since Apple is clearly planning to turn on PDF.js support, I don't
> think we need the same test twice with different expectations.

There are no plans at this time. I agree we shouldn't regress the standard.

If we can somehow use checks against PDFDocument to workaround the origin issues, that might work.
Comment 16 Michael Catanzaro 2023-09-29 12:26:09 PDT
(In reply to Anne van Kesteren from comment #6)
> The problem with that fix is that then
> 
> > new URL("webkit-pdfjs-viewer:blah").origin
> 
> would not return "null" as I understand it. So an implementation detail of a
> WebKit port would become web-exposed (and non-compliant with the URL
> standard).
> 
> I'd prefer not shipping that in Safari.

Actually wait, when testing this I noticed that the origin is already not null without making changes. Testing unmodified WebKitGTK in the inspector console:

> new URL("webkit-pdfjs-viewer:blah").origin
< "webkit-pdfjs-viewer://"
> new URL("webkit-pdfjs-viewerr:blah").origin
< "null"

That is, it's already treated specially without any changes. I think the change here is only concerned with the security origin of blob URLs specifically.

A few more tests, just to see:

> new URL("foo:bar").origin
< "null"
> new URL("ephy-about://blank").origin
< "ephy-about://blank"
> new URL("webkit://gpu").origin
< "webkit://gpu"
Comment 17 Michael Catanzaro 2023-09-29 12:43:33 PDT
So I think following the spec is impossible and WebKit already does not attempt to do so. I'm looking at https://url.spec.whatwg.org/#origin (currently section 4.7 "Origin") and it's clear that the security origin of URLs for most protocols is defined to be always opaque. The only exceptions are blob, ftp, http, https, ws, wss, and file.

But if we were to do that, then there's just no way that custom protocols can ever work. WebKit's SecurityOriginData::shouldTreatAsOpaqueOrigin already has a big list of WebKit protocols that should receive non-opaque security origins, including a bunch of Apple-specific ones (applewebdata, x-apple-ql-id, x-apple-ql-id2, x-apple-ql-magic) and the Linux-specific resource:// protocol. The list is there because the alternative to make the protocols work would be to disable same origin policy entirely, which would be much worse. For example, webkit-pdfjs-viewer pages should be able to load other webkit-pdfjs-viewer pages with the same security origin. resource pages should be able to load other resource pages. If the origins are opaque, it cannot work.

So I think this is OK. The spec handles standard web protocols. We can say we're extending it for the benefit of WebKit-specific protocols and custom protocols.

I'm now I'm wondering if isSafelistedBlobProtocol can be replaced by SecurityOriginData::shouldTreatAsOpaqueOrigin:

diff --git a/Source/WebCore/page/SecurityOrigin.cpp b/Source/WebCore/page/SecurityOrigin.cpp
index 605e0523d462..488d13c5befa 100644
--- a/Source/WebCore/page/SecurityOrigin.cpp
+++ b/Source/WebCore/page/SecurityOrigin.cpp
@@ -169,18 +169,6 @@ Ref<SecurityOrigin> SecurityOrigin::create(const URL& url)
     return adoptRef(*new SecurityOrigin(url));
 }
 
-inline bool isSafelistedBlobProtocol(const URL& url)
-{
-    if (!url.isValid())
-        return false;
-
-    // FIXME: we ought to assert we're in WebKitLegacy or a web content process as per 263652@main,
-    // except that assert gets hit on certain tests.
-    return url.protocolIsInHTTPFamily()
-        || url.protocolIsFile()
-        || LegacySchemeRegistry::schemeIsHandledBySchemeHandler(url.protocol());
-}
-
 Ref<SecurityOrigin> SecurityOrigin::createForBlobURL(const URL& url)
 {
     ASSERT(url.protocolIsBlob());
@@ -189,7 +177,7 @@ Ref<SecurityOrigin> SecurityOrigin::createForBlobURL(const URL& url)
         return origin.releaseNonNull();
 
     URL pathURL { url.path().toString() };
-    if (isSafelistedBlobProtocol(pathURL))
+    if (!SecurityOriginData::shouldTreatAsOpaqueOrigin(pathURL))
         return adoptRef(*new SecurityOrigin(pathURL));
 
     return createOpaque();

I think I'll try this....
Comment 18 Michael Catanzaro 2023-09-29 12:48:32 PDT
I think we're also missing the first step in the algorithm for computing the origin of a blob URL: "If url’s blob URL entry is non-null, then return url’s blob URL entry’s environment’s origin." But I also don't understand what that means. :)
Comment 19 Michael Catanzaro 2023-09-29 14:16:59 PDT
(In reply to Michael Catanzaro from comment #17)
> So I think following the spec is impossible and WebKit already does not
> attempt to do so. I'm looking at https://url.spec.whatwg.org/#origin
> (currently section 4.7 "Origin") and it's clear that the security origin of
> URLs for most protocols is defined to be always opaque. The only exceptions
> are blob, ftp, http, https, ws, wss, and file.

...
 
> I'm now I'm wondering if isSafelistedBlobProtocol can be replaced by
> SecurityOriginData::shouldTreatAsOpaqueOrigin:

OK, SecurityOriginData::shouldTreatAsOpaqueOrigin says it implements "https://url.spec.whatwg.org/#origin with some additions" added by Alex in 233783@main "Non-special URLs should have an opaque origin" so he already read this section of the spec and decided to handle special protocols specially. So I'm increasingly confident this makes sense.
Comment 20 Michael Catanzaro 2023-09-29 14:26:43 PDT
Pull request: https://github.com/WebKit/WebKit/pull/18438
Comment 21 EWS 2023-11-06 10:07:44 PST
Committed 270274@main (58df23fe5ac5): <https://commits.webkit.org/270274@main>

Reviewed commits have been landed. Closing PR #18438 and removing active labels.
Comment 22 Radar WebKit Bug Importer 2023-11-06 10:08:17 PST
<rdar://problem/118005690>
Comment 23 Michael Catanzaro 2023-11-21 05:29:39 PST
*** Bug 265186 has been marked as a duplicate of this bug. ***