<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://bugs.webkit.org/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4.1"
          urlbase="https://bugs.webkit.org/"
          
          maintainer="admin@webkit.org"
>

    <bug>
          <bug_id>260771</bug_id>
          
          <creation_ts>2023-08-26 09:07:30 -0700</creation_ts>
          <short_desc>REGRESSION(266247@main): PDF &quot;Save&quot; button does nothing, &quot;Print&quot; function also broken</short_desc>
          <delta_ts>2023-11-21 05:29:39 -0800</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WebKit</product>
          <component>PDF</component>
          <version>Other</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords>Gtk, InRadar</keywords>
          <priority>P3</priority>
          <bug_severity>Normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Paul Bryan">pbryan</reporter>
          <assigned_to name="Michael Catanzaro">mcatanzaro</assigned_to>
          <cc>abrarsl2002</cc>
    
    <cc>achristensen</cc>
    
    <cc>annevk</cc>
    
    <cc>bugs-noreply</cc>
    
    <cc>mcatanzaro</cc>
    
    <cc>ntim</cc>
    
    <cc>thorton</cc>
    
    <cc>webkit-bug-importer</cc>
          

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>1973778</commentid>
    <comment_count>0</comment_count>
    <who name="Paul Bryan">pbryan</who>
    <bug_when>2023-08-26 09:07:30 -0700</bug_when>
    <thetext>When viewing a PDF in Epiphany and MiniBrowser, clicking on the &quot;Save&quot; button does nothing.

WebKitGTK version (from About Web -&gt; Troubleshooting -&gt; Debugging Information): 2.41.91
Distributor (Linux operating system, Flathub, Epiphany Tech Preview, etc.): Epiphany Tech Preview via Flathub</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1980257</commentid>
    <comment_count>1</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-26 12:42:35 -0700</bug_when>
    <thetext>There are hints in the web console:

[Warning] The download attribute on anchor was ignored because its href URL has a different security origin. (viewer.js, line 12388)
[Error] Not allowed to load local resource: blob:webkit-pdfjs-viewer://pdfjs/6e197ddb-9da6-43c4-8fe6-c0710b24c4f8
	download (viewer.js:12388)
	download (viewer.js:12433)
	(anonymous function) (viewer.js:835)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1980335</commentid>
    <comment_count>2</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-26 16:29:24 -0700</bug_when>
    <thetext>Hm, it broke somewhere between 265737@main &quot;Update to PDF.js v3.8.162&quot; and 266427@main &quot;Update to PDF.js v3.9.179&quot;. I will bisect this.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1980357</commentid>
    <comment_count>3</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-26 17:38:30 -0700</bug_when>
    <thetext>Printing is broken too:

[Error] Not allowed to load local resource: blob:webkit-pdfjs-viewer://pdfjs/6921a4a0-4716-45bc-8fe6-df18b3bafb15
	(anonymous function) (viewer.js:13401)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981065</commentid>
    <comment_count>4</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-28 12:51:37 -0700</bug_when>
    <thetext>Well I found the commit to blame. We&apos;re going to have to think about how to handle this.

b1a659b38f09d05fd3e7b3b112066894ab099b8c is the first bad commit
commit b1a659b38f09d05fd3e7b3b112066894ab099b8c
Author: Anne van Kesteren &lt;annevk@annevk.nl&gt;
Date:   Mon Jul 24 08:42:00 2023 -0700

    Return opaque origin for blob: URL containing inner non-http(s): URL
    https://bugs.webkit.org/show_bug.cgi?id=257262
    rdar://109781193
    
    Reviewed by Alex Christensen and Darin Adler.
    
    Refactor SecurityOrigin so it is more clear blob: URLs are the sole
    special case. And change how we derive the blob: URL origin to align
    with the URL standard:
    
    * No longer perform percent-decoding (matches other browsers).
    * Restrict non-opaque origins to HTTP(S) URLs (will soon match other
      browsers). However:
    
      * Still give blob: URLs derived from file: origins an origin for now
        as removing that ability needs a bit more care. This currently goes
        against the URL standard, but that might change.
      * Also give registered schemes a pass to allow embedders to continue
        to use blob: URLs as they see fit.
    
    Also change BlobURL to rely more directly on SecurityOrigin.
    
    * LayoutTests/imported/w3c/web-platform-tests/url/a-element-origin-expected.txt:
    * LayoutTests/imported/w3c/web-platform-tests/url/a-element-origin-xhtml-expected.txt:
    * LayoutTests/imported/w3c/web-platform-tests/url/url-origin.any-expected.txt:
    * LayoutTests/imported/w3c/web-platform-tests/url/url-origin.any.worker-expected.txt:
    * Source/WebCore/fileapi/BlobURL.cpp:
    (WebCore::BlobURL::getOriginURL):
    (WebCore::BlobURL::isSecureBlobURL):
    * Source/WebCore/fileapi/ThreadableBlobRegistry.cpp:
    (WebCore::ThreadableBlobRegistry::getCachedOrigin):
    * Source/WebCore/page/SecurityOrigin.cpp:
    (WebCore::SecurityOrigin::create):
    (WebCore::SecurityOrigin::forBlobURL):
    (WebCore::SecurityOrigin::isSecure):
    (WebCore::SecurityOrigin::shouldUseInnerURL): Deleted.
    (WebCore::SecurityOrigin::extractInnerURL): Deleted.
    * Source/WebCore/page/SecurityOrigin.h:
    * Source/WebCore/page/SecurityOriginData.cpp:
    (WebCore::SecurityOriginData::shouldTreatAsOpaqueOrigin):
    
    Canonical link: https://commits.webkit.org/266247@main

 .../url/a-element-origin-expected.txt              | 10 ++--
 .../url/a-element-origin-xhtml-expected.txt        | 10 ++--
 .../url/url-origin.any-expected.txt                | 10 ++--
 .../url/url-origin.any.worker-expected.txt         | 10 ++--
 Source/WebCore/fileapi/BlobURL.cpp                 | 14 ++----
 Source/WebCore/fileapi/ThreadableBlobRegistry.cpp  |  3 +-
 Source/WebCore/page/SecurityOrigin.cpp             | 57 ++++++++++++----------
 Source/WebCore/page/SecurityOrigin.h               | 18 ++-----
 Source/WebCore/page/SecurityOriginData.cpp         | 13 ++---
 9 files changed, 66 insertions(+), 79 deletions(-)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981121</commentid>
    <comment_count>5</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-28 15:41:37 -0700</bug_when>
    <thetext>Hey Anne, this is really simple but fixes the bug. Does it look OK?

diff --git a/Source/WebCore/page/SecurityOrigin.cpp b/Source/WebCore/page/SecurityOrigin.cpp
index 605e0523d462..bdeba62901d8 100644
--- a/Source/WebCore/page/SecurityOrigin.cpp
+++ b/Source/WebCore/page/SecurityOrigin.cpp
@@ -178,6 +178,7 @@ inline bool isSafelistedBlobProtocol(const URL&amp; url)
     // except that assert gets hit on certain tests.
     return url.protocolIsInHTTPFamily()
         || url.protocolIsFile()
+        || url.protocol() == &quot;webkit-pdfjs-viewer&quot;_s
         || LegacySchemeRegistry::schemeIsHandledBySchemeHandler(url.protocol());
 }

I&apos;m thinking that&apos;s even correct rather than a hack. Will propose a pull request tomorrow probably.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981219</commentid>
    <comment_count>6</comment_count>
    <who name="Anne van Kesteren">annevk</who>
    <bug_when>2023-09-29 01:25:02 -0700</bug_when>
    <thetext>The problem with that fix is that then

&gt; new URL(&quot;webkit-pdfjs-viewer:blah&quot;).origin

would not return &quot;null&quot; as I understand it. So an implementation detail of a WebKit port would become web-exposed (and non-compliant with the URL standard).

I&apos;d prefer not shipping that in Safari.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981220</commentid>
    <comment_count>7</comment_count>
    <who name="Anne van Kesteren">annevk</who>
    <bug_when>2023-09-29 01:27:24 -0700</bug_when>
    <thetext>To elaborate, it&apos;s probably acceptable, but if we can scope it with an ifdef or feature flag that would be preferable. And if it&apos;s not too much to ask maybe add a test for the specific scheme so it&apos;s clear what we don&apos;t want to regress?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981261</commentid>
    <comment_count>8</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-29 06:54:28 -0700</bug_when>
    <thetext>(In reply to Anne van Kesteren from comment #6)
&gt; I&apos;d prefer not shipping that in Safari.

I&apos;ll guard it behind ENABLE(PDFJS). If you use PDF.js, you&apos;re going to need it. I think you *do* build with that enabled, though? It was originally added for Apple platforms.

(In reply to Anne van Kesteren from comment #7)
&gt; And if it&apos;s not too much to ask
&gt; maybe add a test for the specific scheme so it&apos;s clear what we don&apos;t want to
&gt; regress?

I&apos;ll think about this....</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981267</commentid>
    <comment_count>9</comment_count>
    <who name="Anne van Kesteren">annevk</who>
    <bug_when>2023-09-29 07:31:09 -0700</bug_when>
    <thetext>Bug 242263 suggests we don&apos;t.

The test could be as simple as asserting that

  new URL(&quot;webkit-pdfjs-viewer:blah&quot;).origin

is

  &quot;null&quot;

and PDFJS builds would mark that as failure.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981276</commentid>
    <comment_count>10</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-29 07:57:04 -0700</bug_when>
    <thetext>I&apos;d rather do it the other way around and mark it as a failure for ports that don&apos;t use PDF.js, since the goal is to ensure we don&apos;t break PDF.js. :P</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981282</commentid>
    <comment_count>11</comment_count>
    <who name="Anne van Kesteren">annevk</who>
    <bug_when>2023-09-29 08:28:55 -0700</bug_when>
    <thetext>Well, my goal is we don&apos;t break the URL Standard, which the current architecture of PDFJS apparently requires (perhaps worth a follow-up bug). I guess you can have the same test twice.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981288</commentid>
    <comment_count>12</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-29 08:53:13 -0700</bug_when>
    <thetext>I&apos;m making PDF.js function the same as all URL schemes registered by applications, so I guess we already violate this standard for all such schemes?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981306</commentid>
    <comment_count>13</comment_count>
    <who name="Anne van Kesteren">annevk</who>
    <bug_when>2023-09-29 09:47:17 -0700</bug_when>
    <thetext>That&apos;s right, but the standards really only cover browsers, not a browser engine embedded in an arbitrary app.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981323</commentid>
    <comment_count>14</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-29 10:50:47 -0700</bug_when>
    <thetext>Browsers register lots of URL schemes; at least Epiphany has URL schemes for view source mode, reader mode, and about handlers. (There&apos;s also the built-in webkit:// scheme, although that one probably doesn&apos;t use LegacySchemeRegistry and so wouldn&apos;t hit this codepath.)

Anyway, since Apple is clearly planning to turn on PDF.js support, I don&apos;t think we need the same test twice with different expectations.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981329</commentid>
    <comment_count>15</comment_count>
    <who name="Tim Nguyen (:ntim)">ntim</who>
    <bug_when>2023-09-29 11:16:57 -0700</bug_when>
    <thetext>(In reply to Michael Catanzaro from comment #14)
&gt; Browsers register lots of URL schemes; at least Epiphany has URL schemes for
&gt; view source mode, reader mode, and about handlers. (There&apos;s also the
&gt; built-in webkit:// scheme, although that one probably doesn&apos;t use
&gt; LegacySchemeRegistry and so wouldn&apos;t hit this codepath.)
&gt; 
&gt; Anyway, since Apple is clearly planning to turn on PDF.js support, I don&apos;t
&gt; think we need the same test twice with different expectations.

There are no plans at this time. I agree we shouldn&apos;t regress the standard.

If we can somehow use checks against PDFDocument to workaround the origin issues, that might work.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981360</commentid>
    <comment_count>16</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-29 12:26:09 -0700</bug_when>
    <thetext>(In reply to Anne van Kesteren from comment #6)
&gt; The problem with that fix is that then
&gt; 
&gt; &gt; new URL(&quot;webkit-pdfjs-viewer:blah&quot;).origin
&gt; 
&gt; would not return &quot;null&quot; as I understand it. So an implementation detail of a
&gt; WebKit port would become web-exposed (and non-compliant with the URL
&gt; standard).
&gt; 
&gt; I&apos;d prefer not shipping that in Safari.

Actually wait, when testing this I noticed that the origin is already not null without making changes. Testing unmodified WebKitGTK in the inspector console:

&gt; new URL(&quot;webkit-pdfjs-viewer:blah&quot;).origin
&lt; &quot;webkit-pdfjs-viewer://&quot;
&gt; new URL(&quot;webkit-pdfjs-viewerr:blah&quot;).origin
&lt; &quot;null&quot;

That is, it&apos;s already treated specially without any changes. I think the change here is only concerned with the security origin of blob URLs specifically.

A few more tests, just to see:

&gt; new URL(&quot;foo:bar&quot;).origin
&lt; &quot;null&quot;
&gt; new URL(&quot;ephy-about://blank&quot;).origin
&lt; &quot;ephy-about://blank&quot;
&gt; new URL(&quot;webkit://gpu&quot;).origin
&lt; &quot;webkit://gpu&quot;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981365</commentid>
    <comment_count>17</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-29 12:43:33 -0700</bug_when>
    <thetext>So I think following the spec is impossible and WebKit already does not attempt to do so. I&apos;m looking at https://url.spec.whatwg.org/#origin (currently section 4.7 &quot;Origin&quot;) and it&apos;s clear that the security origin of URLs for most protocols is defined to be always opaque. The only exceptions are blob, ftp, http, https, ws, wss, and file.

But if we were to do that, then there&apos;s just no way that custom protocols can ever work. WebKit&apos;s SecurityOriginData::shouldTreatAsOpaqueOrigin already has a big list of WebKit protocols that should receive non-opaque security origins, including a bunch of Apple-specific ones (applewebdata, x-apple-ql-id, x-apple-ql-id2, x-apple-ql-magic) and the Linux-specific resource:// protocol. The list is there because the alternative to make the protocols work would be to disable same origin policy entirely, which would be much worse. For example, webkit-pdfjs-viewer pages should be able to load other webkit-pdfjs-viewer pages with the same security origin. resource pages should be able to load other resource pages. If the origins are opaque, it cannot work.

So I think this is OK. The spec handles standard web protocols. We can say we&apos;re extending it for the benefit of WebKit-specific protocols and custom protocols.

I&apos;m now I&apos;m wondering if isSafelistedBlobProtocol can be replaced by SecurityOriginData::shouldTreatAsOpaqueOrigin:

diff --git a/Source/WebCore/page/SecurityOrigin.cpp b/Source/WebCore/page/SecurityOrigin.cpp
index 605e0523d462..488d13c5befa 100644
--- a/Source/WebCore/page/SecurityOrigin.cpp
+++ b/Source/WebCore/page/SecurityOrigin.cpp
@@ -169,18 +169,6 @@ Ref&lt;SecurityOrigin&gt; SecurityOrigin::create(const URL&amp; url)
     return adoptRef(*new SecurityOrigin(url));
 }
 
-inline bool isSafelistedBlobProtocol(const URL&amp; url)
-{
-    if (!url.isValid())
-        return false;
-
-    // FIXME: we ought to assert we&apos;re in WebKitLegacy or a web content process as per 263652@main,
-    // except that assert gets hit on certain tests.
-    return url.protocolIsInHTTPFamily()
-        || url.protocolIsFile()
-        || LegacySchemeRegistry::schemeIsHandledBySchemeHandler(url.protocol());
-}
-
 Ref&lt;SecurityOrigin&gt; SecurityOrigin::createForBlobURL(const URL&amp; url)
 {
     ASSERT(url.protocolIsBlob());
@@ -189,7 +177,7 @@ Ref&lt;SecurityOrigin&gt; SecurityOrigin::createForBlobURL(const URL&amp; url)
         return origin.releaseNonNull();
 
     URL pathURL { url.path().toString() };
-    if (isSafelistedBlobProtocol(pathURL))
+    if (!SecurityOriginData::shouldTreatAsOpaqueOrigin(pathURL))
         return adoptRef(*new SecurityOrigin(pathURL));
 
     return createOpaque();

I think I&apos;ll try this....</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981369</commentid>
    <comment_count>18</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-29 12:48:32 -0700</bug_when>
    <thetext>I think we&apos;re also missing the first step in the algorithm for computing the origin of a blob URL: &quot;If url’s blob URL entry is non-null, then return url’s blob URL entry’s environment’s origin.&quot; But I also don&apos;t understand what that means. :)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981409</commentid>
    <comment_count>19</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-29 14:16:59 -0700</bug_when>
    <thetext>(In reply to Michael Catanzaro from comment #17)
&gt; So I think following the spec is impossible and WebKit already does not
&gt; attempt to do so. I&apos;m looking at https://url.spec.whatwg.org/#origin
&gt; (currently section 4.7 &quot;Origin&quot;) and it&apos;s clear that the security origin of
&gt; URLs for most protocols is defined to be always opaque. The only exceptions
&gt; are blob, ftp, http, https, ws, wss, and file.

...
 
&gt; I&apos;m now I&apos;m wondering if isSafelistedBlobProtocol can be replaced by
&gt; SecurityOriginData::shouldTreatAsOpaqueOrigin:

OK, SecurityOriginData::shouldTreatAsOpaqueOrigin says it implements &quot;https://url.spec.whatwg.org/#origin with some additions&quot; added by Alex in 233783@main &quot;Non-special URLs should have an opaque origin&quot; so he already read this section of the spec and decided to handle special protocols specially. So I&apos;m increasingly confident this makes sense.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1981414</commentid>
    <comment_count>20</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-09-29 14:26:43 -0700</bug_when>
    <thetext>Pull request: https://github.com/WebKit/WebKit/pull/18438</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1990206</commentid>
    <comment_count>21</comment_count>
    <who name="EWS">ews-feeder</who>
    <bug_when>2023-11-06 10:07:44 -0800</bug_when>
    <thetext>Committed 270274@main (58df23fe5ac5): &lt;https://commits.webkit.org/270274@main&gt;

Reviewed commits have been landed. Closing PR #18438 and removing active labels.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1990207</commentid>
    <comment_count>22</comment_count>
    <who name="Radar WebKit Bug Importer">webkit-bug-importer</who>
    <bug_when>2023-11-06 10:08:17 -0800</bug_when>
    <thetext>&lt;rdar://problem/118005690&gt;</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>1994150</commentid>
    <comment_count>23</comment_count>
    <who name="Michael Catanzaro">mcatanzaro</who>
    <bug_when>2023-11-21 05:29:39 -0800</bug_when>
    <thetext>*** Bug 265186 has been marked as a duplicate of this bug. ***</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>