RESOLVED FIXED 236062
Setup a custom URL scheme for PDF.js resources
https://bugs.webkit.org/show_bug.cgi?id=236062
Summary Setup a custom URL scheme for PDF.js resources
Tim Nguyen (:ntim)
Reported 2022-02-03 00:22:56 PST
A custom URL scheme
Attachments
Patch (18.53 KB, patch)
2022-02-10 02:31 PST, Tim Nguyen (:ntim)
no flags
Patch (13.51 KB, patch)
2022-02-10 07:38 PST, Tim Nguyen (:ntim)
ews-feeder: commit-queue-
Patch (19.23 KB, patch)
2022-02-10 07:40 PST, Tim Nguyen (:ntim)
no flags
Patch (19.31 KB, patch)
2022-02-10 07:43 PST, Tim Nguyen (:ntim)
youennf: review+
[fast-cq] Patch for landing (19.12 KB, patch)
2022-02-10 09:43 PST, Tim Nguyen (:ntim)
no flags
Radar WebKit Bug Importer
Comment 1 2022-02-03 00:23:31 PST
Tim Nguyen (:ntim)
Comment 2 2022-02-10 02:31:47 PST
youenn fablet
Comment 3 2022-02-10 04:29:10 PST
Comment on attachment 451509 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=451509&action=review > Source/WebCore/ChangeLog:11 > + https://bugs.webkit.org/show_bug.cgi?id=235981 takes care of populating this folder at build-time. Why not doing https://bugs.webkit.org/show_bug.cgi?id=235981 first and then https://bugs.webkit.org/show_bug.cgi?id=236062 This would allow to enable tests when landing this patch. > Source/WebCore/loader/MixedContentChecker.cpp:68 > + if (frame.ownerElement() && frame.ownerElement()->document().isPDFDocument() && url.protocolIs("webkit-pdfjs-viewer")) There are a lot of protocolIs("webkit-pdfjs-viewer"), might be worth adding a isPDFJsURL(const URL&) for instance. > Source/WebCore/loader/MixedContentChecker.cpp:95 > + if (frame.document()->isPDFDocument() && url.protocolIs("webkit-pdfjs-viewer")) Maybe we can teach WebKit to consider webkit-pdfjs-viewer is secure through our legacy scheme handlers. > Source/WebCore/loader/cocoa/PDFJSResourceLoader.mm:55 > + loadQueue().dispatch([protectedLoader = Ref { loader }, url]() mutable { Need to isolatedCopy the url to be thread safe. Something like url = url.isolatedCopy(). > Source/WebCore/loader/cocoa/PDFJSResourceLoader.mm:56 > + NSString *relativePath = [@"pdfjs/" stringByAppendingString: (static_cast<NSURL*>(url).path)]; It seems this is mostly PDFJS agnostic. Maybe we could generalise this by adding a few parameters to loadResource, and rename it to something like loadResourceFromBundle. > Source/WebCore/loader/cocoa/PDFJSResourceLoader.mm:60 > + NSData *data = [NSData dataWithContentsOfFile:path]; auto * for all 3 variables? > Source/WebCore/loader/cocoa/PDFJSResourceLoader.mm:63 > + RunLoop::main().dispatch([protectedLoader = WTFMove(protectedLoader), url] { Ditto for isolating copy the URL. Something like url = WTFMove(url).isolatedCopy() to remove a memory copy. Or we could try merging both dispatch in one by dispatch capturing a RefPtr<FragmentedSharedBuffer> > Source/WebCore/loader/cocoa/PDFJSResourceLoader.mm:69 > + RunLoop::main().dispatch([protectedLoader = WTFMove(protectedLoader), url, buffer = FragmentedSharedBuffer::create(data)] { Can probably use SharedBuffer instead of FragmentedSharedBuffer. Ditto for isolating copy the URL. > Source/WebCore/platform/LegacySchemeRegistry.cpp:83 > + "webkit-pdfjs-viewer", Why is this needed? Maybe we can try making webkit-pdfjs-viewer a little bit more than that. Something like part of builtinSecureSchemes and builtinSchemesWithUniqueOrigins. Worth some additional thoughts there. > Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp:279 > + return false; We could try to fail webkit-pdfjs-viewer loads here if load is a=subresource load and its document is not PDF. Doing those checks here might allow removing some of the other checks in WebCore. It might be worth adding some tests like: fetch("webkit-pdfjs-viewer://xxx") and see whether it fails for regular http/https loads.
Tim Nguyen (:ntim)
Comment 4 2022-02-10 07:33:04 PST
Comment on attachment 451509 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=451509&action=review >> Source/WebCore/ChangeLog:11 >> + https://bugs.webkit.org/show_bug.cgi?id=235981 takes care of populating this folder at build-time. > > Why not doing https://bugs.webkit.org/show_bug.cgi?id=235981 first and then https://bugs.webkit.org/show_bug.cgi?id=236062 > This would allow to enable tests when landing this patch. https://bugs.webkit.org/show_bug.cgi?id=235981 is blocked on bundle size concerns. We'd like to land this first for prototyping, and to unblock other subtasks too.
Tim Nguyen (:ntim)
Comment 5 2022-02-10 07:38:18 PST
Tim Nguyen (:ntim)
Comment 6 2022-02-10 07:40:52 PST
Tim Nguyen (:ntim)
Comment 7 2022-02-10 07:43:39 PST
youenn fablet
Comment 8 2022-02-10 08:03:57 PST
Comment on attachment 451533 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=451533&action=review > Source/WebCore/loader/PolicyChecker.cpp:173 > + bool isInPDFDocumentFrame = m_frame.ownerElement() && m_frame.ownerElement()->document().isPDFDocument(); What about main frame loads? > Source/WebCore/loader/PolicyChecker.cpp:174 > + if (isInPDFDocumentFrame && !request.isNull() && request.url().protocolIs("webkit-pdfjs-viewer")) { I would guess there cannot be a case where request is null but request url protocol is matching viewer, so we can probably remove !request.isNull() check. > Source/WebCore/loader/ResourceLoader.cpp:881 > + } Maybe we can write it this way: if (options().mode == FetchOptions::Mode::Navigate) return true; auto* frame = resourceLoader.frame(); auto* document = frame ? frame->document() : nullptr; return document ? document->isPDFDocument() : false; > Source/WebCore/loader/cocoa/BundleResourceLoader.h:30 > +void loadResourceFromBundle(WebCore::ResourceLoader&, const String&); s/WebCore:://. and forward declare ResourceLoader. Might want to add String forward declaration as well. > Source/WebCore/loader/cocoa/BundleResourceLoader.mm:55 > + loadQueue().dispatch([protectedLoader = Ref { loader }, url = url.isolatedCopy(), subdirectory = subdirectory.isolatedCopy()]() mutable { url = loader.request().url().isolatedCopy() to reduce count churning. > Source/WebCore/loader/cocoa/BundleResourceLoader.mm:69 > + auto mimeType = MIMETypeRegistry::mimeTypeForPath(url.path().toString()); Given mimeTypeForPath is only looking at the suffix, can we do: MIMETypeRegistry::mimeTypeForPath(url.string()) as a small optimization? > Source/WebCore/loader/cocoa/BundleResourceLoader.mm:74 > + response.addHTTPHeaderField(HTTPHeaderName::AccessControlAllowOrigin, "*"); Is it needed? AccessControlAllowOrigin is only used for XHR/fetch and maybe the loads will be considered same origin. > Source/WebCore/loader/cocoa/BundleResourceLoader.mm:75 > + protectedLoader->deliverResponseAndData(response, buffer.ptr()); WTFMove(buffer) and make the lambda mutable. > Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp:275 > +bool WebLoaderStrategy::tryLoadingUsingPDFJSHandler(ResourceLoader& resourceLoader, const WebResourceLoader::TrackingParameters& trackingParameters) trackingParameters not needed. > Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp:277 > + if (!resourceLoader.isPDFJSResourceLoad()) Given we do this check in ResourceLoader, I would only do the scheme check here.
youenn fablet
Comment 9 2022-02-10 08:08:26 PST
Comment on attachment 451533 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=451533&action=review > Source/WebCore/ChangeLog:11 > + https://bugs.webkit.org/show_bug.cgi?id=235981 takes care of populating this folder at build-time. Please update change log according latest patch. > Source/WebKit/ChangeLog:8 > + The code handling the scheme is under WebCore::PDFJSResourceLoader. Ditto. > Source/WebCore/loader/ResourceLoader.h:162 > + WEBCORE_EXPORT bool isPDFJSResourceLoad() const; Could be #if PLATFORM(COCOA)?
Tim Nguyen (:ntim)
Comment 10 2022-02-10 09:36:51 PST
(In reply to youenn fablet from comment #8) > Comment on attachment 451533 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=451533&action=review > > > Source/WebCore/loader/PolicyChecker.cpp:173 > > + bool isInPDFDocumentFrame = m_frame.ownerElement() && m_frame.ownerElement()->document().isPDFDocument(); > > What about main frame loads? We want to disallow main frame loads. > > Source/WebCore/loader/PolicyChecker.cpp:174 > > + if (isInPDFDocumentFrame && !request.isNull() && request.url().protocolIs("webkit-pdfjs-viewer")) { > > I would guess there cannot be a case where request is null but request url > protocol is matching viewer, so we can probably remove !request.isNull() > check. Yep, removed the check. > > Source/WebCore/loader/ResourceLoader.cpp:881 > > + } > > Maybe we can write it this way: > if (options().mode == FetchOptions::Mode::Navigate) > return true; > > auto* frame = resourceLoader.frame(); > auto* document = frame ? frame->document() : nullptr; > return document ? document->isPDFDocument() : false; Simplified the existing method. > > Source/WebCore/loader/cocoa/BundleResourceLoader.h:30 > > +void loadResourceFromBundle(WebCore::ResourceLoader&, const String&); > > s/WebCore:://. > and forward declare ResourceLoader. Might want to add String forward > declaration as well. Done. > > Source/WebCore/loader/cocoa/BundleResourceLoader.mm:55 > > + loadQueue().dispatch([protectedLoader = Ref { loader }, url = url.isolatedCopy(), subdirectory = subdirectory.isolatedCopy()]() mutable { > > url = loader.request().url().isolatedCopy() to reduce count churning. Done. > > Source/WebCore/loader/cocoa/BundleResourceLoader.mm:69 > > + auto mimeType = MIMETypeRegistry::mimeTypeForPath(url.path().toString()); > > Given mimeTypeForPath is only looking at the suffix, can we do: > MIMETypeRegistry::mimeTypeForPath(url.string()) as a small optimization? This doesn't seem to work. I'll investigate. > > Source/WebCore/loader/cocoa/BundleResourceLoader.mm:74 > > + response.addHTTPHeaderField(HTTPHeaderName::AccessControlAllowOrigin, "*"); > > Is it needed? > AccessControlAllowOrigin is only used for XHR/fetch and maybe the loads will > be considered same origin. Yes, otherwise images can't load. > > Source/WebCore/loader/cocoa/BundleResourceLoader.mm:75 > > + protectedLoader->deliverResponseAndData(response, buffer.ptr()); > > WTFMove(buffer) and make the lambda mutable. Done. > > Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp:275 > > +bool WebLoaderStrategy::tryLoadingUsingPDFJSHandler(ResourceLoader& resourceLoader, const WebResourceLoader::TrackingParameters& trackingParameters) > > trackingParameters not needed. They are actually needed for the `WEBLOADERSTRATEGY_RELEASE_LOG("tryLoadingUsingPDFJSHandler: URL will be scheduled with the PDFJS url scheme handler");` macro. > > Source/WebKit/WebProcess/Network/WebLoaderStrategy.cpp:277 > > + if (!resourceLoader.isPDFJSResourceLoad()) > > Given we do this check in ResourceLoader, I would only do the scheme check > here. Done.
Tim Nguyen (:ntim)
Comment 11 2022-02-10 09:38:45 PST
(In reply to youenn fablet from comment #9) > Comment on attachment 451533 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=451533&action=review > > > Source/WebCore/ChangeLog:11 > > + https://bugs.webkit.org/show_bug.cgi?id=235981 takes care of populating this folder at build-time. > > Please update change log according latest patch. > > > Source/WebKit/ChangeLog:8 > > + The code handling the scheme is under WebCore::PDFJSResourceLoader. > > Ditto. Done. > > Source/WebCore/loader/ResourceLoader.h:162 > > + WEBCORE_EXPORT bool isPDFJSResourceLoad() const; > > Could be #if PLATFORM(COCOA)? I don't feel strongly, though this could be cross platform if other platforms implement a BundleResourceLoader too. Right now, this function returns false for non Cocoa platforms which should be good enough.
Tim Nguyen (:ntim)
Comment 12 2022-02-10 09:43:11 PST
Created attachment 451553 [details] [fast-cq] Patch for landing
EWS
Comment 13 2022-02-10 09:50:42 PST
Committed r289549 (247080@main): <https://commits.webkit.org/247080@main> All reviewed patches have been landed. Closing bug and clearing flags on attachment 451553 [details].
Alex Christensen
Comment 14 2022-02-10 09:52:25 PST
View in context: https://bugs.webkit.org/attachment.cgi?id=451553&action=review > Source/WebCore/loader/SubresourceLoader.h:55 > + SecurityOrigin* origin() const { return m_origin.get(); } const SecurityOrigin* origin() const > Source/WebCore/loader/cocoa/BundleResourceLoader.mm:45 > + static auto& queue = WorkQueue::create("org.WebCore.BundleResourceLoader", WorkQueue::QOS::Utility).leakRef(); org.WebKit.BundleResourceLoader
Tim Nguyen (:ntim)
Comment 15 2022-02-10 10:48:01 PST
(In reply to Alex Christensen from comment #14) > View in context: > https://bugs.webkit.org/attachment.cgi?id=451553&action=review > > > Source/WebCore/loader/SubresourceLoader.h:55 > > + SecurityOrigin* origin() const { return m_origin.get(); } > > const SecurityOrigin* origin() const This caused other build problems. > > Source/WebCore/loader/cocoa/BundleResourceLoader.mm:45 > > + static auto& queue = WorkQueue::create("org.WebCore.BundleResourceLoader", WorkQueue::QOS::Utility).leakRef(); > > org.WebKit.BundleResourceLoader Fixed in https://commits.webkit.org/r289553
Aakash Jain
Comment 16 2022-02-12 07:19:03 PST
(In reply to EWS from comment #13) > Committed r289549 (247080@main): <https://commits.webkit.org/247080@main> Seems like this broke a layout-test on ios-wk2 https://results.webkit.org/?suite=layout-tests&test=imported%2Fw3c%2Fweb-platform-tests%2Furl%2Ftoascii.window.html e.g.: https://build.webkit.org/results/Apple-iPadOS-15-Simulator-Release-WK2-Tests/r289589%20(1019)/results.html Diff: https://build.webkit.org/results/Apple-iPadOS-15-Simulator-Release-WK2-Tests/r289589%20(1019)/imported/w3c/web-platform-tests/url/toascii.window-diff.txt -FAIL xn--tešla (using URL) assert_throws_js: function "() => makeURL("url", hostTest.input)" did not throw -FAIL xn--tešla (using URL.host) assert_equals: expected "x" but got "xn--teala" -FAIL xn--tešla (using URL.hostname) assert_equals: expected "x" but got "xn--teala" -FAIL xn--tešla (using <a>) assert_equals: expected "" but got "xn--teala" -FAIL xn--tešla (using <a>.host) assert_equals: expected "x" but got "xn--teala" -FAIL xn--tešla (using <a>.hostname) assert_equals: expected "x" but got "xn--teala" -FAIL xn--tešla (using <area>) assert_equals: expected "" but got "xn--teala" -FAIL xn--tešla (using <area>.host) assert_equals: expected "x" but got "xn--teala" -FAIL xn--tešla (using <area>.hostname) assert_equals: expected "x" but got "xn--teala" +PASS xn--tešla (using URL) +PASS xn--tešla (using URL.host) +PASS xn--tešla (using URL.hostname) +PASS xn--tešla (using <a>) +PASS xn--tešla (using <a>.host) +PASS xn--tešla (using <a>.hostname) +PASS xn--tešla (using <area>) +PASS xn--tešla (using <area>.host) +PASS xn--tešla (using <area>.hostname) PASS xn--zca.xn--zca (using URL)
Alex Christensen
Comment 17 2022-02-15 09:42:08 PST
(In reply to Tim Nguyen (:ntim) from comment #15) > (In reply to Alex Christensen from comment #14) > > View in context: > > https://bugs.webkit.org/attachment.cgi?id=451553&action=review > > > > > Source/WebCore/loader/SubresourceLoader.h:55 > > > + SecurityOrigin* origin() const { return m_origin.get(); } > > > > const SecurityOrigin* origin() const > > This caused other build problems. We need to fix those other build problems. Does this help? const SecurityOrigin* origin() const { return m_origin.get(); } SecurityOrigin* origin() { return m_origin.get(); }
Alex Christensen
Comment 18 2022-02-21 10:40:13 PST
Note You need to log in before you can comment on or make changes to this bug.