WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
236525
PDF.js viewer should work for all kinds of URLs
https://bugs.webkit.org/show_bug.cgi?id=236525
Summary
PDF.js viewer should work for all kinds of URLs
Tim Nguyen (:ntim)
Reported
2022-02-11 13:34:16 PST
rdar://88459622
Attachments
Patch
(2.83 KB, patch)
2022-02-11 17:32 PST
,
pascoe@apple.com
no flags
Details
Formatted Diff
Diff
Patch
(7.65 KB, patch)
2022-02-14 16:10 PST
,
pascoe@apple.com
no flags
Details
Formatted Diff
Diff
Patch
(7.26 KB, patch)
2022-02-15 08:49 PST
,
pascoe@apple.com
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch
(6.76 KB, patch)
2022-02-15 09:42 PST
,
pascoe@apple.com
no flags
Details
Formatted Diff
Diff
Patch
(7.33 KB, patch)
2022-02-15 14:08 PST
,
pascoe@apple.com
no flags
Details
Formatted Diff
Diff
Patch for landing
(7.77 KB, patch)
2022-02-17 14:16 PST
,
pascoe@apple.com
ews-feeder
: commit-queue-
Details
Formatted Diff
Diff
Patch for landing
(7.77 KB, patch)
2022-02-17 14:24 PST
,
pascoe@apple.com
no flags
Details
Formatted Diff
Diff
Show Obsolete
(6)
View All
Add attachment
proposed patch, testcase, etc.
Radar WebKit Bug Importer
Comment 1
2022-02-11 13:34:25 PST
Comment hidden (obsolete)
<
rdar://problem/88832961
>
pascoe@apple.com
Comment 2
2022-02-11 17:32:55 PST
Created
attachment 451762
[details]
Patch
Tim Nguyen (:ntim)
Comment 3
2022-02-11 23:54:11 PST
<
rdar://problem/88459622
>
Tim Nguyen (:ntim)
Comment 4
2022-02-12 00:30:48 PST
Comment on
attachment 451762
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=451762&action=review
This is a great start, thanks! I know this is just a WIP but here are a few initial comments.
> Source/WebCore/ChangeLog:12 > + by calling the PDFJS viewer's open function. More work is needed to > + disable loading of the default option and potentially present the > + data as a PDFDataRangeTransport.
I think we might want to remove the `?file=` parameter logic from `PDFDocument::createDocumentStructure()` with this change as well. Not sure how to disable the example PDF from loading, but will look into that.
> Source/WebCore/html/PDFDocument.cpp:114 > if (event.type() == eventNames().loadEvent) {
If the PDF finishes loading after the iframe finishes loading, this will be problematic. `PDFDocumentParser::finish( )` or `PDFDocument::finishedParsing()` is a place where we are guaranteed to finish parsing the PDF bytes. I think we need a flag for when the PDF loads, and one for when the iframe loads, then we execute this code when both of these are satisfied (whichever is first) to avoid a race. Also, PDFJS has its own `webviewerloaded` event too, so not sure if we need to care about that.
> Source/WebCore/html/PDFDocument.cpp:117 > + auto* iframe = dynamicDowncast<HTMLIFrameElement>(event.target());
nit: this is already defined above.
> Source/WebCore/html/PDFDocument.cpp:132 > + JSLockHolder lock(globalObject->vm()); > + auto callData = JSC::getCallData(globalObject->vm(), openFunction); > + ASSERT(callData.type != JSC::CallData::Type::None); > + JSC::MarkedArgumentBuffer arguments; > + auto nativeBuffer = m_document->loader()->mainResourceData()->tryCreateArrayBuffer(); > + ArrayBufferSharingMode sharingMode = nativeBuffer->sharingMode(); > + arguments.append(JSArrayBuffer::create(globalObject->vm(), globalObject->arrayBufferStructure(sharingMode), WTFMove(nativeBuffer))); > + ASSERT(!arguments.hasOverflowed()); > + > + JSC::call(globalObject, openFunction, callData, pdfJSApplication, arguments);
I wonder if we could run window.postMessage() with a message containing the array buffer instead, and have `content-script.js` listen for the message and do the rest. It might be slightly more convoluted, but I want to do this for a few reasons: - `PDFViewerApplication.open` can break with PDF.js API updates, and ideally I'd like this type of code to stay contained in the `pdfjs-extras` folder (with `PDFDocument` should staying clear from that) - Might be slightly easier if we have to wait for `webviewerloaded` as well, or do anything else with the buffer later on.
pascoe@apple.com
Comment 5
2022-02-14 12:21:45 PST
(In reply to Tim Nguyen (:ntim) from
comment #4
)
> Comment on
attachment 451762
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=451762&action=review
> > This is a great start, thanks! I know this is just a WIP but here are a few > initial comments. > > > Source/WebCore/ChangeLog:12 > > + by calling the PDFJS viewer's open function. More work is needed to > > + disable loading of the default option and potentially present the > > + data as a PDFDataRangeTransport. > > I think we might want to remove the `?file=` parameter logic from > `PDFDocument::createDocumentStructure()` with this change as well. Not sure > how to disable the example PDF from loading, but will look into that.
I've found leaving just ?file=, prevents the default example pdf from loading.
> > Source/WebCore/html/PDFDocument.cpp:114 > > if (event.type() == eventNames().loadEvent) { > > If the PDF finishes loading after the iframe finishes loading, this will be > problematic. > > `PDFDocumentParser::finish( )` or `PDFDocument::finishedParsing()` is a > place where we are guaranteed to finish parsing the PDF bytes. I think we > need a flag for when the PDF loads, and one for when the iframe loads, then > we execute this code when both of these are satisfied (whichever is first) > to avoid a race. > > Also, PDFJS has its own `webviewerloaded` event too, so not sure if we need > to care about that.
I've tried registering a listener for that event both from the PDFDocument.cpp and within content-script.js but have been unable to get it to work.
> > Source/WebCore/html/PDFDocument.cpp:117 > > + auto* iframe = dynamicDowncast<HTMLIFrameElement>(event.target()); > > nit: this is already defined above. > > > Source/WebCore/html/PDFDocument.cpp:132 > > + JSLockHolder lock(globalObject->vm()); > > + auto callData = JSC::getCallData(globalObject->vm(), openFunction); > > + ASSERT(callData.type != JSC::CallData::Type::None); > > + JSC::MarkedArgumentBuffer arguments; > > + auto nativeBuffer = m_document->loader()->mainResourceData()->tryCreateArrayBuffer(); > > + ArrayBufferSharingMode sharingMode = nativeBuffer->sharingMode(); > > + arguments.append(JSArrayBuffer::create(globalObject->vm(), globalObject->arrayBufferStructure(sharingMode), WTFMove(nativeBuffer))); > > + ASSERT(!arguments.hasOverflowed()); > > + > > + JSC::call(globalObject, openFunction, callData, pdfJSApplication, arguments); > > I wonder if we could run window.postMessage() with a message containing the > array buffer instead, and have `content-script.js` listen for the message > and do the rest. It might be slightly more convoluted, but I want to do this > for a few reasons: > > - `PDFViewerApplication.open` can break with PDF.js API updates, and ideally > I'd like this type of code to stay contained in the `pdfjs-extras` folder > (with `PDFDocument` should staying clear from that)
I think PDFViewerApplication.open will stay stable based on this comment:
https://github.com/mozilla/pdf.js/issues/9487#issuecomment-372036644
> - Might be slightly easier if we have to wait for `webviewerloaded` as well, > or do anything else with the buffer later on.
See above comment regarding listening to webviewerloaded.
Tim Nguyen (:ntim)
Comment 6
2022-02-14 12:39:13 PST
(In reply to
j_pascoe@apple.com
from
comment #5
)
> (In reply to Tim Nguyen (:ntim) from
comment #4
) > > Comment on
attachment 451762
[details]
> > Patch > > > > View in context: > >
https://bugs.webkit.org/attachment.cgi?id=451762&action=review
> > > > This is a great start, thanks! I know this is just a WIP but here are a few > > initial comments. > > > > > Source/WebCore/ChangeLog:12 > > > + by calling the PDFJS viewer's open function. More work is needed to > > > + disable loading of the default option and potentially present the > > > + data as a PDFDataRangeTransport. > > > > I think we might want to remove the `?file=` parameter logic from > > `PDFDocument::createDocumentStructure()` with this change as well. Not sure > > how to disable the example PDF from loading, but will look into that. > > I've found leaving just ?file=, prevents the default example pdf from > loading.
That sounds good to me. We should probably add a comment that explains it, and also stop appending the PDF URL after this parameter too.
> > > Source/WebCore/html/PDFDocument.cpp:114 > > > if (event.type() == eventNames().loadEvent) { > > > > If the PDF finishes loading after the iframe finishes loading, this will be > > problematic. > > > > `PDFDocumentParser::finish( )` or `PDFDocument::finishedParsing()` is a > > place where we are guaranteed to finish parsing the PDF bytes. I think we > > need a flag for when the PDF loads, and one for when the iframe loads, then > > we execute this code when both of these are satisfied (whichever is first) > > to avoid a race. > > > > Also, PDFJS has its own `webviewerloaded` event too, so not sure if we need > > to care about that. > I've tried registering a listener for that event both from the > PDFDocument.cpp and within content-script.js but have been unable to get it > to work.
I think we could care about this specific event later. But I'd still like to avoid a race between PDFDocument::finishedParsing() and iframe.onload though.
> > > Source/WebCore/html/PDFDocument.cpp:117 > > > + auto* iframe = dynamicDowncast<HTMLIFrameElement>(event.target()); > > > > nit: this is already defined above. > > > > > Source/WebCore/html/PDFDocument.cpp:132 > > > + JSLockHolder lock(globalObject->vm()); > > > + auto callData = JSC::getCallData(globalObject->vm(), openFunction); > > > + ASSERT(callData.type != JSC::CallData::Type::None); > > > + JSC::MarkedArgumentBuffer arguments; > > > + auto nativeBuffer = m_document->loader()->mainResourceData()->tryCreateArrayBuffer(); > > > + ArrayBufferSharingMode sharingMode = nativeBuffer->sharingMode(); > > > + arguments.append(JSArrayBuffer::create(globalObject->vm(), globalObject->arrayBufferStructure(sharingMode), WTFMove(nativeBuffer))); > > > + ASSERT(!arguments.hasOverflowed()); > > > + > > > + JSC::call(globalObject, openFunction, callData, pdfJSApplication, arguments); > > > > I wonder if we could run window.postMessage() with a message containing the > > array buffer instead, and have `content-script.js` listen for the message > > and do the rest. It might be slightly more convoluted, but I want to do this > > for a few reasons: > > > > - `PDFViewerApplication.open` can break with PDF.js API updates, and ideally > > I'd like this type of code to stay contained in the `pdfjs-extras` folder > > (with `PDFDocument` should staying clear from that) > I think PDFViewerApplication.open will stay stable based on this comment: >
https://github.com/mozilla/pdf.js/issues/9487#issuecomment-372036644
I still find it preferable to have the JS logic contained in the content-script.js with a postMessage based system for consistency. There are more things we'd want to use postMessage() for, e.g. sending find in page messages for instance, where we may want to do more than just one function call. In general, PDFDocument should be as viewer-agnostic as possible, e.g. it should be possible to swap another viewer that watches for the same messages.
pascoe@apple.com
Comment 7
2022-02-14 16:10:20 PST
Created
attachment 451956
[details]
Patch
Tim Nguyen (:ntim)
Comment 8
2022-02-14 23:52:57 PST
Comment on
attachment 451956
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=451956&action=review
> Source/WebCore/html/PDFDocument.cpp:151 > +class ScriptInjectedEventListener final : public EventListener { > +public: > + static Ref<ScriptInjectedEventListener> create(PDFDocument& pdfDocument) { return adoptRef(*new ScriptInjectedEventListener(pdfDocument)); } > + > +private: > + ScriptInjectedEventListener(PDFDocument& pdfDocument) > + : EventListener(PDFJSScriptInjectionEventListenerType) > + , m_pdfDocument(pdfDocument) > + { > + } > + > + bool operator==(const EventListener&) const override; > + void handleEvent(ScriptExecutionContext&, Event&) override; > + > + WeakPtr<PDFDocument> m_pdfDocument; > +}; > + > +void ScriptInjectedEventListener::handleEvent(ScriptExecutionContext&, Event&) > +{ > + m_pdfDocument->setContentsScriptLoaded(true); > + if (m_pdfDocument->isFullyParsed()) > + m_pdfDocument->sendBlob(); > +} > + > +bool ScriptInjectedEventListener::operator==(const EventListener& other) const > +{ > + // All PDFJSScriptInjectionEventListenerType objects compare as equal; OK since there is only one per document. > + return other.type() == PDFJSScriptInjectionEventListenerType; > +}
You can just re-use the other event listener, and check what the event target is in handleEvent. I don't think we want to add more event listener types. ImageDocument uses one event listener for 2 different events for instance.
> Source/WebCore/html/PDFDocument.cpp:187 > + m_iframe = iframe.ptr();
`m_iframe = WTFMove(iframe)`
> Source/WebCore/html/PDFDocument.cpp:201 > + if (this->m_contentScriptsLoaded) > + this->sendBlob();
nit: omit `this->`
> Source/WebCore/html/PDFDocument.cpp:209 > + auto openFunction = frame->script().executeScriptIgnoringException("(data) => PDFJSContentScript.open(data)").getObject();
nit: auto openFunction = frame->script().executeScriptIgnoringException("PDFJSContentScript.open").getObject();
> Source/WebCore/html/PDFDocument.cpp:210 > + auto globalObject = this->globalObject();
nit: you can omit this-> here and below.
> Source/WebCore/html/PDFDocument.cpp:215 > + JSC::MarkedArgumentBuffer arguments;
`JSC::` can be omitted since you have `using namespace JSC`
> Source/WebCore/html/PDFDocument.cpp:225 > void PDFDocument::injectContentScript(Document& contentDocument)
You can stop passing the contentDocument around if you store a member with m_iframe.
> Source/WebCore/html/PDFDocument.h:45 > + void sendBlob();
nit: sendPDFArrayBuffer(); ?
pascoe@apple.com
Comment 9
2022-02-15 08:49:54 PST
Created
attachment 452029
[details]
Patch
Tim Nguyen (:ntim)
Comment 10
2022-02-15 09:14:42 PST
Comment on
attachment 452029
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=452029&action=review
> Source/WebCore/dom/EventListener.h:50 > + PDFJSScriptInjectionEventListenerType,
nit: remove this
> Source/WebCore/html/PDFDocument.cpp:112 > + ASSERT(iframe, "Should have event target");
You can probably remove this assert
> Source/WebCore/html/PDFDocument.cpp:118 > }
and add: ``` else ASSERT_NOT_REACHED(); ```
> Source/WebCore/html/PDFDocument.cpp:179 > +void PDFDocument::sendPDFArrayBuffer()
I still would like to use postMessage() here if possible.
> Source/WebCore/html/PDFDocument.cpp:185 > + auto globalObject = this->globalObject();
redundant this->
> Source/WebCore/html/PDFDocument.cpp:191 > + auto nativeBuffer = this->loader()->mainResourceData()->tryCreateArrayBuffer();
here too
Tim Nguyen (:ntim)
Comment 11
2022-02-15 09:29:31 PST
Comment on
attachment 452029
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=452029&action=review
>> Source/WebCore/html/PDFDocument.cpp:185 >> + auto globalObject = this->globalObject(); > > redundant this->
oops this is not actually redundant, the variable name matches.
pascoe@apple.com
Comment 12
2022-02-15 09:42:41 PST
Created
attachment 452040
[details]
Patch
Chris Dumez
Comment 13
2022-02-15 11:16:52 PST
Comment on
attachment 452040
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=452040&action=review
> Source/WebCore/html/PDFDocument.cpp:152 > auto iframe = HTMLIFrameElement::create(HTMLNames::iframeTag, *this);
Any reason we are not setting m_iframe right away here.
> Source/WebCore/html/PDFDocument.cpp:155 > body->appendChild(iframe);
Note that this may fire events and thus run JS AFAIK.
> Source/WebCore/html/PDFDocument.cpp:157 > auto listener = PDFDocumentEventListener::create(*this);
ditto.
> Source/WebCore/html/PDFDocument.cpp:160 > + m_iframe = WTFMove(iframe);
instead of waiting all the way until here?
> Source/WebCore/html/PDFDocument.cpp:182 > + auto* frame = downcast<Frame>(m_iframe->contentWindow()->frame());
Why isn't this just calling m_iframe->contentFrame() ? Seems odd to go via the WindowProxy.
> Source/WebCore/html/PDFDocument.cpp:187 > + auto callData = JSC::getCallData(globalObject->vm(), openFunction);
Why all the JSC:: in this function given that you use 'using namespace JSC;' above?
> Source/WebCore/html/PDFDocument.cpp:204 > + auto script = HTMLScriptElement::create(scriptTag, *contentDocument, false, false);
Last parameter is optional and can be omitted.
> Source/WebCore/html/PDFDocument.h:47 > + bool isFullyParsed() { return m_finishedParsing; }
Should be marked as const. Also would be nice if the getter and the data member names would match.
> Source/WebCore/html/PDFDocument.h:56 > + bool m_finishedParsing { false };
Boolean variables need to start with a prefix (such as "is") as per WebKit coding style.
> Source/WebCore/html/PDFDocument.h:57 > + bool m_contentScriptsLoaded { false };
ditto.
Alexey Shvayka
Comment 14
2022-02-15 11:30:52 PST
Comment on
attachment 452040
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=452040&action=review
> Source/WebCore/html/PDFDocument.cpp:185 > +
nit: auto& vm = globalObject->vm();
> Source/WebCore/html/PDFDocument.cpp:190 > + auto nativeBuffer = loader()->mainResourceData()->tryCreateArrayBuffer();
Should we handle tryCreateArrayBuffer() returning nullptr?
Tim Nguyen (:ntim)
Comment 15
2022-02-15 12:05:38 PST
Comment on
attachment 452040
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=452040&action=review
> Source/WebCore/ChangeLog:3 > + Collect bytes during parsing and pass data to PDF.js as a blob
Can you correct the changelog title to be descriptive of what this patch actually does? (perhaps the bug/radar titles should be changed too)
Yusuke Suzuki
Comment 16
2022-02-15 12:13:03 PST
Comment on
attachment 452040
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=452040&action=review
> Source/WebCore/html/PDFDocument.cpp:184 > + auto openFunction = frame->script().executeScriptIgnoringException("PDFJSContentScript.open").getObject(); > + auto globalObject = this->globalObject();
If we have globalObject, then how about just getting a property instead of evaluating a script here? auto pdfjs = globalObject->get(globalObject, Identifier::fromString(vm, "PDFJSContentScript")); .... exception handling .... auto open = asObject(pdfjs)->get(globalObject, Identifier::fromString(vm, "open")); .... exception handling .... The above is much faster and efficient than evaluating script snippet. If we do that, do it after taking a JSLock.
pascoe@apple.com
Comment 17
2022-02-15 14:08:44 PST
Created
attachment 452089
[details]
Patch
Tim Horton
Comment 18
2022-02-15 14:31:57 PST
(In reply to Tim Nguyen (:ntim) from
comment #15
)
> Comment on
attachment 452040
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=452040&action=review
> > > Source/WebCore/ChangeLog:3 > > + Collect bytes during parsing and pass data to PDF.js as a blob > > Can you correct the changelog title to be descriptive of what this patch > actually does? (perhaps the bug/radar titles should be changed too)
Ideally the bug title would describe the problem, not the solution.
Tim Nguyen (:ntim)
Comment 19
2022-02-15 15:47:15 PST
Comment on
attachment 452089
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=452089&action=review
> Source/WebCore/html/PDFDocument.cpp:192 > + auto nativeBuffer = loader()->mainResourceData()->tryCreateArrayBuffer(); > + if (nativeBuffer == nullptr) {
auto arrayBuffer = ... if (!arrayBuffer) {
> Source/WebCore/html/PDFDocument.cpp:197 > + ArrayBufferSharingMode sharingMode = nativeBuffer->sharingMode();
`auto sharingMode =` should work here.
Tim Nguyen (:ntim)
Comment 20
2022-02-15 17:09:15 PST
Comment on
attachment 452089
[details]
Patch r=me for this initial iteration. We should switch to postMessage in
bug 236668
, since the iframe and the PDFDocument origins are not the same.
Tim Nguyen (:ntim)
Comment 21
2022-02-15 17:10:11 PST
Comment on
attachment 452089
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=452089&action=review
> Source/WebCore/html/PDFDocument.cpp:181 > + // FIXME:
https://bugs.webkit.org/show_bug.cgi?id=236668
- Use globalObject->get
// FIXME:
webkit.org/b/236668
- Use postMessage instead.
Tim Nguyen (:ntim)
Comment 22
2022-02-15 17:13:36 PST
Comment on
attachment 452089
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=452089&action=review
> Source/WebCore/ChangeLog:15 > + * html/PDFDocument.cpp: > + (WebCore::PDFDocumentEventListener::handleEvent):
The changelog needs to be regenerated, there are more functions that have been changed since.
Tim Nguyen (:ntim)
Comment 23
2022-02-15 17:13:39 PST
Comment hidden (duplicate, obsolete)
Comment on
attachment 452089
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=452089&action=review
> Source/WebCore/ChangeLog:15 > + * html/PDFDocument.cpp: > + (WebCore::PDFDocumentEventListener::handleEvent):
The changelog needs to be regenerated, there are more functions that have been changed since.
pascoe@apple.com
Comment 24
2022-02-17 14:16:52 PST
Created
attachment 452432
[details]
Patch for landing
pascoe@apple.com
Comment 25
2022-02-17 14:24:41 PST
Created
attachment 452433
[details]
Patch for landing
EWS
Comment 26
2022-02-17 16:29:16 PST
Committed
r290089
(
247444@main
): <
https://commits.webkit.org/247444@main
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 452433
[details]
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug