Bug 209063 - Gather PDF scripts to run on a background thread
Summary: Gather PDF scripts to run on a background thread
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit Misc. (show other bugs)
Version: WebKit Nightly Build
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Brady Eidson
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2020-03-13 10:30 PDT by Brady Eidson
Modified: 2020-03-14 15:17 PDT (History)
3 users (show)

See Also:


Attachments
Patch (5.41 KB, patch)
2020-03-13 10:32 PDT, Brady Eidson
ggaren: review-
Details | Formatted Diff | Diff
Patch (5.38 KB, patch)
2020-03-13 11:31 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Patch (5.84 KB, patch)
2020-03-13 15:39 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Patch (5.85 KB, patch)
2020-03-13 15:45 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff
Patch (6.79 KB, patch)
2020-03-14 13:47 PDT, Brady Eidson
ggaren: review+
Details | Formatted Diff | Diff
PFL (6.74 KB, patch)
2020-03-14 14:04 PDT, Brady Eidson
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Brady Eidson 2020-03-13 10:30:44 PDT
When doing incremental PDF loading, gather PDF scripts to run on a background thread

<rdar://problem/60422027>
Comment 1 Brady Eidson 2020-03-13 10:32:09 PDT
Created attachment 393500 [details]
Patch
Comment 2 Geoffrey Garen 2020-03-13 11:11:17 PDT
Comment on attachment 393500 [details]
Patch

Looks like this needs a rebase to run in EWS.
Comment 3 Brady Eidson 2020-03-13 11:24:55 PDT
(In reply to Geoffrey Garen from comment #2)
> Comment on attachment 393500 [details]
> Patch
> 
> Looks like this needs a rebase to run in EWS.

Yup, confirming I build locally before I upload
Comment 4 Brady Eidson 2020-03-13 11:31:06 PDT
Created attachment 393510 [details]
Patch
Comment 5 Geoffrey Garen 2020-03-13 11:50:50 PDT
Comment on attachment 393510 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=393510&action=review

> Source/WebKit/ChangeLog:3
> +        Gather PDF scripts to run on a background thread.

Why do we need to do this? Is it because we might block if the PDF is still loading?

> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1563
> +void PDFPlugin::maybeRunScriptsInPDFDocument()

=> runScriptsInPDFDocumentIfNeeded

> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1572
> +    auto completionHandler = [this, protectedThis = makeRef(*this)] (Vector<RetainPtr<CFStringRef>>& scripts) mutable {

Let's make this Vector& a Vector&& -- that's a better proof of thread safety, I think, since it means you're the single owner.

> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1576
> +        JSGlobalContextRef ctx = JSGlobalContextCreate(0);

nullptr

> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1579
> +            JSEvaluateScript(ctx, OpaqueJSString::tryCreate(script.get()).get(), jsPDFDoc, nullptr, 0, nullptr);

0 => 1 (it's a line number)

> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1587
> +    auto& rawQueue = scriptUtilityQueue.get();
> +    auto documentRef = [m_pdfDocument documentRef];
> +    rawQueue.dispatch([scriptUtilityQueue = WTFMove(scriptUtilityQueue), completionHandler = WTFMove(completionHandler), documentRef] () mutable {

Can we do scriptUtilityQueue-> instead of rawQueue.? Less verbose if so.

Can we capture a RetainPtr instead of (what appears to be) a raw pointer?
Comment 6 Brady Eidson 2020-03-13 15:25:50 PDT
(In reply to Geoffrey Garen from comment #5)
> Comment on attachment 393510 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=393510&action=review
> 
> > Source/WebKit/ChangeLog:3
> > +        Gather PDF scripts to run on a background thread.
> 
> Why do we need to do this? Is it because we might block if the PDF is still
> loading?

Will clarify in new ChangeLog

> 
> > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1563
> > +void PDFPlugin::maybeRunScriptsInPDFDocument()
> 
> => runScriptsInPDFDocumentIfNeeded
> 
> > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1572
> > +    auto completionHandler = [this, protectedThis = makeRef(*this)] (Vector<RetainPtr<CFStringRef>>& scripts) mutable {
> 
> Let's make this Vector& a Vector&& -- that's a better proof of thread
> safety, I think, since it means you're the single owner.
> 
> > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1576
> > +        JSGlobalContextRef ctx = JSGlobalContextCreate(0);
> 
> nullptr
> 
> > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1579
> > +            JSEvaluateScript(ctx, OpaqueJSString::tryCreate(script.get()).get(), jsPDFDoc, nullptr, 0, nullptr);
> 
> 0 => 1 (it's a line number)
> 
> > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1587
> > +    auto& rawQueue = scriptUtilityQueue.get();
> > +    auto documentRef = [m_pdfDocument documentRef];
> > +    rawQueue.dispatch([scriptUtilityQueue = WTFMove(scriptUtilityQueue), completionHandler = WTFMove(completionHandler), documentRef] () mutable {
> 
> Can we do scriptUtilityQueue-> instead of rawQueue.? Less verbose if so.

No, because the WTFMove for the capture happens before the call.

> Can we capture a RetainPtr instead of (what appears to be) a raw pointer?

In practice not necessary because the m_pdfDocument will always be alive because the PDFPlugin will be too.

But adding it makes that more explicit, so I will.
Comment 7 Brady Eidson 2020-03-13 15:39:30 PDT
Created attachment 393542 [details]
Patch
Comment 8 Brady Eidson 2020-03-13 15:45:30 PDT
Created attachment 393545 [details]
Patch
Comment 9 Geoffrey Garen 2020-03-13 17:00:20 PDT
Comment on attachment 393545 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=393545&action=review

> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1453
> +    if (m_documentFinishedLoading)
> +        runScriptsInPDFDocumentIfNeeded();

Can we just run scripts at the end of pdfDocumentDidLoad() in all cases?

That seems like the most natural time. And it removes the complexity of tracking whether we have run scripts already or not. If it's sometimes not possible to run scripts at the end of pdfDocumentDidLoad(), then I wonder why it's sometimes OK.
Comment 10 Brady Eidson 2020-03-13 20:38:58 PDT
(In reply to Geoffrey Garen from comment #9)
> Comment on attachment 393545 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=393545&action=review
> 
> > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1453
> > +    if (m_documentFinishedLoading)
> > +        runScriptsInPDFDocumentIfNeeded();
> 
> Can we just run scripts at the end of pdfDocumentDidLoad() in all cases?
> 
> That seems like the most natural time. And it removes the complexity of
> tracking whether we have run scripts already or not. If it's sometimes not
> possible to run scripts at the end of pdfDocumentDidLoad(), then I wonder
> why it's sometimes OK.

pdfDocumentDidLoad() is an old method name from the "synchronously create the document at the end" days.

It could be renamed to "documentDataDidFinishLoading()"

You need two things to be true to run scripts:
1 - You need all document data to have finished loading
2 - The PDFDocument needs to have been created.

In the new model, the main thread *can* finish loading all of the data for the document, while the original background PDF thread might not have finished creating the PDFDocument yet.

That's why there are two places that "runScriptsInPDFDocumentIfNeeded":
1 - When the document data finishes loading (assuming the PDFDocument has also been created)
2 - When the PDFDocument has been created (assuming we also have all the document data)
Comment 11 Geoffrey Garen 2020-03-14 13:13:35 PDT
> You need two things to be true to run scripts:
> 1 - You need all document data to have finished loading
> 2 - The PDFDocument needs to have been created.

Given these requirements, how about this design:

1. Change runScriptsInPDFDocument() to tryRunScriptsInPDFDocument()

1a. tryRunScriptsInPDFDocument returns early if m_pdfDocument is null or m_documentFinishedLoading is false

2. PDFPlugin::pdfDocumentDidLoad() calls tryRunScriptsInPDFDocument() unconditionally at the end

3. adoptBackgroundThreadDocument() calls tryRunScriptsInPDFDocument() unconditionally at the end

I think this design might be better because I think it directly encodes the requirements you've described: tryRunScriptsInPDFDocument() only runs if both conditions are true, and each function that changes one of the conditions calls tryRunScriptsInPDFDocument() directly. It also avoids an extra piece of state.

(The patch you've written is good at demonstrating that it never runs scripts twice, but not as good at demonstrating that it runs scripts more than zero times.)
Comment 12 Brady Eidson 2020-03-14 13:47:28 PDT
Created attachment 393590 [details]
Patch
Comment 13 Geoffrey Garen 2020-03-14 13:55:49 PDT
Comment on attachment 393590 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=393590&action=review

r=me

> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1564
> +    ASSERT(m_documentFinishedLoading);

I think this assert is no longer valid?
Comment 14 Brady Eidson 2020-03-14 14:01:02 PDT
(In reply to Geoffrey Garen from comment #13)
> Comment on attachment 393590 [details]
> Patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=393590&action=review
> 
> r=me
> 
> > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1564
> > +    ASSERT(m_documentFinishedLoading);
> 
> I think this assert is no longer valid?

Amazing you called that out right after I ran into it in testing.
Comment 15 Brady Eidson 2020-03-14 14:04:01 PDT
Created attachment 393592 [details]
PFL
Comment 16 WebKit Commit Bot 2020-03-14 15:17:09 PDT
Comment on attachment 393592 [details]
PFL

Clearing flags on attachment: 393592

Committed r258474: <https://trac.webkit.org/changeset/258474>
Comment 17 WebKit Commit Bot 2020-03-14 15:17:11 PDT
All reviewed patches have been landed.  Closing bug.