209063 – Gather PDF scripts to run on a background thread

RESOLVED FIXED209063

Gather PDF scripts to run on a background thread

https://bugs.webkit.org/show_bug.cgi?id=209063

Summary Gather PDF scripts to run on a background thread

Brady Eidson

Reported 2020-03-13 10:30:44 PDT

When doing incremental PDF loading, gather PDF scripts to run on a background thread <rdar://problem/60422027>

Attachments
Patch (5.41 KB, patch) 2020-03-13 10:32 PDT, Brady Eidson	ggaren: review-	Details Formatted Diff Diff
Patch (5.38 KB, patch) 2020-03-13 11:31 PDT, Brady Eidson	no flags	Details Formatted Diff Diff
Patch (5.84 KB, patch) 2020-03-13 15:39 PDT, Brady Eidson	no flags	Details Formatted Diff Diff
Patch (5.85 KB, patch) 2020-03-13 15:45 PDT, Brady Eidson	no flags	Details Formatted Diff Diff
Patch (6.79 KB, patch) 2020-03-14 13:47 PDT, Brady Eidson	ggaren: review+	Details Formatted Diff Diff
PFL (6.74 KB, patch) 2020-03-14 14:04 PDT, Brady Eidson	no flags	Details Formatted Diff Diff
Show Obsolete (5) View All Add attachment proposed patch, testcase, etc.

Brady Eidson

Comment 1 2020-03-13 10:32:09 PDT

Created attachment 393500 [details] Patch

Geoffrey Garen

Comment 2 2020-03-13 11:11:17 PDT

Comment on attachment 393500 [details] Patch Looks like this needs a rebase to run in EWS.

Brady Eidson

Comment 3 2020-03-13 11:24:55 PDT

(In reply to Geoffrey Garen from comment #2) > Comment on attachment 393500 [details] > Patch > > Looks like this needs a rebase to run in EWS. Yup, confirming I build locally before I upload

Brady Eidson

Comment 4 2020-03-13 11:31:06 PDT

Created attachment 393510 [details] Patch

Geoffrey Garen

Comment 5 2020-03-13 11:50:50 PDT

Comment on attachment 393510 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=393510&action=review > Source/WebKit/ChangeLog:3 > + Gather PDF scripts to run on a background thread. Why do we need to do this? Is it because we might block if the PDF is still loading? > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1563 > +void PDFPlugin::maybeRunScriptsInPDFDocument() => runScriptsInPDFDocumentIfNeeded > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1572 > + auto completionHandler = [this, protectedThis = makeRef(*this)] (Vector<RetainPtr<CFStringRef>>& scripts) mutable { Let's make this Vector& a Vector&& -- that's a better proof of thread safety, I think, since it means you're the single owner. > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1576 > + JSGlobalContextRef ctx = JSGlobalContextCreate(0); nullptr > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1579 > + JSEvaluateScript(ctx, OpaqueJSString::tryCreate(script.get()).get(), jsPDFDoc, nullptr, 0, nullptr); 0 => 1 (it's a line number) > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1587 > + auto& rawQueue = scriptUtilityQueue.get(); > + auto documentRef = [m_pdfDocument documentRef]; > + rawQueue.dispatch([scriptUtilityQueue = WTFMove(scriptUtilityQueue), completionHandler = WTFMove(completionHandler), documentRef] () mutable { Can we do scriptUtilityQueue-> instead of rawQueue.? Less verbose if so. Can we capture a RetainPtr instead of (what appears to be) a raw pointer?

Brady Eidson

Comment 6 2020-03-13 15:25:50 PDT

(In reply to Geoffrey Garen from comment #5) > Comment on attachment 393510 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=393510&action=review > > > Source/WebKit/ChangeLog:3 > > + Gather PDF scripts to run on a background thread. > > Why do we need to do this? Is it because we might block if the PDF is still > loading? Will clarify in new ChangeLog > > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1563 > > +void PDFPlugin::maybeRunScriptsInPDFDocument() > > => runScriptsInPDFDocumentIfNeeded > > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1572 > > + auto completionHandler = [this, protectedThis = makeRef(*this)] (Vector<RetainPtr<CFStringRef>>& scripts) mutable { > > Let's make this Vector& a Vector&& -- that's a better proof of thread > safety, I think, since it means you're the single owner. > > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1576 > > + JSGlobalContextRef ctx = JSGlobalContextCreate(0); > > nullptr > > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1579 > > + JSEvaluateScript(ctx, OpaqueJSString::tryCreate(script.get()).get(), jsPDFDoc, nullptr, 0, nullptr); > > 0 => 1 (it's a line number) > > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1587 > > + auto& rawQueue = scriptUtilityQueue.get(); > > + auto documentRef = [m_pdfDocument documentRef]; > > + rawQueue.dispatch([scriptUtilityQueue = WTFMove(scriptUtilityQueue), completionHandler = WTFMove(completionHandler), documentRef] () mutable { > > Can we do scriptUtilityQueue-> instead of rawQueue.? Less verbose if so. No, because the WTFMove for the capture happens before the call. > Can we capture a RetainPtr instead of (what appears to be) a raw pointer? In practice not necessary because the m_pdfDocument will always be alive because the PDFPlugin will be too. But adding it makes that more explicit, so I will.

Brady Eidson

Comment 7 2020-03-13 15:39:30 PDT

Created attachment 393542 [details] Patch

Brady Eidson

Comment 8 2020-03-13 15:45:30 PDT

Created attachment 393545 [details] Patch

Geoffrey Garen

Comment 9 2020-03-13 17:00:20 PDT

Comment on attachment 393545 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=393545&action=review > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1453 > + if (m_documentFinishedLoading) > + runScriptsInPDFDocumentIfNeeded(); Can we just run scripts at the end of pdfDocumentDidLoad() in all cases? That seems like the most natural time. And it removes the complexity of tracking whether we have run scripts already or not. If it's sometimes not possible to run scripts at the end of pdfDocumentDidLoad(), then I wonder why it's sometimes OK.

Brady Eidson

Comment 10 2020-03-13 20:38:58 PDT

(In reply to Geoffrey Garen from comment #9) > Comment on attachment 393545 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=393545&action=review > > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1453 > > + if (m_documentFinishedLoading) > > + runScriptsInPDFDocumentIfNeeded(); > > Can we just run scripts at the end of pdfDocumentDidLoad() in all cases? > > That seems like the most natural time. And it removes the complexity of > tracking whether we have run scripts already or not. If it's sometimes not > possible to run scripts at the end of pdfDocumentDidLoad(), then I wonder > why it's sometimes OK. pdfDocumentDidLoad() is an old method name from the "synchronously create the document at the end" days. It could be renamed to "documentDataDidFinishLoading()" You need two things to be true to run scripts: 1 - You need all document data to have finished loading 2 - The PDFDocument needs to have been created. In the new model, the main thread *can* finish loading all of the data for the document, while the original background PDF thread might not have finished creating the PDFDocument yet. That's why there are two places that "runScriptsInPDFDocumentIfNeeded": 1 - When the document data finishes loading (assuming the PDFDocument has also been created) 2 - When the PDFDocument has been created (assuming we also have all the document data)

Geoffrey Garen

Comment 11 2020-03-14 13:13:35 PDT

> You need two things to be true to run scripts: > 1 - You need all document data to have finished loading > 2 - The PDFDocument needs to have been created. Given these requirements, how about this design: 1. Change runScriptsInPDFDocument() to tryRunScriptsInPDFDocument() 1a. tryRunScriptsInPDFDocument returns early if m_pdfDocument is null or m_documentFinishedLoading is false 2. PDFPlugin::pdfDocumentDidLoad() calls tryRunScriptsInPDFDocument() unconditionally at the end 3. adoptBackgroundThreadDocument() calls tryRunScriptsInPDFDocument() unconditionally at the end I think this design might be better because I think it directly encodes the requirements you've described: tryRunScriptsInPDFDocument() only runs if both conditions are true, and each function that changes one of the conditions calls tryRunScriptsInPDFDocument() directly. It also avoids an extra piece of state. (The patch you've written is good at demonstrating that it never runs scripts twice, but not as good at demonstrating that it runs scripts more than zero times.)

Brady Eidson

Comment 12 2020-03-14 13:47:28 PDT

Created attachment 393590 [details] Patch

Geoffrey Garen

Comment 13 2020-03-14 13:55:49 PDT

Comment on attachment 393590 [details] Patch View in context: https://bugs.webkit.org/attachment.cgi?id=393590&action=review r=me > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1564 > + ASSERT(m_documentFinishedLoading); I think this assert is no longer valid?

Brady Eidson

Comment 14 2020-03-14 14:01:02 PDT

(In reply to Geoffrey Garen from comment #13) > Comment on attachment 393590 [details] > Patch > > View in context: > https://bugs.webkit.org/attachment.cgi?id=393590&action=review > > r=me > > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1564 > > + ASSERT(m_documentFinishedLoading); > > I think this assert is no longer valid? Amazing you called that out right after I ran into it in testing.

Brady Eidson

Comment 15 2020-03-14 14:04:01 PDT

Created attachment 393592 [details] PFL

WebKit Commit Bot

Comment 16 2020-03-14 15:17:09 PDT

Comment on attachment 393592 [details] PFL Clearing flags on attachment: 393592 Committed r258474: <https://trac.webkit.org/changeset/258474>

WebKit Commit Bot

Comment 17 2020-03-14 15:17:11 PDT

All reviewed patches have been landed. Closing bug.

Note You need to log in before you can comment on or make changes to this bug.

Status RESOLVED

Resolution FIXED

Priority P2

Severity Normal

Classification Unclassified

Version WebKit Nightly Build

Hardware Unspecified

OS Unspecified

Product WebKit

Component WebKit Misc.

Assignee

Brady Eidson

Reported

2020-03-13 10:30 PDT

Modified

2020-03-14 15:17 PDT History

CC List

3 users Show

URL

Keywords InRadar

Depends on

Blocks