WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
210208
Fix handling non-linearized PDFs when incremental PDF loading is enabled
https://bugs.webkit.org/show_bug.cgi?id=210208
Summary
Fix handling non-linearized PDFs when incremental PDF loading is enabled
Brady Eidson
Reported
2020-04-08 11:55:41 PDT
Fix handling non-linearized PDFs when incremental PDF loading is enabled <
rdar://problem/60619506
>
Attachments
Patch
(7.84 KB, patch)
2020-04-08 12:16 PDT
,
Brady Eidson
no flags
Details
Formatted Diff
Diff
Patch
(8.38 KB, patch)
2020-04-08 13:40 PDT
,
Brady Eidson
no flags
Details
Formatted Diff
Diff
Patch
(8.37 KB, patch)
2020-04-08 13:49 PDT
,
Brady Eidson
no flags
Details
Formatted Diff
Diff
Show Obsolete
(2)
View All
Add attachment
proposed patch, testcase, etc.
Brady Eidson
Comment 1
2020-04-08 12:16:39 PDT
Created
attachment 395852
[details]
Patch
Geoffrey Garen
Comment 2
2020-04-08 12:24:07 PDT
/Volumes/Data/worker/macOS-Mojave-Debug-Build-EWS/build/Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:1536:5: error: use of undeclared identifier 'pdfLog'
Tim Horton
Comment 3
2020-04-08 12:27:45 PDT
Comment on
attachment 395852
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=395852&action=review
> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:142 > +static const int32_t maximumRangeRequestPosition = std::numeric_limits<int32_t>::max();
INT32_MAX is a reasonable maximum? Seems ... big.
Geoffrey Garen
Comment 4
2020-04-08 12:46:42 PDT
Comment on
attachment 395852
[details]
Patch View in context:
https://bugs.webkit.org/attachment.cgi?id=395852&action=review
> Source/WebKit/ChangeLog:9 > + When we try to load a non-linearized PDF with PDFKit, it makes an outlandishly large range request > + to try to verify the PDF file size.
Are we working around a bug in PDFKit here? If so, let's cite the Radar for that bug.
> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:142 > +static const int32_t maximumRangeRequestPosition = std::numeric_limits<int32_t>::max();
Should this be off_t, since the position argument passed to dataProviderGetBytesAtPositionCallback is off_t? What is the actual request we get from PDFKit in practice? Maybe we can bump this number up, and do a direct comparison instead of a > comparison. This limit would prohibit streaming of any > 4GB PDF resource. While 4GB is very big, it doesn't strike me as impossibly big, and refusing to stream something because it was big would be a sad paradox.
> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:701 > +void PDFPlugin::receivedInvalidRangeRequest()
I think this code would be clearer if we named and/or commented it according to the behavior we are working around / modeling. Maybe something like maximumRangeRequestPosition => nonLinearizedPDFSentinel receivedInvalidRangeRequest => receivedNonLinearizedPDFSentinel
> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:741 > + Ref<PDFPlugin> plugin = *((PDFPlugin*)info);
Let's put this at the top of the function so we can remove the casting above.
> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:744 > + // It's possible we previously encountered an invalid range and therefore disabled incremental loading, > + // but PDFDocument is still trying to read data using a different strategy.
Does this happen?
> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:750 > + auto debugPluginRef = plugin.copyRef();
I think you can just use plugin here.
> Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:859 > + // The main thread dispatch below removes the last reference to the PDF thread. > + // It must be the last code executed in this function. > + callOnMainThread([protectedPlugin = WTFMove(protectedPlugin)] { });
callOnMainThread can execute its function concurrently at any time; so, this is a suspicious lifetime idiom, and probably not fully correct. What are we trying to accomplish here?
Geoffrey Garen
Comment 5
2020-04-08 12:47:20 PDT
Comment on
attachment 395852
[details]
Patch I accidentally overwrote thorton's r+, but now it's on purpose cuz EWS is angry.
Brady Eidson
Comment 6
2020-04-08 13:04:09 PDT
(In reply to Geoffrey Garen from
comment #4
)
> Comment on
attachment 395852
[details]
> Patch > > View in context: >
https://bugs.webkit.org/attachment.cgi?id=395852&action=review
> > > Source/WebKit/ChangeLog:9 > > + When we try to load a non-linearized PDF with PDFKit, it makes an outlandishly large range request > > + to try to verify the PDF file size. > > Are we working around a bug in PDFKit here? If so, let's cite the Radar for > that bug. > > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:142 > > +static const int32_t maximumRangeRequestPosition = std::numeric_limits<int32_t>::max(); > > Should this be off_t, since the position argument passed to > dataProviderGetBytesAtPositionCallback is off_t? > > What is the actual request we get from PDFKit in practice? Maybe we can bump > this number up, and do a direct comparison instead of a > comparison.
It's not constant, and it's closer to int64_t max.
> This limit would prohibit streaming of any > 4GB PDF resource. While 4GB is very > big, it doesn't strike me as impossibly big, and refusing to stream > something because it was big would be a sad paradox.
I can make it "bigger than 4gb"
> > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:701 > > +void PDFPlugin::receivedInvalidRangeRequest() > > I think this code would be clearer if we named and/or commented it according > to the behavior we are working around / modeling. Maybe something like > > maximumRangeRequestPosition => nonLinearizedPDFSentinel > receivedInvalidRangeRequest => receivedNonLinearizedPDFSentinel
I'm fine with these
> > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:741 > > + Ref<PDFPlugin> plugin = *((PDFPlugin*)info); > > Let's put this at the top of the function so we can remove the casting above. > > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:744 > > + // It's possible we previously encountered an invalid range and therefore disabled incremental loading, > > + // but PDFDocument is still trying to read data using a different strategy. > > Does this happen?
Yes.
> > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:750 > > + auto debugPluginRef = plugin.copyRef(); > > I think you can just use plugin here.
Need a non-moved copy of plugin down at the bottom of the function for logging.
> > > Source/WebKit/WebProcess/Plugins/PDF/PDFPlugin.mm:859 > > + // The main thread dispatch below removes the last reference to the PDF thread. > > + // It must be the last code executed in this function. > > + callOnMainThread([protectedPlugin = WTFMove(protectedPlugin)] { }); > > callOnMainThread can execute its function concurrently at any time; so, this > is a suspicious lifetime idiom, and probably not fully correct. What are we > trying to accomplish here?
It's a scope exit - It only runs the functor as the enclosing scope (e.g. ::threadEntry()) finishes up. This is just moving the final "callOnMainThread" to a scope exit because there's now multiple return paths from this function.
Brady Eidson
Comment 7
2020-04-08 13:40:49 PDT
Created
attachment 395861
[details]
Patch
Geoffrey Garen
Comment 8
2020-04-08 13:44:09 PDT
> > callOnMainThread can execute its function concurrently at any time; so, this > > is a suspicious lifetime idiom, and probably not fully correct. What are we > > trying to accomplish here? > > It's a scope exit - It only runs the functor as the enclosing scope (e.g. > ::threadEntry()) finishes up. > > This is just moving the final "callOnMainThread" to a scope exit because > there's now multiple return paths from this function.
How about changing the comment to "Keep PDFPlugin alive until the end of this function and the end of the last main thread task submitted by this function".
Brady Eidson
Comment 9
2020-04-08 13:49:36 PDT
Created
attachment 395863
[details]
Patch
EWS
Comment 10
2020-04-08 15:15:22 PDT
Committed
r259760
: <
https://trac.webkit.org/changeset/259760
> All reviewed patches have been landed. Closing bug and clearing flags on
attachment 395863
[details]
.
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug