Bug 120651

Summary: [mac] PDFDocumentImage should use PDFKit to draw
Product: WebKit Reporter: Tim Horton <thorton>
Component: PDFAssignee: Tim Horton <thorton>
Status: RESOLVED FIXED    
Severity: Normal CC: ap, buildbot, commit-queue, ddkilzer, mitz, rniwa, sam, simon.fraser
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Unspecified   
OS: Unspecified   
Bug Depends on: 121008    
Bug Blocks:    
Attachments:
Description Flags
patch
none
binary patch
ap: review+, buildbot: commit-queue-
patch
ap: review+
a new patch
none
better patch with all the parts
buildbot: commit-queue-
extern ap: review+

Description Tim Horton 2013-09-03 14:39:26 PDT
PDFDocumentImage currently uses CG to draw, which means that annotations in the PDF are not drawn when used inside an <img> (they are drawn when PDFPlugin is used, when the PDF is in <object>, <embed>, or <iframe>).

<rdar://problem/12810731>
Comment 1 Tim Horton 2013-09-03 17:44:58 PDT
Created attachment 210418 [details]
patch

I still need to test the !PLATFORM(MAC) case.
Comment 2 Tim Horton 2013-09-03 17:48:04 PDT
Created attachment 210419 [details]
binary patch
Comment 3 Build Bot 2013-09-03 19:25:55 PDT
Comment on attachment 210419 [details]
binary patch

Attachment 210419 [details] did not pass win-ews (win):
Output: http://webkit-queues.appspot.com/results/1694405
Comment 4 Alexey Proskuryakov 2013-09-03 22:01:48 PDT
Comment on attachment 210419 [details]
binary patch

View in context: https://bugs.webkit.org/attachment.cgi?id=210419&action=review

> Source/WebCore/ChangeLog:40
> +        (WebCore::PDFDocumentImage::setCurrentPage):

Isn't this mostly dead code, as we only ever display the first page?

> Source/WebCore/ChangeLog:41
> +        Make all page-number-related things unsigned, since they are in CG and PDFKit,

Would be nice to have comments about what is 0-based and what is 1-based.

> Source/WebCore/platform/graphics/cg/PDFDocumentImage.cpp:163
> +    // We use the GetBytesAtPosition callback rather than the GetBytePointer one because SharedBuffer
> +    // does not provide a way to lock down the byte pointer and guarantee that it won't move, which

I'm not sure if this is relevant, as we don't mutate the SharedBuffer. Although maybe its data can be replaced when picking up a resource from CFNetwork cache?

Brady would know for sure.

> Source/WebCore/platform/graphics/cg/PDFDocumentImage.cpp:189
> +    return m_document ? CGPDFDocumentGetNumberOfPages(m_document) : 0;

No need for null check.

> Source/WebCore/platform/graphics/cg/PDFDocumentImage.h:52
> +    virtual String filenameExtension() const;

OVERRIDE

> Source/WebCore/platform/graphics/cg/PDFDocumentImage.h:54
> +    virtual bool hasSingleSecurityOrigin() const { return true; }

OVERRIDE

And so on...

> Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:29
> +#if PLATFORM(MAC)

Is the purpose of this check to compile this out on iOS?

> Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:46
> +    m_document = [[getPDFDocumentClass() alloc] initWithData:nsData.get()];

This leaks the whole document.

> Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:51
> +    RetainPtr<PDFPage> pdfPage = [m_document pageAtIndex:m_currentPage];

I don't think that this needs a RetainPtr, a raw pointer should be just fine.

> Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:61
> +    return m_document ? [m_document pageCount] : 0;

No need for null check.

> Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:69
> +    [[m_document pageAtIndex:m_currentPage] drawWithBox:kPDFDisplayBoxMediaBox];

It looks suspicious that we use media box unconditionally here.
Comment 5 Darin Adler 2013-09-03 22:05:01 PDT
Comment on attachment 210419 [details]
binary patch

View in context: https://bugs.webkit.org/attachment.cgi?id=210419&action=review

>> Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:29
>> +#if PLATFORM(MAC)
> 
> Is the purpose of this check to compile this out on iOS?

If it is, then it won’t work since PLATFORM(MAC) is true on iOS at the moment; worth fixing that some day.

> Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:35
> +#import <objc_class.h>

Isn’t it supposed to be <objc/objc_class.h>?

>> Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:46
>> +    RetainPtr<NSData> nsData = data()->createNSData();
>> +    m_document = [[getPDFDocumentClass() alloc] initWithData:nsData.get()];
> 
> This leaks the whole document.

Also no need for a local variable for nsData:

    m_document = adoptNS([[getPDFDocumentClass() alloc] initWithData:data()->createNSData().get()]);
Comment 6 Tim Horton 2013-09-04 10:31:11 PDT
(In reply to comment #4)
> (From update of attachment 210419 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=210419&action=review
> 
> > Source/WebCore/ChangeLog:40
> > +        (WebCore::PDFDocumentImage::setCurrentPage):
> 
> Isn't this mostly dead code, as we only ever display the first page?

Yes. I thought someone was planning to make use of it, but now that I look this code is absolutely ancient and effectively unused, so let me take a moment to get rid of it.

> > Source/WebCore/ChangeLog:41
> > +        Make all page-number-related things unsigned, since they are in CG and PDFKit,
> 
> Would be nice to have comments about what is 0-based and what is 1-based.

Yeah :| That tripped me up a bit yesterday.

> > Source/WebCore/platform/graphics/cg/PDFDocumentImage.cpp:163
> > +    // We use the GetBytesAtPosition callback rather than the GetBytePointer one because SharedBuffer
> > +    // does not provide a way to lock down the byte pointer and guarantee that it won't move, which
> 
> I'm not sure if this is relevant, as we don't mutate the SharedBuffer. Although maybe its data can be replaced when picking up a resource from CFNetwork cache?
> 
> Brady would know for sure.

I’ll check with Brady.

> > Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:29
> > +#if PLATFORM(MAC)
> 
> Is the purpose of this check to compile this out on iOS?
> If it is, then it won’t work since PLATFORM(MAC) is true on iOS at the moment; worth fixing that some day.

Oh! I’ll fix that. It is supposed to only be built on OS X.

> > Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:46
> > +    m_document = [[getPDFDocumentClass() alloc] initWithData:nsData.get()];
> 
> This leaks the whole document.

Indeed it does.

> > Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:51
> > +    RetainPtr<PDFPage> pdfPage = [m_document pageAtIndex:m_currentPage];
> 
> I don't think that this needs a RetainPtr, a raw pointer should be just fine.

Sure.

> > Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:69
> > +    [[m_document pageAtIndex:m_currentPage] drawWithBox:kPDFDisplayBoxMediaBox];
> 
> It looks suspicious that we use media box unconditionally here.

I was following… other code. I’ll check.
Comment 7 Tim Horton 2013-09-04 10:58:19 PDT
(In reply to comment #5)
> > Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:35
> > +#import <objc_class.h>
> 
> Isn’t it supposed to be <objc/objc_class.h>?

I don’t think so.
Comment 8 Tim Horton 2013-09-04 11:20:28 PDT
> > > Source/WebCore/platform/graphics/cg/PDFDocumentImage.cpp:163
> > > +    // We use the GetBytesAtPosition callback rather than the GetBytePointer one because SharedBuffer
> > > +    // does not provide a way to lock down the byte pointer and guarantee that it won't move, which
> > 
> > I'm not sure if this is relevant, as we don't mutate the SharedBuffer. Although maybe its data can be replaced when picking up a resource from CFNetwork cache?
> > 
> > Brady would know for sure.
> 
> I’ll check with Brady.

Brady says it’s OK to remove the NSData madness (which I did, in PDFDocumentImageMac) but that this one should stay.
Comment 9 Alexey Proskuryakov 2013-09-04 11:24:19 PDT
> > Isn’t it supposed to be <objc/objc_class.h>?
> I don’t think so.

Isn't it supposed to be "objc_class.h" then? But I don't see why it's needed at all.
Comment 10 Tim Horton 2013-09-04 13:29:33 PDT
(In reply to comment #6)
> > > Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:69
> > > +    [[m_document pageAtIndex:m_currentPage] drawWithBox:kPDFDisplayBoxMediaBox];
> > 
> > It looks suspicious that we use media box unconditionally here.
> 
> I was following… other code. I’ll check.

Ok. The current PDFDocumentImage implementation paints PDF content outside of the crop box (i.e. doesn’t crop to the crop box). This means we happily display all the overdraw nonsense which doesn’t show up in PDFPlugin nor in Preview nor elsewhere. So, you’re right, we should probably use CropBox here despite that being a change in behavior.

The current implementation also doesn’t size the <img> as close to the page size as I’d expect, but I’m not going to change that in this patch.
Comment 11 Tim Horton 2013-09-04 13:34:08 PDT
(In reply to comment #9)
> > > Isn’t it supposed to be <objc/objc_class.h>?
> > I don’t think so.
> 
> Isn't it supposed to be "objc_class.h" then? But I don't see why it's needed at all.

Ahhhh, apparently it’s #import <objc/objc-class.h>; note the dash instead of the underscore.

And, it’s needed for SOFT_LINK_CLASS’s call to objc_getClass.
Comment 12 Tim Horton 2013-09-04 13:54:52 PDT
Created attachment 210491 [details]
patch
Comment 13 Tim Horton 2013-09-04 14:23:12 PDT
(In reply to comment #1)
> Created an attachment (id=210418) [details]
> patch
> 
> I still need to test the !PLATFORM(MAC) case.

LGTM on Windows.
Comment 14 Alexey Proskuryakov 2013-09-04 14:48:22 PDT
Comment on attachment 210491 [details]
patch

View in context: https://bugs.webkit.org/attachment.cgi?id=210491&action=review

> Source/WebCore/platform/graphics/cg/PDFDocumentImage.h:33
> +#define PDFDOCUMENTIMAGE_USES_PDFKIT (PLATFORM(MAC) && !PLATFORM(IOS))

I'm not sure if this is needed at all, can files be just excluded form build on platforms that don't need them?
Comment 15 Tim Horton 2013-09-04 14:53:22 PDT
(In reply to comment #14)
> (From update of attachment 210491 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=210491&action=review
> 
> > Source/WebCore/platform/graphics/cg/PDFDocumentImage.h:33
> > +#define PDFDOCUMENTIMAGE_USES_PDFKIT (PLATFORM(MAC) && !PLATFORM(IOS))
> 
> I'm not sure if this is needed at all, can files be just excluded form build on platforms that don't need them?

I need to also not build in the CG-but-not-Mac implementations of the four platform specific functions in PDFDocumentImage.cpp (createPDFDocument, computeBoundsForCurrentPage, pageCount, and drawPDFPage).
Comment 16 Tim Horton 2013-09-04 15:07:55 PDT
http://trac.webkit.org/changeset/155069
Comment 17 David Kilzer (:ddkilzer) 2013-09-05 06:40:29 PDT
(In reply to comment #16)
> http://trac.webkit.org/changeset/155069

This caused a build failure on iOS because macros used to guard sharedBufferGetBytesAtPosition() are now inconsistent.

In these two files, !PLATFORM(MAC) is used:

Source/WebCore/platform/graphics/cg/ImageSourceCG.cpp	Source/WebCore/platform/graphics/cg/ImageSourceCG.h

In this file, !USE(PDFKIT_FOR_PDFDOCUMENTIMAGE) is used:

Source/WebCore/platform/graphics/cg/PDFDocumentImage.cpp
Comment 18 David Kilzer (:ddkilzer) 2013-09-05 06:50:43 PDT
(In reply to comment #17)
> (In reply to comment #16)
> > http://trac.webkit.org/changeset/155069
> 
> This caused a build failure on iOS because macros used to guard sharedBufferGetBytesAtPosition() are now inconsistent.

Filed Bug 120771.
Comment 19 WebKit Commit Bot 2013-09-08 11:06:01 PDT
Re-opened since this is blocked by bug 121008
Comment 20 Alexey Proskuryakov 2013-09-08 12:25:04 PDT
Somehow, the change made a lot of tests flaky (presumably failing if run after a PDF image test). See <http://build.webkit.org/results/Apple%20MountainLion%20Release%20WK1%20(Tests)/r155249%20(12626)/results.html>.

Rolled out in <http://trac.webkit.org/changeset/155306>.
Comment 21 Tim Horton 2013-09-08 16:20:22 PDT
(In reply to comment #20)
> Somehow, the change made a lot of tests flaky (presumably failing if run after a PDF image test). See <http://build.webkit.org/results/Apple%20MountainLion%20Release%20WK1%20(Tests)/r155249%20(12626)/results.html>.
> 
> Rolled out in <http://trac.webkit.org/changeset/155306>.

The only thing I can imagine is that there's some CG/something state that's not getting reset. I save the GraphicsContext state in draw(), which calls drawPDFPage() which temporarily makes our context the "current" NSGraphicsContext, and then lets PDFKit take over. Is there any way that this isn't enough? I'd expect that to be sufficient. Does NSGraphicsContext have state other than the 'current' context that PDFKit could be futzing with, that wouldn't be restored by restoring the old context/restoring the current context's state when we leave draw()? I have no idea.
Comment 22 Alexey Proskuryakov 2013-09-09 10:22:37 PDT
Now that we know that PDF image is the culprit, I was able to find an easier to reproduce reliable case:

run-webkit-tests --child-processes=1 fast/images/pdf-as-image.html fast/regions/counters/extract-list-items-001.html
Comment 23 Alexey Proskuryakov 2013-09-09 10:42:50 PDT
Also of interest, running one PDF as image test changes text rendering in a lot in subsequent tests:

run-webkit-tests --child-processes=1 fast/images/pdf-as-image.html fast/multicol --pixel --tolerance=0

Reftests don't quite show that because expectations change too.
Comment 24 Simon Fraser (smfr) 2013-09-09 11:34:42 PDT
PDFKit must be clobbering CG state.
Comment 25 Tim Horton 2013-09-09 23:06:38 PDT
(In reply to comment #24)
> PDFKit must be clobbering CG state.

Yep. I'm going to manually save/restore the relevant state.
Comment 26 Tim Horton 2013-09-09 23:25:16 PDT
Created attachment 211171 [details]
a new patch

I'm going to run the tests overnight, but the ones I know about are better.
Comment 27 Tim Horton 2013-09-09 23:25:52 PDT
Comment on attachment 211171 [details]
a new patch

This patch is not the right size.
Comment 28 Tim Horton 2013-09-09 23:29:09 PDT
Created attachment 211172 [details]
better patch with all the parts
Comment 29 Build Bot 2013-09-09 23:47:20 PDT
Comment on attachment 211172 [details]
better patch with all the parts

Attachment 211172 [details] did not pass mac-wk2-ews (mac-wk2):
Output: http://webkit-queues.appspot.com/results/1638437
Comment 30 Tim Horton 2013-09-09 23:57:14 PDT
Created attachment 211178 [details]
extern
Comment 31 Alexey Proskuryakov 2013-09-10 11:39:25 PDT
Comment on attachment 211178 [details]
extern

View in context: https://bugs.webkit.org/attachment.cgi?id=211178&action=review

r=me assuming that there are no flaky tests introduced locally (also when running some subsets of pixel tests like in comment 23). EWS won't tell us.

> Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:81
> +    // These states can be mutated by PDFKit but are not saved
> +    // on the context's state stack. (<rdar://problem/14951759>)

Do we need the same in PDFPlugin?
Comment 32 Tim Horton 2013-09-10 11:40:24 PDT
(In reply to comment #31)
> (From update of attachment 211178 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=211178&action=review
> 
> r=me assuming that there are no flaky tests introduced locally (also when running some subsets of pixel tests like in comment 23). EWS won't tell us.
> 
> > Source/WebCore/platform/graphics/mac/PDFDocumentImageMac.mm:81
> > +    // These states can be mutated by PDFKit but are not saved
> > +    // on the context's state stack. (<rdar://problem/14951759>)
> 
> Do we need the same in PDFPlugin?

In PDFPlugin, PDFKit owns the contexts, and we never call drawWithBox (PDFKit does). So, no. Thanks!
Comment 33 Tim Horton 2013-09-10 11:58:58 PDT
(In reply to comment #31)
> (From update of attachment 211178 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=211178&action=review
> 
> r=me assuming that there are no flaky tests introduced locally (also when running some subsets of pixel tests like in comment 23). EWS won't tell us.

Many of the pixel tests fail before and after my change, but no new failures are introduced (in the sizable subset I ran), and no flakes are introduced.
Comment 34 Tim Horton 2013-09-10 12:01:55 PDT
http://trac.webkit.org/changeset/155461