Bug 65076

Summary: WebKit2: Printing to PDF loses URL links
Product: WebKit Reporter: kk@kkrueger.de
Component: PDFAssignee: Mark Rowe (bdash) <mrowe@apple.com>
Status: RESOLVED FIXED    
Severity: Normal CC: ap@webkit.org, b@raicheff.com, collegeitdept@yahoo.com, M8R-lmdomp@mailinator.com, mjt@c-command.com, nw2uzh3766@snkmail.com, tom.andersen@gmail.com, webkit@derailer.org
Priority: P2 Keywords: InRadar
Version: 528+ (Nightly build)   
Hardware: Macintosh   
OS: Unspecified   
URL: http://www.wikipedia.de/
Attachments:
Description Flags
Printed PDF from r83010
none
Printed PDF from r83080
none
Patch v1
ap: review+
PDF in Preview.app on left; WebKit on right none

Description From 2011-07-24 07:37:16 PST
Prior to Safari 5.1 printing into a PDF kept all links intact - great for archiving scientific articles.

After upgrading to Safari 5.1 (both on Lion and Snow Leopard) those links are gone, breaking my workflow because it need to keep the URLs.

From a user perspective it seems that it is an old bug reappearing:

     https://bugs.webkit.org/show_bug.cgi?id=10216

Can be reproduced on Mac OS X 10.6.8 and 10.7:
- open any web page containing links
- print to pdf using pdf services
- open pdf in Preview.app
------- Comment #1 From 2011-07-24 11:12:44 PST -------
<rdar://problem/9831050>
------- Comment #2 From 2011-08-11 23:48:36 PST -------
Created an attachment (id=103742) [details]
Printed PDF from r83010
------- Comment #3 From 2011-08-11 23:49:21 PST -------
Created an attachment (id=103743) [details]
Printed PDF from r83080
------- Comment #4 From 2011-08-11 23:55:16 PST -------
Traced the issue a bit further:

Nightly build r83010: prints as expected
Nightly build r83080 fails

Open the attached PDF files in Preview.app and hover over the embedded links.

Both files were printed from highly builds r83010 / r83080 using the specified URL from above:

http://www.wikipedia.de
------- Comment #5 From 2011-10-02 15:31:52 PST -------
There is lots of grief about this bug on the web. 

It looks to me like the culprit may have something to do with https://bugs.webkit.org/show_bug.cgi?id=57916 and the m_alwaysCreateLineBoxes variable.

Perhaps the optimization should be off when printing. 

RenderInline::RenderInline(Node* node)
     : RenderBoxModelObject(node)
     , m_lineHeight(-1)
+    , m_alwaysCreateLineBoxes

Might give someone a clue for an easy fix.

from not a webkit developer.
------- Comment #6 From 2011-10-15 14:38:12 PST -------
Will this very annoying and disruptive bug ever get fixed?

It still hasn't been assigned to anyone.

Please fix this bug.
------- Comment #7 From 2011-10-15 14:39:32 PST -------
This bug is still present and has NOT been fixed in the recent Safari update 5.1.1.

Does someone have an update for an ETA??

Thanks.
------- Comment #8 From 2011-11-04 13:09:32 PST -------
*** Bug 71573 has been marked as a duplicate of this bug. ***
------- Comment #9 From 2011-11-21 01:44:14 PST -------
This affects not only Safari but lots of Mac software used to archive web sites.

(In reply to comment #1)
> <rdar://problem/9831050>

If this were on http://openradar.appspot.com we could all watch it ;)
------- Comment #10 From 2011-12-18 09:13:53 PST -------
Will someone kindly look at this issue already?  It was reported more than 5 MONTHS ago and is still unassigned!

The functionality is critical for online research activities and this defect is extremely annoying!
------- Comment #11 From 2012-01-06 20:33:03 PST -------
Created an attachment (id=121534) [details]
Patch v1
------- Comment #12 From 2012-01-06 20:39:22 PST -------
(In reply to comment #9)
> This affects not only Safari but lots of Mac software used to archive web sites.

This bug is actually specific to Safari. If you're seeing a problem with another application then this bug is not it.
------- Comment #13 From 2012-01-06 21:46:34 PST -------
(From update of attachment 121534 [details])
View in context: https://bugs.webkit.org/attachment.cgi?id=121534&action=review

r=me assuming you tested multi-page PDFs.

> Source/WebKit2/UIProcess/API/mac/WKPrintingView.mm:509
> +    RetainPtr<NSData> pdfData(AdoptNS, [[NSData alloc] initWithBytes:pdfDataBytes.data() length:pdfDataBytes.size()]);

It's a little sad that we now copy the data.

> Source/WebKit2/UIProcess/API/mac/WKPrintingView.mm:533
> +        RetainPtr<NSData> pdfData(AdoptNS, [[NSData alloc] initWithBytes:_printedPagesData.data() length:_printedPagesData.size()]);

Ditto.
------- Comment #14 From 2012-01-06 22:03:11 PST -------
(In reply to comment #13)
> (From update of attachment 121534 [details] [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=121534&action=review
> 
> r=me assuming you tested multi-page PDFs.

Printing multipage documents works as well as printing single-page documents. 

> > Source/WebKit2/UIProcess/API/mac/WKPrintingView.mm:509
> > +    RetainPtr<NSData> pdfData(AdoptNS, [[NSData alloc] initWithBytes:pdfDataBytes.data() length:pdfDataBytes.size()]);
> 
> It's a little sad that we now copy the data.
> 
> > Source/WebKit2/UIProcess/API/mac/WKPrintingView.mm:533
> > +        RetainPtr<NSData> pdfData(AdoptNS, [[NSData alloc] initWithBytes:_printedPagesData.data() length:_printedPagesData.size()]);
> 
> Ditto.

It should be possible to avoid copying the data, but it would require a bit of reworking our assumptions about the state of _printedPagesData at various places throughout the class.  I'll file a follow-up about it.
------- Comment #15 From 2012-01-06 22:18:46 PST -------
(In reply to comment #12)
> (In reply to comment #9)
> > This affects not only Safari but lots of Mac software used to archive web sites.
> 
> This bug is actually specific to Safari. If you're seeing a problem with another application then this bug is not it.

What you're describing may be bug 75768.
------- Comment #16 From 2012-01-06 22:58:33 PST -------
Fixed in r104377.
------- Comment #17 From 2012-01-06 23:05:13 PST -------
(In reply to comment #14)
> It should be possible to avoid copying the data, but it would require a bit of reworking our assumptions about the state of _printedPagesData at various places throughout the class.  I'll file a follow-up about it.

Bug 75770.
------- Comment #18 From 2012-01-07 05:10:14 PST -------
(In reply to comment #12)

> This bug is actually specific to Safari.

It's in all applications that use /System/Library/Frameworks/WebKit.framework for instance http://c-command.com/eaglefiler/

Google Chrome also doesn't "print" clickable links, but I don't know if it ever did.

On Snow Leopard the r104378 nightly only ever prints blank pages, so I can't confirm that this is fixed.
------- Comment #19 From 2012-01-07 05:15:31 PST -------
> On Snow Leopard the r104378 nightly only ever prints blank pages, so I can't confirm that this is fixed.

I just tested r104378 (SL 10.6.8) and can confirm that printing to PDF now retains the links, so it seems to work perfectly.

Thanks a million!
------- Comment #20 From 2012-01-08 08:23:08 PST -------
I just tested r104398 and it works !

Thanks to Mark to take care of it !!
------- Comment #21 From 2012-01-16 13:07:51 PST -------
r105048 on Lion 10.7.2 restores functionality of hyperlinks in "Save to PDF" -- however, the color of the hyperlink on the HTML page was not carried over to the PDF, so there's no visual indication that there is a hyperlink (at least in OS X Preview).

See "ScreenShot_2012-01-16" (PDF views in Preview on the left; WebKit on the right)
------- Comment #22 From 2012-01-16 13:08:45 PST -------
Created an attachment (id=122677) [details]
PDF in Preview.app on left; WebKit on right
------- Comment #23 From 2012-01-16 14:37:01 PST -------
That's a completely separate issue.  Please file a new bug report about it.
------- Comment #24 From 2012-01-16 15:56:07 PST -------
Sorry -- opened bug #76406

(In reply to comment #23)
> That's a completely separate issue.  Please file a new bug report about it.
------- Comment #25 From 2012-01-22 16:22:08 PST -------
Actually one URL link is not preserved (and is now broken... it used to work before the patch)... The footer URL of the originating webpage no longer works.  It used to be generated as a link now it is not.
------- Comment #26 From 2012-01-22 16:23:29 PST -------
Other than this one last issue... Thank you for your help fixing this issue!
------- Comment #27 From 2012-01-22 21:40:02 PST -------
Please file a new bug with specific steps to reproduce that problem.
------- Comment #28 From 2012-02-01 17:08:44 PST -------
I see this is resolved as fixed, however I just observed the problem (printing to PDF does not preserve clickable links) in Safari 5.1.3 on Mac OS X 10.7.3, both with Safari itself and with my WebKit-using app (EagleFiler).
------- Comment #29 From 2012-02-01 17:09:31 PST -------
That's expected. Safari 5.1.3 does not contain the fix.
------- Comment #30 From 2012-02-01 17:50:23 PST -------
Do we know when this fix will appear in an official Safari release from Apple?  I was really looking forward to the update for the potential fix to this disruptive bug.

Until then I have to continue to use Firefox with its Adobe Create PDF extension.
------- Comment #31 From 2012-02-01 17:54:48 PST -------
Topics such as Apple's release schedule are outside of the scope of this bug.
------- Comment #32 From 2012-02-02 12:26:01 PST -------
It looks like Safari 5.1.4 has been released for testing (as seen on the web http://www.google.ca/search?q=Safari+5.1.4 ) with suggestions to test print to PDF, etc - so it looks like that is the Mac version with the fix.
------- Comment #33 From 2012-03-16 16:05:34 PST -------
Mark,

I tried  Safari 5.1.4 and indeed it works (just like with the WebKit builds) so thanks for your effort.

But something is still strange: I use NewNewsWire as RSS reader. After updating Safari it still has the bug. I tried Vienna, same. I tried OmniBrowser and RapidWeaver - both work. Strange...

There might be a second place where your code changes need to be applied. Could you check again ?

Thanks,
Karsten
------- Comment #34 From 2012-03-16 16:16:32 PST -------
Safari 5.1.4 did not update the system version of WebKit when installed on Lion. The fact that this still reproduces in third-party applications is currently expected, and will be resolved when the system version of WebKit is updated.
------- Comment #35 From 2012-05-12 02:39:28 PST -------
Just a final update:

With the recent Mac OS X 10.7.4 / Safari 5.1.7 update WebKit got fixed, too.
So apps like NetNewsWire, Vienna and others, which use WebKit to display 
HTML pages are fixed now.

Thanks again,
Karsten