Bug 10216

Summary: PDF created by printing should have live hyperlinks
Product: WebKit Reporter: Darin Adler <darin>
Component: PrintingAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Enhancement CC: bdakin, darin, hyatt, kraisler, mitz
Priority: P4 Keywords: InRadar
Version: 420+   
Hardware: Mac   
OS: OS X 10.4   
Attachments:
Description Flags
review:?
none
patch fix for this bug.
hyatt: review-
PDF created by printing should have live hyperlinks
hyatt: review-
This patch supports creating href links in pdf documents when printing to pdf.
bdakin: review-
Will add HTML links to PDF output documents.
hyatt: review-
sample html documents create RenderInline objects.
none
patch in response to review
none
This patch will create URL links in PDF documents.
none
This patch will create URL links in PDF documents.
bdakin: review-
This patch will create URL links in PDF documents.
hyatt: review-
Adding links to pdf documents.
none
Adding links to pdf documents.
bdakin: review+
Cleaned up and slightly different patch. hyatt: review+

Darin Adler
Reported 2006-08-02 10:48:17 PDT
From <rdar://problem/3391162>: The PDF that results from saving a web page to PDF does not have live links. I should be able to click on the link in Preview and have the URL resolved exactly as if it had been clicked in Mail or Safari.
Attachments
review:? (4.26 KB, patch)
2006-08-02 17:01 PDT, Ken Kraisler
no flags
patch fix for this bug. (4.26 KB, patch)
2006-08-02 17:07 PDT, Ken Kraisler
hyatt: review-
PDF created by printing should have live hyperlinks (25.79 KB, patch)
2006-09-22 17:12 PDT, Ken Kraisler
hyatt: review-
This patch supports creating href links in pdf documents when printing to pdf. (368.63 KB, patch)
2006-09-29 16:38 PDT, Ken Kraisler
bdakin: review-
Will add HTML links to PDF output documents. (26.42 KB, patch)
2006-10-02 14:27 PDT, Ken Kraisler
hyatt: review-
sample html documents create RenderInline objects. (20.05 KB, application/zip)
2006-10-03 18:32 PDT, Ken Kraisler
no flags
patch in response to review (27.47 KB, patch)
2006-10-03 18:33 PDT, Ken Kraisler
no flags
This patch will create URL links in PDF documents. (27.47 KB, patch)
2006-10-04 09:56 PDT, Ken Kraisler
no flags
This patch will create URL links in PDF documents. (28.38 KB, patch)
2006-10-04 13:21 PDT, Ken Kraisler
bdakin: review-
This patch will create URL links in PDF documents. (28.38 KB, patch)
2006-10-04 14:48 PDT, Ken Kraisler
hyatt: review-
Adding links to pdf documents. (32.05 KB, patch)
2006-10-06 16:03 PDT, Ken Kraisler
no flags
Adding links to pdf documents. (29.29 KB, patch)
2006-10-06 17:09 PDT, Ken Kraisler
bdakin: review+
Cleaned up and slightly different patch. (12.78 KB, patch)
2006-10-11 01:58 PDT, Beth Dakin
hyatt: review+
Ken Kraisler
Comment 1 2006-08-02 17:01:19 PDT
Created attachment 9840 [details] review:? I am adding a fix such that when we create PDF documents, any URL links are live and active.
Ken Kraisler
Comment 2 2006-08-02 17:07:41 PDT
Created attachment 9841 [details] patch fix for this bug. I am adding a fix such that when we create PDF documents, any URL links are live and active.
Dave Hyatt
Comment 3 2006-08-02 17:27:07 PDT
Comment on attachment 9841 [details] patch fix for this bug. Ok, here are my comments. (1) This feature as coded will cause a slowdown. We will need a way to only do work when we detect that we're drawing into a PDF context. (2) You need to deal with partial clipping (unless the CG API does this for you?). (3) Patching only InlineFlowBox is insufficient and won't give you the right rects. This won't work for image links. It won't work when the link is a block. It won't work if the content inside the link is a larger font size. I would probably limit this feature to images and text and patch those methods instead.
Dave Hyatt
Comment 4 2006-08-02 17:30:08 PDT
The problem with patching images and text though is that they don't know that they are links... so you'd be crawling up all the time looking to see if something was a link. I honestly am not sure how to do this feature efficiently. :(
Dave Hyatt
Comment 5 2006-08-02 17:33:34 PDT
This problem is conceptually similar to outline. An outline is the Aqua focus ring that shows when a link has been tabbed to. For outlines (which are irregular paths that bound an object and its descendants) we actually collect rectangles. That collection of rectangles is what you want, so this problem is very very similar to that. We even deal with clipping for focus rings.
Darin Adler
Comment 6 2006-08-03 18:39:30 PDT
It'd be very helpful if we could come up with test cases for the various cases Dave mentions above. Clearly we have to look further up the DOM tree, since the <a> element might not be the immediate parent of the node that's rendered. Given Dave' suggestion, I started looking at how outline painting works, and the relevant functions seem to be: RenderBlock::paintObject RenderFlow::paintLines RenderImage::paint RenderHTMLCanvas::paint RenderSVGContainer::paint I wouldn't worry too much about those last two, although I guess they would need the same code as RenderImage.
Darin Adler
Comment 7 2006-08-03 19:32:26 PDT
I don't see yet how to efficiently check if we're inside a link or not (I think Dave had some ideas about that), but I do see that we could add a set of linkObjects to PaintInfo alongside outlineObjects. InlineFlowBox::paint could add the objects to the set. Then RenderFlow::paintLines could call setURLForRect for each one. RenderBlock::paintObject and RenderImage::paint would also call setURLForRect. We'd probably want to add a function to RenderObject so we could share the logic for deciding if something is a link. Dave, could you say more about what the technique would be for efficiently checking if we're inside a link? You said something to me yesterday about guessing that it wouldn't be too slow if we only checked in some place using this new outline approach, but I can't remember what you said exactly.
Darin Adler
Comment 8 2006-08-03 19:34:21 PDT
(In reply to comment #3) > (2) You need to deal with partial clipping (unless the CG API does this for > you?). I think this is easy to do for the simple cases. The GraphicsContext::setURLForRect function can intersect the passed-in rectangle with the clip bounding box (CGContextGetClipBoundingBox) before converting the coordinates from user to device space. That won't work when the clip is not rectangular, but I think it's good enough for our purposes.
Ken Kraisler
Comment 9 2006-09-11 16:38:25 PDT
Based on Dave and Darin's suggestions above, I've started debugging RenderFlow::paintLines. paintLines() iterates through its InlineFlowBox's. I found that if an InlineFlowBox has an element which responds to isLink() as true, I can build up a PDF PDF HREF rect. This rect seems to work properly in my simple HTML document and seems to account for instances when the font size has been increased. I realize RenderFlow::paintLines is one over several spots we'll need some patch code and I'm using it as my initial exploration case. Does my basic approach sound correct?
Ken Kraisler
Comment 10 2006-09-22 17:12:31 PDT
Created attachment 10718 [details] PDF created by printing should have live hyperlinks Darin, Beth and Dave, Thank you very much for your help regarding this bug. After all of your guidance, I finally gained some sense into how the rendering works. The patch solves all of my use case documents including the ancestors one. I'd like to submit this patch for your review. As Darin and I discussed, it hopefully will not impact performance during normal draw time, because it only occurs during printing. I tried to be very careful not to check anything unless in this mode. I'll be posting this same patch on Bugzilla w/ Beth as my requested reviewer.
Dave Hyatt
Comment 11 2006-09-22 22:55:58 PDT
Step one... please get the tabs out so that the indentation doesn't look so crazy... then I can read it more easily. Thanks.
Dave Hyatt
Comment 12 2006-09-22 23:26:45 PDT
This will be cool functionality to have, and I'm glad you're working on this. This new patch is conceptually closer to outline and that's a good thing. However, I still think this patch is way too intrusive in terms of how far it has to sink its hooks into the RenderTree code. I would like to see more code sharing with outline focus rings. I think the patch could become a lot smaller in terms of painting intrusion if we use CSS to your advantage. CSS already has a way of asking if something is a link and only applying a style if that is true. So when printing only, there could be a style rule in html4.css like: @media print { :-webkit-any-link { ... add something cool here ... } } Given that outline already collects rects and then paints the outline, if outline itself were enhanced with an extension that indicated that all links should have a special link-tagging annotation, then I think the resulting intrusion into the RenderTree code becomes very minimal. So for example, we could say something like: -webkit-outline-annotation: link; ...and make that part of the outline property. If that is present it would (a) cause the element to think it needs to "paint" its outline and (b) At the time the outline gets "painted", the URL of the link would also get tagged with the outline rect. You'd basically then be able to add the link tagging code to only two places: (1) void RenderObject::paintOutline and (2) void RenderFlow::paintLines (at the end where the outlines are drawn) ...and avoid having to intrude into so many painting methods (and also avoid having to have a hasLink method on RenderObject). Beth can help you with adding a CSS property (this is not hard at all), since she has done it many times.
Ken Kraisler
Comment 13 2006-09-29 16:38:11 PDT
Created attachment 10842 [details] This patch supports creating href links in pdf documents when printing to pdf. I've add the CSS property to html4.css and added the boolean to CSS3NonInheritedData, as suggested by Dave. The html4.css uses the @meda print approach which right now has a small glitch. Darin says this will be fixed quickly.
Beth Dakin
Comment 14 2006-10-02 13:49:39 PDT
Comment on attachment 10842 [details] This patch supports creating href links in pdf documents when printing to pdf. 1. Remove this: + if( id == CSS_PROP__WEBKIT_OUTLINE_ANNOTATION ) + printf( "test99" ); 2. Here: + if(!primitiveValue) break; + if(primitiveValue->getIdent() == CSS_VAL__WEBKIT_LINK) { + style->setOutlineAnnotation(true); + } Separate the break onto the next line, and take the brackets away from the second if-statement. 3. Resolve conflicts in GraphicsContextCairo.cpp: +<<<<<<< .mine +void GraphicsContext::setURLForRect(const KURL& link, const IntRect& destRect) +{ +======= 4. It doesn't look like the changes you have in InlineFlowBox.cpp are necessary. You should take then out unless the are needed. I would like Dave to look at the painting code because I know that he has very specific ideas about how it can be done, but here are some clean-up comments in the meantime.
Beth Dakin
Comment 15 2006-10-02 14:11:21 PDT
Comment on attachment 10842 [details] This patch supports creating href links in pdf documents when printing to pdf. Hyatt will check out the painting stuff once the clean-up comments I made above are addressed.
Ken Kraisler
Comment 16 2006-10-02 14:27:00 PDT
Created attachment 10874 [details] Will add HTML links to PDF output documents. This patch has code clean up as suggested by Beth.
Dave Hyatt
Comment 17 2006-10-03 15:46:40 PDT
Comment on attachment 10874 [details] Will add HTML links to PDF output documents. Conflicts in the ChangeLog. Style is wrong here: + if(!primitiveValue) + break; + if(primitiveValue->getIdent() == CSS_VAL__WEBKIT_LINK) { + style->setOutlineAnnotation(true); + } Need spaces after the "ifs" and need to drop the braces from the second if. A general observation here is that you've added way too many new tiny functions to RenderObject that are no longer called in many places, so they can just be folded right into the code that is adding the links. You should not need to touch RenderBlock.cpp at all. I do not understand any of the changes in this file. Outlines are painted in the two places I described in my earlier comment. The idea is that you only have to hook into those places and you can leverage the rects that are collected by the outline code. The new CSS property needs computed style implemented. The boolean in RenderStyle that you added to CSS3NonInheritedData needs to be initialized and also added to the copy constructor and assignment operator.
Ken Kraisler
Comment 18 2006-10-03 18:32:16 PDT
Created attachment 10886 [details] sample html documents create RenderInline objects. These both require changes to RenderBlock to accomplish the task of producing URL links in PDF.
Ken Kraisler
Comment 19 2006-10-03 18:33:34 PDT
Created attachment 10887 [details] patch in response to review 1. I resolved conficts in the ChangeLog. 2. I added spaces after the ifs. 3. I have 3 functions in RenderObject. They are needed because of the complexities of RenderBlock. Without my patch to RenderBlock, my sample ancestors and fruit documents (see attachments) will not work. These 3 functions support the various cases needed between RenderBlock and RenderFlow. 4. As for not needing to touch RenderBlock, please debug through it using my sample documents. 5. Using the CSS property we created there is not necessarily an outline style on the RenderObject and so paintOutlines and paintOutline will not necessarily be called. Please debug through it yourself to validate. We are not forcing a RenderStyle on the link RenderObject. If we force something like DOTTED, then yes we'd hit that code, however putting this style on that object was deemed bogus. We're using outline as a sample of how to do it, not necessarily actually generating an outline. 6. I'm not sure of where or what this is: The new CSS property needs computed style implemented. 7. I did the following: The boolean in RenderStyle that you added to CSS3NonInheritedData needs to be initialized and also added to the copy constructor and assignment operator.
Ken Kraisler
Comment 20 2006-10-04 09:56:02 PDT
Created attachment 10903 [details] This patch will create URL links in PDF documents. 1. I resolved conficts in the ChangeLog. 2. I added spaces after the ifs. 3. I have 3 functions in RenderObject. They are needed because of the complexities of RenderBlock. Without my patch to RenderBlock, my sample ancestors and fruit documents (see attachments) will not work. These 3 functions support the various cases needed between RenderBlock and RenderFlow. 4. As for not needing to touch RenderBlock, please debug through it using my sample documents. 5. Using the CSS property we created there is not necessarily an outline style on the RenderObject and so paintOutlines and paintOutline will not necessarily be called. Please debug through it yourself to validate. We are not forcing a RenderStyle on the link RenderObject. If we force something like DOTTED, then yes we'd hit that code, however putting this style on that object was deemed bogus. We're using outline as a sample of how to do it, not necessarily actually generating an outline. 6. I'm not sure of where or what this is: The new CSS property needs computed style implemented. 7. I did the following: The boolean in RenderStyle that you added to CSS3NonInheritedData needs to be initialized and also added to the copy constructor and assignment operator.
Beth Dakin
Comment 21 2006-10-04 11:02:36 PDT
Ken, to implement the computed style of the property, you just need to add a clause to getPropertyCSSValue() in CSSComputedStyleDeclaration.cpp
Ken Kraisler
Comment 22 2006-10-04 13:21:06 PDT
Created attachment 10913 [details] This patch will create URL links in PDF documents. I added a clause getPropertyCSSValue() / CSSComputedStyleDeclaration.cpp for the outline annotation property.
Beth Dakin
Comment 23 2006-10-04 13:31:33 PDT
Comment on attachment 10913 [details] This patch will create URL links in PDF documents. The CSSPrimitveValue class does not have a constructor tat accepts a boolean value, so I am not sure exactly what the constructor you added to CSSComputedStyleDeclaration will do, but it will probably do something funny. In the parser, you only let -webkit-outline-annotation have one of two values: CSS_VAL__WEBKIT_LINK, or CSS_VAL_NONE. So you probably want something like: case CSS_PROP__WEBKIT_OUTLINE_ANNOTATION: if (style->outlineAnnotation()) + return new CSSPrimitiveValue(CSS_VAL__WEBKIT_LINK); return new CSSPrimitiveValue(CSS_VAL_NONE); + break;
Ken Kraisler
Comment 24 2006-10-04 14:48:28 PDT
Created attachment 10917 [details] This patch will create URL links in PDF documents. I adjusted the return values for the computed style declaration as suggsted by Beth's comments.
Dave Hyatt
Comment 25 2006-10-04 18:07:14 PDT
Comment on attachment 10917 [details] This patch will create URL links in PDF documents. (1) Add a new method to RenderStyle called hasOutline() that is true if (a) you have a link annotation or (b) you have an outline width > 0 and you have an outline style > BHIDDEN. (2) Patch the paint method of InlineFlowBox to check hasOutline instead of outlineWidth > 0 (3) Now all of the rects (including continuations) for inline links will be collected and ready for you in paintLines. (4) Merge paintFocusRing and paintOutlines from RenderFlow into a single new function called paintOutline. (5) In this new function you can collect the focus rings rects if the outline style is auto or if there is a link annotation. (6) Once the focus rings have been collected, call drawFocusRing if outline style is auto. Add the link rects if the annotation is set. Then do the paintOutlines code. (7) Patch RenderObject's paintOutline method in a similar fashion (8) Patch RenderBlock's call to paintOutline to be conditional on hasOutline() rather than outlineWidth() > 0. Note this test case: <a href="http://www.apple.com" style="display:block">Apple</a> (9) Eliminate all the other painting code.
Dave Hyatt
Comment 26 2006-10-04 18:12:08 PDT
I notice you had to apply a transform to the rect by hand in GraphicsContext.cpp. What about clipping? Do you know if CG intersects the rect you add with the current clip?
Ken Kraisler
Comment 27 2006-10-06 16:03:45 PDT
Created attachment 10954 [details] Adding links to pdf documents. In regards to RenderObject, RenderFlow, InlineFlowBox and most of RenderBlock, I've implemented Dave''s suggestions above hopefully 100% accurately. However, continuations() continue to be a problem. As a result, I added code to RenderBlock to handle continuations() using similar methods to my previous patches. One difference being that my specialized code resides only in RenderBlock and RenderObject. Note this snippet of printf's from InlineFlowBox when run against my two sample documents: void InlineFlowBox::paint(RenderObject::PaintInfo& i, int tx, int ty) {     int xPos = tx + m_x - object()->maximalOutlineSize(i.phase);     int w = width() + 2 * object()->maximalOutlineSize(i.phase);     bool intersectsDamageRect = xPos < i.r.right() && xPos + w > i.r.x();          if (intersectsDamageRect && i.phase != PaintPhaseChildOutlines) {         if (i.phase == PaintPhaseOutline || i.phase == PaintPhaseSelfOutline) {             // Add ourselves to the paint info struct's list of inlines that need to paint their             // outlines.             if (object()->style()->visibility() == VISIBLE && object()->style()->hasOutline() &&                 !object()->isInlineContinuation() && !isRootInlineBox()) {                 i.outlineObjects->add(flowObject());             }             if (object()->style()->hasOutline())                 printf( "This prints out for most types of links.  THIS NEVER PRINTS OUT for my two sample documents containing continuation()s. Note that these continuations are RenderInline's" );             if (object()->continuation() && object()->continuation()->style()->hasOutline())                 printf( "This never prints out." );         }         else {             // 1. Paint our background and border.             paintBackgroundAndBorder(i, tx, ty);                          // 2. Paint our underline and overline.             paintDecorations(i, tx, ty, false);         }     }     PaintPhase paintPhase = i.phase == PaintPhaseChildOutlines ? PaintPhaseOutline : i.phase;     RenderObject::PaintInfo childInfo(i);     childInfo.phase = paintPhase;     // 3. Paint our children.     if (paintPhase != PaintPhaseSelfOutline) {         for (InlineBox* curr = firstChild(); curr; curr = curr->nextOnLine()) {             if (!curr->object()->layer())                 curr->paint(childInfo, tx, ty);         }     }     // 4. Paint our strike-through     if (intersectsDamageRect && (i.phase == PaintPhaseForeground || i.phase == PaintPhaseSelection))         paintDecorations(i, tx, ty, true); }
Ken Kraisler
Comment 28 2006-10-06 16:05:26 PDT
(In reply to comment #26) > I notice you had to apply a transform to the rect by hand in > GraphicsContext.cpp. What about clipping? Do you know if CG intersects the > rect you add with the current clip? > CG does not apply any clippping probably because this is not strictly a drawing function. This has not seemed necessary on a wide array of samples.
Dave Hyatt
Comment 29 2006-10-06 16:14:48 PDT
You've discovered a bug with outlines and continuations. We'll need to fix that so that the outlines paint properly. e.g., <a href="http://www.apple.com" style="outline:10px auto purple"><h3>Hello</h3></a> does not properly paint its outline. We can handle fixing that though in a followup bug, so don't worry that your continuation examples don't work properly.
Dave Hyatt
Comment 30 2006-10-06 16:18:37 PDT
It's pretty easy to work with clipping, since you can just get the bounding box from CG and make sure to intersect your rect with it. This is important, since otherwise links inside overflow sections will spill out over the rest of the page. Here's a test case: <pre style="width:100px;height:100px;background-color:lime;overflow:hidden"> <a href="http://www.apple.com">This is a really long link that gets clipped and that should not have its entire rect put into the PDF as is... see it's really really long!</a></pre>
Ken Kraisler
Comment 31 2006-10-06 17:09:45 PDT
Created attachment 10956 [details] Adding links to pdf documents. I incorporated Dave's clipping suggestion. Based on Dave's comment that the continuation code currently has a bug and that I proceed w/ his original instructions, I've removed my specialized continuation() handling code from RenderBlock. The code, in this patch, should match with Dave's recommendations in comment #25. Thanks, Ken
Beth Dakin
Comment 32 2006-10-10 15:04:53 PDT
Comment on attachment 10956 [details] Adding links to pdf documents. r=me This patch still has some minor style issues like unnecessary braces and white space. But I will land the patch myself and clean it up before I land. So no one else land this! Not unless you want to comb it for style and white space, that is ;-)
Beth Dakin
Comment 33 2006-10-11 01:58:42 PDT
Created attachment 11030 [details] Cleaned up and slightly different patch. Here is a cleaned up patch. I have posted it for review again, though, because it is also slightly different. Hyatt and I were talking on IRC, and we realized that a CSS property is not necessary after all; all of the information we need if available right in RenderObject. So I added hasOutlineAnnotation() and hasOutline() to RenderObject.h instead of having th annotation be a style. Otherwise, the patch is pretty much the same as Ken's last patch. Ken, it would be really helpful if you could either tell me how you have been testing this or if you could roll my patch into your tree and test it yourself. I am fairly positive that this patch will have identical behavior to your last one, but I would like to check before landing it.
Ken Kraisler
Comment 34 2006-10-11 10:55:34 PDT
Hi Beth, I've been testing this by going to the Yahoo.com page and selecting Print and then Print Preview. The links in the resulting PDF should be active and valid. Also, this bug has a couple sample documents that need to ultimately work. If you read Dave's comments above, he says there's an outline painting bug for RenderInline objects. I request that you or someone on your team write up this bug so that this does not fall through the cracks. I'll download your patch too and provide feedback later today. Thanks again, Ken (In reply to comment #33) > Created an attachment (id=11030) [edit] > Cleaned up and slightly different patch. > > Here is a cleaned up patch. I have posted it for review again, though, because > it is also slightly different. Hyatt and I were talking on IRC, and we realized > that a CSS property is not necessary after all; all of the information we need > if available right in RenderObject. So I added hasOutlineAnnotation() and > hasOutline() to RenderObject.h instead of having th annotation be a style. > Otherwise, the patch is pretty much the same as Ken's last patch. > > Ken, it would be really helpful if you could either tell me how you have been > testing this or if you could roll my patch into your tree and test it yourself. > I am fairly positive that this patch will have identical behavior to your last > one, but I would like to check before landing it. >
Beth Dakin
Comment 35 2006-10-11 11:34:33 PDT
Beth Dakin
Comment 36 2006-10-11 11:51:13 PDT
Ken, did your patch work with all of the links at yahoo.com? My patch works for most of them, but it misses the links on the left side (autos, finance, etc). I am not sure if that is an edge case covered by the bug that I filed or if you normally have those working even with the existing outline bug.
Ken Kraisler
Comment 37 2006-10-11 12:02:08 PDT
Beth, Those links on the left-hand-side were handle by my modifications to RenderBlock. Since Dave rejected this code as says it needs to be addressed as a fix through the outline painting code, these links simply won't work as of yet. However, we need to track them as another test case along with the attached sample documents. I forgot to mention this above and I thank you for reminding me. Can you add this comment to the bug you created: <rdar://problem/4778099>. Thanks again. (In reply to comment #36) > Ken, did your patch work with all of the links at yahoo.com? My patch works for > most of them, but it misses the links on the left side (autos, finance, etc). I > am not sure if that is an edge case covered by the bug that I filed or if you > normally have those working even with the existing outline bug. >
Beth Dakin
Comment 38 2006-10-11 13:44:52 PDT
Okay, cool. I think this patch maintains the same functionality as Ken's most recent patch, then.
Dave Hyatt
Comment 39 2006-10-13 15:28:17 PDT
Comment on attachment 11030 [details] Cleaned up and slightly different patch. r=me, make sure to apply the same logic fix to RenderObject;:paintOutline that you did to RenderFlow::paintOutline.
Beth Dakin
Comment 40 2006-10-13 18:17:58 PDT
Committed with r17052.
Note You need to log in before you can comment on or make changes to this bug.