Bug 150740 - Link preview doesn't work on XHTML pages with Content-Type header as `application/xhtml+xml`
Summary: Link preview doesn't work on XHTML pages with Content-Type header as `applica...
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebKit2 (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-30 16:45 PDT by Beth Dakin
Modified: 2015-11-04 14:36 PST (History)
6 users (show)

See Also:


Attachments
Patch (3.53 KB, patch)
2015-10-30 16:48 PDT, Beth Dakin
no flags Details | Formatted Diff | Diff
Patch (3.54 KB, patch)
2015-10-30 16:52 PDT, Beth Dakin
thorton: review+
Details | Formatted Diff | Diff
Follow-up fix for non-ascii URLs (3.11 KB, patch)
2015-11-02 08:55 PST, Beth Dakin
no flags Details | Formatted Diff | Diff
Patch to fix remaining bug, clean things (7.54 KB, patch)
2015-11-03 12:42 PST, Beth Dakin
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Beth Dakin 2015-10-30 16:45:40 PDT
Link preview doesn't work on XHTML pages with Content-Type header as `application/xhtml+xml`

rdar://problem/23063585
Comment 1 Beth Dakin 2015-10-30 16:48:01 PDT
Created attachment 264441 [details]
Patch
Comment 2 WebKit Commit Bot 2015-10-30 16:50:32 PDT
Attachment 264441 [details] did not pass style-queue:


ERROR: Source/WebKit2/UIProcess/ios/WKContentViewInteraction.mm:1051:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
ERROR: Source/WebKit2/UIProcess/ios/WKContentViewInteraction.mm:1053:  Missing space after ,  [whitespace/comma] [3]
ERROR: Source/WebKit2/UIProcess/ios/WKContentViewInteraction.mm:3434:  Missing space after ,  [whitespace/comma] [3]
Total errors found: 3 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 3 Beth Dakin 2015-10-30 16:52:03 PDT
Created attachment 264442 [details]
Patch
Comment 4 WebKit Commit Bot 2015-10-30 16:53:27 PDT
Attachment 264442 [details] did not pass style-queue:


ERROR: Source/WebKit2/UIProcess/ios/WKContentViewInteraction.mm:1051:  An else if statement should be written as an if statement when the prior "if" concludes with a return, break, continue or goto statement.  [readability/control_flow] [4]
Total errors found: 1 in 2 files


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 5 Beth Dakin 2015-10-30 17:00:40 PDT
Thanks Tim! http://trac.webkit.org/changeset/191830
Comment 6 mitz 2015-10-30 17:44:54 PDT
Is it really correct to just check the tag name, or is this one of those cases where XML namespaces should be taken into account?
Comment 7 Darin Adler 2015-10-31 15:40:27 PDT
(In reply to comment #6)
> Is it really correct to just check the tag name, or is this one of those
> cases where XML namespaces should be taken into account?

I was planning to comment on that but had lost track of this bug. Thanks for drawing my attention back here! As Dan implies, this new code is incorrect in the presence of non-HTML elements that happen to have the same local name as HTML elements.

This clickableElementName field is apparently a copy of the DOM nodeName attribute, shipped across processes. We can’t use nodeName alone to check if something is a particular HTML element.

If our intent is to use this attribute to check if something is a particular HTML element, then I suggest we replace this code:

    info.clickableElementName = hitNode->nodeName();

With something more like this:

    if (is<HTMLElement>(*hitNode))
        info.clickableElementName = downcast<HTMLElement>(*hitNode).localName();

This will always be lowercase, and so there will be no need to fold case on the UI process side.

If it happens that we want to use this for SVG elements as well, and there is also no practical concern about collision between HTML and SVG local names (I think there are very few such collisions), then it could instead be this:

    if (is<HTMLElement>(*hitNode) || is<SVGElement>(*hitNode))
        info.clickableElementName = downcast<Element>(*hitNode).localName();
Comment 8 Beth Dakin 2015-11-01 11:44:34 PST
(In reply to comment #7)
> (In reply to comment #6)
> > Is it really correct to just check the tag name, or is this one of those
> > cases where XML namespaces should be taken into account?
> 
> I was planning to comment on that but had lost track of this bug. Thanks for
> drawing my attention back here! As Dan implies, this new code is incorrect
> in the presence of non-HTML elements that happen to have the same local name
> as HTML elements.
> 
> This clickableElementName field is apparently a copy of the DOM nodeName
> attribute, shipped across processes. We can’t use nodeName alone to check if
> something is a particular HTML element.
> 
> If our intent is to use this attribute to check if something is a particular
> HTML element, then I suggest we replace this code:
> 
>     info.clickableElementName = hitNode->nodeName();
> 
> With something more like this:
> 
>     if (is<HTMLElement>(*hitNode))
>         info.clickableElementName =
> downcast<HTMLElement>(*hitNode).localName();
> 
> This will always be lowercase, and so there will be no need to fold case on
> the UI process side.
> 
> If it happens that we want to use this for SVG elements as well, and there
> is also no practical concern about collision between HTML and SVG local
> names (I think there are very few such collisions), then it could instead be
> this:
> 
>     if (is<HTMLElement>(*hitNode) || is<SVGElement>(*hitNode))
>         info.clickableElementName = downcast<Element>(*hitNode).localName();

Thanks Darin and Dan! I will fix this.
Comment 9 Beth Dakin 2015-11-01 11:45:13 PST
Marking as "reopened" based on the information Darin and Dan provided.
Comment 10 Darin Adler 2015-11-01 12:24:40 PST
One other thought. Since the check in the UI process is for two specific types of elements, we might want to come up with a more precise way of doing that rather than using element local names at all. Instead of a string, we could pass over the exact computed values that the UI process needs.

My first cut might be an "is link" boolean and an "is image" boolean. Then you have to decide exactly what definition of "link" and what definition of "image" you want, but that might actually make things work better, rather than worse, helping distinguish <a> elements that are not actually links, and make it easier to make SVG links and images work as well.
Comment 11 Beth Dakin 2015-11-02 08:55:35 PST
Created attachment 264594 [details]
Follow-up fix for non-ascii URLs
Comment 12 mitz 2015-11-02 09:19:17 PST
Wrong bug?
Comment 13 Beth Dakin 2015-11-02 10:03:46 PST
(In reply to comment #12)
> Wrong bug?

😁 yes!
Comment 14 Beth Dakin 2015-11-03 12:42:25 PST
Created attachment 264713 [details]
Patch to fix remaining bug, clean things
Comment 15 Darin Adler 2015-11-03 14:54:24 PST
Comment on attachment 264713 [details]
Patch to fix remaining bug, clean things

View in context: https://bugs.webkit.org/attachment.cgi?id=264713&action=review

> Source/WebKit2/WebProcess/WebPage/ios/WebPageIOS.mm:2182
> +            info.clickableElementName = element->localName();

Can we remove clickableElementName entirely? If not, what code is using it? Is the localName usable without knowing that it’s in the HTML namespace, SVG namespace, or another? I’d expect it would not be sufficient on its own.
Comment 16 Beth Dakin 2015-11-03 16:55:20 PST
(In reply to comment #15)
> Comment on attachment 264713 [details]
> Patch to fix remaining bug, clean things
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=264713&action=review
> 
> > Source/WebKit2/WebProcess/WebPage/ios/WebPageIOS.mm:2182
> > +            info.clickableElementName = element->localName();
> 
> Can we remove clickableElementName entirely? If not, what code is using it?

Good question. I did consider this, and it is used in two places still. It is not necessarily perfect for either of those places, but isLink() and isImage() are not sufficient substitutes either.

First, it is used very when the _highlightLongPressGestureRecognizer begins to recognize. So long as the clickableElementName is not null, the gesture is allowed. 

Second it is used when the _highlightLongPressGestureRecognizer ends before a normal long press recognizes. At that time, if clickableElementName is not empty, then we will attempt a click at the location. So it is a minor optimization to prevent attempting a click 

> Is the localName usable without knowing that it’s in the HTML namespace, SVG
> namespace, or another? I’d expect it would not be sufficient on its own.

Since it is only used at this time as a way to determine if an element was hit, I believe that the lack of a namespace is not a problem. However, it does bug me that this might not be right if it was used for another purpose, and it bugs me that we store a string to decide whether or not something is an element. Is there ever a time that you could HitTest an Element but it's localName would be null or empty? I don't think there is. Maybe clickableElement should be a bool too.
Comment 17 Darin Adler 2015-11-04 08:41:09 PST
(In reply to comment #16)
> Maybe clickableElement should be a bool too.

Yes, I think that’s the answer. We should change it from a string to a bool.

Everything you are saying makes sense. As far as I can tell from what you say, there’s no incorrect behavior in the code as it stands. But it’s not valuable to send over a name to simply convey the boolean “is a clickable element” state. I don’t think there’s anything correct we could do with the name alone other than check it for null/empty. Just temptation to use it incorrectly for future programmers!
Comment 18 Darin Adler 2015-11-04 08:44:57 PST
Another thought:

I don’t really know in what practical case I would not be over any element. Maybe it’s true past the bottom of the document? I think we’d almost always be over the body element if nothing else.

Given that, I’m not sure it’s valuable for us to prevent the gesture when not over an element. And I’m not sure it’s valuable for us to prevent clicks when not over an element.

To word it another way: How did we test the “not over an element” case?

If I’m right, we can probably just remove this entirely.
Comment 19 Beth Dakin 2015-11-04 12:03:22 PST
(In reply to comment #18)
> Another thought:
> 
> I don’t really know in what practical case I would not be over any element.
> Maybe it’s true past the bottom of the document? I think we’d almost always
> be over the body element if nothing else.
> 
> Given that, I’m not sure it’s valuable for us to prevent the gesture when
> not over an element. And I’m not sure it’s valuable for us to prevent clicks
> when not over an element.
> 
> To word it another way: How did we test the “not over an element” case?
> 
> If I’m right, we can probably just remove this entirely.

Very interesting thoughts. So, it turns out that it is very possible to not be over an element. I think that in normal hit testing, you will always be over an element (I think). But! The hit-testing code path that this code uses is unique to iOS and does all kinds of filtering to look for something clickable/tappable -- see Frame::nodeRespondingToClickEvents(). Now, I don't understand the historical reasons for that instead of normal hit testing, and I don't know if those reasons are still valid. But changing that seems like a big enough change that it should be done separately and with more thought.

So for now, I think I will change clickableElementName to isClickableElement just for clarity.

It seems like a small enough change that I can just make it and check it, but I would be happy to post a new patch if you prefer.
Comment 20 Beth Dakin 2015-11-04 14:36:43 PST
I committed the change with the clickableElementName->isClickableElement change. Thanks  for the help! And let me know if you see some follow-up issues.

http://trac.webkit.org/changeset/192037