Bug 62531
Summary: | Find in page -- spaces fail to match tab characters (collapsed to spaces) | ||
---|---|---|---|
Product: | WebKit | Reporter: | Yoyo Zhou <yoz> |
Component: | HTML Editing | Assignee: | Nobody <webkit-unassigned> |
Status: | NEW | ||
Severity: | Normal | CC: | ap, darin |
Priority: | P2 | ||
Version: | 528+ (Nightly build) | ||
Hardware: | All | ||
OS: | All | ||
URL: | data:text/html,a%09b |
Yoyo Zhou
When a tab character is in the HTML source, rendered as a space, it should match if the user enters a space into the find input.
For example, finding "a b" in
data:text/html,a%09b
fails in Webkit nightly and Chrome 12 beta, but succeeds in Firefox 4, IE 8, and Opera 10.
Another example real URL:
http://www.dipsea.org/2009/2009results_0630.html
in which "INV / 1" is unfindable because the HTML uses tab characters.
I couldn't find a way to insert a tab character into the find input, so I can't tell if that would match.
This works fine with some other kinds of whitespace, e.g. %0A (newline).
Attachments | ||
---|---|---|
Add attachment proposed patch, testcase, etc. |
Darin Adler
I did some brief investigation. This is not a bug in the search machinery itself, but rather in the TextIterator class. Its code to handle whitespace collapsing does not kick in, and the tab character shows up as a tab character. Another symptom of the problem is that when you copy the content you get a tab character rather than a space character in the plain text version. The intent is that you get a space.
Darin Adler
The problematic code is in TextIterator::handleTextBox and in the WebCore::isCollapsibleWhitespace function in TextIterator.h. Here are the issues I spotted:
1) The part of TextIterator::handleTextBox that handles newline specially inside the "if (runStart < runEnd)" needs to handle both newlines and tab characters in the same way.
2) The tab character is not included in the switch statement in WebCore::isCollapsibleWhitespace.
3) It's almost never safe to use WebCore::isCollapsibleWhitespace because it does not consider the style. Any time we are using WebCore::isCollapsibleWhitespace we have to have some reason to know the style was already considered. Not sure about the call to WebCore::isCollapsibleWhitespace in TextIterator::handleTextBox. The way it’s used in SimplifiedBackwardsTextIterator::advance calling WebCore::maxOffsetIncludingCollapsedSpaces calling WebCore::collapsedSpaceLength is definitely wrong.
4) The code in TextIterator::handleTextNode and TextIterator::handleTextBox, and possibly in other TextIterator functions too, seems like it will work better with a whitespace mode of NORMAL or PRE than it would with a mode of NOWRAP or PRE_WRAP. It would be good and not terribly difficult to make test cases covering all of these.
I suspect that fixing (1) and perhaps (2) will resolve this bug.
Darin Adler
The code that handles this is technically part of the HTML editing machinery.