Bug 21076

Summary: incorrect tabindex parsing
Product: WebKit Reporter: Alexander Macdonald <alexmac>
Component: WebCore Misc.Assignee: Adam Barth <abarth>
Status: RESOLVED FIXED    
Severity: Normal CC: abarth, ap, commit-queue, eric, stanaland, webkit.review.bot
Priority: P2    
Version: 528+ (Nightly build)   
Hardware: All   
OS: All   
Attachments:
Description Flags
testcase
none
Patch
none
Patch
none
Patch
none
Patch eric: review+

Description Alexander Macdonald 2008-09-24 14:48:37 PDT
When an unquoted attribute appears atthe end of an element with no space between it and the closing "/>" the attribute will have a trailing "/" appended to its value, this is a bug in the tokenizer, although it seems to be kept to maintain compatibility with old versions of IE

however when a tabindex attribute is given this extra trailing slash it fails the new tabindex HTML5 validation in HTMLElement::parseMappedAttribute

This behaviour does not occur on FF3.0.2 or IE7

Checking for, and removing, a single trailing slash fixes the problem. Here is the code I'm using to fix the problem, its inserted after line 147 in HTMLElement.cpp:

if(indexstring[indexstring.length()-1] == '/') {
  indexstring.truncate(indexstring.length()-1);
}

Does this seem reasonable, or is there a better way to fix it?
Comment 1 Alexander Macdonald 2008-09-24 14:51:50 PDT
Created attachment 23767 [details]
testcase

try tabbing between the fields, focus should switch from the first input to the second input
Comment 2 Adam Barth 2010-07-21 17:03:23 PDT
Won't that get the wrong answer if we write tabindex="3/" ?
Comment 3 Adam Barth 2010-07-21 17:05:59 PDT
Created attachment 62249 [details]
Patch
Comment 4 Adam Barth 2010-07-21 17:06:42 PDT
That patch doesn't actually fix the problem.  It just documents our parsing behavior.
Comment 5 Adam Barth 2010-07-21 18:10:41 PDT
Comment on attachment 62249 [details]
Patch

Clearing flags on attachment: 62249

Committed r63872: <http://trac.webkit.org/changeset/63872>
Comment 6 Adam Barth 2010-08-05 10:19:05 PDT
We discovered on another bug that the correct fix is to ignore trailing junk when parsing numbers out of the tabindex property.
Comment 7 Adam Barth 2010-09-13 18:18:25 PDT
*** Bug 43128 has been marked as a duplicate of this bug. ***
Comment 8 Adam Barth 2010-09-13 21:36:24 PDT
Here's the algorithm we're supposed to use:
http://www.whatwg.org/specs/web-apps/current-work/#rules-for-parsing-integers
Comment 9 Adam Barth 2010-09-13 23:44:50 PDT
Created attachment 67521 [details]
Patch
Comment 10 Adam Barth 2010-09-14 00:10:01 PDT
Created attachment 67523 [details]
Patch
Comment 11 Eric Seidel (no email) 2010-09-14 03:48:35 PDT
Comment on attachment 67523 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=67523&action=prettypatch

> WebCore/html/parser/HTMLParserIdioms.cpp:95
> +bool parseHTMLInteger(const String& input, int& value)
I swear this must exist elsewhere.
Comment 12 Eric Seidel (no email) 2010-09-14 13:59:19 PDT
Comment on attachment 67523 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=67523&action=prettypatch

> WebCore/html/parser/HTMLParserIdioms.cpp:106
> +    } sign = Positive;
Why not use an int and set it to 1 or -1?

> WebCore/html/parser/HTMLParserIdioms.cpp:139
> +    // Step 6
> +    if (*position == '-') {
>  
> +        // Substep 1
> +        sign = Negative;
> +
> +        // Substep 2
> +        ++position;
> +
> +        // Substep 3
> +        if (position == end)
> +            return false;
> +    } else if (*position == '+') {
> +        // Substep 1
> +        ++position;
> +
> +        // Substep 2
> +        if (position == end)
> +            return false;
> +    }
> +    ASSERT(position < end);
This is all super-redundant.  1.  you can just ignore +.  2.  You could share the position advancement and check code.

I think it's good that we're following the spec closely, but I don't think we need to write it out in such verbosity.  Just document the individual steps with comments.

> WebCore/html/parser/HTMLParserIdioms.cpp:152
> +    int number = WTF::charactersToIntStrict(digits.data(), digits.size());
Slightly sad this doesn't just take a vector or a WTF::String, but I guess taking pointer/len pair makes the implementation simpler?

> WebCore/html/parser/HTMLParserIdioms.cpp:162
> +    switch (sign) {
> +    case Positive:
> +        value = number;
> +        break;
> +    case Negative:
> +        value = -number;
> +        break;
> +    }
Then this becomes number *= sign;  Actually just becomes int number = WTF:::parseFunction() * sign. :)

Bleh. I much prefer results to include their pass fail information.  I agree with your theory that it makes them easier to change.  But I dislike your argument that since at least one result in teh repository requires you to compare against another port's expected, we should make them all require you to compare against expected.  But it's not a deal breaker.

r- for the unnecessary code duplication above.
Comment 13 Darin Adler 2010-09-14 14:02:21 PDT
Comment on attachment 67523 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=67523&action=prettypatch

In answer to one of Eric’s comments:

> WebCore/html/parser/HTMLParserIdioms.cpp:152
> +    int number = WTF::charactersToIntStrict(digits.data(), digits.size());
The function toIntStrict is the version of this that is a member of WTF::String. But the code here is using a vector for better performance. I think what Adam has done is the way to go. We could also do an overload that can take a vector, but I don’t think it’s good.
Comment 14 Adam Barth 2010-09-14 14:24:19 PDT
(In reply to comment #12)
> (From update of attachment 67523 [details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=67523&action=prettypatch
> 
> > WebCore/html/parser/HTMLParserIdioms.cpp:106
> > +    } sign = Positive;
> Why not use an int and set it to 1 or -1?

Ok.

> > WebCore/html/parser/HTMLParserIdioms.cpp:139
> > +    // Step 6
> > +    if (*position == '-') {
> >  
> > +        // Substep 1
> > +        sign = Negative;
> > +
> > +        // Substep 2
> > +        ++position;
> > +
> > +        // Substep 3
> > +        if (position == end)
> > +            return false;
> > +    } else if (*position == '+') {
> > +        // Substep 1
> > +        ++position;
> > +
> > +        // Substep 2
> > +        if (position == end)
> > +            return false;
> > +    }
> > +    ASSERT(position < end);
> This is all super-redundant.  1.  you can just ignore +.  2.  You could share the position advancement and check code.

We can't share the advancement because we don't advance if *position is not + or -.

> > WebCore/html/parser/HTMLParserIdioms.cpp:152
> > +    int number = WTF::charactersToIntStrict(digits.data(), digits.size());
> Slightly sad this doesn't just take a vector or a WTF::String, but I guess taking pointer/len pair makes the implementation simpler?

There's another method to call if you have a String, but we don't.  It's designed to let you parse parts of larger buffers.

> > WebCore/html/parser/HTMLParserIdioms.cpp:162
> > +    switch (sign) {
> > +    case Positive:
> > +        value = number;
> > +        break;
> > +    case Negative:
> > +        value = -number;
> > +        break;
> > +    }
> Then this becomes number *= sign;  Actually just becomes int number = WTF:::parseFunction() * sign. :)

Done.
Comment 15 Adam Barth 2010-09-14 14:25:21 PDT
Created attachment 67605 [details]
Patch
Comment 16 Eric Seidel (no email) 2010-09-14 14:27:55 PDT
Comment on attachment 67605 [details]
Patch

OK.
Comment 17 Darin Adler 2010-09-14 14:31:43 PDT
Comment on attachment 67605 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=67605&action=prettypatch

A clear improvement, but seems to leave some loose ends.

> WebCore/html/HTMLElement.cpp:146
>          if (indexstring.length()) {
I’m surprised you left this alone. This can’t be correct behavior!

> WebCore/html/HTMLElement.cpp:150
>                  // Clamp tabindex to the range of 'short' to match Firefox's behavior.
>                  setTabIndexExplicitly(max(static_cast<int>(std::numeric_limits<short>::min()), min(tabindex, static_cast<int>(std::numeric_limits<short>::max()))));
I’m surprised you left this alone. Is this in the HTML specification?

> WebCore/html/parser/HTMLParserIdioms.cpp:140
> +    value = sign * WTF::charactersToIntStrict(digits.data(), digits.size());
You should not need to qualify with WTF:: here. If you do, that’s a bug in the WTF header.

> LayoutTests/fast/parser/tabindex-parsing-2.html:57
> +    document.getElementById('result').innerHTML += divs[i].tabIndex + '\n'
I agree with Eric’s earlier comment that patches that distinguish passes from failures are better that ones where you just have to know what to expect. Further, in this test’s output you can’t easily see the test cases, so you have to line the test up side by side with its results to see what is being tested. Also there are no test cases here for space characters other than U+0020.
Comment 18 Adam Barth 2010-09-14 15:07:32 PDT
Comment on attachment 67605 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=67605&action=prettypatch

> WebCore/html/HTMLElement.cpp:146
>          if (indexstring.length()) {
The code is wrong, but it's not observable because the empty string fails to parse.

> WebCore/html/HTMLElement.cpp:150
>                  setTabIndexExplicitly(max(static_cast<int>(std::numeric_limits<short>::min()), min(tabindex, static_cast<int>(std::numeric_limits<short>::max()))));
Not as far as I can tell.  The original bug I wanted to solve was that <input tabindex=3/> was broken.  It's easy for changes like this to spider to fixing the world, but I think I'd like to draw the boundary of this patch at the parser.  We can do more work on tabindex in future patches.

> WebCore/html/parser/HTMLParserIdioms.cpp:140
> +    value = sign * WTF::charactersToIntStrict(digits.data(), digits.size());
I think compile failed without that.  I'll try to fix the header.

> LayoutTests/fast/parser/tabindex-parsing-2.html:57
> +    document.getElementById('result').innerHTML += divs[i].tabIndex + '\n'
Ok.  I'll add the test cases, but I'd like to skip the "PASS/FAIL" notations.  I find those to be much more of a pain than a benefit.
Comment 19 Alexey Proskuryakov 2010-09-14 15:28:58 PDT
I agree with Eric and Darin. Tests without PASS/FAIL are only good as long as nothing changes. When it does, or when it needs to change, or when you want to check other browsers' behavior, raw dumps are difficult.

Naturally, there are some cases when there is no golden standard to adhere to, and we just want to dump as much as possible to prevent inadvertent behavior changes in the future. But this clearly isn't such a case.
Comment 20 Adam Barth 2010-09-14 15:39:53 PDT
(In reply to comment #19)
> I agree with Eric and Darin. Tests without PASS/FAIL are only good as long as nothing changes. When it does, or when it needs to change, or when you want to check other browsers' behavior, raw dumps are difficult.
> 
> Naturally, there are some cases when there is no golden standard to adhere to, and we just want to dump as much as possible to prevent inadvertent behavior changes in the future. But this clearly isn't such a case.

Honestly, the PASS/FAIL double-expectation tests are a huge pain when trying to change behavior.  Basically, you need to make the same change redundantly to two different files.  It's not really adding any value.  It's just flies in the face of Don't Repeat Yourself.
Comment 21 Adam Barth 2010-09-14 15:51:21 PDT
I'm going to land this without PASS/FAIL because I'm petchulant.
Comment 22 Adam Barth 2010-09-14 15:53:16 PDT
Committed r67506: <http://trac.webkit.org/changeset/67506>
Comment 23 Alexey Proskuryakov 2010-09-14 15:53:49 PDT
I think that your experience with making a lot of changes at once is not typical. On quieter days, it's a few tests that you need to thoroughly check in all browsers to collect information about which ones pass, and which ones fail.
Comment 24 WebKit Review Bot 2010-09-14 19:44:10 PDT
http://trac.webkit.org/changeset/67506 might have broken Leopard Intel Debug (Tests)
The following changes are on the blame list:
http://trac.webkit.org/changeset/67506
http://trac.webkit.org/changeset/67507