Bug 7761

Summary:

Tabs in class attribute not treated as whitespace

Product:

WebKit

Reporter:

mitz

Component:

DOM

Assignee:

Nobody <webkit-unassigned>

Status:

VERIFIED FIXED

Severity:

Normal

CC:

cdumez, darin

Priority:

Version:

420+

Hardware:

Mac

OS:

OS X 10.4

Attachments:

Description	Flags
test case	none
Treat \t and \r as whitespace	darin: review-
Revised patch	darin: review+

mitz

Reported 2006-03-13 22:29:21 PST

Tab characters are considered part of the class name they precede/follow. See the test case, which renders correctly in Firefox.

Attachments
test case (165 bytes, text/html) 2006-03-13 22:29 PST, mitz	no flags	Details
Treat \t and \r as whitespace (7.99 KB, patch) 2006-03-14 15:29 PST, mitz	darin: review-	Details Formatted Diff Diff
Revised patch (8.44 KB, patch) 2006-03-15 10:01 PST, mitz	darin: review+	Details Formatted Diff Diff
Show Obsolete (1) View All Add attachment proposed patch, testcase, etc.

mitz

Comment 1 2006-03-13 22:29:57 PST

Created attachment 7057 [details] test case

mitz

Comment 2 2006-03-14 15:29:17 PST

Created attachment 7064 [details] Treat \t and \r as whitespace

Darin Adler

Comment 3 2006-03-14 15:46:59 PST

Comment on attachment 7064 [details] Treat \t and \r as whitespace Looks great. We wanted to get this class off of QStringList anyway, so hooray for that! OK, let me now be crazy and make premature optimization comments: + while (classAttr[sPos] && (classAttr[sPos] == ' ' || classAttr[sPos] == '\r' || classAttr[sPos] == '\n' || classAttr[sPos] == '\t')) + sPos++; The code above is a little inefficient. First of all, it checks that the character is non-null before comparing it with three specific characters. The check for 0 is not needed because if it's ' ' or '\r' or '\n' or '\t', then it's definitely not 0. Each call to the [] operator does a separate comparison with the length of the string so it would be nicer if we didn't have to index over and over again. If you made a little helper function then you could write: while (isClassWhitespace(classAttr[sPos])) sPos++; which might be clearer anyway. Checking for the null character to detect the end of the string is intrinsically inefficient for our Platform::String class, because it has a length stored, not a trailing null. So if you were to take classAttr.unicode() and classAttr.length() and work directly, you'd end up with something more efficient. Something like this: str = classAttr.unicode(); while (sPos < length && isClassWhitespace(str[sPos])) ++sPos; if (sPos >= length) break; int ePos = sPos; while (ePos < length && !isClassWhitespace(str[ePos])) ++sPos; But I can't tell whether this kind of tweaking makes the code better or worse. ePos should not be declared outside the loop. Instead it should be declared right where it's used. Do we want to optimize the common case where sPos is 0 and ePos is 1 so that we don't create a new atomic string for that case? Not sure what to do about these nitpicks. The code is good as is. I'm going to mark this review-, but it's practically a review+.

Darin Adler

Comment 4 2006-03-14 15:47:31 PST

(In reply to comment #3) > (From update of attachment 7064 [details] [edit]) > Do we want to optimize the common case where sPos is 0 and ePos is 1 so that we > don't create a new atomic string for that case? I meant where sPos is 0 and ePos is len.

mitz

Comment 5 2006-03-15 10:01:27 PST

Created attachment 7089 [details] Revised patch (In reply to comment #3) > OK, let me now be crazy and make premature optimization comments: Yay! I followed your suggestions in this version. > Do we want to optimize the common case [...] so that we > don't create a new atomic string for that case? In that case, this patch uses AtomicString(classAttr) instead of AtomicString(str, length), but I almost convinced myself that there is no difference. > Not sure what to do about these nitpicks. The code is good as is. Not really... it was also checking each boundary character twice!

Darin Adler

Comment 6 2006-03-15 10:38:43 PST

(In reply to comment #5) > In that case, this patch uses AtomicString(classAttr) instead of > AtomicString(str, length), but I almost convinced myself that there is no > difference. You're right, there is no difference. To make a difference we'd have to change the code path so that the attribute stays an atomic string. Currently the attribute starts out an atomic string in the attribute, but then: 1) AttrImpl::value() changes it into a plain string (should be changed to return AtomicString) 2) calling parseClassAttribute would also change it into a plain string (should be changed to pass an AtomicString) Also, lowercasing inside parseClassAttribute has to be done carefully; the code does seem to be doing that right already. But if we took care of all of those, then we could have a fast path. I'm not certain it would matter, but it seems like it would be a win. But lets not require that be in this patch.

Darin Adler

Comment 7 2006-03-15 10:40:25 PST

Comment on attachment 7089 [details] Revised patch r=me

mitz

Comment 8 2006-03-19 11:22:37 PST

Verified in r13387 nightly.

Lucas Forschler

Comment 9 2019-02-06 09:03:35 PST

Mass moving XML DOM bugs to the "DOM" Component.

Note You need to log in before you can comment on or make changes to this bug.