You need to
before you can comment on or make changes to this bug.
Almost all of the formating is gone from http://www.nzherald.co.nz when viewed with local debug build of r23865.
The formatting is present when using nightly build of r23841
Created an attachment (id=15304) [details]
Created an attachment (id=15305) [details]
Formatted layout (nightly r23841)
Created an attachment (id=15312) [details]
Created an attachment (id=15333) [details]
Fix for regression caused by my previous patch
I hope this is enough to get all the semi-legal hex color values. I say we go for that for now, we can always revert the original patch later if worst comes to worst.
(From update of attachment 15333 [details])
I don't understand the language so I can't review the code, but a regression test is required.
Created an attachment (id=15334) [details]
Now with testcase
Mitz pointed out a testcase was needed, there you go.
If I am reading it correctly, the patch seems to now allow CSS colors of 2 to 6 hex values, rather than just 3 or 6. The test case is for 2 hex values, which will fix the defect reduction. But what will happen with 4 or 5 hex values? Is that really valid? Shouldn't there be a testcase for those too?
Created an attachment (id=15335) [details]
Reduction: Bogus span color definition
This new reduction shows that the defect is due to a bogus color attribute for the span element. In Safari 3.0.2 and Firefox/IE the bogus value is ignored, and subsequent CSS elements are processed. However, with r23841, the bogus value stops further CSS elements from being applied. So I think the fix isn't to allow invalid shorthand color values like #f0, but instead gracefully ignore the bogus value and continue as per 3.0.2.
(From update of attachment 15334 [details])
Committed revision 23919.
Oops, only after landing this fix do I see Darren Collins's comment.
Perhaps we should roll the change out and do a better fix?
Layout test was failing. Rolling out the change.
(From update of attachment 15334 [details])
When I checked this in, the layout test didn't pass.
Created an attachment (id=15351) [details]
Wow, this one is hard to get right. I think this one is better, it fixes the new testcase. Unfortunately I am not able to make a new patch against ToT (this one is against the feature branch) for at least one more day.
*** Bug 14509 has been marked as a duplicate of this bug. ***
Rob, it looks like you omitted the html layout test files from your most recent patch?
Created an attachment (id=15385) [details]
This time with testcase and against ToT
Should be clear from the description :) This one fixes the regression and doesnt regress otherwise.
(From update of attachment 15385 [details])
This looks good, but not quite right. With this patch, every case in the parser that allows HEX_OR_IDENT also allows IDSEL, so it doesn't seem good to have two different tokens for these.
For the "specifier" production we allow characters that start with a numeric character, as long as they happen to be 3 or 6 hex digits. That doesn't seem right to me. Is our handling of such selectors consistent with other CSS parsers?
I think that certain types of illegal colors would still malfunction with our engine. For example, "#2bogus".
HEX_OR_IDENT is a strange name for a token that is # followed by either 3 or 6 hexadecimal digits. So if we keep it we should probably rename it.
Would you be willing to clean this up a bit more and straighten it out?
Created an attachment (id=15416) [details]
Improved patch and testcase
This one has some better names, a check for numbers as first character of an id selector and an extended test.
(From update of attachment 15416 [details])
It's not portable to call isdigit on a 16-bit character and count on it returning false for non-ASCII values. In particular I think it either doesn't compile at all or warns on Windows. Instead the code should just compare the UChar to check if it's >= '0' and <= '9'. We should probably add some helpful ASCII versions of ctype.h functions too, because the ctype.h functions change their behavior based on the current locale and we really don't want that. I think Sam Weinig is working on that.
Sorry to keep nitpicking this, but I still don't think that HASH is a clear name for one of two different tokens that both start with "#". My original comment still stands -- every code path for IDSEL and HASH is identical and it's not helpful that the lexer gives two different tokens for these two different cases. They should be combined into a single token. I think HASH *would* be a reasonable name for that single token.
(In reply to comment #21)
> (From update of attachment 15416 [details] )
> It's not portable to call isdigit on a 16-bit character and count on it
> returning false for non-ASCII values. In particular I think it either doesn't
> compile at all or warns on Windows. Instead the code should just compare the
> UChar to check if it's >= '0' and <= '9'. We should probably add some helpful
> ASCII versions of ctype.h functions too, because the ctype.h functions change
> their behavior based on the current locale and we really don't want that. I
> think Sam Weinig is working on that.
Right, not using isdigit anymore in my upcoming patch.
> Sorry to keep nitpicking this, but I still don't think that HASH is a clear
> name for one of two different tokens that both start with "#". My original
I am going for the name HEX for now.
> comment still stands -- every code path for IDSEL and HASH is identical and
> it's not helpful that the lexer gives two different tokens for these two
> different cases. They should be combined into a single token. I think HASH
> *would* be a reasonable name for that single token.
I would like a single token too, the problem I have is that things like \0031 in an id selector get translated to simply '1', at which point I can't distinguish it from an id sel of #1, which also becomes '1'.
Unfortunately I only see the two labels as a solution for this. I am open for suggestions though :)
Created an attachment (id=15444) [details]
Here I avoid using isdigit and rename HASH to HEX. Still not sure about the two label issue...
I think you need to have two labels due to the escapes issue you describe. They are allowed in id selectors but not hex colors, but can result in the same effective token.
(From update of attachment 15444 [details])
Fixed by rwlbuis in r24185.
Since these color changes I've had lots of problems with color parsing.
<td style="background-color: green"> works while <td style="background-color: #00ff00"> doesn't. It fails on Linux/Qt, but not Mac/Qt. I think it has to be bison/flex related and the versions are:
It is definitely a bison / flex problem. Using Bison 2.1 and Flex 2.5.31 causes the bug to occur on Mac OS X 10.4 as well. I am going to try and see exactly which versions of Bison and Flex break it.
Switching to the latest version of flex (2.5.33) fixes the bug. With the prior version (2.5.27), the CSS parser doesn't even compile. The flex SourceForge page <http://flex.sourceforge.net/> notes that 2.5.31 is rather buggy and discourages its use. Therefore, it is almost surely a bug in that older version of flex that causes the problem in Linux.