RESOLVED FIXED 14453
REGRESSION: www.nzherald.co.nz almost all the formatting is gone
https://bugs.webkit.org/show_bug.cgi?id=14453
Summary REGRESSION: www.nzherald.co.nz almost all the formatting is gone
Darren Collins
Reported 2007-06-28 20:23:25 PDT
Almost all of the formating is gone from http://www.nzherald.co.nz when viewed with local debug build of r23865. The formatting is present when using nightly build of r23841
Attachments
Broken formatting (132.09 KB, image/png)
2007-06-28 20:28 PDT, Darren Collins
no flags
Formatted layout (nightly r23841) (406.43 KB, image/png)
2007-06-28 20:29 PDT, Darren Collins
no flags
Reduction (120 bytes, text/html)
2007-06-29 03:25 PDT, mitz
no flags
Fix for regression caused by my previous patch (1.11 KB, patch)
2007-06-30 15:21 PDT, Rob Buis
no flags
Now with testcase (7.80 KB, patch)
2007-06-30 15:47 PDT, Rob Buis
darin: review-
Reduction: Bogus span color definition (126 bytes, text/html)
2007-06-30 17:25 PDT, Darren Collins
no flags
Another attempt (119.25 KB, patch)
2007-07-02 06:23 PDT, Rob Buis
no flags
This time with testcase and against ToT (7.49 KB, patch)
2007-07-04 06:50 PDT, Rob Buis
darin: review-
Improved patch and testcase (10.37 KB, patch)
2007-07-06 04:25 PDT, Rob Buis
darin: review-
Small fixes (10.36 KB, patch)
2007-07-08 09:29 PDT, Rob Buis
darin: review+
Darren Collins
Comment 1 2007-06-28 20:28:12 PDT
Created attachment 15304 [details] Broken formatting
Darren Collins
Comment 2 2007-06-28 20:29:34 PDT
Created attachment 15305 [details] Formatted layout (nightly r23841)
mitz
Comment 3 2007-06-29 01:33:50 PDT
mitz
Comment 4 2007-06-29 03:25:56 PDT
Created attachment 15312 [details] Reduction
Rob Buis
Comment 5 2007-06-30 15:21:12 PDT
Created attachment 15333 [details] Fix for regression caused by my previous patch I hope this is enough to get all the semi-legal hex color values. I say we go for that for now, we can always revert the original patch later if worst comes to worst. Cheers, Rob.
mitz
Comment 6 2007-06-30 15:34:46 PDT
Comment on attachment 15333 [details] Fix for regression caused by my previous patch I don't understand the language so I can't review the code, but a regression test is required.
Rob Buis
Comment 7 2007-06-30 15:47:06 PDT
Created attachment 15334 [details] Now with testcase Mitz pointed out a testcase was needed, there you go. Cheers, Rob.
Darren Collins
Comment 8 2007-06-30 16:27:03 PDT
If I am reading it correctly, the patch seems to now allow CSS colors of 2 to 6 hex values, rather than just 3 or 6. The test case is for 2 hex values, which will fix the defect reduction. But what will happen with 4 or 5 hex values? Is that really valid? Shouldn't there be a testcase for those too?
Darren Collins
Comment 9 2007-06-30 17:25:07 PDT
Created attachment 15335 [details] Reduction: Bogus span color definition This new reduction shows that the defect is due to a bogus color attribute for the span element. In Safari 3.0.2 and Firefox/IE the bogus value is ignored, and subsequent CSS elements are processed. However, with r23841, the bogus value stops further CSS elements from being applied. So I think the fix isn't to allow invalid shorthand color values like #f0, but instead gracefully ignore the bogus value and continue as per 3.0.2.
Darin Adler
Comment 10 2007-06-30 18:00:20 PDT
Comment on attachment 15334 [details] Now with testcase r=me
Darin Adler
Comment 11 2007-07-01 16:49:00 PDT
Committed revision 23919. Oops, only after landing this fix do I see Darren Collins's comment. Perhaps we should roll the change out and do a better fix?
Darin Adler
Comment 12 2007-07-01 19:24:47 PDT
Layout test was failing. Rolling out the change.
Darin Adler
Comment 13 2007-07-01 19:25:11 PDT
Comment on attachment 15334 [details] Now with testcase When I checked this in, the layout test didn't pass.
Rob Buis
Comment 14 2007-07-02 06:23:29 PDT
Created attachment 15351 [details] Another attempt Wow, this one is hard to get right. I think this one is better, it fixes the new testcase. Unfortunately I am not able to make a new patch against ToT (this one is against the feature branch) for at least one more day. Cheers, Rob.
Darren Collins
Comment 15 2007-07-03 05:06:12 PDT
*** Bug 14509 has been marked as a duplicate of this bug. ***
Mark Rowe (bdash)
Comment 16 2007-07-04 04:00:20 PDT
Rob, it looks like you omitted the html layout test files from your most recent patch?
Rob Buis
Comment 17 2007-07-04 06:50:03 PDT
Created attachment 15385 [details] This time with testcase and against ToT Should be clear from the description :) This one fixes the regression and doesnt regress otherwise. Cheers, Rob.
Darin Adler
Comment 18 2007-07-05 00:25:40 PDT
Comment on attachment 15385 [details] This time with testcase and against ToT This looks good, but not quite right. With this patch, every case in the parser that allows HEX_OR_IDENT also allows IDSEL, so it doesn't seem good to have two different tokens for these. For the "specifier" production we allow characters that start with a numeric character, as long as they happen to be 3 or 6 hex digits. That doesn't seem right to me. Is our handling of such selectors consistent with other CSS parsers? I think that certain types of illegal colors would still malfunction with our engine. For example, "#2bogus". HEX_OR_IDENT is a strange name for a token that is # followed by either 3 or 6 hexadecimal digits. So if we keep it we should probably rename it. Would you be willing to clean this up a bit more and straighten it out?
David Kilzer (:ddkilzer)
Comment 19 2007-07-05 22:31:10 PDT
Rob Buis
Comment 20 2007-07-06 04:25:16 PDT
Created attachment 15416 [details] Improved patch and testcase This one has some better names, a check for numbers as first character of an id selector and an extended test. Cheers, Rob.
Darin Adler
Comment 21 2007-07-08 05:10:04 PDT
Comment on attachment 15416 [details] Improved patch and testcase It's not portable to call isdigit on a 16-bit character and count on it returning false for non-ASCII values. In particular I think it either doesn't compile at all or warns on Windows. Instead the code should just compare the UChar to check if it's >= '0' and <= '9'. We should probably add some helpful ASCII versions of ctype.h functions too, because the ctype.h functions change their behavior based on the current locale and we really don't want that. I think Sam Weinig is working on that. Sorry to keep nitpicking this, but I still don't think that HASH is a clear name for one of two different tokens that both start with "#". My original comment still stands -- every code path for IDSEL and HASH is identical and it's not helpful that the lexer gives two different tokens for these two different cases. They should be combined into a single token. I think HASH *would* be a reasonable name for that single token.
Rob Buis
Comment 22 2007-07-08 09:27:45 PDT
Hi Darin, (In reply to comment #21) > (From update of attachment 15416 [details] [edit]) > It's not portable to call isdigit on a 16-bit character and count on it > returning false for non-ASCII values. In particular I think it either doesn't > compile at all or warns on Windows. Instead the code should just compare the > UChar to check if it's >= '0' and <= '9'. We should probably add some helpful > ASCII versions of ctype.h functions too, because the ctype.h functions change > their behavior based on the current locale and we really don't want that. I > think Sam Weinig is working on that. Right, not using isdigit anymore in my upcoming patch. > Sorry to keep nitpicking this, but I still don't think that HASH is a clear > name for one of two different tokens that both start with "#". My original I am going for the name HEX for now. > comment still stands -- every code path for IDSEL and HASH is identical and > it's not helpful that the lexer gives two different tokens for these two > different cases. They should be combined into a single token. I think HASH > *would* be a reasonable name for that single token. I would like a single token too, the problem I have is that things like \0031 in an id selector get translated to simply '1', at which point I can't distinguish it from an id sel of #1, which also becomes '1'. Unfortunately I only see the two labels as a solution for this. I am open for suggestions though :) Cheers, Rob.
Rob Buis
Comment 23 2007-07-08 09:29:18 PDT
Created attachment 15444 [details] Small fixes Here I avoid using isdigit and rename HASH to HEX. Still not sure about the two label issue... Cheers, Rob.
Maciej Stachowiak
Comment 24 2007-07-09 01:21:09 PDT
I think you need to have two labels due to the escapes issue you describe. They are allowed in id selectors but not hex colors, but can result in the same effective token.
Darin Adler
Comment 25 2007-07-11 00:16:36 PDT
Comment on attachment 15444 [details] Small fixes r=me
David Kilzer (:ddkilzer)
Comment 26 2007-07-11 21:09:24 PDT
Fixed by rwlbuis in r24185.
George Staikos
Comment 27 2007-07-11 21:58:48 PDT
Since these color changes I've had lots of problems with color parsing. <td style="background-color: green"> works while <td style="background-color: #00ff00"> doesn't. It fails on Linux/Qt, but not Mac/Qt. I think it has to be bison/flex related and the versions are: Linux Bison: 2.1 Flex: 2.5.31 OS X Bison: 1.28 Flex: 2.5.4
Cameron Zwarich (cpst)
Comment 28 2007-07-12 01:46:47 PDT
It is definitely a bison / flex problem. Using Bison 2.1 and Flex 2.5.31 causes the bug to occur on Mac OS X 10.4 as well. I am going to try and see exactly which versions of Bison and Flex break it.
Cameron Zwarich (cpst)
Comment 29 2007-07-12 02:03:51 PDT
Switching to the latest version of flex (2.5.33) fixes the bug. With the prior version (2.5.27), the CSS parser doesn't even compile. The flex SourceForge page <http://flex.sourceforge.net/> notes that 2.5.31 is rather buggy and discourages its use. Therefore, it is almost surely a bug in that older version of flex that causes the problem in Linux.
Note You need to log in before you can comment on or make changes to this bug.