WebKit Bugzilla
New
Browse
Log In
×
Sign in with GitHub
or
Remember my login
Create Account
·
Forgot Password
Forgotten password account recovery
RESOLVED FIXED
21280
AXTables on page are probably not tables
https://bugs.webkit.org/show_bug.cgi?id=21280
Summary
AXTables on page are probably not tables
chris fleizach
Reported
2008-10-01 12:34:03 PDT
Request is to improve table detection. There are two non-table tables on
http://www.msnbc.msn.com/id/26760673/#storyContinued
Attachments
Patch to make AXTable detection more robust
(18.82 KB, patch)
2008-10-01 12:55 PDT
,
chris fleizach
darin
: review+
Details
Formatted Diff
Diff
View All
Add attachment
proposed patch, testcase, etc.
chris fleizach
Comment 1
2008-10-01 12:55:41 PDT
Created
attachment 23986
[details]
Patch to make AXTable detection more robust
Darin Adler
Comment 2
2008-10-01 13:38:39 PDT
Comment on
attachment 23986
[details]
Patch to make AXTable detection more robust I think I would use the word "heuristic" somewhere here to describe what we're doing. To me it seems overkill to look at all the cells in a table. Once you've found borders or colors on 10 different cells, I'm not sure what the value is of examining 100 others to determine that it's "more than half of the cells" that have borders or colors. I think "two rows" was probably too limited, but "every single cell" is probably too open ended. In particular we want good performance on absurdly huge tables, and querying every cell of such tables is obviously not helpful. + // considered a bordered cell. Try the cell borders and the style's borders I don't see code corresponding to the style's borders. The check for background differences isn't quite right -- it doesn't account for transparent colors. + unsigned neededCellCount = (validCellCount >> 1); It's bad style to do ">> 1" when you really mean "/ 2" and doesn't make the code any faster. We also don't usually put parentheses around this operation. I'm going to say r=me, but I think it's worth pondering further whether it's sensible to iterate the entire table even for pathologically huge tables.
chris fleizach
Comment 3
2008-10-01 14:32:30 PDT
that's reasonable. i can put in a limit to how many cells are checked at max. Let's say 10 cells
chris fleizach
Comment 4
2008-10-02 17:01:05 PDT
http://trac.webkit.org/changeset/37216
Note
You need to
log in
before you can comment on or make changes to this bug.
Top of Page
Format For Printing
XML
Clone This Bug