Bug 21280 - AXTables on page are probably not tables
Summary: AXTables on page are probably not tables
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: Accessibility (show other bugs)
Version: 528+ (Nightly build)
Hardware: Mac OS X 10.5
: P2 Normal
Assignee: Nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-10-01 12:34 PDT by chris fleizach
Modified: 2008-10-02 17:01 PDT (History)
0 users

See Also:


Attachments
Patch to make AXTable detection more robust (18.82 KB, patch)
2008-10-01 12:55 PDT, chris fleizach
darin: review+
Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description chris fleizach 2008-10-01 12:34:03 PDT
Request is to improve table detection. There are two non-table tables on

http://www.msnbc.msn.com/id/26760673/#storyContinued
Comment 1 chris fleizach 2008-10-01 12:55:41 PDT
Created attachment 23986 [details]
Patch to make AXTable detection more robust
Comment 2 Darin Adler 2008-10-01 13:38:39 PDT
Comment on attachment 23986 [details]
Patch to make AXTable detection more robust

I think I would use the word "heuristic" somewhere here to describe what we're doing.

To me it seems overkill to look at all the cells in a table. Once you've found borders or colors on 10 different cells, I'm not sure what the value is of examining 100 others to determine that it's "more than half of the cells" that have borders or colors. I think "two rows" was probably too limited, but "every single cell" is probably too open ended. In particular we want good performance on absurdly huge tables, and querying every cell of such tables is obviously not helpful.

+            // considered a bordered cell. Try the cell borders and the style's borders

I don't see code corresponding to the style's borders.

The check for background differences isn't quite right -- it doesn't account for transparent colors.

+    unsigned neededCellCount = (validCellCount >> 1);

It's bad style to do ">> 1" when you really mean "/ 2" and doesn't make the code any faster. We also don't usually put parentheses around this operation.

I'm going to say r=me, but I think it's worth pondering further whether it's sensible to iterate the entire table even for pathologically huge tables.
Comment 3 chris fleizach 2008-10-01 14:32:30 PDT
that's reasonable. i can put in a limit to how many cells are checked at max. Let's say 10 cells
Comment 4 chris fleizach 2008-10-02 17:01:05 PDT
http://trac.webkit.org/changeset/37216