Apple.com seems to have adopted a technique where they hide accessibility text all over the page in spans with this style:
clip: rect(1px, 1px, 1px, 1px);
-webkit-clip-path: inset(0px 0px 99.9% 99.9%);
clip-path: inset(0px 0px 99.9% 99.9%);
If you use find-in-page to search for text inside of one of these elements, we happily return it as a potential result, though the text is not visible, and the find hole and highlight end up being 1x1 px.
I'm not sure the best way or place to detect this, but making TextIterator's `fullyClipsContents` return true if the contentSize's area is <= 1px certainly fixes it (and makes find on these pages feel much more sensible).
Steps to Reproduce:
1. Search for 'apple' or 'iphone' on apple.com
The heuristic about very small sizes makes sense to me and seems worth doing rather than a lot of soul searching. I think we could consider a threshold even larger than 1px, based on some assumptions about the smallest size for readable text.
What is not so clear to me is which part of this is the key. The small size, the clipping. It would be nice to have the heuristic consider the minimum clue necessary to help ensure it works on the widest number of pages.
Hm.. maybe 2-3px box would be too small? I think anything bigger than that, we run the risk of things being visible when zoomed, or the author intentionally showing really tiny text.
Created attachment 372136 [details]
Hilariously, the overlay test failures are legit, because that test puts its log content in a "position: absolute; height: 1px; width: 1px; overflow: hidden;" div (for some reason), and now TextIterator decides not to dump that text when dumpAsText-ing the layout test.
Not really sure we can change global TextIterator behavior that much (Darin? Ryosuke? smfr is not on board anymore)
The other option is to add a TextIteratorOption that find uses, but it seems weiiiird for plainText() and find to include different text.
(In reply to Tim Horton from comment #7)
> Hilariously, the overlay test failures are legit, because that test puts its
> log content in a "position: absolute; height: 1px; width: 1px; overflow:
> hidden;" div (for some reason), and now TextIterator decides not to dump
> that text when dumpAsText-ing the layout test.
> Not really sure we can change global TextIterator behavior that much (Darin?
> Ryosuke? smfr is not on board anymore)
I think this kind of change is OK. I can’t think of any concrete use of TextIterator where iterating this invisible text is a plus. We should change those tests to not depend on this.
(In reply to Tim Horton from comment #8)
> The other option is to add a TextIteratorOption that find uses, but it seems
> weiiiird for plainText() and find to include different text.
We need to go through the different TextIterator use types. There are probably still fewer than 10. I doubt we can find any where it’s a benefit to include this kind of invisible text.
The use of TextIterator in dumpAsText is probably the least "legit" use of it. It's basically designed for use in Copy and Find, and then we found many uses for that in text editing as well.
If there’s someone arguing that such invisible text *should* continue to be made visible when you copy and paste, I suggest that’s what we should discuss, not the abstract "what should a text iterator do".
TextIterator is also used to implement innerText for now, and that API's behavior shouldn't change.
(In reply to Ryosuke Niwa from comment #11)
> TextIterator is also used to implement innerText for now, and that API's
> behavior shouldn't change.
Hmm, I assumed that set TextIteratorIgnoresStyleVisibility but it does not.
Time for another flag!
Seems risky that innerText is possibly important for website compatibility but it’s not precisely specified.
OK, so if innerText does not set TextIteratorIgnoresStyleVisibility:
1) I think it should.
2) I think then that changing to treat 1x1 text as invisible is a good idea anyway.
3) But why would any of these need to be web-visible.
I do not want to hold off on fixing this bug forever just because of innerText!