Bug 30883

Summary: [Gtk] Implement AtkText for HTML elements which contain text
Product: WebKit Reporter: Joanmarie Diggs <jdiggs>
Component: AccessibilityAssignee: Nobody <webkit-unassigned>
Status: RESOLVED FIXED    
Severity: Normal CC: apinheiro, commit-queue, walker.willie, webkit.review.bot, xan.lopez
Priority: P2 Keywords: Gtk
Version: 528+ (Nightly build)   
Hardware: PC   
OS: Linux   
Bug Depends on:    
Bug Blocks: 25531, 25677, 26991    
Attachments:
Description Flags
proposed patch
none
caret offset and event adjustments
xan.lopez: review-
proposed patch redux (comments added)
none
caret offset and event adjustments - take 2
none
get rid of a needless variable
none
One more tweak....
none
Handle the issue from comment #28, add sanity checks to objectAndOffsetUnignored, new tests, style compliance changes....
none
Handle the issue from comment #28, add sanity checks to objectAndOffsetUnignored, new tests, style compliance changes....
none
Handle the issue from comment #28, add sanity checks to objectAndOffsetUnignored, new tests none

Description Joanmarie Diggs 2009-10-28 14:55:02 PDT
Currently, given any HTML element which contains text, we see a hierarchy like:

--> HTML Element
    --> ROLE_TEXT

Or in the case of text which is partially marked up with styles or has line breaks.

--> HTML Element
    --> ROLE_TEXT
    --> ROLE_TEXT
    --> ROLE_TEXT

This is not what we see elsewhere (OOo, Gecko, most Gtk+ apps). As a result, custom handling is needed on the AT-side. In particular:

1. to identify the functional object behind text events (e.g. caret-moved)
2. to identify the text associated with focusable elements which emit focus:/state-changed:focused events

Therefore, if it wouldn't be too much work on the WebKit side for HTML elements to implement AtkText instead, it would be awesome.
Comment 1 Joanmarie Diggs 2009-10-28 14:58:24 PDT
Marking as depends on bug 25675 as I assume we'll want to be able to count on having a single text object before this bug can be addressed. (Feel free to correct me if I'm wrong. :-) )
Comment 2 Joanmarie Diggs 2009-12-28 13:04:43 PST
(In reply to comment #1)
> Marking as depends on bug 25675 as I assume we'll want to be able to count on
> having a single text object before this bug can be addressed. (Feel free to
> correct me if I'm wrong. :-) )

I'm going to correct myself as it seems I was wrong. Due to some already-implemented WebKit goodness somewhere, the mere act of implementing AtkText for objects whose children are static text seems to be sufficient to cause us to have a single text object as expected. Thus we don't first want to collapse (create) a single static text object. We just want to implement the fix for this bug and the issue raised in bug 25675 will disappear.

I'll take this one. And I'm marking 25675 as a dup.
Comment 3 Joanmarie Diggs 2009-12-28 13:05:49 PST
*** Bug 25675 has been marked as a duplicate of this bug. ***
Comment 4 Joanmarie Diggs 2009-12-30 19:47:46 PST
Created attachment 45696 [details]
proposed patch

This is the bulk of it (and it might be all of it):

* Implements AtkText for HTML elements which contain text
* Eliminates the now-extraneous objects of ROLE_TEXT from the hierarchy
* get_text_{after,at,before}_offset and get_text work as expected
* Layout tests for accessibility are verified to continue to work as expected
* Unit tests for accessibility updated and verified to work as expected

Still on the to-do list:

1. Verify that events work as expected -- and if not, fix that
2. Add a new unit test or two

However as this patch is complete (and MUCH needed and has two bugs now depending upon it), a review would be most appreciated. Thanks!
Comment 5 WebKit Review Bot 2009-12-30 19:51:00 PST
style-queue ran check-webkit-style on attachment 45696 [details] without any errors.
Comment 6 Joanmarie Diggs 2010-01-02 09:00:45 PST
Created attachment 45740 [details]
caret offset and event adjustments

The first patch flattens the hierarchy, but the reported caret offset and the events emitted were still with respect to the static text objects (which are no longer in the hierarchy). This patch fixes those things. As a side effect, it also fixes bug 25669. Yea! :-)
Comment 7 WebKit Review Bot 2010-01-02 09:01:10 PST
style-queue ran check-webkit-style on attachment 45740 [details] without any errors.
Comment 8 Joanmarie Diggs 2010-01-02 09:07:21 PST
*** Bug 25669 has been marked as a duplicate of this bug. ***
Comment 9 Joanmarie Diggs 2010-01-05 09:27:37 PST
Created attachment 45896 [details]
proposed patch redux (comments added)

Xan and I went over my proposed patch via IRC (thanks!) and decided that it was in need of comments. :-) Done.
Comment 10 WebKit Review Bot 2010-01-05 09:29:35 PST
style-queue ran check-webkit-style on attachment 45896 [details] without any errors.
Comment 11 Xan Lopez 2010-01-06 12:45:17 PST
Comment on attachment 45896 [details]
proposed patch redux (comments added)

Looks good!
Comment 12 WebKit Commit Bot 2010-01-06 18:05:09 PST
Comment on attachment 45896 [details]
proposed patch redux (comments added)

Clearing flags on attachment: 45896

Committed r52890: <http://trac.webkit.org/changeset/52890>
Comment 13 Xan Lopez 2010-01-07 03:17:37 PST
Comment on attachment 45740 [details]
caret offset and event adjustments

>From 7de96d0a93b5b44e0dc32bb141a4b52b890e15b1 Mon Sep 17 00:00:00 2001
>From: Joanmarie Diggs <jd@vblockhead.(none)>
>Date: Sat, 2 Jan 2010 11:14:37 -0500
>Subject: [PATCH 2/2] 2010-01-02  Joanmarie Diggs  <joanmarie.diggs@gmail.com>
>
>        Reviewed by NOBODY (OOPS!).
>
>        https://bugs.webkit.org/show_bug.cgi?id=30883
>        [Gtk] Implement AtkText for HTML elements which contain text

We need some kind of explanation here of what the patch does...

>
>        * accessibility/gtk/AccessibilityObjectWrapperAtk.h
>        * accessibility/gtk/AccessibilityObjectWrapperAtk.cpp:
>        (objectAndOffsetUnignored):
>        (webkit_accessible_text_get_caret_offset):
>        * editing/gtk/SelectionControllerGtk.cpp:
>        (SelectionController::notifyAccessibilityForSelectionChange)
>---


> static gint webkit_accessible_text_get_caret_offset(AtkText* text)
> {
>+    AccessibilityObject* coreObject = core(text);
>+    RenderObject* focusedNode = coreObject->selection().end().node()->renderer();
>+    AccessibilityObject* focusedObject = coreObject->document()->axObjectCache()->getOrCreate(focusedNode);
>+
>+    AccessibilityObject* object;
>+    int offset;
>+    objectAndOffsetUnignored(focusedObject, object, offset, !coreObject->isLink());

Not sure I see a reason to not use the return value of the function for, say, the 'realObject'?

Also, can you explain a bit the logic behind the last boolean parameter? In the function below I see that it's used to decide whether to go one level up or not when the object containing the text is a link, but I'm not sure I get why here you pass !isLink(). If it's not a link you want to ignore them?


>+
>     // TODO: Verify this for RTL text.
>-    return core(text)->selection().end().offsetInContainerNode();
>+    return offset;
> }
> 
> static AtkAttributeSet* webkit_accessible_text_get_run_attributes(AtkText* text, gint offset, gint* start_offset, gint* end_offset)
>@@ -1729,4 +1737,23 @@ AtkObject* webkit_accessible_get_focused_element(WebKitAccessible* accessible)
>     return focusedObj->wrapper();
> }
> 
>+void objectAndOffsetUnignored(AccessibilityObject* coreObject, AccessibilityObject*& realObject, int& offset, bool ignoreLinks)
>+{
>+
>+    Node* endNode = static_cast<AccessibilityRenderObject*>(coreObject)->renderer()->node();
>+    int endOffset = coreObject->selection().end().offsetInContainerNode();
>+
>+    realObject = coreObject;
>+    if (realObject->accessibilityIsIgnored())
>+        realObject = realObject->parentObjectUnignored();

This can only happen once? Any specific cases in mind?

>+
>+    if (ignoreLinks && realObject->isLink())
>+        realObject = realObject->parentObjectUnignored();
>+
>+    RefPtr<Range> range = rangeOfContents(static_cast<AccessibilityRenderObject*>(realObject)->renderer()->node());
>+    ExceptionCode ec = 0;
>+    range->setEndBefore(endNode, ec);

We all know this will fail one day and we will say 'oh damn, should have checked that value' :]

>+    offset = range->text().length() + endOffset;
>+}
>+
> #endif // HAVE(ACCESSIBILITY)

Marking r- to clarify my questions/doubts.
Comment 14 Joanmarie Diggs 2010-01-09 12:57:17 PST
Thanks for the review! New patch to follow shortly.

>        Reviewed by NOBODY (OOPS!).
> >
> >        https://bugs.webkit.org/show_bug.cgi?id=30883
> >        [Gtk] Implement AtkText for HTML elements which contain text
> 
> We need some kind of explanation here of what the patch does...

Will do.

> static gint webkit_accessible_text_get_caret_offset(AtkText* text)
> > {
> >+    AccessibilityObject* coreObject = core(text);
> >+    RenderObject* focusedNode = coreObject->selection().end().node()->renderer();
> >+    AccessibilityObject* focusedObject = coreObject->document()->axObjectCache()->getOrCreate(focusedNode);
> >+
> >+    AccessibilityObject* object;
> >+    int offset;
> >+    objectAndOffsetUnignored(focusedObject, object, offset, !coreObject->isLink());
> 
> Not sure I see a reason to not use the return value of the function for, say,
> the 'realObject'?

Will change.

> Also, can you explain a bit the logic behind the last boolean parameter?
 
Will add comments to the code which will hopefully clarify/remind us of what's taking place.

> In the
> function below I see that it's used to decide whether to go one level up or not
> when the object containing the text is a link, but I'm not sure I get why here
> you pass !isLink(). If it's not a link you want to ignore them?

Exactly. :-)

Here's the deal: In flattening the hierarchy w.r.t. text objects, we're no longer exposing all of those RenderText objects to ATs in the form of ATK_ROLE_TEXT. We are still exposing links however. (Ultimately, that too may be something that we can "collapse" fully into the parent object, but I'm not yet sure if that's a good idea.)

Links now implement the AtkText just like paragraphs, etc. So given:

<body>
<p>This is a <a href="foo">test</a>.</p>
</body>

We have this hierarchy:

-> Document Frame
   -> Paragraph (accessible text: "This is a test.")
      -> Link (accessible text: "test")

Thus an AT interested in the offset for an AtkText object might be asking us for the offset in the paragraph, or it may be asking for the offset in the link. As far as WebKit is concerned, however, the caret is in neither the paragraph nor the link; it's in a RenderText object which is now ignored due to the flattening of the hierarchy.

As a general rule, we can (and do) call parentObjectUnignored to find out the real object as far as the AT is concerned. Problem (normally) solved. BUT: At the moment, links are not ignored objects; they're exposed objects. So the question is, do we report the caret offset w.r.t. the link or w.r.t. the full paragraph? The answer depends on what AtkText object the caller specified. If that object is a link, then we don't want to ignore links. If that object is not a link, but instead the parent of a link, we do want to ignore links and work our way up to the unignored parent of the link, in this case the paragraph.

Regarding the caret-moved events to report, I had to make a judgment call. Do we:

1. Emit events for both objects?
2. Emit events for only the paragraph?
3. Emit events for only the link?

The more events which get emitted, the more events an AT has to process. Plus, when caret navigation moves into a link, that link gets a focus rectangle and should emit a focus: and object:state-changed:focused event and also claim ATK_STATE_FOCUSED. In other words, ATs already will know that the user just arrowed into a link. And if they want the caret offset w.r.t. that link, they can always ask for it as described above. Therefore, I decided that the thing to do in terms of emitting caret moved events (and also selection changed events), is to just emit events for the paragraph and not for the link.

>+void objectAndOffsetUnignored(AccessibilityObject* coreObject, AccessibilityObject*& realObject, int& offset, bool ignoreLinks)
> >+{
> >+
> >+    Node* endNode = static_cast<AccessibilityRenderObject*>(coreObject)->renderer()->node();
> >+    int endOffset = coreObject->selection().end().offsetInContainerNode();
> >+
> >+    realObject = coreObject;
> >+    if (realObject->accessibilityIsIgnored())
> >+        realObject = realObject->parentObjectUnignored();
> 
> This can only happen once? Any specific cases in mind?

Yes. All cases. :-) parentObjectUnignored() by definition returns the first parent object for which accessibilityIsIgnored() is false. Therefore, if realObject->accessibilityIsIgnored() is true, setting realObject to realObject->parentObjectUnignored() will give us the object we should report to the AT. Unless, of course, realObject is now a link (because we don't ignore links normally) and the caller wants to ignore links for one of the reasons described above. Therefore, we do this bit:

> >+    if (ignoreLinks && realObject->isLink())
> >+        realObject = realObject->parentObjectUnignored();

And we should be good, I believe.

> >+    RefPtr<Range> range = rangeOfContents(static_cast<AccessibilityRenderObject*>(realObject)->renderer()->node());
> >+    ExceptionCode ec = 0;
> >+    range->setEndBefore(endNode, ec);
> 
> We all know this will fail one day and we will say 'oh damn, should have
> checked that value' :]

Hehehe. Yeah, I know.... But I wasn't sure how it would fail or what I should do in response. And I grepped through the code and didn't see anything especially enlightening done in similar calls. So... If you could elaborate on what you think I should do here, I'd be happy to do it.

Thanks!
Comment 15 Joanmarie Diggs 2010-01-09 13:49:36 PST
Created attachment 46218 [details]
caret offset and event adjustments - take 2

This patch (hopefully) addresses all of the issues Xan identified, modulo checking the value after doing

   range->setEndBefore(endNode, ec)

because it's not clear to me what condition I should be checking for/concerned about. Sorry!

Because I've not addressed this last point, I'm not asking for review; merely attaching this patch "for safe keeping" until Xan has had the opportunity to point me in the right direction.
Comment 16 Xan Lopez 2010-01-11 03:41:36 PST
Comment on attachment 46218 [details]
caret offset and event adjustments - take 2

Thank you, your explanations were really helpful :)

About the error code, just checking at the function it seems it can fail in a number of ways (the internal state of the range is wrong, the node you pass is wrong, the document of the range and the node you pass are not the same, ...), so you could just check none of those happen and otherwise maybe set the offset to some error value (-1?). But you can do that in a follow-up as far as I'm concerned, so r+
Comment 17 WebKit Commit Bot 2010-01-11 04:03:47 PST
Comment on attachment 46218 [details]
caret offset and event adjustments - take 2

Clearing flags on attachment: 46218

Committed r53072: <http://trac.webkit.org/changeset/53072>
Comment 18 WebKit Commit Bot 2010-01-11 04:04:01 PST
All reviewed patches have been landed.  Closing bug.
Comment 19 Joanmarie Diggs 2010-01-11 06:06:48 PST
I'm going to re-open this one. I still need to add the error checking Xan mentioned, and all the relevant context is in this bug. Sorry!
Comment 20 Joanmarie Diggs 2010-01-11 17:53:37 PST
Created attachment 46325 [details]
get rid of a needless variable

Kov pointed out this error. Thanks!
Comment 21 Gustavo Noronha (kov) 2010-01-11 18:05:26 PST
Comment on attachment 46325 [details]
get rid of a needless variable

thanks!
Comment 22 Joanmarie Diggs 2010-01-11 18:24:16 PST
Created attachment 46328 [details]
One more tweak....

/me sighs.

Thanks Kov!
Comment 23 Joanmarie Diggs 2010-01-11 18:34:48 PST
Comment on attachment 46328 [details]
One more tweak....

So.... In further chatting with Kov, it seems that there are two ways to skin the cat in question. It was decided that what I'd already done was fine (and works), so I'm withdrawing this patch.
Comment 24 WebKit Commit Bot 2010-01-12 02:10:05 PST
Comment on attachment 46325 [details]
get rid of a needless variable

Clearing flags on attachment: 46325

Committed r53124: <http://trac.webkit.org/changeset/53124>
Comment 25 WebKit Commit Bot 2010-01-12 02:10:12 PST
All reviewed patches have been landed.  Closing bug.
Comment 26 Joanmarie Diggs 2010-01-12 05:41:52 PST
(I still have to address Xan's issue.)
Comment 27 Joanmarie Diggs 2010-01-17 15:36:00 PST
Note to self: While testing the text attributes patch, it seems that some objects are not exposing their text correctly as part of the changes made here. See this document as an example (objects exposed as ROLE_PANEL): https://bugs.webkit.org/attachment.cgi?id=29967
Comment 28 Joanmarie Diggs 2010-01-17 16:03:17 PST
Simple test case where things are broken:

<html>
<body>
<p>This is a test.</p>
Hello world.
</body>
</html>

('Hello world.' has a corresponding object of ROLE_PANEL, but no text)
Comment 29 Joanmarie Diggs 2010-01-17 23:18:41 PST
Created attachment 46791 [details]
Handle the issue from comment #28, add sanity checks to objectAndOffsetUnignored, new tests, style compliance changes....

This patch makes the following changes:

1. Moves the text collapsing/assembling functionality that was in getPangoLayoutForAtk to the newly-created textForObject.

2. webkit_accessible_text_get_text now falls back on textForObject when it comes up empty.

3. Adds a check in textForObject so that we don't double-add newlines when we have a BR.

4. Adds new tests for items 2 and 3 above to testatk.c

5. (Hopefully) brings the rest of testatk.c into compliance with the style guidelines. Also there were several instances of our calling atk_object_ref_accessible_child(), but never unref'ing the object at the end of the test. I assume we should be unref'ing them.

6. Adds some sanity checks to objectAndOffsetUnignored which (hopefully) address Xan's concerns.
Comment 30 Joanmarie Diggs 2010-01-17 23:37:05 PST
Created attachment 46792 [details]
Handle the issue from comment #28, add sanity checks to objectAndOffsetUnignored, new tests, style compliance changes....

Caught a dumb mistake.
Comment 31 Joanmarie Diggs 2010-01-19 10:43:27 PST
Created attachment 46926 [details]
Handle the issue from comment #28, add sanity checks to objectAndOffsetUnignored, new tests

Xan and I went over this in IRC (thanks!)

Addresses:
* Style changes should be done in a separate patch. (New tests are style-compliant w.r.t. their names.)
* Removed textForObject from the header
Comment 32 WebKit Review Bot 2010-01-19 10:48:29 PST
Attachment 46926 [details] did not pass style-queue:

Failed to run "WebKitTools/Scripts/check-webkit-style" exit_code: 1
WebCore/accessibility/gtk/AccessibilityObjectWrapperAtk.cpp:912:  webkit_accessible_text_get_text is incorrectly named. Don't use underscores in your identifier names.  [readability/naming] [4]
WebKit/gtk/tests/testatk.c:367:  Declaration has space between * and variable name in WebKitWebView* webView  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:368:  Declaration has space between * and variable name in AtkObject* obj  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:369:  Declaration has space between * and variable name in AtkObject* obj1  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:370:  Declaration has space between * and variable name in AtkObject* obj2  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:371:  Declaration has space between * and variable name in GMainLoop* loop  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:372:  Declaration has space between * and variable name in AtkText* textObj1  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:373:  Declaration has space between * and variable name in AtkText* textObj2  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:411:  Declaration has space between * and variable name in WebKitWebView* webView  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:412:  Declaration has space between * and variable name in AtkObject* obj  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:413:  Declaration has space between * and variable name in AtkObject* obj1  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:414:  Declaration has space between * and variable name in AtkObject* obj2  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:415:  Declaration has space between * and variable name in GMainLoop* loop  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:416:  Declaration has space between * and variable name in AtkText* textObj1  [whitespace/declaration] [3]
WebKit/gtk/tests/testatk.c:417:  Declaration has space between * and variable name in AtkText* textObj2  [whitespace/declaration] [3]
Total errors found: 15


If any of these errors are false positives, please file a bug against check-webkit-style.
Comment 33 Xan Lopez 2010-01-19 12:41:17 PST
Comment on attachment 46926 [details]
Handle the issue from comment #28, add sanity checks to objectAndOffsetUnignored, new tests

r=me
Comment 34 WebKit Commit Bot 2010-01-19 13:31:08 PST
Comment on attachment 46926 [details]
Handle the issue from comment #28, add sanity checks to objectAndOffsetUnignored, new tests

Clearing flags on attachment: 46926

Committed r53487: <http://trac.webkit.org/changeset/53487>
Comment 35 WebKit Commit Bot 2010-01-19 13:31:21 PST
All reviewed patches have been landed.  Closing bug.