Alex's blog: atk

Showing posts with label atk. Show all posts

Tuesday, April 23, 2013

UIA text vs ATK / IAccessible2

After IAccessible2 comparison to ATK text it'd be good to get a quick look at ATK / IAccessible2 APIs and UIA text. Firefox doesn't have UIA implementation yet and there's no nearest plans about it but it's worth to compare these API because one day UIA text might become a good alternative to IAccessible2 I think.

UIA text is closer to user actions since UIA has concept of range which can be moved like a cursor through the web page. You can move it (and extend it) by characters, words, lines the same way as the user would do. And then you can get a text the range is spanned to. This means you won't ever run into restrictions of accessible tree and embedded characters as you probably do in case of IAccesisble or ATK APIs. A couple examples might be good to explain what I mean.

Example #1

If accessible tree is DOM based i.e. it's close to DOM hierarchy then ATK / IA2 text interface implementation might be tricky. For example, the following HTML paragraph

hello
<aside style="position:absolute;top:0px;left:0px;">meine freund</aside>
my friend

can have the accessible tree

paragraph (html:p)
text leaf ('hello' text)
section (html:aside)
text leaf ('meine freund' text)
text leaf ('my friend' text)

If the browser is not smart enough then it doesn't remove embedded character designating html:aside element from the text of parent html:p element. In other words if the paragraph text is "hello*my friend" (where * is an embedded character for html:aside) then a screen reader have to to deal with it and it should ignore out of flow content somehow. If the screen reader is not smart enough to ignore that then it will move the user through "meine freund" text when the user moves through the paragraph text.

Example #2

In case of Firefox implementation which tends to use embedded characters for everything you can observe another kind of weird behavior. Screen reader must be smart to move by words, etc because embedded characters are used for inline objects like anchors. For instance if you have

hel<a>lo</a>

then screen reader must juggle offsets to detect that this paragraph technically consists of one word. You can get into similar troubles when an anchor is spanned through multiple lines. So if you move by lines then the end offset pointing after embedded character never says to you whether a line end is in the middle or after the embedded object. You need to look into embedded object to detect that. It makes a screen reader logic not performant and not trivial.

A summary

In short UIA lets you to move through the page in correspondence to web page layout while ATK and IAccessible2 allow you to move in correspondence to the accessible tree. Sometimes it makes a difference.

So accessible tree dependent approach makes the text implementation not trivial on certain platforms (granted, on Gecko). I'm sure that everything (or mostly everything) can be implemented right on the browser side or can be workarounded by screen readers but implementation in either case must be not seamless. Note, somebody told me that Webkit has a nice ATK text implementation (or nicer than Gecko's one? I don't recall). So it's an attestation it's doable, just can be not easy.

I should notice we didn't prototyped anything yet. Of course before making any judgements (did I?) we need to implement it and screen readers should adopt it. And only after that I have a right to say whether it was so good as it looks. On the other hand ATK appeared years ago, IAccessible2 just adopted and simplified ATK ideas so that Microsoft had enough time after MSAA to invent something nice. So I'm ready to believe they did it.

Friday, April 12, 2013

ATK text pitfalls

As soon as I ensured myself I've got a good understanding of ATK text they put me back into reality. One more time I must admit myself that ATK text is unknowable like the universe. Seriously, shortly after I started the work on fixing ATK text bugs in Firefox then Orca, a Linux screen reader, suddenly felt bad. I've been suggested to compare Firefox and GEdit to see if there's a difference in implementations. So did I and then I realized that results depend on whence you start the ATK spec reading (btw, GEdit implementation doesn't always follow the spec). If you read the spec from beginning (a first sentence) then you get one result. If you read it from the end (a second sentence) then you might conclude that a different result is expected. I filed a bug against ATK. But let's read it again together, I might be missing something.

Let's consider an example: "a funny word".

* atk_text_get_text_at_offset for BOUNDARY_WORD_END

The returned string is from the word end before the offset to the word end at or after the offset.

I think you will agree that there's no word end *before* 0 offset so it can be treated as an author error. ATK doesn't say how error values should be handled so I guess any reasonable return value is allowed. Firefox returned a ('', 0, 0) triplet and that confused Orca.

Read the spec next:

The returned string will contain the word at the offset if the offset is inside a word.

This means we should return a ('a', 0, 1) triple because 0 offset is inside 'a' word (btw, that's what GEdit did).

* atk_text_get_text_at_offset for BOUNDARY_WORD_START

It is a dual problem to the issue above for the offset equal to a text length. Spec says:

The returned string is from the word start at or before the offset to the word start after the offset.

and

The returned string will contain the word at the offset if the offset is inside a word.

There's no word start after the offset but the same time the offset is inside 'word' word. Reading next.

* atk_text_get_text_after_offset for BOUNDARY_WORD_END

The returned string is from the word end at or after the offset to the next work end.

It might be not evident but 0 offset is a word end offset. A proof by contradiction. If 0 offset is not the end offset then

get_text_at_offset(0, BOUNDARY_WORD_END)

in case of single word (like 'word') should return an empty text. But this contradicts to get_text_at_offset method name semantic and Orca expectations (see the case above). Therefore 0 offset is a word end.

Then it means that the method at 0 offset should return the first word ('a' in our example). But the second sentence says that it must be a second word ('funny' in our case).

The returned string will contain the word after the offset if the offset is inside a word.

* atk_text_get_text_before_offset for BOUNDARY_WORD_START.

This is a dual problem to get_text_after_offset (word end boundary) case. Let's take an offset equal to text length.

The returned string is from the word start before the word start before or at the offset to the word start before or at the offset.

Text length offset is a word start offset. A proof is by analogy (see above). That means that 3d word is expected ('text' in our case). However the second sentence says that it should be a 2nd word ('funny' in our case):

The returned string will contain the word before the offset if the offset is inside a word.

Wednesday, April 10, 2013

IAccessible2 text vs ATK text

We started our accessible text rework in Firefox. It's time to revisit our IAccessible2 text implementation and compare it to ATK text.

Both IAccessible2 and ATK allows you to navigate the text by characters, words and lines. To do so they both provide these three methods:

get text *at* offset
get text *after* offset
get text *before* offset

It makes clear that these methods are used to get the text at, after or before offsets. It's true in IAccessible2 world but it's no so univocal in case of ATK. ATK provides bunch of tricky boundary types that may change things. Below I will example the difference.

A difference #1. IAccessible2 getTextAtOffset may return nothing when you asked for a word. This happens when the given offset is between words (see the spec).

If the index is valid, but no suitable word (or other text type) is found, a NULL pointer is returned.

ATK always returns a word, no matter where you are because it requires to return the text between two word start or word end offsets (check out the spec).

A difference #2. IAccessible2 getTextAfter/BeforeOffset always return the requested lexem after/before the given offset.

Returns the substring of the specified text type that is located after/before the given character and does not include it. The result of this method should be same as a result for IAccessibleText::textAtOffset with a suitably increased/decreased index value.

ATK get_text_after/before_offset methods may return a lexem at the offset under certain circumstances. For example here's the case of word end boundary in get_text_after_offset method:

If the boundary_type is ATK_TEXT_BOUNDARY_WORD_END the returned string is from the word end at or after the offset to the next work end.

In short ATK and IAccessible2 supply two resembling but different concepts of text traversal. ATK allows you to move through the whole text by any method. For example, text_get_text_at_offset works nice if you pass the end offset you obtained at previous call. In IAccessible2 word you need to use a couple consisting of getTextAtOffset and getTextBefore/AfterOffset to move backward/forward. I'd say IAccessible2 text is simplified (a human friendly) version of ATK text. I won't object if somebody says that ATK looks more powerful. But the same time I'm not sure screen readers need this power.

Anyway IAccessible2 text looked so close to ATK and thus originally we mapped IAccessible2 boundary constants into ATK constants and we ended up with shared logic between these APIs. It was done several years ago and we never revisited this code.

Now we are going to change. IAccessible2 consumers please keep the track of our progress to catch regressions early.

Thursday, March 28, 2013

An easy way to understanding the ATK text

As far as I remember myself I've always been in touch with ATK text interface implementation in Mozilla. I started from writing and reviewing some patches in far 2006 year. But I didn't really understand that piece of code so I wasn't sure that a change here don't break things there. At some point we decided that we should get an automated test coverage for the text interface before we do any serious work in this area. At least that allowed us to be sure to a certain extent we don't regress badly from a single bug fix. And then I helped my colleagues in test suite creation. As part of this work we caught a bug in ATK spec (thanks to Evan Yan, a Mozilla community member and those times Sun engineer). So that wasn't easy. It wouldn't be a lie if I said that I've never seen a more complicated API for the things it's designed for.

I should notice that roughly speaking IAccessible2 text interface implementation in Firefox is done via ATK text interface. So having a bad implementation in ATK we deliver all bugs right to IAccessible2 screen readers. It's a hot problem in other words. Recently I've felt myself brave enough (again) to say: we should stop this shame. And I started to look at the code and the spec trying to untwist things. And then I realized I still don't have a good perception of ATK text. First of all I thought it'd be good to add some drawings to stingy ATK spec to let me and everybody else check easily whether the expected results are correct actually.

Preliminaries

ATK provides bunch of methods to get a text:

Of course ATK provides a bunch of other methods but they are trivial and it doesn't make sense to even mention them. Each of methods above take AtkTextBoundary as an argument and its values are:

char (trivial)
word start and word end
line start and line end
sentence start and sentence end (not implemented in Firefox)

So we have get_text_before/at/after methods and word/line start/end offsets. This is a subject of the talk.

About terms: a chapter for the advanced

If you didn't planned to read the ATK spec or dig into details then you can skip this chapter and move to the pictures part. Otherwise this chapter might be useful since it has some clarifications.

First of all, here are some terms which are used in the spec but aren't defined there:

word start offset - an offset where the word starts, for example, "hello, all" has two word start offsets: 0 for "hello" and 7 for "all";
word end offset - 5 for "hello", i.e. an offset after 'o' character, and 10 for "all";
inside word offset - any offset between word start and end offset (including boundaries), in our case these are 0-5 and 7-9 offsets;
outside word offset - everything that is not inside a word, in our case this is only 6 offset.

It's pretty much the same for line start and line end offsets.

Also we need to mention edge cases: imaginary offsets. Say we have a paragraph:

hello

and then we do

gint startOffset = 0, endOffset = 0;
atk_text_get_text_at_offset(accessible, 1,
 ATK_TEXT_BOUNDARY_WORD_START,
 &startOffset, &endOffset);
atk_text_get_text_at_offset(accessible, 1,
 ATK_TEXT_BOUNDARY_WORD_END,
 &startOffset, &endOffset);

In both cases we expect "hello" string with (0, 5) start and end offsets otherwise there's no way traverse this paragraph by words. But actually it goes with spec: "The returned string will contain the word at the offset if the offset is inside a word". But this means that 0 and 5 offsets are both start and end offset the same time because "the returned string is from the word start at or before the offset to the word start after the offset" and "the returned string is from the word end before the offset to the word end at or after the offset".

Summarizing it all a zero offset (0) and a last offset (character count) are special offsets and can be treated as word start, word end, line start and line end offsets.

A Quick-n-Easy Guide

Update. The proposed algorithm must be corrected to handle edge offsets properly, see ATK text pitfalls. I won't spend time to update it since Joanie proposed ATK text simplification and hopefully it will be accepted in foreseeable future.

So we are ready to put the spec verbosity into nice pictures.

atk_text_get_text_at_offset

WORD_START and LINE_START boundaries are illustrated this way (X symbol designates the initial offset):

Move forward to the boundary and then (if was successful) move backward. The start offset is at or before the initial offset, the end offset is after the initial offset.

WORD_END and LINE_END boundaries:

Move backward to the boundary and then (if was successful) move forward. The start offset is before the initial offset, the end offset is at or after the initial offset.

atk_text_get_text_before_offset

WORD_START and LINE_START boundaries:

If the initial offset is the boundary then move backward to find the start offset. Otherwise move backward twice to pick up the start and end offsets.

WORD_END and LINE_END boundaries:

Move backward twice for start and end offsets.

atk_text_get_after_offset

WORD_START and LINE_START boundaries:

Move forward twice for start and end offsets.

WORD_END and LINE_END boundaries:

If the initial offset is the boundary then move forward to find the end offset. Otherwise move forward twice to pick up the start and end offsets.

That's all. Hallelujah to Love!

P.S. Well if I'm wrong in sayings above then you'd better say it otherwise this will be implemented in Firefox soon ;)