Alex's blog: aria

Showing posts with label aria. Show all posts

Tuesday, November 18, 2014

Accessibility goes into DOM

PWFG group suggested two new methods for DOM Element interface. These methods reflect role and name accessibility concepts, and corresponding methods were named as computedRole and computedLabel.

I have bunch of issues with the approach I wanted to outline here. Just to keep things in one place.

The purpose

I've been told that primary reason is a testing propose, but having role and name only is not enough to run UAIG tests or any accessibility automation tool since it would require other accessibility properties.

Also they say that it might be used for non accessibility proposes. I realize that semantics, the ARIA adds, can be used by non assistive technologies. In Firefox we have a large number of non AT consumers but we don't have a good idea in most of cases what they are for. So I don't really have the use case, and thus it's hard to say whether accessible role and name only works well for non a11y proposes.

Concerning to assistive technologies I think they also need a much larger API.

Blowing the DOM

Anything useful should require extra accessible properties as I said above. These are accessible description, states, relations, ability to navigate the hierarchy etc. That means sooner or later the Element interface has to be changed to a great extent. Check out AtkObject to get an idea about possible changes.

In beginning of times accessibility interfaces was built on top of DOM and later they were turned into full APIs. Now we are faced to backward process, accessibility APIs are getting back to DOM. I'm not sure if that's a good idea because accessibility tasks are something very specific, and accessibility API might be not suitable for common needs of web apps.

Restrictions

Not every semantically meaningful piece on the screen has a DOM node, for example, list bullets don't necessary have DOM elements associated with them. So Element based accessibility API is too restrictive to fit the requirements of the assistive technologies.

Performance

Last but not least is performance issue. In most of browsers the accessibility engine is kept separately and it gets running on demand. If accessibility is merged with the DOM then nothing tells the user this method may trigger heavy accessibility computations and make his app slower. Surely the browsers will learn how to get smarter but the approach will have a perf hit either way.

What's it going to be then, eh?

The idea is to provide a separate accessibility interface. If you like it then it can be done by parts, for example, introduce role and name only for the first round same as the original proposal says. Later you can think of adding all other properties.

This idea was welcomed initially, then later it was rejected as being too complex and accessibility centric. But - and that's most important thing - it doesn't have disadvantages the Element approach has.

Thursday, June 19, 2014

Peculiarities of standardization

Sometimes standardization might have amusing consequences.

Some preliminary. Say the web author need to place an image for pure design purposes, it could be a background image for example, and of course the author doesn't want it to be visible for screen reader users. So the author can do:

<img src="blabla.jpg" alt=""/>

This technique is well known and was standardized by Techniques for UAAG published at 2002:

In some authoring scenarios, empty content (e.g., alt="" in HTML) may make an appropriate text equivalent, such as when non-text content has no other function than pure decoration, or when an image is part of a "mosaic" of several images and does not make sense out of the mosaic.

Neither browser nor assistive technology is supposed to repair the text equivalent for empty alt image or in other words it should be no image from the user perspective. This technique was supported by Firefox and by number of screen readers over the years. On implementation level the trick is accessible name of the image element is an empty string what is interpreted by screen reader the image should be ignored.

Then after years as accessibility standardization process goes on we've got a quite good initiative which is HTML accessibility mapping. Among other things, it has HTML to ARIA mapping. This is nice but brings ARIA on the level of universal accessibility language while it barely fits all nuances of HTML markup. When it comes to HTML img alt="" case then the closest thing popping up in ARIA is role="presentation". Semantically it looks good, however it doesn't match the accessibility API mapping we used to have for years. The change can be made both on browser and screen reader sides but it doesn't have any practical benefit.

By the way the topic seems constantly bother accessibility minds through years. Not taking into account the fresh bug, we had same bug 4 years ago.

Monday, January 20, 2014

Personalized web for the assistive technology

Sometimes the web authors provide a different content for screen readers than they do that for sighted users. That could be an additional content like "skip to content" navigational links or set of landmarks to create a more reliable document structure. In some cases the web author might want to remove a content, for instance, duping links, or make extra tricks to keep the content accessible if, for example, the author gets out of the standard HTML. In ideal case of course the content is supposed to be quite the same but because of design issues and HTML imperfection it doesn't really happen. The web repletes with examples of special content for the assistive technology.

The need of alternative content

Usually authors don't need to put significant changes into a web page to make it accessible. Keeping in mind usability aspects and following best practices is often enough for good results. In other words this is the kind of changes (except some ARIA) that is supposed to be useful for everybody. Sure it doesn't count the web pages having large pieces of ARIA but that's rather an area of large web apps and custom widgets.

That's how it is but nowadays tendency is getting changed and certain web apps want to provide whole portions of alternative content for the assistive technology.

A few examples are good for demo proposes.

Couple examples

Shared video example. If blind and sighted users want to share the device to watch the movie then it might be good idea to have audio descriptions shown (announced) for the blind user only.

Camera apps. It may be another use case of separate content for blind and sighted users. A camera app shows what's on the camera and may have graphic-only interface like green rectangle showing the thing that is currently in focus. A screen reader user may benefit if the face detection software beeped when face gets in focus since it gives a good chance of nice picture.

Another example can be QR code reader software which helps to find QR code label on the product. In general all camera apps may benefit from giving special instructions for assistive technology users.

Integration vs personalization

So alternative content for assistive technology can be a part of the web app design. A next question is how the alternative content can be added into the web page. Would it be one integral app explicitly and implicitly containing special content or will it be personalized version of the app designed for the assistive technology.

Both approaches have own benefits and disadvantages. So that personalization approach wins in performance since potentially the app doesn't need to combine two different versions into one (and actually it is a big concern of web apps vendors from what I hear). Benefit of integration is people get all-in-one solution what mean users share the app and usually have same experience.

Privacy concerns

A big issue of personalization the people talks about is privacy. If you want to have a personalized version of the web site then you have to tell the web site you use the assistive technology.

The idea of sharing personal information is not comfortable in general for many people and it's quite understandable. But you need to keep in mind that those who wants to know whether the user uses the assistive technology quite likely have a way to detect this. These are solutions like screen reader sniffing flash plugin or JS script based on the difference in content navigation. For example, it didn't take much time for me to write a simple script (can be found attached to Mozilla bug) to detect the NVDA screen reader. These solutions are not perfect and may give false positives but I'm pretty sure they can be improved if somebody wanted.

On the another hand if the user says he wants a specialized version for assistive technology then it doesn't necessary mean the user has the assistive technology running and of course it doesn't necessary mean the user have to share what kind of assistive technology he uses.

So of course there's a privacy concern but it's not bigger than, say, privacy concern of Geolocation API. The difference in sharing and not sharing is rather seeming. After all the user decides.

Tech party

Not sure which approach will get dominant, perhaps that will be some reasonable mix of them. Anyway it would be a good idea to provide web authors the different techniques to implement either approach.

JS API

Just a method to detect the presence of assistive technology such as a screen reader allow the web app to load specialized scripts and build personalized version of the app. The idea in implementation can be quite similar to Geolocation API. If the web app wants to know whether an assistive technology is running then the user gets asked if he's ok to share this info. If the user agrees then personalized app is loaded.

aria-hidden

It can be used to hide certain parts of the web page from the assistive technology and (quite a recent change) it can be used to create web page portions for assistive technology only (not visible/operable/etc for sighted users).

Actually at this point ARIA spec doesn't allow aria-hidden to create web content for AT. However w3c pushed this option into the law by resolving HTML5 bug. I admit that next ARIA spec should get in sync with the change. It's worth to notice however that WHATWG spec hasn't been changed yet and probably it won't be.

Also ARIA recommends very limited usage of aria-hidden:

Authors MAY, with caution, use aria-hidden to hide visibly rendered content from assistive technologies only if the act of hiding this content is intended to improve the experience for users of assistive technologies by removing redundant or extraneous content. Authors using aria-hidden to hide visible content from screen readers MUST ensure that identical or equivalent meaning and functionality is exposed to assistive technologies.

I suppose it may be considered as temporary advice and will be neutralized as soon as the web gets more apps having special versions for the assistive technology.

So here's a demo of the approach

<body>
<div aria-hidden="true">An ordinal version</div>
<div hidden aria-hidden="false">An assistive technology version</div>
</body>

The minus of the aria-hidden="true" is the user navigable content is hidden from the assistive technology. You may want to read the post why I didn't fell in love with aria-hidden="true" as it's stated. My opinion haven't really changed over years.

The minus of aria-hidden="false" is AT visible only content is not keyboard navigable until AT has special support of it, it's not focusable and has neither dimensions nor position what is a certain restriction for AT. Also it's good to note, screen readers like Orca would ignore aria-hidden="false" content in general because it requires a virtual buffer feature for content navigation but Orca doesn't have it in question.

So aria-hidden has bunch of disadvantages but I admit it has one good benefit which is its simplicity for the web author.

CSS media

CSS media features are designed to provide styling depending on features the device has. So if screen reader is detected then the web app can show or hide portions of the page when it makes sense for screen reader users, for example,

@media (screenreader) {

.sr {

display: none;

}

}

<div class="sr">hidden from AT</div>

As you see it's very easy to change the web page and personalize it for the assistive technology user. This technique doesn't have disadvantages of aria-hidden because all content is rendered on the screen. So that the content has position and size, it's focusable and navigable by standard ways.

Nevertheless CSS media is also disputable approach (check out Mozilla bug for details). Note, afaik it's not supported by web browsers yet.

Two screens option

Same screen approach looks quite appealing but it doesn't answer to all designing needs the web authors want. For example, in the case of shared screen to watch the movie it might be a nice option if audio descriptions were shown only for those who needs them.

This idea leads to the option to render CSS media styles virtually. Technically speaking that means the presence of two screens, i.e. one version is rendered on the display (that's a movie), the second version is rendered in memory, for assistive technology only (subtitles). Nowadays a similar thing is implemented for HTML 5 canvas shadow content: it's not visible on the display but the assistive technology literally sees it, i.e. it have an access to layout, position and dimensions, and it can navigate it.

It's not clear however how to share input devices like mouse and keyboard, but probably it could be nicely resolved this or that way. This approach is quite close to aria-hidden technique since it allows same design techniques but it doesn't have disadvantages of it because the content is still rendered.

The web is going to change the way it handles the assistive technology. Quite soon I think.

Wednesday, November 14, 2012

Accessible Firefox: Text equivalent computation

Each accessible object may have name and description, a primary characteristic used for perceivability of the control element by the user (also referred as text equivalent). As far as I know only ARIA spec provides an algorithm of text equivalent computation. It might look strange that ARIA specifies an universal algorithm equally applicable to any markup (would it be HTML or anything else) but that's how it is. Other specs either don't address that or like HTML spec are referred to the ARIA one. Each browser follows that algorithm this or that way.

Algorithm implemented in Firefox isn't 1 to 1 with ARIA's one but it is quite close to the version from ARIA first draft. Basically that version was written from Firefox implementation. ARIA was evaluated, Firefox was evaluated too but not always in sync with ARIA.

I realized that Firefox algorithm isn't documented anywhere so I decided to put it here to not make people read the code if they are curious about Firefox behavior. Note, Firefox might do a slightly different computations on case by case basis. In general these should considered as bugs. However this algorithm is not free from bugs as well. Let me know if you see anything suspicious.

Terms

Here is a list of used terms in algorithm description:

initial node: the DOM node the text equivalent is computed for;
current node: the DOM node currently traversed in order to compute the text equivalent for initial node;
text equivalent string: the text equivalent we have computed up until we have arrived at the current node;
string attribute: attribute whose value provides a text equivalent, for example, aria-label in case of name computation;
relation attribute: IDRefs attribute referred to other element(s) used in text equivalent computation, for example, aria-labelledby in case of name computation;
empty on purpose text equivalent: text equivalent left empty on purpose by the author, AT shouldn't try to repair it.

Algorithm

To compute the text equivalent for current node:

Prepend a space if necessary: if the current node is not inline element (refer to CSS display style), append a space character if the text equivalent string is not empty.
Compute the text equivalent for the current node and append it to text equivalent string:
1. If the node is hidden and it's not a part of computation initiated by relation attribute (in other words, it's not referred by or it's not a child of hidden element referred by relation attribute) then, skip the node. ⤴
2. If the node is a text node, then append the rendered text content if the node is not hidden, otherwise its append text content. Proceed to the next node. ⤴
3. Append text equivalent from ARIA markup if any, otherwise append it from native markup:
4. If the node is not initial node or if it's recursively reentered initial node but it's not the fist or last part of a text equivalent computation then append the current user-managed value of this node.
5. If the text equivalent for this node is empty, and either the node's role allows "text equivalent from subtree" or the node is not a control and not the initial node, then recursively implement this algorithm for each child, starting with step 1.⟲
6. If the text equivalent for this node is still empty, get it from tooltip for the current node if any.
Append space if the space was added at step 1.
Normalize whitespace, trimming leading and trailing space and condense other whitespace characters into a single space.

Remarks and examples

Item c.

Text equivalent computation from ARIA and native markup is nicely covered by HTML to a11y spec.

Nevertheless as example of a native markup text equivalent can be alt attribute for HTML <img> or a label from <label for> for control element. However, markup for tooltips is not used as native markup text equivalent, they are used as a last resort under item f.

In general name and description don't dupe each other so that if some markup was used as name then it won't be reused as description. For example,

<img title="Me and Eiffel Tower">
Name is "Me and Eiffel Tower", description is empty.

But:

<img alt="I'm in France" title="Me and Eiffel Tower">

Name is "I'm in France", description is "Me and Eiffel Tower".

Item c.i.

Example of empty on purpose name is <img alt="">.

Item c.ii.

Example of relation attribute processing:

<button aria-labelledby="span" id="btn" />
<button aria-labelledby="btn" id="btn2" />
<span id="span">text</span>

@id="btn" name is "text", aria-labelledby is processed since we don't have reentrances.
@id="btn2" doesn't have a name, aria-labelledby on @id="btn" is ignored because otherwise it would mean reentrance (@id="btn2" aria-labelledby brought us here).

Note, if the recursion only produces white space then we proceed to the next item of the algorithm. For example

<span id="span"></span> <button aria-labelledby="span">press me</button>

Name of button element is "press me". The rule is also applicable to item e.

Item d.

1. By user-managed value we assume the value of accessible object. In HTML <input> case it's built from value attribute. In case of ARIA that will be aria-valuetext for example.

2. If the current node is initial node then value is not included.

<div role="slider" aria-valuetext="right in the middle"></div>

Name is empty for ARIA slider.

3. But if the current node is not initial node then value is included.

<label for="input">
    Position
    <div role="slider" aria-valuetext="right in the middle">
    </div>
</label>
<input id="input" type="checkbox">

Name of the checkbox is "Position right in the middle".

4. If the current node is the initial node and it was reentered then:

a. If the node is in middle of text equivalent computation then its value is included.

   <label>      Subscribe to      <select>        <option>ATOM</option>        <option>RSS</option>      </select>      feed.    </label>
Name for select is "Subscribe to ATOM feed".

b. If node is not in the middle of text equivalent computation then value is omitted.

<label>Home page: <input type="text"></label>

Name for input is "Home page:".

Monday, October 29, 2012

ARIA payback

I'm not a real part of ARIA spec development but I work with assistive technology vendors and ARIA widgets authors on ARIA support in Firefox, I'm the one who implements ARIA in Firefox. I'm focused on practical aspects of ARIA usage and often I deal with problems not addressed by ARIA spec.

Here how it usually works. We and AT developers discuss a problem and then after agreement we implement just a reasonable solution that works for AT, users and us. We don't always go for a feedback from ARIA group and actually I think there's a number of reasons why. Personally I don't go probably because

I think somebody else could do that since it was a group decision after all.
I don't always get feedback from ARIA group.
ARIA group is perceived as a closed group (I always run into restricted areas the other participants are referred to).
ARIA group structure feels complicated (there are two ARIA specs managed differently and when you have a single ARIA issue then sometimes you should go though different authorities to get a feedback).

I think it's because I didn't have a really good story of collaboration with ARIA group.

On the another hand I don't follow the ARIA spec progress. I didn't see changelogs between spec versions. I wasn't really asked for feedback as ARIA implementator in Firefox. Because all of this many changes in the spec were introduced silently for me. I don't want to blame anyone (including myself) I just want to say that ARIA spec development was in parallel universe for me.

Yes, it couldn't be forever, one day Firefox implementation should meet the spec on the crossing and we should get a bump. This day have came. ARIA spec came into candidate recommendation and we were said Firefox don't pass tests. While I was running through failing tests one by one then I realized I have concerns for half of them. It wasn't a really big surprise but I disagreed what ARIA spec states in a number of cases. Here are few examples when I had concerns.

1) ARIA abstract roles must be not exposed via standard role mechanism (see the ARIA spec):

User agents MUST NOT map roles defined in the WAI-ARIA specification as "abstract" via the standard role mechanism of the accessibility API.

It seems very reasonable since there's *no* any single reason why the AT would need it. But the statement might be colored differently if you read it as implementator. If the browser exposes only known and "good" ARIA roles then the browser is in good shape and it goes with the spec. The browser can do different approach and expose all ARIA roles (not depending whether they are known or unknown) and let the AT to decide what to do with them. You can argue whether this is a good idea or not but it can be used in the wild, for example, by scripted JAWS and certain web apps (in this case the browser is just a mediator between web app and screen reader). Also the spec doesn't deny that.

However if the browser follows this approach then we run into a problem because the browser must known about abstract roles to ignore them. Abstract roles are pure theoretical matter used to organize stuff in the spec, it's *not supposed* to be used on the web and it *won't* be used on the web but the browser *must* know about them if the browser relies on "expose any role" approach. In reality it slows down the browser for nobody's win. Ok, it's a browser problem. ARIA problem I think is the ARIA spec tries to standardize things that *aren't* supposed to be used on the web what is meaningless in general.

2) aria-checked="mixed" on radios should be mapped to "false" value (see the ARIA spec):

The mixed value is not supported on radio or menuitemradio or any element that inherits from these in the taxonomy, and user agents MUST treat a mixed value as equivalent to false for those roles.

I agree mixed value on radios don't make any sense since radios don't support tristate. But the "false" value is not a fallback value on radios. That means the browser *must* introduce a special check for the case that doesn't have *practical* usage on the web.

The problem is the candidate recommendation means nobody wants to change the spec at this point especially if it's implemented by some browsers already. It doesn't really make sense to address any issue listed above in the next spec since they don't make a difference on the web. It wouldn't be so bad to just follow the spec if we didn't have other discrepancies especially those that *make* a difference for the user.

The reality is either Firefox picks up that burden wordlessly or it gets an yoke of the browser incompatibility with the spec. No good options, huh?

Monday, February 20, 2012

aria-hidden and role="presentation"

John Foliot pinged me about his blog post devoted to ARIA techniques used to hide the content from assistive technologies. Since I don't have straight answer then I decided to put my thoughts here.

presentation role

ARIA role="presentation" technique is intended to hide an element from AT users. A classical example is presentational images. If you place role="presentation" on the image then it gets removed from accessible tree. Also this technique can be used to remove HTML table semantics, i.e. if role="presentation" is specified on HTML table then table structure is not exposed.

ARIA role="presentation" is completely ignored if it's used on focusable element. That means the element isn't removed from accessible tree and its native semantic is exposed. For example, if you put this role on HTML button then it's revealed as normal HTML button to AT. This is reasonable because if the user focuses the button then focus doesn't get lost and screen reader announces something meaningful.

aria-hidden
The most noticeable difference from role="presentation" is aria-hidden affects on whole subtree. John gave a good use case: hide excess links from screen reader users. A common pattern is a clickable image inside HTML:a element. ARIA spec confirms this saying "Authors MAY, with caution, use aria-hidden to hide visibly rendered content from assistive technologies only if the act of hiding this content is intended to improve the experience for users of assistive technologies by removing redundant or extraneous content."

I should notice that ARIA is not designed to change visual presentation and affect on behavior. Thus if this is an ordinal link then no ARIA technique can be used to hide it from screen reader user since this contradicts to ARIA design. But if the author puts @tabindex="-1" on the link to make it unfocusable then aria-hidden looks like a proper way to achieve desired result.

Technical side

ARIA implementation guide allows to remove elements with aria-hidden from accessible tree but it doesn't require that. Also it states "If the object is in the accessibility tree, map all attributes as normal. In addition, expose object attribute hidden:true". Additionally it points that if the aria-hidden attribute is changed then the browser should emit attribute_changed event (in case of IAccessible2). That's exactly what Firefox does.

So Firefox does a minimal implementation of aria-hidden allowing a screen reader to do what it thinks it should. That means each screen readers should invent a wheel and actually that what happens (or doesn't) - check this table to see how Firefox is different depending on screen reader running.

Implementation of aria-hidden both on browser side and screen reader side is not straightforward. For example if some element having aria-hidden contains a focusable element then its *whole* subtree can't be ignored. At the first glance reasonable solution would be if the browser doesn't create an accessible for each element in subtree until the element is focusable, in other words it treats aria-hidden as role="presentation" was specified on each node in subtree. Otherwise AT needs to crawl the accessible tree to check if there's an element with aria-hidden in ancestor chain. The current Firefox implementation forces AT to do that.

Things to think about.

If the browser doesn't create an accessible for aria-hidden subtree then there is a black box having certain dimensions on the screen. Thus if the user investigates the page layout by mouse then screen reader says nothing when the mouse pointer is above that black box but if the user clicks at this area then he gets unexpected behavior.

On the other hand some screen magnifiers use dimensions of accessible objects for page zooming. No accessible then zoom is likely broken.

So at the second glance I think aria-hidden shouldn't change the tree at all. Instead the browser should expose hidden:true object attribute on every accessible from the subtree of aria-hidden element. That allows AT to decide whether they want to ignore the accessible or not and prevents AT to crawl the accessible hierarchy.

I'd say this rule should be applicable to role="presentation" as well. So if the user explorers the page by mouse then screen reader can say this is presentational image but exclude it from keyboard navigation.

Thursday, November 17, 2011

ARIA autocomples implementation insight

Autocomple widget is a text field having associated list of options so the user is allowed to type value into text field or choose it from available options. Basically autocomplete is variety of combobox control and the user expects it behaving similarly to combobox, for example, keyboard shortcuts should work mostly the same way.

ARIA allows the author to create autocomplete widgets by putting aria-autocomplete attribute on element having role="textbox" or role="combobox" attributes. There are examples on the web, for instance, here or here.

Usually authors prefer to use HTML input control as a base of autocomplete widget because they get typing implementation for free. All they need is to implement autocomplete list, navigation through list options and support standard combobox shortcuts.

When the user navigates autocomplete list then he's able to start typing to adjust list of available options. From implementation point of view the author tends to keep DOM focus on text field. That makes sense because if he keeps it on option element or autocomplete list then he'd need to implement typing on his own. But what is perceived focus in this case? I'd say the focus is on currently traversed option but when the user starts typing then focus goes into text field and visa versa. It's quite similar to comboboxes: when the user navigates options then focus is on option, if the user dismiss popup then focus goes into combobox itself.

How does it look from AT perspective? When the user navigates through options then AT should announce where the user is. For that the author can manage the DOM focus by tabindex technique, use ARIA live regions or try something else. Tabindex technique is really great because it's mapped into accessibility focus concept. The focus concept exists in all AT APIs what makes it universal and all ATs support it very well. Whenever focus is changed it gets announced to the user. ARIA live regions technique is good too because it's supported well by modern ATs.

Reality is the author doesn't want to manage DOM focus because it means special support for typing and he doesn't want to use ARIA live regions because it's hacky, complicated and sort of weird. So we have something else in the end. ARIA provides one more technique called active descendant which is supposed to be mapped to accessible focus but not restricted to it. So this one looks like authors could rely on.

The author wants to keep DOM focus on text field so he sets aria-activedescendant attribute on autocomplete list element which points to currently traversed option. ARIA implementation guide states that aria-activedescendant attribute change results in accessible focus event on pointed descendant iff the container has DOM focus. Since DOM focus is on text field then there's no focus event. Author can't manage aria-activedescendant on text field since autocomplete popup can't be a child of text field due to markup restrictions and strictly speaking it shouldn't be required to be a logical child (aria-owns). Actually the relation between autocomplete and autocomplete popup should be described by aria-controls attribute but ARIA spec doesn't allow to map aria-activedescendant changes into focus event for elements of this relation.

So what do we have? Some AT APIs (like IAccessible2) has a concept of active descendant and the author could hope that ATs are smart enough so they can pick up active descendant changes and announce them to the user. Reality is active descendant concept is well supported neither by browsers nor by ATs. Some AT APIs (like MSAA) doesn't have it. I'm not aware of other use cases of this concept so that makes me think that active-descendant is going to require special support from ATs. But ATs like focus, they don't want to support new techniques when there's working old one. Autocomple widgets are not special and user interaction can be described in focus terms. That's how Firefox awesome bar works.

What can we do? I think ARIA spec should be changed to extend rules of active-descendant mapping into accessible focus. The following proposition sounds reasonable with me. Allow ARIA menu, listbox and tree widgets controlled (aria-controls) by the widget having DOM focus to manage accessible focus by aria-activedescendant technique. For example:

<input aria-autocomplete="list" aria-controls="autocomplelist">
<ul role="listbox" id="autocompletelist" aria-activedescendant="option1">
<li role="listitem" id="option1">first option</li>
<li role="listitem" id="option2">second option</li>
</ul>

In this case AT should report accessible focus on the first option when the text field has DOM focus.