zqzhang | d36b5ce | 2016-09-30 17:29:34 | [diff] [blame] | 1 | # Audio Focus Handling |
| 2 | |
| 3 | A MediaSession collects all audio-producing objects in one tab. It is usually |
| 4 | unpleasant when multiple MediaSessions play sound at the same time. Audio focus |
| 5 | handling manages the MediaSessions and mixes them in proper ways. This is part |
| 6 | of the default media session on desktop project. |
| 7 | |
| 8 | [TOC] |
| 9 | |
| 10 | ## Processing model |
| 11 | |
| 12 | ### Audio focus types |
| 13 | |
| 14 | There are "persistent" and "transient" audio focus types. |
| 15 | |
| 16 | * Persistent audios are used for long media playback, and they should not mix |
| 17 | with each other. When they start to play, they should pause all other |
| 18 | playbacks. |
| 19 | * Transient audios are used for short media playback such as a ping for incoming |
| 20 | message. When they start to play, they should play on top of other playbacks |
| 21 | and the other playbacks should duck (have reduced volume). |
| 22 | |
| 23 | ### `MediaSession` |
| 24 | |
| 25 | Audio-producing objects should join `MediaSession` when they want to produce |
| 26 | sound. `MediaSession` has the following states: |
| 27 | |
| 28 | * ACTIVE: the `MediaSession` has audio focus and its audio-producing objects can |
| 29 | play. |
| 30 | * SUSPENDED: the MediaSession does not have audio focus. All audio-producing |
| 31 | objects are paused and can be resumed when the session gains audio focus. |
| 32 | * INACTIVE: the MediaSession does not have audio focus, and there is no |
| 33 | audio-producing objects in this `MediaSession`. |
| 34 | |
| 35 | Besides, `MediaSession` has a `DUCKING` flag, which means its managed |
| 36 | audio-producing objects has lowered volume. The flag is orthogonal with |
| 37 | `MediaSession` state. |
| 38 | |
| 39 | ### `AudioFocusManager` |
| 40 | |
| 41 | `AudioFocusManager` is a global instance which manages the state of |
| 42 | `MediaSession`s. It is used for platforms (e.g. Android) that do not have a |
| 43 | system audio focus. |
| 44 | |
| 45 | When an audio-producing object wants to play audio, it should join `MediaSession` |
| 46 | and tell which kind of audio focus type it requires. `MediaSession` will then |
| 47 | request audio focus from `AudioFocusManager`, and will allow the object to play |
| 48 | sound if successful. `AudioFocusManager` will notify other `MediaSession`s if |
| 49 | their states are changed. |
| 50 | |
| 51 | When an audio-producing object stops playing audio, it should be removed from |
| 52 | its `MediaSession`, and `MediaSession` should abandon its audio focus if its |
| 53 | audio-producing objects is empty. `AudioFocusManager` will notify other |
| 54 | `MediaSession`s of state change if necessary. |
| 55 | |
| 56 | ## The algorithm for handling audio focus |
| 57 | |
| 58 | `AudioFocusManager` uses a stack implementation. It keeps track of all |
| 59 | ACTIVE/SUSPENDED `MediaSession`s. When a `MediaSession` requests audio focus, it |
| 60 | will be put at the top of the stack, and will be removed from the stack when it |
| 61 | abandons audio focus. |
| 62 | |
| 63 | The algorithm is as follows: |
| 64 | |
| 65 | * When a `MediaSession` requests audio focus: |
| 66 | |
| 67 | * Remove it from the audio focus stack if it's already there, and place it at |
| 68 | the top of audio focus stack, grant focus to the session and let it play. |
| 69 | * If the session is persistent, suspend all the other sessions on the stack. |
| 70 | * If the session is transient, we should duck any active persistent audio |
| 71 | focus entry if present: |
| 72 | |
| 73 | * If the next top entry is transient, do nothing, since if there is any |
| 74 | persistent session that is active, it is already ducking. |
| 75 | * If the next top entry is persistent, let the next top entry start ducking, |
qyearsley | c0dc6f4 | 2016-12-02 22:13:39 | [diff] [blame] | 76 | since it is the only active persistent session. |
zqzhang | d36b5ce | 2016-09-30 17:29:34 | [diff] [blame] | 77 | |
| 78 | * When a `MediaSession` abandons audio focus: |
| 79 | |
| 80 | * If the session is not on the top, just remove it from the stack. |
| 81 | * If the session is on the top, remove it from the stack. |
| 82 | |
| 83 | * If the stack becomes empty, do nothing. |
| 84 | * If the next top session is transient, do nothing. |
| 85 | * If the next top session is persistent, stop ducking it. |
| 86 | |
| 87 | ### Handling Pepper |
| 88 | |
| 89 | Pepper is different from media elements since it has a different model. Pepper |
| 90 | cannot be paused, but its volume can be changed. When considering Pepper, the |
| 91 | above algorithm must be modified. |
| 92 | |
| 93 | When Pepper joins `MediaSession`, it should request persistent focus type. When |
| 94 | AudioFocusManager wants to suspend a `MediaSession`, it must check whether the |
| 95 | session has Pepper instance, and if yes, it should duck the session instead. |
| 96 | |
| 97 | Also, whenever a session abandons focus, and the next top session is INACTIVE, |
| 98 | `AudioFocusManager` should find the next session having Pepper and unduck it. |