Functional Role of Temporal Patterning of Articulation in Speech Production
Functional Role of Temporal Patterning of Articulation in Speech Production
Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022 • Copyright © 2022 American Speech-Language-Hearing Association 4577
of speech (Liberman & Prince, 1977; Ramus et al., 2000). To from both production and perception perspectives. We
produce intelligible speech requires careful planning and then focus on the production side and propose an
sequencing of articulatory motor programs to generate tem- entrainment-based perspective drawing insights from the
porally structured sound sequences that are perceptually par- AP/TD model to conceptualize how neuromotor speech
sable into linguistic units. The functional significance of tem- disorders can impair the neurophysiological processes con-
poral patterning of speech is well supported by empirical trolling speech timing and, in turn, impact temporal pat-
findings in the psychoacoustic literature. In the acoustic terning of speech production. Last, we identify several
domain, temporal patterning of speech is reflected by the gaps in empirical testing of the theorized connections
slow modulation envelope of the waveform (Drullman, between neuromotor speech disorders, temporal patterning
1995). Converging evidence has been drawn from numerous of speech production, and functional communication to
perceptual experiments revealing a detrimental effect of motivate the primary aims and methodology of this study.
reduced or distorted slow temporal modulation of acoustic Furthermore, we pose an exploratory question, that is,
envelope on the functional speech outcomes (e.g., speech how to manipulate and improve temporal patterning of
intelligibility), whereas spectral distortions of the fine struc- speech production in neurologically impaired individuals
ture can be tolerated to a much greater degree (Drullman in aiming to provide insights into the management of
et al., 1994, 1996; Ghitza, 2012). These findings corroborate neuromotor speech disorders.
the critical role of temporal patterning of speech in func-
tional communication from the perception perspective. Relationship Between Temporal Patterning of
Despite the perceptual salience of temporal patterning Speech and Functional Communication:
of speech, the current literature on the temporal aspects of Theoretical Foundation From Both
speech production, especially of speech production disor- Production and Perception Perspectives
ders, leans heavily toward segmental features such as vowel
and consonant durations (Liss et al., 2009; Tjaden, 2007; The temporal structure of natural speech, as a mani-
Weismer et al., 2001) and relative timing between segments festation of the rhythmic profile of a language, is of high
(Romö et al., 2022), which provide limited insights into the regularity (Turk & Shattuck-Hufnagel, 2013). The rhyth-
global temporal organization of speech production. mic profile of a language is hierarchical in nature, consist-
Although theoretical models such as articulatory phonology/ ing of multiple nested subcomponents (also referred to as
task dynamics (AP/TD; Saltzman, 1986; Saltzman & linguistic rhythms), each reflecting the recurrence of a lin-
Munhall, 1989) and directions into velocities of articulators guistic unit at a specific timescale (Liberman & Prince,
(DIVA; Guenther, 1995; Guenther et al., 2006) offer explan- 1977; Mai et al., 2016; Poeppel, 2003). The functionally
atory accounts for how neuromotor processes related to most important linguistic rhythms occur in the low-
speech production evolve in time, little empirical evidence frequency range, including several crucial frequency bands
exists about how these processes (a) independently and col- centered around 1–3, 4–7, 13–25, and 25–35 Hz, repre-
lectively contribute to temporal patterning of the speech out- senting the periodicity of prosodic units, syllables, onset–
comes and (b) are impacted by neurological, physiological, rime (i.e., consonant–vowel transition in a syllable), and
and biomechanical disturbances in the speech production phonemes, respectively (Leong & Goswami, 2015). From
system (e.g., due to neuromotor speech disorders). The lack a neurodynamic perspective, a growing body of theoretical
of insights into how speech production behaviors are tempo- and empirical evidence has converged on the temporal
rally organized in relation to the speech outcomes poses a regularity of speech to stem from the entrainment of the
major challenge to understand whether the temporal pat- speech production and perception systems to the periodic-
terning principles as established in the perception literature ity of linguistic units (Goldstein, 2019; Poeppel, 2003;
also hold for the production mechanism. Addressing this Poeppel & Assaneo, 2020). Here, entrainment refers to the
challenge is critical to link the global temporal aspects of the phase locking between two systems, allowing the fre-
production and perception mechanisms, which would, in quency of one system to entrain the frequency of another
turn, inform an integrative understanding of the temporal (Roenneberg et al., 2003).
dynamics of speech sensorimotor processing and interaction. Within the auditory perception system, speech pro-
It also has clinical implications in guiding time-based reha- cessing and comprehension have been shown to occur at
bilitation to facilitate temporal regulation of speech produc- multiple discrete timescales, consistent with the hierarchy
tion and, in turn, improve functional speech outcomes in of linguistic rhythms (Mai et al., 2016; Poeppel, 2003).
individuals with speech production disorders. This multiscale temporal processing is achieved by
To address the challenge above, in the following sec- entrainment of neural oscillations in the auditory cortex to
tions, we first draw neurobehavioral evidence from the lit- the modulation envelope of the acoustic signal at linguisti-
erature to lay the theoretical foundation for linking tem- cally relevant timescales. Importantly, entrainments of
poral patterning of speech to functional communication delta-, theta-, beta-, and gamma-band auditory oscillations
4578 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
allow the auditory cortex to sample and parse a speech gap, the source of distortions in acoustic envelope modu-
time series into prosodic, syllabic, onset–rime, and phone- lation identified as contributing to the degradation of the
mic units, respectively (Giraud & Poeppel, 2012; Riecke functional speech outcomes in perceptual studies remains
et al., 2018). This time-based auditory processing mecha- poorly understood. As the integrative output of the speech
nism, known as auditory entrainment (Ding & Simon, production system, the acoustics of speech is shaped by
2014; Thaut, 2003), underlies the temporal organization of the coordinated activities of multiple underlying articula-
speech from the perception perspective. Disrupted tempo- tors including the tongue, jaw, lips, and pharynx. It is thus
ral patterning of speech, for example, by altering the peri- conceivable that impaired timing of articulation, charac-
odicity of acoustic envelope modulation, can impair the teristic of many neuromotor speech disorders (Duffy,
ability of the perception system to parse and interpret lin- 2013), may disrupt temporal patterning of the acoustic sig-
guistic information at appropriate temporal granularities, nal and, in turn, degrade the functional speech outcomes.
thereby degrading speech intelligibility (Drullman, 1995; To conceptualize how neuromotor speech disorders
Drullman et al., 1994, 1996; Ghitza, 2012). can impact the timing of articulation and, in turn, disrupt
In the dynamical systems account of speech produc- temporal patterning of speech, we draw insights from the-
tion, speech articulators are viewed as spring mass oscilla- oretical models of speech production (Guenther, 1994,
tors (Saltzman, 1986; Saltzman & Byrd, 2000; Saltzman & 1995; Saltzman & Byrd, 2000; Saltzman et al., 2008;
Munhall, 1989; van Lieshout, 2004). The oscillatory move- Šimko & Cummins, 2010, 2011; Windmann et al., 2015).
ment of an articulator is characterized by a preferred tempo- Although methodological disparities exist between differ-
ral rhythm, which is determined by the interplay of the bio- ent models, the major disparity lying in whether temporal
mechanical and neural oscillatory properties of the articula- information is specified for the entire articulatory move-
tory system. Remarkably, the preferred temporal rhythms ment trajectory (e.g., in the DIVA model; Guenther, 1995;
of the articulatory system are largely in line with the audi- Guenther et al., 1998) or at discrete time points signifying
tory rhythms. For example, the frequency of jaw oscillation the onset of articulatory events (e.g., in the AP/TD model;
during natural connected speech is around 4 Hz (Ohala, Saltzman, 1986; Saltzman & Munhall, 1989), there is a
1975), which is aligned with the theta rhythm of auditory general consensus that the temporal organization of
oscillations for parsing and decoding syllables. Such a speech production emerges from the interaction of the
rhythmic synchronization between articulatory and auditory central clock, which reflects the planning process in gener-
activities enables the speech production system to package ating trigger pulses to initiate motor responses (Wing &
and convey linguistic information at appropriate temporal Kristofferson, 1973) and the state of the articulatory sys-
granularities by entraining articulatory motor activities to tem. Because AP/TD currently provides the most compre-
auditory-perceptually encoded linguistic rhythms (e.g., sylla- hensive account of the temporal aspects of speech produc-
ble), thereby optimizing the efficiency of linguistic informa- tion, below we use the AP/TD framework to lay the theo-
tion exchange during communication. This establishes the retical groundwork for the impact of neuromotor speech
theoretical foundation of temporal patterning of speech, disorders on timing of articulation.
linking the production and perception mechanisms, both In AP/TD, the central clock is modeled as an
entrained to common underlying linguistic rhythms, to the ensemble of planning oscillators, which embed the tempo-
goal of functional communication. Confirmative empirical ral information in linguistic events into motor plans
evidence is yet needed from the production side to substanti- (Saltzman et al., 2008; Šimko & Cummins, 2011;
ate how specific speech production behaviors (e.g., articula- Windmann et al., 2015). These oscillators can be catego-
tory movements) are entrained to the hierarchy of linguistic rized into (a) gestural planning oscillators, which gate the
rhythms as evidenced in auditory-perceptual activities. activation of individual articulatory gestures, and (b)
suprasegmental planning oscillators, which specify the
Impact of Neuromotor Disorders on higher level prosodic structure composed of a hierarchy of
Temporal Patterning of Speech: Theoretical nested constituents such as syllables, feet, words, and
Groundwork Based on Articulatory phrases. Within this framework, gestures are articulatory
Entrainment representations of phonological segments, which are
defined in an abstract manner by the location and degree
To evaluate the theorized functional significance of of local constrictions in the vocal tract (Saltzman, 1986;
temporal patterning of speech, most empirical studies so Saltzman & Munhall, 1989). Each gesture is governed by
far have used a bottom-up approach, for example, by a set of physical articulators, which are functionally
manipulating the modulation envelope of acoustic signals coupled to one another and act synergistically to form
and evaluating its perceptual consequences (Drullman and release vocal tract constrictions (Saltzman, 1986;
et al., 1994, 1996; Ghitza, 2012). Little empirical research Saltzman & Munhall, 1989). Gestural activation patterns
has been done from the production side. Because of this (as specified by the gestural planning oscillators) are
4580 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
guiding the timing of speech interventions (Ball et al., 2001, temporal modulation of each articulator within and across
2002; Green et al., 2013). Importantly, speech intelligibility the target timescales. Moreover, because the AP/TD
and speaking rate exhibit different declining trajectories pro- model treats functionally related articulators as coordina-
viding complementary information to the course of ALS tive structures that are activated synergistically during
progression, with speaking rate showing earlier and faster speech (Saltzman & Byrd, 2000; Saltzman & Munhall,
declines during the early stage and intelligibility manifesting 1989), these articulators are expected to be entrained to
accelerated declines as the disease progresses to the late common periodicity contents in linguistic rhythms. To
stage (Rong, Yunusova, & Green, 2015; Yorkston et al., characterize the functional coupling between these articu-
1993). To capture the global pattern of functional speech lators, a set of interarticulator modulation features were
decline over the course of ALS, we combined speech intel- extracted to measure the coherence of temporal modula-
ligibility and speaking rate into an integrative functional tion patterns between functionally related articulators at
speech index—intelligible speaking rate—as the number of each target timescale. The combination of these intra- and
intelligible words per minute (WPM). This index has proven interarticulator modulation features provides a hierarchi-
to reflect the communication efficiency of individuals with cal, quantitative characterization of articulatory entrain-
dysarthria, providing a more comprehensive measure of ment to different linguistic rhythms, enabling an integra-
functional speech performance across a wider range of tive understanding of temporal patterning of articulation.
severity than intelligibility alone (Yorkston & Beukelman,
1981). Furthermore, as conceptualized above, one theoreti- Behavioral Means of Modifying Temporal
cal hypothesis of temporal patterning of speech pertains to Patterning of Articulation: Speaking Rate
optimizing the efficiency of conveying linguistic information Manipulation
during communication. Intelligible speaking rate thus pro- Based upon the theorized relationships between neuro-
vides a functional speech outcome to target the hypothe- motor speech disorder, temporal patterning of articulation,
sized relation of temporal patterning of speech with commu- and functional speech outcomes, a logical follow-up question
nication efficiency. to ask is how the disorder-related changes in temporal pat-
terning of articulation can be modulated, for example, by
An Integrative Approach to Characterizing behavioral modifications as widely used in the management
Temporal Patterning of Articulation: Intra- and of neuromotor speech disorders, to improve the global tempo-
Interarticulator Modulation at Three Hierarchically ral organization of speech production. To address this explor-
Nested, Linguistically Relevant Timescales atory question, we selected speaking rate manipulation—a
According to the theoretical foundation for temporal common behavioral modification for individuals with neuro-
patterning of speech production and perception, articula- motor speech disorders (Yorkston, Hakel, et al., 2007)—as a
tory motor activities presume to occur at the same hierar- triggering method to modulate temporal patterning of articu-
chically organized timescales as previously reported in lation. The functional significance of speaking rate lies in that
auditory-perceptual activities, encoding the underlying lin- it is not just a descriptive feature of speech but also a control
guistic rhythms. To experimentally investigate such tempo- variable that modulates global functioning of the speech pro-
ral patterning of articulation, we adopted the temporal duction system (Guenther et al., 2006; Saltzman & Byrd,
modulation analysis from the psychoacoustic literature 2000; Tourville & Guenther, 2011; Turk & Shattuck-
(Leong & Goswami, 2015; Leong et al., 2017) to charac- Hufnagel, 2014). The modulatory effect of rate manipulation
terize the modulation patterns of articulatory activities at is corroborated by a wide range of empirical findings showing
three hierarchically nested, linguistically relevant time- rate-elicited changes in acoustic performance, including vowel
scales (delta, 0.9–2.5 Hz; theta, 2.5–12 Hz; and beta/ formant frequencies, diphthong formant transition trajecto-
gamma, 12–40 Hz), reflecting the rhythms of prosodic ries, and consonant spectral moments (Tjaden & Wilding,
stress, syllable, and onset–rime/phoneme, respectively. It is 2004; Turner et al., 1995; Weismer et al., 2000), as well as in
well established that the periodicity contents in a time articulatory performance, including duration, displacement,
series are represented by the modulus (i.e., magnitude) velocity, and variability of articulatory movement (Kleinow
and phase of the relevant temporal modulation patterns et al., 2001; Kuruvilla-Dugdale & Mefferd, 2017; McClean,
(Todd et al., 2002). As such, articulatory entrainment to 2000; McClean & Tasko, 2003; McHenry, 2003; Mefferd
the periodicity contents in the hierarchy of linguistic et al., 2014; Rong, 2020). These findings provide the evidence
rhythms can be investigated by the magnitude of articula- base for speaking rate manipulation as a global behavioral
tory modulation at each linguistically relevant timescale modification for managing neuromotor speech disorders.
and the relative phase between articulation modulations at It is important to note that, despite being a global
different timescales. In accordance with this notion, a vari- behavioral modification, the effect of rate manipulation
ety of intra-articulator modulation features were derived on articulatory performance has been found to be largely
to characterize the magnitude and relative phase of idiosyncratic, leading to variable changes in one or more
4582 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Table 1. Demographic and clinical characteristics of participants.
Subject ID Gender Age (years) Onset DaysSinceDiag MoCA SIT_Intell SIT_SR SIT_IntellRate Group
Note. Descriptive statistics for groups are provided as mean (standard deviation) at the bottom of the table. Em dashes indicate “not applicable.” Subject ID: ALS = amyotrophic
lateral sclerosis; HC = healthy control. Gender: M = male; F = female. Onset: B = bulbar; C = cervical; N = neck; L = lumbar. DaysSinceDiag = disease duration in days since diag-
nosis. MoCA = Montréal Cognitive Assessment score (range: 0–30). SIT_Intell = speech intelligibility in percentage of intelligible words, as assessed by the Sentence Intelligibility
Test (SIT). SIT_SR = speaking rate in words per minute (WPM), as assessed by the SIT. SIT_IntellRate = intelligible speaking rate in WPM, derived as the product of SIT_Intell and
SIT_Rate. Group: ALS + S = individuals with ALS, presented with overt clinical speech symptoms; ALS − S = individuals with ALS, absent of overt clinical speech symptoms; Control =
healthy controls; n/a = not applicable.
4583
provides an overview of the demographic and clinical char- speech intelligibility by speaking rate. The SIT-derived
acteristics of the participants. speech intelligibility and intelligible speaking rate served
The participants with ALS were recruited from the as the functional speech outcomes to correlate with the
multidisciplinary ALS clinic at The University of Kansas modulation features of articulation in the first aim.
Medical Center and had been seen by a speech-language
pathologist (the second author) who evaluated their speech Speech Production Experiment
function following standard clinical guidelines. Based on
the clinical speech evaluation, six participants with ALS Experimental Task
manifested overt clinical speech symptoms consistent with The protocol for the speech production experiment
flaccid–spastic dysarthria, and seven were asymptomatic. was part of a larger project, in which participants per-
Regardless of the presence of clinical speech symptoms, all formed a variety of oral reading tasks under habitual and
participants with ALS (along with their caregivers when modified conditions. For this study, the sentence “take
applicable) reported experiencing speech/voice changes today’s tasty tea on the terrace” was selected as the speech
since the disease onset, according to a questionnaire survey stimulus, which was read out loud 3 times by the partici-
of self-perceived problems with bulbar (i.e., speech and pants, first at their habitual rate and then at a slower rate.
swallowing) motor function. Given that ALS is a neuro- For the slow rate task, speaking rate was voluntarily
generative disease known to have a long prodromal phase, manipulated by the participants in accordance with the
detecting subclinical changes preceding the clinical symp- following instruction: “Read the sentence out loud at a
tom onset is of critical importance in early intervention to slower rate, about half your habitual rate, focusing on
slow disease progression and improve patient’s quality of stretching the duration of the words.” The data derived
life (Green et al., 2013; Yorkston et al., 2002). Inclusion of from the habitual rate task were used for all study aims,
both patients with and without overt clinical symptoms can and the data derived from the slow rate task were used
therefore provide a heterogeneous sample of the prodromal for the third aim only.
and symptomatic phases of bulbar disease (i.e., a hallmark The selected sentence consists of seven words of one
feature of ALS characterized by the involvement of the bul- to two syllables with an alternating stress pattern. Phonet-
bar musculature) to assess the utility of the proposed ically, this sentence is loaded with (voiceless) alveolar
approach in detecting temporal speech deficits throughout stops and (middle) vowels, along with several other pho-
the disease course. nemes (e.g., velar stop, fricative, nasal, and approximant).
Previous research on segmental timing has shown that the
Functional Speech Outcomes release of voiceless stops and middle vowels are among
the most affected segments in individuals with ALS, exhi-
A standard functional speech assessment—the Sen- biting significant and greater increases in duration com-
tence Intelligibility Test (SIT; Yorkston, Beukelman, pared to other segments (Tjaden & Turner, 2000). From a
et al., 2007)—was administered to all participants. In this physiological perspective, because muscles of the tongue,
test, participants read 11 randomly generated sentences of especially of tongue tip, are known to be the most
five to 15 words. Speech was digitally recorded at 22.05 involved articulatory musculature in ALS (DePaul et al.,
kHz and later orthographically transcribed and timed by 1988), the alveolar stop–vowel contexts are expected to
two naïve listeners, using the built-in software for the SIT. elicit abnormal tongue kinematics reflective of the under-
The listeners were undergraduate students in the speech- lying neuromotor pathology. In addition to the tongue as
language pathology major, who (a) were native speakers the primary articulator, the jaw, which anatomically lies
of American English; (b) had normal speech, language, under the tongue, and the lips are involved as secondary
hearing, and cognitive functions; and (c) were unfamiliar articulators.
with either the speech stimuli or the speaker characteris-
tics. Based on the perceptual response of each listener, the Instrumentation
percentage of words correctly transcribed and number of A three-dimensional electromagnetic kinematic track-
WPM were calculated and then averaged across listeners ing system (Wave, Northern Digital, Inc.) was used to
as measures of speech intelligibility and speaking rate, record articulatory movements while the participants read
respectively. Interlistener reliability has proven to be high the speech stimulus. Specifically, four small, wired sensors
by our prior work (Rong & Heidrick, 2021), showing a were attached to (a) tongue tip (0.5–1 cm posterior to the
significant strong correlation and minimal discrepancy apex of the tongue), (b) tongue body (2–3 cm posterior to
between the two listeners for both intelligibility (r = .86, tongue tip), (c) jaw (center of lower chin), and (d) lower lip
interlistener discrepancy = 1.02% ± 1.44%) and rate (r = (central vermilion border of lower lip) using dental adhe-
.99, interlistener discrepancy = 3.15 ± 2.03 WPM). Intelli- sive or medical tape. An additional reference sensor embed-
gible speaking rate was then calculated by multiplying ded in a headband was attached to the center of forehead,
4584 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
with its axes aligning as closely as possible with the ana- series was derived from the original, untransformed jaw
tomical axes of the oral cavity. Sensor motions were sensor data, as the Euclidean distance between the jaw
recorded relative to the head reference at 100 Hz by the and an oral anatomical landmark—central upper incisor
WaveFront software (Northern Digital, Inc.). In addition, (location determined by the anterior edge of the palatal
a midsagittal palatal trace was acquired posteriorly from trace)—at each sampled time point. This articulatory time
the edge between the hard and soft palate and anteriorly to series characterized the jaw movement pattern that pas-
the central upper incisor using a probe supplied by North- sively contributed to the tongue tip, tongue body, and lip
ern Digital, Inc. Moreover, participants wore a head- gestures. The other three articulatory time series were
mounted microphone (DPA d:fine 4188) to record speech derived from the transformed tongue tip, tongue body,
acoustic signals, which served as the reference for interpret- and lower lip sensor data, as the Euclidean distance
ing the articulatory data. The acoustic signals were proc- between each articulator and the jaw at each sampled time
essed by the Behringer XENYX 802 sound conditioner, point. These articulatory time series characterized the pat-
digitized at 22.05 kHz, and acquired by the WaveFront terns of tongue tip, tongue body, and lower lip movements
software. relative to the jaw, reflecting the active movement compo-
nents that contributed to the tongue tip, tongue body, and
Data Processing lower lip gestures. A graphic example of the articulatory
time series for the habitual and slow rate tasks for a partici-
The movement trajectories of all articulators were pant with ALS is provided in Figure 1. Temporal modula-
visually inspected to exclude data with recording errors. tion analysis was applied to these articulatory time series,
Note that, because the sensors were independent of each with or without rate normalization (see details below).
other, only the data for the malfunctioning sensors in spe-
cific trials were excluded, whereas the data for other normal Rate Normalization
functioning sensors in these trials were preserved. Given the As a global timing feature, speaking rate itself con-
primary focus of this study on linguistically relevant time- tributes to the statistical pattern of temporal modulation
scales in the lower frequency domain (i.e., ≤ 40 Hz), only the (e.g., reduced speaking rate during the slow rate task tends
superior–inferior and anterior–posterior dimensions of artic- to increase the magnitude of modulation in the low-
ulatory data, which characterized the primary pattern of lin- frequency range). Although it was our intention to use
guistically related articulatory motions, were analyzed to voluntary rate reduction as a triggering method to elicit
minimize the confounding effect of the linguistically less rele- mechanistic changes in temporal patterning of articulation
vant lateral–medial motions; the resulting two-dimensional in Aim 3, the change in speaking rate itself could intro-
articulatory trajectories were low-pass filtered at 40 Hz by a duce a confounder that may interfere with the interpreta-
second-order, zero-lag Butterworth filter to further remove tion of the rate-elicited articulatory changes. Therefore,
high-frequency movement artifacts. we adopted the rate normalization approach in Leong
et al. (2017) to normalize the duration of all articulatory
Decoupling time series for the slow rate task relative to that of the
Because tongue and lower lip resided on the jaw, habitual rate task. Specifically, each slow rate trial was
the sensors placed on tongue tip, tongue body, and lower matched with a trial of the habitual rate task produced by
lip were expected to capture a mixture of active (i.e., the same speaker (i.e., the ith slow rate trial was matched
tongue and lower lip movements independent of the jaw) with the ith habitual rate trial to account for potential
and passive (i.e., loading effect of the jaw) movement habituation effects). Duration was calculated for both the
components. To decouple these active and passive move- slow and habitual rate trials. The slow rate trial was then
ment components, the sensor data for tongue tip, tongue rescaled in time by resampling the time series based on
body, and lower lip were transformed from the head coor- the ratio of duration between the two rate conditions. The
dinate system to the jaw coordinate system, using the resulting rescaled articulatory data for the slow rate task,
transformation method described in Westbury et al. (2002; along with the original articulatory data for the habitual
see Equation 1). The transformed sensor data reflected the rate task, were submitted to temporal modulation analysis
active tongue tip, tongue body, and lower lip movements, in the following.
with removal of the passive loading effect of the jaw.
Temporal Modulation Analysis
Characterization of Time-Varying Articulatory
Changes For each trial of production, 24 features were
To characterize articulatory changes over time for extracted algorithmically by a custom program in MATLAB
the subsequent temporal modulation analysis, we derived (MathWorks) to characterize the temporal modulation
four articulatory time series. The first articulatory time patterns at three linguistically relevant timescales (delta,
0.9–2.5 Hz; theta, 2.5–12 Hz; and beta/gamma, 12–40 Hz) delta-, theta-, and beta/gamma-band modulations, respec-
within (intra) and between (inter) articulators. These fea- tively, and (b) two features characterizing the relative
tures are summarized in Table 2 and elaborated below. phase between the theta-band modulation, which has been
identified as the master oscillatory rhythm during speech
Intra-Articulator Modulation processing (Ghitza, 2011, 2012), and the delta- and beta/
In view of the oscillatory models, to generate a gamma-band modulations, for each articulator.
structured rhythmic hierarchy requires the component Modulation depths within the target frequency
oscillators to (a) entrain to all periodicity contents in the bands. Modulation spectrum was generated for each
hierarchy and (b) remain harmonically entrained (or articulatory time series by power spectrum analysis
phase-locked) to one another (Giraud & Poeppel, 2012). using a 2,048-point fast Fourier transform (FFT) with a
In keeping with this theoretical view, 12 features were Hamming window. Similar to the acoustic envelope
extracted to characterize the magnitude and relative phase modulation analysis by Liss et al. (2010), the mean
of temporal modulation patterns across four articulators: depth of modulation within each target frequency band,
tongue tip, tongue body, lower lip, and jaw. This feature representing the extent of articulatory entrainment to the
set included (a) three features measuring the magnitude of relevant linguistic rhythm, was calculated by summing
4586 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Table 2. Summary of modulation features.
Intra-articulator
Delta-band modulation depth TT_mod_depth_delta Articulatory entrainment to the rhythm of prosodic stress
TB_mod_depth_delta
L_mod_depth_delta
J_mod_depth_delta
Theta-band modulation depth TT_mod_depth_theta Articulatory entrainment to the rhythm of syllable
TB_mod_depth_theta
L_mod_depth_theta
J_mod_depth_theta
Beta/gamma-band modulation depth TT_mod_depth_beta.gamma Articulatory entrainment to the rhythm of onset–rime/
TB_mod_depth_beta.gamma phoneme
L_mod_depth_beta.gamma
J_mod_depth_beta.gamma
Delta–theta phase synchronization TT_PSI_delta_theta Harmonic articulatory entrainment to the rhythms of
TB_PSI_delta_theta stress and syllable
L_PSI_delta_theta
J_PSI_delta_theta
Theta–beta/gamma phase synchronization TT_PSI_theta_beta.gamma Harmonic articulatory entrainment to the rhythms of
TB_PSI_theta_beta.gamma syllable and onset–rime/phoneme
L_PSI_theta_beta.gamma
J_PSI_theta_beta.gamma
Interarticulator
Theta-band coherence COH_TTJ_theta Interarticulator entrainment to the rhythm of syllable
COH_TBJ_theta
Beta/gamma-band coherence COH_TTJ_beta.gamma Interarticulator entrainment to the rhythm of onset–rime/
COH_TBJ_beta.gamma phoneme
Note. TT = tongue tip; TB = tongue body; L = lower lip; J = jaw; PSI = phase synchronization index; COH = coherence.
the power at all frequency bins within the target band oscillators are in synchrony, they are referred to as being
and normalizing it by the total spectral power. As a phase locked or harmonically entrained, which satisfies
result, 12 measures of modulation depth (4 articulators × the following condition:
3 target bands) were derived per trial of production, which
are denoted as TT_mod_depth_delta, TT_mod_depth_theta, jΔϕðtÞj < σ; where ΔϕðtÞ ¼ nϕ1 ðtÞ mϕ2 ðtÞ: (1)
TT_mod_depth_beta.gamma, TB_mod_depth_delta, TB_mod_
depth_theta, TB_mod_depth_beta.gamma, L_mod_depth_delta, In this equation, ϕ1(t) and ϕ2(t) are the instanta-
L_mod_depth_theta, L_mod_depth_beta.gamma, J_mod_depth_ neous phase of the oscillators at time t, Δϕ(t) is the gener-
delta, J_mod_depth_theta, and J_mod_depth_beta.gamma alized phase difference between the two oscillators, n and
in Table 2. m are integers reflecting the frequency relation between
Cross-frequency phase synchronization. The relative the two oscillators, and σ is a small positive constant.
phase between articulatory modulations within different Based on the phase locking condition, the degree of syn-
frequency bands was assessed by a synchronization index chronization between two oscillators can be measured by
based on the harmonic phase alignment of sub-band a phase synchronization index (PSI; Schack & Weiss,
articulatory signals. Synchronization is an important attribute 2005) defined as follows:
of neuron activities contributing to the coherence and D E
regularity of neural outputs (Roelfsema et al., 1997; Tononi PSI ¼ eiðnϕ1 ðtÞmϕ2 ðtÞÞ ; (2)
et al., 1998). In the motor speech system, the functional pffiffiffiffiffiffiffi
significance of synchronization is corroborated by empirical where i is the unit imaginary number equal to 1 and
evidence showing that reduced synchronization of motor h∙i denotes averaging across time. PSI ranges between 0
unit firing, as caused by motor neuron degeneration in ALS, and 1, with 0 denoting no synchrony and 1 indicating per-
can reduce the coherence and regularity of articulatory fect synchrony.
muscle activities and their movement kinematics (Rong, 2021; In this study, PSI was used to assess the harmonic
Rong & Pattee, 2021). entrainment of articulatory activities to different linguistic
In physical terms, the synchronization of two oscilla- rhythms by evaluating the phase alignment between artic-
tors is represented by their generalized phase relationship ulatory modulations within different frequency bands.
(Schack & Weiss, 2005; Tass et al., 1998). When two Specifically, each articulatory time series was bandpass
4588 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
modulation pattern of articulation. The fit of the factori- Aim 3: Effect of Rate Manipulation on Temporal
zation model was evaluated by three standard fit indices: Patterning of Articulation
Tucker–Lewis index (TLI), root-mean-square of residuals To examine the effect of rate manipulation on temporal
(RMSR), and root-mean-square error of approximation patterning of articulation, the data for the habitual and slow
(RMSEA) index. A model with a good fit should have a rate conditions were merged. Linear mixed-effects models were
large TLI (range: 0–1) close to 1 and small RMSR and applied to the merged data. Within each model, one of the
RMSEA values close to 0. Finally, the factor scores, modulation features served as the dependent variable; the fixed
which measured the performance of the participants on effects consisted of task, group, and Task × Group interaction;
each dimension in the factor space, were estimated by the and participant was treated as the random effect. Post hoc
correlation-preserving method. These factor scores were pairwise comparisons were conducted based on the contrast of
used as the independent variables to correlate with the estimated marginal means between the two rate conditions for
functional speech outcomes. each group, with Bonferroni-adjusted significance level.
Regression. To predict the functional speech outcomes
(i.e., speech intelligibility and intelligible speaking rate), all
factor scores as derived above were fed first into linear Results
regression and then into support vector regression (SVR)
with the radial basis function (RBF) kernel. SVR is a Task Performance Verification
regression method based on one of the most robust and
widely used machine learning algorithms—support vector Individual-level between-tasks comparisons revealed
machine (Drucker et al., 1997). SVR has various advantages increased sentence duration in all but three trials of the slow
over linear regression, including (a) more flexible error rate task (including two trials produced by ALS1 and one trial
threshold and (b) allowing for mapping of the original data produced by HC3) relative to the baseline sentence duration
into a higher dimensional space using nonlinear kernels such for the habitual rate task. The three exceptions were consid-
as RBF to solve regression problems that do not have a ered as reflecting failure of following instructions or inability
good fit in a linear space. Given the exploratory nature of to voluntarily reduce speaking rate, which were excluded from
this study and the known nonlinearity of the speech the subsequent analyses. Figure 2 provides a graphic display
production and perception mechanisms, we employed both of sentence duration by task and participant. Group-level sta-
linear and nonlinear regression methods to predict the tistical analysis revealed a significant main effect of task, F(1,
contribution of temporal modulation of articulation to the 104.40) = 365.24, p < .0001, and Task × Group interaction,
functional speech outcomes. Moreover, because of the highly F(1, 104.40) = 7.13, p = .0088, on sentence duration. There
skewed distribution of speech intelligibility (skewness = was no significant difference in sentence duration between the
−3.14) with two participants showing substantially lower two groups, F(1, 21.17) = 3.73, p = .067. Post hoc tests sug-
scores (ALS1, ALS8; see Table 1) than others, intelligibil- gested that sentence duration was significantly longer in the
ity was logarithmically transformed into 10 log10[max(in- slow versus habitual rate task for both healthy controls (t =
telligibility + 1) − intelligibility] (skewness after transfor- 10.95, p < .0001) and individuals with ALS (t = 16.49, p <
mation = 1.48), to prevent the sparse data on the lower .0001). These descriptive and statistical results confirmed the
end of the distribution from generating biased results (i.e., general adherence to task instructions and the validity of task
blind spots) during the training of the machine learning performance at both group and individual levels.
algorithm. The performance of the models was evaluated
by R2 and root-mean-square error (RMSE) through five- Modulation and Coherence Spectra
fold cross-validation.
To provide a visualization of the intra- and interarti-
Aim 2: Effect of Motor Speech Impairment on culator modulation patterns during habitual speech,
Temporal Patterning of Articulation Figure 3 shows the mean (and standard error) of the mod-
To examine the effect of motor speech impairment on ulation and coherence spectra by group. Because the spec-
temporal patterning of articulation, linear mixed-effects tral energy of biological signals naturally follows a 1/f dis-
models were applied to the data for the habitual rate condi- tribution, we further fitted a 1/f function to each modula-
tion. Each model was constructed with one of the modula- tion spectrum to help visualize the modulation patterns.
tion features as the dependent variable, group (ALS, control) Deviations from the 1/f fit imply rhythmic modulation
as the fixed effect, and participant as the random effect. To within the relevant frequency ranges. Based on this notion,
further evaluate the direction and extent of disease-related it can be seen that the articulators all exhibited rhythmic
change in temporal patterning of articulation, the effect sizes modulations within the target frequency bands (delta, theta,
for the between-groups differences in all modulation features and beta/gamma), with tongue tip, tongue body, and jaw
were estimated by Cohen’s d. showing overall greater modulations compared with lower
lip in both ALS and healthy control groups. All coherence and J_mod_depth_beta.gamma), (b) Factor 2 mainly
spectra exhibited one or more peaks within the theta and consisted of delta- and theta-band modulation depths and
beta/gamma bands (note that only the peaks above the delta–theta phase synchronization for tongue and jaw (e.g.,
baseline generated by the surrogate signals were included TT_mod_depth_delta, TB_mod_depth_delta, J_mod_depth_
for calculating interarticulator coherence) and a lack of theta, TT_PSI_delta_theta, TB_PSI_delta_theta, and J_PSI_
peaks within the delta band in both ALS and healthy con- delta_theta), (c) Factor 3 was mainly composed of lip modula-
trol groups. These observations are consistent with the post tion features (e.g., L_mod_depth_delta, L_mod_depth_beta.
hoc peak verification as noted in the Method section. gamma, and L_PSI_delta_theta), (d) Factor 4 consisted of
interarticulator coherence features (e.g., COH_TTJ_beta.
Aim 1: Relationship Between Temporal gamma, COH_TBJ_theta, and COH_TBJ_beta.gamma), (e)
Patterning of Articulation and Functional Factor 5 had one primary component feature reflecting theta-
Speech Outcomes band modulation depth for tongue body (e.g., TB_mod_
depth_theta), and (f) Factor 6 also had one primary compo-
Factor Analysis nent feature reflecting theta–beta/gamma phase synchroniza-
The 24 modulation features were clustered into six tion for tongue body (e.g., TB_PSI_theta_beta/gamma).
composite factors. The resulting six-factor model exhibited
an acceptable fit, as indicated by the fit indices: TLI = Regression
0.86, RMSR = 0.06, RMSEA = 0.055. The rotated factor The results for the regression analyses, including RMSE
loadings are shown in Figure 4, with larger loadings denot- and R2 for both linear regression and SVR and partial R2
ing greater contributions of the feature to the factor. Fea- and p values based on the linear regression models, are listed
tures that loaded greater than 0.3 (i.e., absolute value in Table 3. Between the two regression methods, SVR per-
> 0.3) on each factor were conventionally regarded as the formed notably better in predicting intelligible speaking
primary component features for the factor. Based on this rate than did linear regression, whereas the two methods
notion, (a) Factor 1 was primarily composed of beta/ showed similar performance in predicting speech intellig-
gamma-band modulation depths for tongue and jaw ibility. These results confirmed that temporal patterning of
(e.g., TT_mod_depth_beta.gamma, TB_mod_depth_beta.gamma, articulation was moderately to strongly correlated with the
4590 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Figure 3. Modulation and coherence spectra for ALS (a, b) and healthy control (c, d) groups. Blue line and shaded area around the blue line
represent the mean and standard error for each group. Red line denotes the 1/f fit to the modulation spectrum, with the associated equation
being displayed above the line. The 1/f spectrum is in the form of S = a/fb, where f is frequency and a and b are parameters to be deter-
mined during the fitting process. Vertical dashed lines mark the boundaries between the target frequency bands (delta: 0.9–2.5 Hz, theta:
2.5–12 Hz, beta/gamma: 12–40 Hz). ALS = amyotrophic lateral sclerosis; Control = healthy controls; PSD = power spectral density; COH =
coherence; TT = tongue tip; TB = tongue body; L = lower lip; J = jaw.
functional speech outcomes across individuals with ALS These features included (a) delta- and theta-band modu-
and healthy controls (R2 = .51–.73). Among different pre- lation depths and theta–beta/gamma phase synchroniza-
dictors, Factors 1, 2, and 5 were the most contributive to tion for tongue tip, TT_mod_depth_delta, F(1, 20.86) =
both speech intelligibility and intelligible speaking rate, as 7.44, p = .013; TT_mod_depth_theta, F(1, 21.06) =
reflected by their partial R2 and significant p values. 17.12, p = .00046; TT_PSI_theta_beta.gamma, F(1,
25.32) = 4.27, p = .049, and (b) theta- and beta/gamma-
Aim 2: Effect of Motor Speech Impairment on band modulation depths for tongue body, TB_mod_
Temporal Patterning of Articulation depth_theta, F(1, 20.76) = 5.62, p = .028; TB_mod_depth_
beta.gamma, F(1, 20.21) = 6.16, p = .022. Among these
Of all modulation features, five showed a significant five features, the group effect on TT_mod_depth_theta sur-
main effect of group at the significance level of p < .05. vived Bonferroni correction for multiple tests. The group
effect did not reach statistical significance for the rest of the large decreases in the ALS versus healthy control group
features (p > .05). (Cohen’s d = −0.52 to −1.69). In addition, five other fea-
The direction and extent of disease-related change in tures, including TT_mode_depth_beta.gamma, TB_PSI_
all modulation features, as indicated by the Cohen’s d theta_beta.gamma, J_mod_depth_theta, J_mod_depth_beta.
effect sizes, are visualized in Figure 5. Of all features, gamma, and J_PSI_delta_theta, revealed moderate decreases
those identified as exhibiting a significant main group in the ALS versus healthy control group (Cohen’s d = −0.52
effect, corrected or uncorrected, all showed moderate-to- to −0.78), although the main effect did not reach statistical
4592 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Table 3. Results for the regression models between the factor scores derived from the articulatory modulation features and the functional speech outcomes.
Factor 1 .00030 .19 .36 .53 .37 .51 < .0001 .27 33.31 .63 29.47 .73
Factor 2 < .0001 .38 < .0001 .52
Factor 3 .58 .0050 .28 .019
Factor 4 .18 .029 .87 .00043
Factor 5 .00013 .21 < .0001 .24
Rong & Heidrick: Temporal Patterning of Articulation
Note. RMSE = root-mean-square error; LR = linear regression; SVR = support vector regression.
4593
Figure 5. Cohen’s d effect sizes for the differences between the articulatory modulation features for individuals with amyotrophic lateral scle-
rosis and healthy controls. The square marks the estimated mean effect size, and the horizontal line around the square is the 95% confi-
dence interval (CI). The numeric values of the mean effect sizes and 95% CIs are shown by the right side of the plot. Feature notation:
TT_mod_depth_delta = delta-band modulation depth for tongue tip; TT_mod_depth_theta = theta-band modulation depth for tongue tip;
TT_mod_depth_beta.gamma = beta/gamma-band modulation depth for tongue tip; TT_PSI_delta_theta = delta–theta phase synchronization
index (PSI) for tongue tip; TT_PSI_theta_beta.gamma = theta–beta/gamma PSI for tongue tip; TB_mod_depth_delta = delta-band modulation
depth for tongue body; TB_mod_depth_theta = theta-band modulation depth for tongue body; TB_mod_depth_beta.gamma = beta/gamma-band
modulation depth for tongue body; TB_PSI_delta_theta = delta–theta PSI for tongue body; TB_PSI_theta_beta.gamma = theta–beta/gamma
PSI for tongue body; L_mod_depth_delta = delta-band modulation depth for lower lip; L_mod_depth_theta = theta-band modulation depth for
lower lip; L_mod_depth_beta.gamma = beta/gamma-band modulation depth for lower lip; L_PSI_delta_theta = delta–theta PSI for lower lip;
L_PSI_theta_beta.gamma = theta–beta/gamma PSI for lower lip; J_mod_depth_delta = delta-band modulation depth for jaw; J_mod_depth_theta =
theta-band modulation depth for jaw; J_mod_depth_beta.gamma = beta/gamma-band modulation depth for jaw; J_PSI_delta_theta = delta–theta
PSI for jaw; J_PSI_theta_beta.gamma = theta–beta/gamma PSI for jaw; COH_TTJ_theta = theta-band coherence between tongue tip and jaw;
COH_TTJ_beta.gamma = beta/gamma-band coherence between tongue tip and jaw; COH_TBJ_theta = theta-band coherence between tongue
body and jaw; COH_TBJ_beta.gamma = beta/gamma-band coherence between tongue body and jaw.
significance. The rest of the features all showed small and affected. Interarticulator modulation remained relatively
nonsignificant differences between the two groups. intact in individuals with ALS.
The results for Aim 2 together suggested that, com-
pared with healthy controls, individuals with ALS exhibited Aim 3: Effect of Rate Manipulation on
notable trends of impairment of intra-articulator modula- Temporal Patterning of Articulation
tion, as characterized by an overall decrease in modulation
depths within the target frequency bands (especially the Figure 6 displays the box plots for all modulation
theta and beta/gamma bands) and cross-frequency phase features by task and group. Statistical results for the main
synchronization for tongue tip, tongue body, and jaw, effects of task, group, and Task × Group interaction as
whereas the modulation features for lower lip were less well as the post hoc tests are provided in Table 4. Because
4594 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Figure 6. Box plots for articulatory modulation features by group (ALS = amyotrophic lateral sclerosis; Control = healthy controls) and task
(Habitual = habitual rate task; Slow = slow rate task). Feature notation: TT_mod_depth_delta = delta-band modulation depth for tongue tip;
TT_mod_depth_theta = theta-band modulation depth for tongue tip; TT_mod_depth_beta.gamma = beta/gamma-band modulation depth for
tongue tip; TT_PSI_delta_theta = delta–theta phase synchronization index (PSI) for tongue tip; TT_PSI_theta_beta.gamma = theta–beta/
gamma PSI for tongue tip; TB_mod_depth_delta = delta-band modulation depth for tongue body; TB_mod_depth_theta = theta-band modu-
lation depth for tongue body; TB_mod_depth_beta.gamma = beta/gamma-band modulation depth for tongue body; TB_PSI_delta_theta =
delta–theta PSI for tongue body; TB_PSI_theta_beta.gamma = theta–beta/gamma PSI for tongue body; L_mod_depth_delta = delta-band
modulation depth for lower lip; L_mod_depth_theta = theta-band modulation depth for lower lip; L_mod_depth_beta.gamma = beta/gamma-
band modulation depth for lower lip; L_PSI_delta_theta = delta–theta PSI for lower lip; L_PSI_theta_beta.gamma = theta–beta/gamma PSI
for lower lip; J_mod_depth_delta = delta-band modulation depth for jaw; J_mod_depth_theta = theta-band modulation depth for jaw;
J_mod_depth_beta.gamma = beta/gamma-band modulation depth for jaw; J_PSI_delta_theta = delta–theta PSI for jaw; J_PSI_theta_beta.
gamma = theta–beta/gamma PSI for jaw; COH_TTJ_theta = theta-band coherence between tongue tip and jaw; COH_TTJ_beta.gamma =
beta/gamma-band coherence between tongue tip and jaw; COH_TBJ_theta = theta-band coherence between tongue body and jaw;
COH_TBJ_beta.gamma = beta/gamma-band coherence between tongue body and jaw.
TT_mod_depth_delta 4.90 .029 4.64 .043 14.86 .00020 −4.04 .0001 1.24 .22
TT_mod_depth_theta 17.21 < .0001 11.69 .0026 4.47 .037 1.35 .18 4.74 < .0001
TT_mod_depth_beta.gamma 8.61 .0041 2.80 .11 3.94 .050 −3.22 .0017 −0.74 .46
TT_PSI_delta_theta 6.14 .015 0.58 .45 0.27 .61 −1.99 .049 −1.49 .14
TT_PSI_theta_beta.gamma 1.66 .20 0.85 .36 2.88 .092 −0.27 .79 2.26 .026
TB_mod_depth_delta 16.47 < .0001 3.54 .075 0.050 .82 −2.55 .012 −3.24 .0016
TB_mod_depth_theta 13.62 .00036 5.94 .024 0.31 .58 2.88 .0049 2.32 .022
TB_mod_depth_beta.gamma 15.45 .00016 6.53 .019 14.38 .00025 −5.04 < .0001 0.11 .91
TB_PSI_delta_theta 2.58 .11 0.64 .43 0.17 .68 −0.79 .43 −1.53 .13
TB_PSI_theta_beta.gamma 5.15 .025 1.55 .23 0.93 .34 0.87 .39 2.45 .016
L_mod_depth_delta 2.71 .10 2.31 .14 14.63 .00022 −3.64 .0004 1.65 .10
L_mod_depth_theta 1.01 .32 1.97 .18 0.68 .41 1.22 .23 0.14 .89
L_mod_depth_beta.gamma 0.17 .68 0.30 .59 1.38 .24 0.51 .61 −1.20 .23
L_PSI_delta_theta 16.61 < .0001 0.021 .88 11.04 .0012 −4.92 < .0001 0.57 .57
L_PSI_theta_beta.gamma 3.82 .053 1.73 .20 0.043 .84 1.44 .15 1.32 .19
J_mod_depth_delta 15.24 .00017 0.12 .73 0.16 .69 −2.86 .0051 −2.65 .0092
J_mod_depth_theta 36.69 < .0001 2.95 .10 0.18 .67 3.75 .0003 4.91 < .0001
J_mod_depth_beta.gamma 3.09 .082 3.99 .059 3.64 .059 2.44 .016 −0.11 .91
J_PSI_delta_theta 9.40 .0028 2.86 .11 3.22 .076 −3.23 .0016 −0.96 .34
J_PSI_theta_beta.gamma 19.74 < .0001 2.11 .16 4.68 .033 4.39 < .0001 1.73 .087
COH_TTJ_theta 0.024 .88 0.68 .42 0.0020 .96 0.13 .89 0.084 .93
COH_TTJ_beta.gamma 2.72 .10 0.19 .66 0.85 .36 0.49 .63 1.95 .054
COH_TBJ_theta 0.22 .64 0.30 .59 0.65 .42 0.85 .40 −0.26 .80
COH_TBJ_beta.gamma 2.63 .11 0.71 .41 0.24 .62 0.75 .45 1.60 .11
Note. Main effects are denoted by F and p values (Columns 2–7) estimated by linear mixed-effects models. Post hoc between-tasks com-
parisons are denoted by t and p values (Columns 8–11) based on the contrast of estimated marginal means. Features exhibiting a significant
main effect of task and/or Task × Group at the Bonferroni-corrected significance level of p < .002 are bolded. Task: Habitual = habitual rate
task; Slow = slow rate task. Group: ALS = amyotrophic lateral sclerosis; Control = healthy controls. Feature notation: TT_mod_depth_delta =
delta-band modulation depth for tongue tip; TT_mod_depth_theta = theta-band modulation depth for tongue tip; TT_mod_depth_beta.
gamma = beta/gamma-band modulation depth for tongue tip; TT_PSI_delta_theta = delta–theta phase synchronization index (PSI) for tongue
tip; TT_PSI_theta_beta.gamma = theta–beta/gamma PSI for tongue tip; TB_mod_depth_delta = delta-band modulation depth for tongue
body; TB_mod_depth_theta = theta-band modulation depth for tongue body; TB_mod_depth_beta.gamma = beta/gamma-band modulation
depth for tongue body; TB_PSI_delta_theta = delta–theta PSI for tongue body; TB_PSI_theta_beta.gamma = theta–beta/gamma PSI for
tongue body; L_mod_depth_delta = delta-band modulation depth for lower lip; L_mod_depth_theta = theta-band modulation depth for lower
lip; L_mod_depth_beta.gamma = beta/gamma-band modulation depth for lower lip; L_PSI_delta_theta = delta–theta PSI for lower lip; L_PSI_theta_
beta.gamma = theta–beta/gamma PSI for lower lip; J_mod_depth_delta = delta-band modulation depth for jaw; J_mod_depth_theta = theta-band
modulation depth for jaw; J_mod_depth_beta.gamma = beta/gamma-band modulation depth for jaw; J_PSI_delta_theta = delta–theta PSI for
jaw; J_PSI_theta_beta.gamma = theta–beta/gamma PSI for jaw; COH_TTJ_theta = theta-band coherence between tongue tip and jaw;
COH_TTJ_beta.gamma = beta/gamma-band coherence between tongue tip and jaw; COH_TBJ_theta = theta-band coherence between
tongue body and jaw; COH_TBJ_beta.gamma = beta/gamma-band coherence between tongue body and jaw.
the purpose of this part of analysis was to assess how volun- following 11 features: TT_mod_depth_delta, TT_mod_
tary rate reduction impacted temporal patterning of articu- depth_theta, TT_mod_depth_beta.gamma, TB_mod_ depth_
lation, we focused on the effects of task and Task × Group delta, TB_mod_depth_theta, TB_mod_depth_beta.gamma,
interaction. Out of the 24 features, 14 showed a significant L_mod_depth_delta, L_PSI_delta_theta, J_mod_depth_
task and/or Task × Group effect, including TT_mod_ delta, J_mod_depth_theta, and J_PSI_theta_beta.gamma.
depth_delta (task, Task × Group), TT_mod_depth_theta These results combined indicated that rate manipulation sig-
(task, Task × Group), TT_mod_depth_beta.gamma (task), nificantly modified the intra-articulator modulation patterns
TT_PSI_delta_theta (task), TB_mod_depth_delta (task), of all articulators, whereas interarticulator modulation was
TB_mod_depth_theta (task), TB_mod_depth_beta.gamma relatively unaffected.
(task, Task × Group), TB_PSI_theta_beta.gamma (task), Post hoc comparisons between tasks revealed differ-
L_mod_depth_delta (Task × Group), L_PSI_delta_theta ential effects of voluntary rate reduction on intra-
(task, Task × Group), J_mod_depth_delta (task), J_mod_ articulator modulation between individuals with ALS and
depth_theta (task), J_PSI_delta_theta (task), and J_PSI_theta_ healthy controls. Specifically, individuals with ALS exhib-
beta.gamma (task, Task × Group), at the significance ited (a) a significant increase in TT_mod_depth_theta,
level of p < .05. The task and/or Task × Group effect sur- TB_mod_depth_theta, J_mod_depth_theta, TT_PSI_theta_
vived Bonferroni correction for multiple tests on the beta.gamma, and TB_PSI_theta_beta.gamma and (b) a
4596 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
significant decrease in TB_mod_depth_delta and J_mod_ Contribution of Temporal Patterning of
depth_delta between the slow and habitual rate tasks. For Articulation to Functional Speech Outcomes
healthy controls, a larger number of features showed a signifi-
cant decrease, including TT_mod_depth_delta, TB_mod_ The 24 intra- and interarticulator modulation fea-
depth_delta, L_mod_depth_delta, J_mod_depth_delta, TT_mod_ tures were clustered into six composite factors uncovering
depth_beta.gamma, TB_mod_depth_beta.gamma, TT_ PSI_ the latent traits of temporal patterning of articulation.
delta_theta, L_PSI_delta_theta, and J_PSI_delta_theta, whereas Based on the results of factorization as shown in Figure 4,
a smaller number of features, including TB_mod_ depth_theta, Factors 1 through 4 can be interpreted as reflecting (a)
J_mod_depth_theta, J_mod_depth_beta.gamma, and J_PSI_ entrainment of the tongue and the jaw to the faster lin-
theta_beta.gamma, showed a significant increase between the guistic rhythm related to onset–rime/phoneme, (b) har-
slow and habitual rate tasks. Taken together, the results for monic entrainment of the tongue and the jaw to the
Aim 3 pointed toward an overall positive effect of voluntary slower linguistic rhythms related to stress and syllable, (c)
rate reduction on intra-articulator modulation in individuals lip entrainment to the linguistic rhythm hierarchy, and (d)
with ALS, through (a) enhancing theta-band modulation interarticulator coherence at the target linguistic rhythms,
depths for tongue tip, tongue body, and jaw and theta–beta/ respectively. Factors 5 and 6 had relatively coarse feature
gamma phase synchronization for tongue tip and tongue representations, with the primary component feature
body and (b) preserving delta-band modulation depths for reflecting tongue body entrainment to the syllable rhythm
tongue tip and lower lip; beta/gamma-band modulation (for Factor 5) and harmonic entrainment of the tongue
depths for tongue tip and tongue body; and delta–theta body to the syllable and onset–rime/phoneme rhythms (for
phase synchronization for tongue tip, lower lip, and jaw, to Factor 6), respectively.
prevent them from being further degraded as in healthy The six factors combined were moderately correlated
controls. with speech intelligibility (R2 = .51–.53) and strongly cor-
related with intelligible speaking rate (R2 = .63–.73). The
stronger correlation with intelligible speaking rate was
Discussion consistent with the finding of Yorkston and Beukelman
(1981), suggesting that the integration of intelligibility and
This study investigated an important but underex- speaking rate may provide a more comprehensive charac-
plored aspect of speech production, that is, temporal pattern- terization of the multifaceted functional speech perfor-
ing of articulatory activities, reflecting articulatory entrain- mance than intelligibility alone. As such, some aspects of
ment to the underlying linguistic rhythms in both neurologi- temporal patterning of articulation might be reflected in
cally healthy and impaired speakers. A variety of features intelligible speaking rate, but not in speech intelligibility.
were extracted from the temporal modulation patterns of Among different predictors, Factor 2 exhibited the great-
tongue tip, tongue body, lower lip, and jaw at three linguisti- est and most significant contribution to both speech intel-
cally relevant timescales (delta, theta, and beta/gamma) to ligibility and intelligible speaking rate, followed by Fac-
assess the harmonic entrainment of these articulators to tors 1 and 5, both showing significant and about half the
the rhythms of stress, syllable, and onset–rime/phoneme. contribution as of Factor 2, whereas all other factors
Moderate-to-strong correlations were found between these exhibited minimal contributions. These findings accentu-
modulation features and the functional speech outcomes ated the functional significance of tongue and jaw entrain-
across neurologically healthy and impaired speakers. Com- ment to linguistic rhythms, especially the rhythms of sylla-
pared with healthy speakers, neurologically impaired individ- ble stress, which is consistent with the phonetic properties
uals exhibited reduced and less synchronized articulatory of the speech stimulus eliciting greater activities of the
entrainment to the linguistic rhythms, with tongue tip, ton- tongue and the jaw relative to the lower lip.
gue body, and jaw being more affected than lower lip. Nota- Taken together, the moderate-to-strong correlations
ble trends of improvement on these impairments were between temporal patterning of articulation and functional
observed following a behavioral modification—voluntary speech outcomes paralleled the vast body of neurolinguistic
speaking rate reduction, supporting the utility of rate control and psychoacoustic evidence, corroborating the contribu-
in improving articulatory entrainment in individuals with tions of delta-, theta-, beta-, and gamma-band modulations
ALS. The findings of this study provided preliminary to functional speech communication (Ding & Simon, 2014;
empirical evidence for the functional role of articulatory Doelling et al., 2014; Drullman, 1995; Ghitza, 2012;
entrainment in speech production, shedding light on a novel Kösem & van Wassenhove, 2017; Luo & Poeppel, 2012;
global timing–based approach for profiling articulatory def- Mai et al., 2016). Using artificially manipulated speech,
icits in neuromotor speech disorders, which has potential Ghitza (2012) has associated flattened temporal modulation
clinical implications in motor speech assessment and envelope of spectrally filtered acoustic signals (spectral
rehabilitation. passband: 230–3800 Hz, which is the spectral frequency
4598 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
studies (Rong & Green, 2019; Rong & Heidrick, 2021; reflected by its vowel nucleus, and (b) jaw plays a key role
Shellikeri et al., 2016). The differential involvement of the in vowel articulation, especially in individuals with ALS
articulators may therefore contribute to the varying extent who exhibit an increased reliance on the jaw than do
of decrease in their modulation depths. Second, the speech healthy speakers, which has proven to be a common adap-
task in this study exerts the highest motoric demand on tation strategy to accommodate tongue impairment (Rong,
tongue tip, which is the primary articulator for the alveo- 2019). Accordingly, the decrease in the delta–theta PSI for
lar stops in all words, followed by tongue body and jaw, the jaw may be interpreted as an articulatory manifestation
which are involved in vowel articulation, whereas the role of abnormal syllable stress in individuals with ALS.
of the lower lip is relatively trivial. The disparate levels of Pertaining to the timing constraint between syllable
motoric demands tend to result in differential sensitivity and onset–rime/phoneme, prior work has reported nonuni-
of the task to motor deficits of tongue tip, tongue body, form changes in segment duration across different pho-
jaw, and lower lip, providing another possible contributing neme classes in individuals with ALS, with the duration of
factor to the varying extent of decrease in modulation vowels being prolonged to a greater extent than that of
depth across different articulators. consonants (Tjaden & Turner, 2000). The differential
changes in vowel and consonant duration could alter the
Cross-Frequency Phase Synchronization alignment of these phonemes within the temporal structure
Cross-frequency phase synchronization was, in gen- of syllables. From the articulatory perspective, such
eral, less affected than modulation depth in individuals changes in phoneme–syllable temporal alignment would
with ALS. Nonetheless, a moderate decrease was observed correspond with reduced phase synchronization between
in the delta–theta PSI for the jaw and the theta–beta/ the theta and beta/gamma modulations of the primary
gamma PSI for tongue tip and tongue body between the articulators, which, in the context of alveolar stop–vowel
ALS and healthy control groups, although these group syllables, are tongue tip and tongue body. This interpreta-
differences did not reach statistical significance. Given that tion provides a tentative explanation of the observed
the relatively small sample size in this study could render decrease in theta–beta/gamma PSI for tongue tip and
insignificant statistical results, it is worth looking into tongue body.
these moderate but insignificant changes in PSI to inform
future large-scale studies on cross-frequency articulatory Voluntary Rate Reduction Tends to Improve
modulation. Intra-Articulator Modulation in Individuals
As demonstrated by the findings of psychoacoustic With ALS
research, the temporal structure of speech can be viewed
as a hierarchy wherein different linguistic events are con- Effect of Voluntary Rate Reduction on Intra-
strained in their relative timing to act together as a coher- Articulator Modulation in Healthy Speakers
ent scene (Cummins & Port, 1998). Pertaining to the tim- Voluntary rate reduction exerted differential impacts
ing constraint between stress and syllable, it has been on intra-articulator modulation between individuals with
shown that the perceptual center or stress beat is located ALS and healthy controls. For healthy controls, voluntar-
near the onset of the vowel nucleus of the syllable ily slowed speech was characterized by overall decreased
(Marcus, 1981). The perception of syllable stress is hence modulation depths within the delta and beta/gamma
dependent on the alignment of the syllable subcycles bands as well as decreased delta–theta PSI, compared with
(reflected by theta-band modulation) within the stress their habitual speech (see Figure 6). These changes
cycles (reflected by delta-band modulation). The speech of resulted in descriptively similar intra-articulator modula-
individuals with ALS is often perceived as having equal tion profiles for voluntarily slowed speech produced by
and excessive stress (Darley et al., 1969a, 1969b), which healthy speakers and habitual speech produced by individ-
has been associated with disproportionate prolongation of uals with ALS. This finding resonated with prior study of
unstressed versus stressed segments resulting in reduced segmental timing, which also revealed a broad similarity
duration contrasts between these segments (Tjaden & between voluntarily slowed speech for healthy speakers
Turner, 2000). Such a segmental timing deficit can pre- and habitual speech for individuals with ALS (Tjaden &
sumably lead to misalignment of stressed and unstressed Turner, 2000).
syllable segments within the temporal template of the Similar to the interpretation of the disease effect, the
higher level prosodic structure, thereby reducing the phase rate-elicited decreases in delta- and beta/gamma-band
synchronization between delta- and theta-band modula- modulation depths can also be interpreted in the context
tions. The finding that the jaw revealed the greatest of the AP/TD model (Saltzman & Byrd, 2000; Saltzman
decrease in delta–theta PSI among all articulators is likely & Munhall, 1989; Saltzman et al., 2008). Specifically, to
attributed to the interplay of two articulatory/linguistic accommodate the rate-elicited prolongation of activation
factors: (a) The stress pattern of a syllable is mainly period, articulatory movements need to be stretched in
4600 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
presumably reflective of mechanistic changes in temporal interarticulator coherence observed in Rong (2020) may
patterning of articulation in response to rate manipulation. be a by-product of rate reduction rather than a mechanis-
As pointed out earlier, a potential barrier for articulatory tic change in interarticulator coordination.
entrainment in individuals with ALS is the mismatch
between the period of articulatory movement and its desig- Clinical Implications: A Novel Perspective
nated activation period, with the former presumably exceed- Toward Global Timing–Based Motor Speech
ing the latter owing to slowed articulatory muscle contrac- Assessment and Rehabilitation
tions (DePaul & Brooks, 1993; Langmore & Lehman,
1994). As a global timing variable, voluntary reduction of Using a set of algorithmically derived temporal
speaking rate can increase the activation periods related to modulation features, this study provided preliminary evi-
the underlying linguistic events. This would have three poten- dence for (a) the impact of motor speech impairment on
tial impacts: (a) improving temporal alignment between the temporal patterning of articulation and (b) the contribu-
prolonged activation period and the increased articulatory tion of temporal patterning of articulation to the func-
movement period; (b) enabling the central nervous system to tional speech outcomes. These findings converge with the
recruit more motor units within the activation period to extant psychoacoustic evidence on the functional signifi-
increase the overall force generation capacity and, in turn, cance of temporal patterning of speech, pointing toward a
mitigate the undershooting problem; and (c) affording more time-based sensorimotor mechanism stemming from har-
processing time to mitigate the deficits in initiating and stop- monic entrainment of articulatory-motor and auditory-
ping movements. These changes can arguably facilitate the perceptual activities to common linguistic rhythms. Such a
implementation of the planned articulatory movements to coupling between the production and perception systems
better achieve the targets associated with the underlying lin- may facilitate language processing and optimize communi-
guistic events, thereby improving the harmonic entrainment cation efficiency (Pickering & Garrod, 2004). This puta-
of the articulators to the linguistic rhythm hierarchy. tive mechanism has potential assessment and management
implications for neuromotor speech disorders.
Considerations Concerning Interarticulator From the assessment perspective, integrating the
Coherence evaluation of temporal patterning of articulation into the
existing paradigm of motor speech assessment may add
Compared with intra-articulator modulation, inter- incremental value to the diagnosis and monitoring of neu-
articulator coherence was relatively unaffected by either romotor speech disorders. The modulation features in this
the disease effect or rate manipulation, at least in the con- study provide good candidates for measuring and evaluat-
text of this study. A possible contributing factor for the ing temporal patterning of articulation. Owing to the
lack of significant disease effect on interarticulator coher- automation of analysis, these features (a) can be extracted
ence was the speech task not exerting a sufficiently high without laborious data parsing and labeling as in tradi-
demand on interarticulator coordination. Our prior work tional segmental analyses and (b) are objective and repro-
has shown that, in a more demanding task such as oral ducible. Further analytical and clinical validations are
diadochokinesis (i.e., rapid repetitions of “puh tuh kuh”), needed to substantiate the utility of these features in (dif-
which requires the articulatory muscles to be highly coor- ferential) assessment of neuromotor speech disorders.
dinated to enable rapid recruitment and activation of these From the management perspective, entrainment-
muscles as coordinative structures (as opposed to recruit- based rehabilitation has a history of managing timing defi-
ing and activating them separately, which takes longer cits in communication disorders such as nonfluent aphasia
time), individuals with ALS tend to exhibit a greater defi- (Feenaughty et al., 2021; Fridriksson et al., 2012). Its
cit in interarticulator coherence compared with the regular application to neuromotor speech disorders is, however,
sentence reading task (Rong & Heidrick, 2021). The gen- less explored. The findings of this study provide novel
eral lack of rate effect on interarticulator coherence is in insights into the functional role of articulatory entrain-
keeping with early physiological studies, showing stable ment in speech production and elucidate how articulatory
relative timing between articulators despite speaking rate entrainment can be modulated through global behavioral
variations (Nittrouer et al., 1988; Tuller et al., 1982). modifications such as rate manipulation to facilitate the
However, it contradicts the finding of Rong (2020), which implementation of motor plans and improve the quality of
has reported a significant decrease in interarticulator articulatory movement outcomes in individuals with
coherence during voluntarily slowed speech in healthy motor speech impairments. Such an entrainment paradigm
speakers. This discrepancy might be attributed to the gives rise to a novel perspective toward global timing–
methodological differences between the two studies. based motor speech rehabilitation, which, with further
Unlike this study, the articulatory data in Rong (2020) are research warranted, may have potential applications in the
not rate-normalized. Thus, the rate-elicited decrease in management of neuromotor speech disorders.
4602 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Conclusions grateful to all participants and their families for contributing
their time to this study. The authors also thank Olivia
This is the first known empirical study to systemati- Hansen for her assistance with data processing and analysis.
cally evaluate temporal patterning of articulation at multi-
ple linguistically relevant, hierarchically nested timescales
(corresponding to stress, syllable, and onset–rime/phoneme) References
in both neurologically healthy and impaired speakers.
While the extant literature on neuromotor speech disorders Amano-Kusumoto, A., & Hosom, J.-P. (2011). A review of research
leans heavily toward spatial and temporal features at the on speech intelligibility and correlations with acoustic features
segmental level, the findings of this study suggest that, (Technical Report CSLU-011-001). Center for Spoken Lan-
apart from the segmental features, global temporal organi- guage Understanding, Oregon Health & Science University.
American Speech-Language-Hearing Association. (2002). National
zation of articulation around linguistic rhythms also plays Outcomes Measurement System (NOMS): Adults speech-
an important role in shaping the functional speech out- language pathology user’s guide.
comes. In individuals with ALS, temporal patterning of American Speech-Language-Hearing Association. (2013). National
articulation tends to be disrupted, presumably due to Outcomes Measurement System (NOMS): Adults speech-
impaired articulatory entrainment to the rhythms of the language pathology user’s guide.
Atsumi, T., & Miyatake, T. (1987). Morphometry of the degener-
underlying linguistic events, contributing to degraded func- ative process in the hypoglossal nerves in amyotrophic lateral
tional speech outcomes. Through voluntary speaking rate sclerosis. Acta Neuropathologica, 73(1), 25–31. https://doi.org/
reduction, several deficits in temporal patterning of articu- 10.1007/BF00695498
lation in individuals with ALS reveal a trend of improve- Ball, L. J., Beukelman, D. R., & Pattee, G. L. (2002). Timing of
ment, possibly by reshaping the temporal template of artic- speech deterioration in people with amyotrophic lateral sclero-
sis. Journal of Medical Speech-Language Pathology, 10(4),
ulatory motor plans to better accommodate the disease- 231–235.
related neuromechanical constraints in the articulatory sys- Ball, L. J., Willis, A., Beukelman, D. R., & Pattee, G. L. (2001). A
tem. The findings of this study shed light on a novel global protocol for identification of early bulbar signs in amyotrophic
timing–based approach for profiling articulatory deficits in lateral sclerosis. Journal of the Neurological Sciences, 191(1–2),
neuromotor speech disorders, which may have potential 43–53. https://doi.org/10.1016/S0022-510X(01)00623-2
Berry, J. (2011). Speaking rate effects on normal aspects of artic-
clinical applications in the long term. From the assessment ulation: Outcomes and issues. SIG 5 Perspectives on Speech
perspective, integrating the evaluation of temporal pattern- Science and Orofacial Disorders, 21(1), 15–26. https://doi.org/
ing of articulation into the existing motor speech assess- 10.1044/ssod21.1.15
ment may add incremental value to the (differential) diag- Borrie, S. A., Barrett, T. S., Liss, J. M., & Berisha, V. (2020). Sync
nosis and monitoring of neuromotor speech disorders. pending: Characterizing conversational entrainment in dysar-
thria using a multidimensional, clinically informed approach.
From the management perspective, manipulating temporal Journal of Speech, Language, and Hearing Research, 63(1), 83–
patterning of articulation through behavioral modifications 94. https://doi.org/10.1044/2019_JSLHR-19-00194
such as speaking rate reduction may recalibrate the neuro- Brooks, B. R., Miller, R. G., Swash, M., & Munsat, T. L. (2000).
motor processes underlying the articulatory activities to El Escorial revisited: Revised criteria for the diagnosis of
improve the planning and implementation of articulatory amyotrophic lateral sclerosis. Amyotrophic Lateral Sclerosis
and Other Motor Neuron Disorders, 1(5), 293–299. https://doi.
movements. org/10.1080/146608200300079536
Buhusi, C. V., & Meck, W. H. (2005). What makes us tick? Func-
tional and neural mechanisms of interval timing. Nature Reviews
Data Availability Statement Neuroscience, 6(10), 755–765. https://doi.org/10.1038/nrn1764
Cha, C. H., & Patten, B. M. (1989). Amyotrophic lateral sclero-
sis: Abnormalities of the tongue on magnetic resonance imag-
The data set generated during this study is not pub- ing. Annals of Neurology, 25(5), 468–472. https://doi.org/10.
licly available due to containing protected health informa- 1002/ana.410250508
tion but will be made available in a de-identified format Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A., &
from the corresponding author on reasonable request. Ghazanfar, A. A. (2009). The natural statistics of audiovisual
speech. PLOS Computational Biology, 5(7), e1000436. https://
doi.org/10.1371/journal.pcbi.1000436
Cummins, F., & Port, R. (1998). Rhythmic constraints on stress
Acknowledgments timing in English. Journal of Phonetics, 26(2), 145–171.
https://doi.org/10.1006/jpho.1998.0070
This work was supported by the New Investigators Darley, F. L., Aronson, A. E., & Brown, J. R. (1969a). Clusters
Research Grant awarded by the American Speech- of deviant speech dimensions in the dysarthrias. Journal of
Speech and Hearing Research, 12(3), 462–496. https://doi.org/
Language-Hearing Foundation (PI: Panying Rong) and 10.1044/jshr.1203.462
the New Faculty General Research Fund awarded by The Darley, F. L., Aronson, A. E., & Brown, J. R. (1969b). Differen-
University of Kansas (PI: Panying Rong). The authors are tial diagnostic patterns of dysarthria. Journal of Speech and
4604 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
lateral sclerosis. Journal of Speech and Hearing Research, Nasreddine, Z. S., Phillips, N. A., Bédirian, V., Charbonneau, S.,
37(1), 28–37. https://doi.org/10.1044/jshr.3701.28 Whitehead, V., Collin, I., Cummings, J. L., & Chertkow, H.
Lee, J., & Bell, M. (2018). Articulatory range of movement in (2005). The Montréal Cognitive Assessment, MoCA: A brief
individuals with dysarthria secondary to amyotrophic lateral scle- screening tool for mild cognitive impairment. Journal of the
rosis. American Journal of Speech-Language Pathology, 27(3), American Geriatrics Society, 53(4), 695–699. https://doi.org/10.
996–1009. https://doi.org/10.1044/2018_ajslp-17-0064 1111/j.1532-5415.2005.53221.x
Lenth, R. V. (2020). emmeans: Estimated marginal means, aka least- Negro, F., & Farina, D. (2011). Linear transmission of cortical
squares means. https://CRAN.R-project.org/package=emmeans oscillations to the neural drive to muscles is mediated by com-
Leong, V., & Goswami, U. (2015). Acoustic-emergent phonology mon projections to populations of motoneurons in humans.
in the amplitude envelope of child-directed speech. PLOS Journal of Physiology, 589(3), 629–637. https://doi.org/10.
ONE, 10(12), e0144411. https://doi.org/10.1371/journal.pone. 1113/jphysiol.2010.202473
0144411 Nittrouer, S., Munhall, K., Kelso, J. A. S., Tuller, B., & Harris,
Leong, V., Kalashnikova, M., Burnham, D., & Goswami, U. (2017). K. S. (1988). Patterns of interarticulator phasing and their
The temporal modulation structure of infant-directed speech. relation to linguistic structure. The Journal of the Acoustical
Open Mind, 1(2), 78–90. https://doi.org/10.1162/OPMI_a_00008 Society of America, 84(5), 1653–1661. https://doi.org/10.1121/
Liberman, M., & Prince, A. (1977). On stress and linguistic 1.397180
rhythm. Linguistic Inquiry, 8(2), 249–336. Ohala, J. J. (1975). The temporal regulation of speech. In G.
Liss, J. M., LeGendre, S., & Lotto, A. J. (2010). Discriminating Fant & M. A. A. Tatham (Eds.), Auditory analysis and per-
dysarthria type from envelope modulation spectra. Journal of ception of speech (pp. 431–453). Academic Press.
Speech, Language, and Hearing Research, 53(5), 1246–1255. Peterson, G. E., & Lehiste, I. (1960). Duration of syllable nuclei
https://doi.org/10.1044/1092-4388(2010/09-0121) in English. The Journal of the Acoustical Society of America,
Liss, J. M., Utianski, R., & Lansford, K. (2013). Crosslinguistic 32(6), 693–703. https://doi.org/10.1121/1.1908183
application of English-centric rhythm descriptors in motor Pickering, M. J., & Garrod, S. (2004). Toward a mechanistic psy-
speech disorders. Folia Phoniatrica et Logopaedica, 65(1), 3– chology of dialogue. Behavioral and Brain Sciences, 27(2),
19. https://doi.org/10.1159/000350030 169–190. https://doi.org/10.1017/s0140525x04000056
Liss, J. M., White, L., Mattys, S. L., Lansford, K., Lotto, A. J., Poeppel, D. (2003). The analysis of speech in different temporal
Spitzer, S. M., & Caviness, J. N. (2009). Quantifying speech integration windows: Cerebral lateralization as “asymmetric
rhythm abnormalities in the dysarthrias. Journal of Speech, sampling in time.” Speech Communication, 41(1), 245–255.
Language, and Hearing Research, 52(5), 1334–1352. https:// https://doi.org/10.1016/S0167-6393(02)00107-3
doi.org/10.1044/1092-4388(2009/08-0208) Poeppel, D., & Assaneo, M. F. (2020). Speech rhythms and their
Luo, H., & Poeppel, D. (2012). Cortical oscillations in auditory neural foundations. Nature Reviews Neuroscience, 21(6), 322–
perception and speech: Evidence for two temporal windows in 334. https://doi.org/10.1038/s41583-020-0304-4
human auditory cortex. Frontiers in Psychology, 3, 170. R Core Team. (2019). R: A language and environment for statistical
https://doi.org/10.3389/fpsyg.2012.00170 computing. R Foundation for Statistical Computing. http://
Mai, G., Minett, J. W., & Wang, W. S.-Y. (2016). Delta, theta, www.R-project.org/
beta, and gamma brain oscillations index levels of auditory Ramus, F., Nespor, M., & Mehler, J. (2000). Correlates of lin-
sentence processing. NeuroImage, 133, 516–528. https://doi. guistic rhythm in the speech signal. Cognition, 75(1), AD3–
org/10.1016/j.neuroimage.2016.02.064 AD30. https://doi.org/10.1016/S0010-0277(00)00101-3
Marcus, S. M. (1981). Acoustic determinants of perceptual center Revelle, W. (2020). psych: Procedures for psychological, psycho-
(P-center) location. Perception & Psychophysics, 30(3), 247– metric, and personality research. Northwestern University.
256. https://doi.org/10.3758/BF03214280 https://CRAN.R-project.org/package=psych
McClean, M. D. (2000). Patterns of orofacial movement velocity Riecke, L., Formisano, E., Sorger, B., Başkent, D., & Gaudrain,
across variations in speech rate. Journal of Speech, Language, E. (2018). Neural entrainment to speech modulates speech
and Hearing Research, 43(1), 205–216. https://doi.org/10.1044/ intelligibility. Current Biology, 28(2), 161–169.e165. https://
jslhr.4301.205 doi.org/10.1016/j.cub.2017.11.033
McClean, M. D., & Tasko, S. M. (2003). Association of orofacial Roelfsema, P. R., Engel, A. K., König, P., & Singer, W. (1997).
muscle activity and movement during changes in speech rate and Visuomotor integration is associated with zero time-lag syn-
intensity. Journal of Speech, Language, and Hearing Research, chronization among cortical areas. Nature, 385(6612), 157–
46(6), 1387–1400. https://doi.org/10.1044/1092-4388(2003/108) 161. https://doi.org/10.1038/385157a0
McHenry, M. A. (2003). The effect of pacing strategies on the Roenneberg, T., Daan, S., & Merrow, M. (2003). The art of
variability of speech movement sequences in dysarthria. Jour- entrainment. Journal of Biological Rhythms, 18(3), 183–194.
nal of Speech, Language, and Hearing Research, 46(3), 702– https://doi.org/10.1177/0748730403018003001
710. https://doi.org/10.1044/1092-4388(2003/055) Romö, N., Lee, J., & Robb, M. P. (2022). Properties of relative
Mefferd, A. S., Pattee, G. L., & Green, J. R. (2014). Speaking timing and phonetic complexity in adults with dysarthria sec-
rate effects on articulatory pattern consistency in talkers with ondary to amyotrophic lateral sclerosis. Folia Phoniatrica et
mild ALS. Clinical Linguistics & Phonetics, 28(11), 799–811. Logopaedica, 74(4), 284–295. https://doi.org/10.1159/000521144
https://doi.org/10.3109/02699206.2014.908239 Rong, P. (2019). The effect of tongue–jaw coupling on phonetic
Miller, J. L. (1981). Effects of speaking rate on segmental dis- distinctiveness of vowels in amyotrophic lateral sclerosis. Jour-
tinctions. In P. D. Eimas & J. L. Miller (Eds.), Perspectives nal of Speech, Language, and Hearing Research, 62(9), 3248–
on the study of speech (pp. 39–73). Erlbaum. 3264. https://doi.org/10.1044/2019_JSLHR-S-19-0058
Namasivayam, A. K., & van Lieshout, P. (2008). Investigating Rong, P. (2020). Neuromotor control of speech and speechlike
speech motor practice and learning in people who stutter. tasks: Implications from articulatory gestures. Perspectives of
Journal of Fluency Disorders, 33(1), 32–51. https://doi.org/10. the ASHA Special Interest Groups, 5(5), 1324–1338. https://
1016/j.jfludis.2007.11.005 doi.org/10.1044/2020_PERSP-20-00070
4606 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Lieshout, & W. Hulstijn (Eds.), Speech motor control in normal Yorkston, K. M., Beukelman, D. R., & Ball, L. J. (2002). Man-
and disordered speech (pp. 51–82). Oxford University Press. agement of dysarthria in amyotrophic lateral sclerosis. Geriat-
Weismer, G. (2008). Speech intelligibility. In M. J. Ball, M. R. rics & Aging, 5, 38–41.
Perkins, N. Müller, & S. Howard (Eds.), The handbook of Yorkston, K. M., Beukelman, D. R., Hakel, M., & Dorsey, M.
clinical linguistics (pp. 568–582). Blackwell. (2007). Speech Intelligibility Test for Windows. Institute for
Weismer, G., Jeng, J. Y., Laures, J. S., Kent, R. D., & Kent, Rehabilitation Science and Engineering at Madonna Rehabili-
J. F. (2001). Acoustic and intelligibility characteristics of tation Hospitals.
sentence production in neurogenic speech disorders. Folia Yorkston, K. M., Dowden, P. A., & Beukelman, D. R. (1992).
Phoniatrica et Logopaedica, 53(1), 1–18. https://doi.org/10. Intelligibility measurement as a tool in the clinical manage-
1159/000052649 ment of dysarthric speakers. In R. D. Kent (Ed.), Intelligibil-
Weismer, G., Laures, J. S., Jeng, J. Y., Kent, R. D., & Kent, ity in speech disorders: Theory, measurement and management
J. F. (2000). Effect of speaking rate manipulations on acoustic (pp. 265–286). John Benjamins. https://doi.org/10.1075/sspcl.1.
and perceptual aspects of the dysarthria in amyotrophic lat- 08yor
eral sclerosis. Folia Phoniatrica et Logopaedica, 52(5), 201– Yorkston, K. M., Hakel, M., Beukelman, D. R., & Fager, S.
219. https://doi.org/10.1159/000021536 (2007). Evidence for effectiveness of treatment of loudness,
Westbury, J. R., Lindstrom, M. J., & McClean, M. D. (2002). rate, or prosody in dysarthria: A systematic review. Journal of
Tongues and lips without jaws: A comparison of methods for Medical Speech-Language Pathology, 15(2), xi–xxxvi.
decoupling speech movements. Journal of Speech, Language, Yorkston, K. M., Hammen, V. L., Beukelman, D. R., & Traynor,
and Hearing Research, 45(4), 651–662. https://doi.org/10.1044/ C. D. (1990). The effect of rate control on the intelligibility
1092-4388(2002/052) and naturalness of dysarthric speech. Journal of Speech and
Windmann, A., Šimko, J., & Wagner, P. (2015). Optimization- Hearing Disorders, 55(3), 550–560. https://doi.org/10.1044/jshd.
based modeling of speech timing. Speech Communication, 74, 5503.550
76–92. https://doi.org/10.1016/j.specom.2015.09.007 Yorkston, K. M., Strand, E. A., Miller, R., Hillel, A., & Smith,
Wing, A. M., & Kristofferson, A. B. (1973). Response delays and K. (1993). Speech deterioration in amyotrophic lateral sclero-
the timing of discrete motor responses. Perception & Psycho- sis: Implications for the timing of intervention. Journal of
physics, 14(1), 5–12. https://doi.org/10.3758/BF03198607 Medical Speech-Language Pathology, 1(1), 35–46.
Yorkston, K. M., & Beukelman, D. R. (1981). Communication effi- Yunusova, Y., Green, J. R., Lindstrom, M. J., Ball, L. J., Pattee,
ciency of dysarthric speakers as measured by sentence intellig- G. L., & Zinman, L. (2010). Kinematics of disease progression
ibility and speaking rate. Journal of Speech and Hearing Disor- in bulbar ALS. Journal of Communication Disorders, 43(1), 6–20.
ders, 46(3), 296–301. https://doi.org/10.1044/jshd.4603.296 https://doi.org/10.1016/j.jcomdis.2009.07.003