0% found this document useful (0 votes)
23 views

Functional Role of Temporal Patterning of Articulation in Speech Production

Uploaded by

Bea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Functional Role of Temporal Patterning of Articulation in Speech Production

Uploaded by

Bea
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Research Article

Functional Role of Temporal Patterning of


Articulation in Speech Production: A Novel
Perspective Toward Global Timing–Based Motor
Speech Assessment and Rehabilitation
Panying Ronga and Lindsey Heidrickb
a
Department of Speech-Language-Hearing: Sciences & Disorders, The University of Kansas, Lawrence b Department of Hearing and Speech,
The University of Kansas Medical Center, Kansas City

ARTICLE INFO ABSTRACT


Article History: Purpose: This study aimed to (a) relate temporal patterning of articulation to
Received February 8, 2022 functional speech outcomes in neurologically healthy and impaired speakers, (b)
Revision received May 31, 2022 identify changes in temporal patterning of articulation in neurologically impaired
Accepted August 11, 2022 speakers, and (c) evaluate how these changes can be modulated by speaking
rate manipulation.
Editor-in-Chief: Cara E. Stepp Method: Thirteen individuals with amyotrophic lateral sclerosis (ALS) and 10
Editor: Raymond D. Kent neurologically healthy controls read a sentence 3 times, first at their habitual
rate and then at a voluntarily slowed rate. Temporal patterning of articulation
https://doi.org/10.1044/2022_JSLHR-22-00089 was assessed by 24 features characterizing the modulation patterns within
(intra) and between (inter) four articulators (tongue tip, tongue body, lower lip,
and jaw) at three linguistically relevant, hierarchically nested timescales corre-
sponding to stress, syllable, and onset–rime/phoneme. For Aim 1, the features
for the habitual rate condition were factorized and correlated with two functional
speech outcomes—speech intelligibility and intelligible speaking rate. For Aims
2 and 3, the features were compared between groups and rate conditions,
respectiely.
Results: For Aim 1, the modulation features combined were moderately to
strongly correlated with intelligibility (R2 = .51–.53) and intelligible speaking rate
(R2 = .63–.73). For Aim 2, intra-articulator modulation was impaired in ALS,
manifested by moderate-to-large decreases in modulation depth at all time-
scales and cross-timescale phase synchronization. Interarticulator modulation
was relatively unaffected. For Aim 3, voluntary rate reduction improved several
intra-articulator modulation features identified as being susceptible to the dis-
ease effect in individuals with ALS.
Conclusions: Disrupted temporal patterning of articulation, presumably reflecting
impaired articulatory entrainment to linguistic rhythms, may contribute to func-
tional speech declines in ALS. These impairments tend to be improved through
voluntary rate reduction, possibly by reshaping the temporal template of motor
plans to better accommodate the disease-related neuromechanical constraints in
the articulatory system. These findings shed light on a novel perspective toward
global timing–based motor speech assessment and rehabilitation.

Speech is a hierarchical sound structure organized at


multiple timescales representing different linguistic units
including phonemes, syllables, words, and phrases (Poeppel,
Correspondence to Panying Rong: [email protected]. Disclosure: The
2003; Poeppel & Assaneo, 2020). These linguistic units recur
authors have declared that no competing financial or nonfinancial inter- at regular intervals reflecting the rhythmic structure of natu-
ests existed at the time of publication. ral languages, which, in turn, establishes temporal patterning

Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022 • Copyright © 2022 American Speech-Language-Hearing Association 4577
of speech (Liberman & Prince, 1977; Ramus et al., 2000). To from both production and perception perspectives. We
produce intelligible speech requires careful planning and then focus on the production side and propose an
sequencing of articulatory motor programs to generate tem- entrainment-based perspective drawing insights from the
porally structured sound sequences that are perceptually par- AP/TD model to conceptualize how neuromotor speech
sable into linguistic units. The functional significance of tem- disorders can impair the neurophysiological processes con-
poral patterning of speech is well supported by empirical trolling speech timing and, in turn, impact temporal pat-
findings in the psychoacoustic literature. In the acoustic terning of speech production. Last, we identify several
domain, temporal patterning of speech is reflected by the gaps in empirical testing of the theorized connections
slow modulation envelope of the waveform (Drullman, between neuromotor speech disorders, temporal patterning
1995). Converging evidence has been drawn from numerous of speech production, and functional communication to
perceptual experiments revealing a detrimental effect of motivate the primary aims and methodology of this study.
reduced or distorted slow temporal modulation of acoustic Furthermore, we pose an exploratory question, that is,
envelope on the functional speech outcomes (e.g., speech how to manipulate and improve temporal patterning of
intelligibility), whereas spectral distortions of the fine struc- speech production in neurologically impaired individuals
ture can be tolerated to a much greater degree (Drullman in aiming to provide insights into the management of
et al., 1994, 1996; Ghitza, 2012). These findings corroborate neuromotor speech disorders.
the critical role of temporal patterning of speech in func-
tional communication from the perception perspective. Relationship Between Temporal Patterning of
Despite the perceptual salience of temporal patterning Speech and Functional Communication:
of speech, the current literature on the temporal aspects of Theoretical Foundation From Both
speech production, especially of speech production disor- Production and Perception Perspectives
ders, leans heavily toward segmental features such as vowel
and consonant durations (Liss et al., 2009; Tjaden, 2007; The temporal structure of natural speech, as a mani-
Weismer et al., 2001) and relative timing between segments festation of the rhythmic profile of a language, is of high
(Romö et al., 2022), which provide limited insights into the regularity (Turk & Shattuck-Hufnagel, 2013). The rhyth-
global temporal organization of speech production. mic profile of a language is hierarchical in nature, consist-
Although theoretical models such as articulatory phonology/ ing of multiple nested subcomponents (also referred to as
task dynamics (AP/TD; Saltzman, 1986; Saltzman & linguistic rhythms), each reflecting the recurrence of a lin-
Munhall, 1989) and directions into velocities of articulators guistic unit at a specific timescale (Liberman & Prince,
(DIVA; Guenther, 1995; Guenther et al., 2006) offer explan- 1977; Mai et al., 2016; Poeppel, 2003). The functionally
atory accounts for how neuromotor processes related to most important linguistic rhythms occur in the low-
speech production evolve in time, little empirical evidence frequency range, including several crucial frequency bands
exists about how these processes (a) independently and col- centered around 1–3, 4–7, 13–25, and 25–35 Hz, repre-
lectively contribute to temporal patterning of the speech out- senting the periodicity of prosodic units, syllables, onset–
comes and (b) are impacted by neurological, physiological, rime (i.e., consonant–vowel transition in a syllable), and
and biomechanical disturbances in the speech production phonemes, respectively (Leong & Goswami, 2015). From
system (e.g., due to neuromotor speech disorders). The lack a neurodynamic perspective, a growing body of theoretical
of insights into how speech production behaviors are tempo- and empirical evidence has converged on the temporal
rally organized in relation to the speech outcomes poses a regularity of speech to stem from the entrainment of the
major challenge to understand whether the temporal pat- speech production and perception systems to the periodic-
terning principles as established in the perception literature ity of linguistic units (Goldstein, 2019; Poeppel, 2003;
also hold for the production mechanism. Addressing this Poeppel & Assaneo, 2020). Here, entrainment refers to the
challenge is critical to link the global temporal aspects of the phase locking between two systems, allowing the fre-
production and perception mechanisms, which would, in quency of one system to entrain the frequency of another
turn, inform an integrative understanding of the temporal (Roenneberg et al., 2003).
dynamics of speech sensorimotor processing and interaction. Within the auditory perception system, speech pro-
It also has clinical implications in guiding time-based reha- cessing and comprehension have been shown to occur at
bilitation to facilitate temporal regulation of speech produc- multiple discrete timescales, consistent with the hierarchy
tion and, in turn, improve functional speech outcomes in of linguistic rhythms (Mai et al., 2016; Poeppel, 2003).
individuals with speech production disorders. This multiscale temporal processing is achieved by
To address the challenge above, in the following sec- entrainment of neural oscillations in the auditory cortex to
tions, we first draw neurobehavioral evidence from the lit- the modulation envelope of the acoustic signal at linguisti-
erature to lay the theoretical foundation for linking tem- cally relevant timescales. Importantly, entrainments of
poral patterning of speech to functional communication delta-, theta-, beta-, and gamma-band auditory oscillations

4578 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
allow the auditory cortex to sample and parse a speech gap, the source of distortions in acoustic envelope modu-
time series into prosodic, syllabic, onset–rime, and phone- lation identified as contributing to the degradation of the
mic units, respectively (Giraud & Poeppel, 2012; Riecke functional speech outcomes in perceptual studies remains
et al., 2018). This time-based auditory processing mecha- poorly understood. As the integrative output of the speech
nism, known as auditory entrainment (Ding & Simon, production system, the acoustics of speech is shaped by
2014; Thaut, 2003), underlies the temporal organization of the coordinated activities of multiple underlying articula-
speech from the perception perspective. Disrupted tempo- tors including the tongue, jaw, lips, and pharynx. It is thus
ral patterning of speech, for example, by altering the peri- conceivable that impaired timing of articulation, charac-
odicity of acoustic envelope modulation, can impair the teristic of many neuromotor speech disorders (Duffy,
ability of the perception system to parse and interpret lin- 2013), may disrupt temporal patterning of the acoustic sig-
guistic information at appropriate temporal granularities, nal and, in turn, degrade the functional speech outcomes.
thereby degrading speech intelligibility (Drullman, 1995; To conceptualize how neuromotor speech disorders
Drullman et al., 1994, 1996; Ghitza, 2012). can impact the timing of articulation and, in turn, disrupt
In the dynamical systems account of speech produc- temporal patterning of speech, we draw insights from the-
tion, speech articulators are viewed as spring mass oscilla- oretical models of speech production (Guenther, 1994,
tors (Saltzman, 1986; Saltzman & Byrd, 2000; Saltzman & 1995; Saltzman & Byrd, 2000; Saltzman et al., 2008;
Munhall, 1989; van Lieshout, 2004). The oscillatory move- Šimko & Cummins, 2010, 2011; Windmann et al., 2015).
ment of an articulator is characterized by a preferred tempo- Although methodological disparities exist between differ-
ral rhythm, which is determined by the interplay of the bio- ent models, the major disparity lying in whether temporal
mechanical and neural oscillatory properties of the articula- information is specified for the entire articulatory move-
tory system. Remarkably, the preferred temporal rhythms ment trajectory (e.g., in the DIVA model; Guenther, 1995;
of the articulatory system are largely in line with the audi- Guenther et al., 1998) or at discrete time points signifying
tory rhythms. For example, the frequency of jaw oscillation the onset of articulatory events (e.g., in the AP/TD model;
during natural connected speech is around 4 Hz (Ohala, Saltzman, 1986; Saltzman & Munhall, 1989), there is a
1975), which is aligned with the theta rhythm of auditory general consensus that the temporal organization of
oscillations for parsing and decoding syllables. Such a speech production emerges from the interaction of the
rhythmic synchronization between articulatory and auditory central clock, which reflects the planning process in gener-
activities enables the speech production system to package ating trigger pulses to initiate motor responses (Wing &
and convey linguistic information at appropriate temporal Kristofferson, 1973) and the state of the articulatory sys-
granularities by entraining articulatory motor activities to tem. Because AP/TD currently provides the most compre-
auditory-perceptually encoded linguistic rhythms (e.g., sylla- hensive account of the temporal aspects of speech produc-
ble), thereby optimizing the efficiency of linguistic informa- tion, below we use the AP/TD framework to lay the theo-
tion exchange during communication. This establishes the retical groundwork for the impact of neuromotor speech
theoretical foundation of temporal patterning of speech, disorders on timing of articulation.
linking the production and perception mechanisms, both In AP/TD, the central clock is modeled as an
entrained to common underlying linguistic rhythms, to the ensemble of planning oscillators, which embed the tempo-
goal of functional communication. Confirmative empirical ral information in linguistic events into motor plans
evidence is yet needed from the production side to substanti- (Saltzman et al., 2008; Šimko & Cummins, 2011;
ate how specific speech production behaviors (e.g., articula- Windmann et al., 2015). These oscillators can be catego-
tory movements) are entrained to the hierarchy of linguistic rized into (a) gestural planning oscillators, which gate the
rhythms as evidenced in auditory-perceptual activities. activation of individual articulatory gestures, and (b)
suprasegmental planning oscillators, which specify the
Impact of Neuromotor Disorders on higher level prosodic structure composed of a hierarchy of
Temporal Patterning of Speech: Theoretical nested constituents such as syllables, feet, words, and
Groundwork Based on Articulatory phrases. Within this framework, gestures are articulatory
Entrainment representations of phonological segments, which are
defined in an abstract manner by the location and degree
To evaluate the theorized functional significance of of local constrictions in the vocal tract (Saltzman, 1986;
temporal patterning of speech, most empirical studies so Saltzman & Munhall, 1989). Each gesture is governed by
far have used a bottom-up approach, for example, by a set of physical articulators, which are functionally
manipulating the modulation envelope of acoustic signals coupled to one another and act synergistically to form
and evaluating its perceptual consequences (Drullman and release vocal tract constrictions (Saltzman, 1986;
et al., 1994, 1996; Ghitza, 2012). Little empirical research Saltzman & Munhall, 1989). Gestural activation patterns
has been done from the production side. Because of this (as specified by the gestural planning oscillators) are

Rong & Heidrick: Temporal Patterning of Articulation 4579


modulated by the prosodic structure (as specified by the Gaps to Address in This Study
suprasegmental planning oscillators) to superimpose supra-
segmental cues such as stress onto the gestures. The combi- To consolidate the theoretical predictions above, a
natorial output of the gestural and suprasegmental planning major gap is the lack of experimental evidence to link the
oscillator ensemble serves as the drive to the articulators, breakdown of the articulatory system, temporal patterning
which interferes with the physical state of the articulators of articulation, and functional speech outcomes. This
as determined by their intrinsic dynamic properties (e.g., study aimed to address this gap by investigating a cohort
stiffness and damping) to shape the surface articulatory of individuals with varying speech severities secondary to
movement patterns. ALS. The rationale for the study population selection is
An advantage of the AP/TD model is that it directly contingent on (a) the clinical manifestation of the motor
links articulatory motor activities with the underlying lin- speech disorder secondary to ALS (i.e., flaccid–spastic
guistic events at multiple timescales (i.e., at both segmen- dysarthria), which is characterized by various temporal
tal and suprasegmental levels), providing a theoretical deficits related to rate and stress (Darley et al., 1969a,
framework for articulatory entrainment to different lin- 1969b), and (b) empirical evidence from our prior work
guistic rhythms. Based on this framework, it is conceiv- showing temporal articulatory deficits at the segmental
able that impaired central clock and dynamic properties level (Rong & Heidrick, 2021; Rong et al., 2018, 2020).
of the articulatory system can both affect articulatory This study further broadened the focus beyond the seg-
entrainment to linguistic rhythms and, in turn, disrupt mental features to global features reflecting temporal pat-
temporal patterning of articulatory activities. Although terning of articulation.
the exact mechanism of the central clock remains under
debate, emerging evidence has revealed contributions of Functional Speech Outcomes
various cortical and subcortical structures including pre- To evaluate the relation of temporal patterning of
motor cortex, inferior parietal lobe, cerebellum, and basal articulation to functional communication, two functional
ganglia to the intrinsic timekeeping function (Buhusi & speech outcomes—speech intelligibility and intelligible
Meck, 2005; Grahn, 2009; Ivry & Spencer, 2004; Konoike speaking rate—were selected. As a functional communica-
et al., 2012; Todd et al., 2002). Hence, impairment to tion measure of the American Speech-Language-Hearing
these structures and their interconnections, for example, Association’s (ASHA’s) National Outcomes Measurement
due to basal ganglia disorders such as Parkinson’s disease System (ASHA, 2002, 2013), speech intelligibility is multi-
or cerebellar disorders such as cerebellar ataxia, may dis- faceted in nature, underpinned by a complex interplay of
rupt temporal patterning of articulation, resulting in rate cognitive, linguistic, motor, sensory, and environmental
and rhythmic abnormalities as documented in the classic factors (Weismer, 2008). From the motor speech perspec-
dysarthria literature (Darley et al., 1969a; Duffy, 2013). In tive, a variety of segmental (e.g., vowel distinctiveness,
addition, impairment to the neuromuscular system con- consonant precision, formant transition trajectories, and
trolling articulation, for example, due to motor neuron segment duration) and suprasegmental (e.g., intonation,
diseases such as amyotrophic lateral sclerosis (ALS), energy contour, utterance duration, and long-term spectra)
which alters the firing pattern of motor units and the spatiotemporal factors have been found to contribute to
physiological properties of the articulatory musculature speech intelligibility in both neurologically intact and
(de Carvalho et al., 2014; Farina & Negro, 2015; Negro & impaired speakers (Amano-Kusumoto & Hosom, 2011;
Farina, 2011), can change the intrinsic dynamic parameters Kent et al., 1990; van Brenk et al., 2021; Weismer et al.,
of the articulatory system (e.g., stiffness) and thus disrupt 2001). These underpinnings of speech intelligibility have
temporal patterning of articulation, as well. These insights proven to provide useful information for characterizing,
drawn from the AP/TD model yield an entrainment-based assessing, and developing targeted management strategies
perspective on temporal patterning of articulation, provid- for neuromotor speech disorders (Yorkston et al., 1992).
ing a putative theoretical account of how neuromotor Given its multifaceted nature and established clinical sig-
speech disorders can impair articulatory entrainment to lin- nificance, speech intelligibility is expected to associate with
guistic rhythms and, in turn, disrupt the temporal structure temporal patterning of articulation—a motor facet that is
of the acoustic output of the speech production system and theorized to underlie the functional outcome of speech
degrade the functional speech outcomes. Moreover, the communication—in both neurologically impaired and
AP/TD model also provides a multitimescale speech pro- intact speakers.
duction framework to guide the methodological design and In addition to speech intelligibility, another widely
interpretation of empirical studies to identify deficits in used clinical speech index—speaking rate—also contrib-
temporal patterning of speech production behaviors related utes to the profiling of functional speech impairments in
to different linguistic events in individuals with neuromotor individuals with neuromotor disorders, especially ALS, for
speech disorders. whom speaking rate is currently used as a proxy for

4580 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
guiding the timing of speech interventions (Ball et al., 2001, temporal modulation of each articulator within and across
2002; Green et al., 2013). Importantly, speech intelligibility the target timescales. Moreover, because the AP/TD
and speaking rate exhibit different declining trajectories pro- model treats functionally related articulators as coordina-
viding complementary information to the course of ALS tive structures that are activated synergistically during
progression, with speaking rate showing earlier and faster speech (Saltzman & Byrd, 2000; Saltzman & Munhall,
declines during the early stage and intelligibility manifesting 1989), these articulators are expected to be entrained to
accelerated declines as the disease progresses to the late common periodicity contents in linguistic rhythms. To
stage (Rong, Yunusova, & Green, 2015; Yorkston et al., characterize the functional coupling between these articu-
1993). To capture the global pattern of functional speech lators, a set of interarticulator modulation features were
decline over the course of ALS, we combined speech intel- extracted to measure the coherence of temporal modula-
ligibility and speaking rate into an integrative functional tion patterns between functionally related articulators at
speech index—intelligible speaking rate—as the number of each target timescale. The combination of these intra- and
intelligible words per minute (WPM). This index has proven interarticulator modulation features provides a hierarchi-
to reflect the communication efficiency of individuals with cal, quantitative characterization of articulatory entrain-
dysarthria, providing a more comprehensive measure of ment to different linguistic rhythms, enabling an integra-
functional speech performance across a wider range of tive understanding of temporal patterning of articulation.
severity than intelligibility alone (Yorkston & Beukelman,
1981). Furthermore, as conceptualized above, one theoreti- Behavioral Means of Modifying Temporal
cal hypothesis of temporal patterning of speech pertains to Patterning of Articulation: Speaking Rate
optimizing the efficiency of conveying linguistic information Manipulation
during communication. Intelligible speaking rate thus pro- Based upon the theorized relationships between neuro-
vides a functional speech outcome to target the hypothe- motor speech disorder, temporal patterning of articulation,
sized relation of temporal patterning of speech with commu- and functional speech outcomes, a logical follow-up question
nication efficiency. to ask is how the disorder-related changes in temporal pat-
terning of articulation can be modulated, for example, by
An Integrative Approach to Characterizing behavioral modifications as widely used in the management
Temporal Patterning of Articulation: Intra- and of neuromotor speech disorders, to improve the global tempo-
Interarticulator Modulation at Three Hierarchically ral organization of speech production. To address this explor-
Nested, Linguistically Relevant Timescales atory question, we selected speaking rate manipulation—a
According to the theoretical foundation for temporal common behavioral modification for individuals with neuro-
patterning of speech production and perception, articula- motor speech disorders (Yorkston, Hakel, et al., 2007)—as a
tory motor activities presume to occur at the same hierar- triggering method to modulate temporal patterning of articu-
chically organized timescales as previously reported in lation. The functional significance of speaking rate lies in that
auditory-perceptual activities, encoding the underlying lin- it is not just a descriptive feature of speech but also a control
guistic rhythms. To experimentally investigate such tempo- variable that modulates global functioning of the speech pro-
ral patterning of articulation, we adopted the temporal duction system (Guenther et al., 2006; Saltzman & Byrd,
modulation analysis from the psychoacoustic literature 2000; Tourville & Guenther, 2011; Turk & Shattuck-
(Leong & Goswami, 2015; Leong et al., 2017) to charac- Hufnagel, 2014). The modulatory effect of rate manipulation
terize the modulation patterns of articulatory activities at is corroborated by a wide range of empirical findings showing
three hierarchically nested, linguistically relevant time- rate-elicited changes in acoustic performance, including vowel
scales (delta, 0.9–2.5 Hz; theta, 2.5–12 Hz; and beta/ formant frequencies, diphthong formant transition trajecto-
gamma, 12–40 Hz), reflecting the rhythms of prosodic ries, and consonant spectral moments (Tjaden & Wilding,
stress, syllable, and onset–rime/phoneme, respectively. It is 2004; Turner et al., 1995; Weismer et al., 2000), as well as in
well established that the periodicity contents in a time articulatory performance, including duration, displacement,
series are represented by the modulus (i.e., magnitude) velocity, and variability of articulatory movement (Kleinow
and phase of the relevant temporal modulation patterns et al., 2001; Kuruvilla-Dugdale & Mefferd, 2017; McClean,
(Todd et al., 2002). As such, articulatory entrainment to 2000; McClean & Tasko, 2003; McHenry, 2003; Mefferd
the periodicity contents in the hierarchy of linguistic et al., 2014; Rong, 2020). These findings provide the evidence
rhythms can be investigated by the magnitude of articula- base for speaking rate manipulation as a global behavioral
tory modulation at each linguistically relevant timescale modification for managing neuromotor speech disorders.
and the relative phase between articulation modulations at It is important to note that, despite being a global
different timescales. In accordance with this notion, a vari- behavioral modification, the effect of rate manipulation
ety of intra-articulator modulation features were derived on articulatory performance has been found to be largely
to characterize the magnitude and relative phase of idiosyncratic, leading to variable changes in one or more

Rong & Heidrick: Temporal Patterning of Articulation 4581


articulatory kinematic parameters (e.g., displacement, perception literature by associating temporal patterning of
velocity, and relative timing) in either direction (increased articulation with the functional speech outcomes in both
or decreased) across different individuals (Berry, 2011). neurologically healthy and impaired speakers. The second,
This idiosyncrasy uncovers the intricate nature of rate clinically oriented aim was to identify changes in temporal
manipulation, which interacts in a complex, nonlinear patterning of articulation in neurologically impaired
way with the activities of the articulatory motor system to speakers. The outcome of this aim can inform the value of
elicit changes in multiple facets of the articulatory mecha- temporal patterning of articulation in characterizing and
nism that are much more complicated than a simple, lin- assessing neuromotor speech disorders. The third, explor-
ear scaling along the spatial and/or temporal dimensions. atory aim was to determine how the changes in temporal
Because of this intricate nature, rate manipulation appears patterning of articulation in neurologically impaired
to be a natural trigger to elicit changes in temporal pat- speakers can be modulated through speaking rate manipu-
terning of articulation, which is itself underpinned by a lation. The outcome of this aim would provide manage-
complex, nonlinear interplay between the articulatory ment implications for temporal patterning of articulation.
motor and linguistic systems across multiple timescales. Due to the lack of empirical evidence on the
We selected the voluntary rate reduction paradigm research topics of this study, hypotheses were tentatively
to evaluate how slowing of speaking rate (and thus generated based on both theoretical considerations and
increasing of duration) would impact temporal patterning inferences from the empirical psychoacoustic findings as
of articulation in individuals with ALS. As a neuromotor outlined above. Specifically, it was hypothesized that (a)
disorder affecting the pyramidal pathways for motor exe- temporal patterning of articulation would correlate with
cution, ALS is characterized by reduced recruitment, syn- the functional speech outcomes across neurologically
chronization, and firing rates of motor units, resulting in a impaired and healthy speakers, (b) temporal patterning of
global decrease in force generation rates (DePaul & articulation would be significantly impaired in neurologi-
Brooks, 1993; Langmore & Lehman, 1994). In the articu- cally impaired speakers, and (c) such impairments can be
latory system, this neuromotor deficit has been associated improved through voluntary rate reduction.
with increased settling time of articulatory movement (i.e.,
time from movement onset to peak velocity; Rong &
Heidrick, 2021). As the settling time of a movement Method
exceeds its designated activation interval as specified by
the underlying linguistic event, the speaker would be The study protocol was approved by the institu-
forced to truncate or reverse the direction of movement tional review board at The University of Kansas Medical
prematurely, resulting in undershooting—a common artic- Center. Written informed consent was obtained from all
ulatory phenomenon in ALS (Lee & Bell, 2018; Rong & participants. All study procedures were noninvasive and
Heidrick, 2021; Rong, Yunusova, Wang, & Green, 2015; involved minimal risks.
Shellikeri et al., 2016). These changes can presumably
reduce the entrainment of the articulators to the relevant Participants
linguistic rhythms and, in turn, disrupt temporal pattern-
ing of articulation. In addition, damage to premotor and Twenty-three neurologically healthy or impaired par-
prefrontal cortices has also been reported in ALS, result- ticipants took part in this study. Neurologically healthy
ing in deficits in initiating and inhibiting motor responses participants included 10 adults (three men and seven
(Thorns et al., 2010). These deficits could further impair women; age range: 38–81 years, M = 66.80 years, SD =
the timing of articulation to reduce articulatory entrain- 13.02 years) with no known neurological disease or disor-
ment to the underlying linguistic events. Our intention der. Neurologically impaired participants included 13 indi-
was to use voluntary rate reduction as a triggering method viduals diagnosed with probable or definite ALS (eight
to reshape the temporal structure of articulatory motor men and five women; age range: 38–74 years, M =
plans and afford more processing time to accommodate 59.54 years, SD = 12.78 years), as per the revised El Escor-
the increased settling time and difficulty in initiating and ial criteria (Brooks et al., 2000). All participants were from
inhibiting articulatory movements, thereby improving the Midwest regions of the United States; spoke American
articulatory entrainment to linguistic rhythms. English as the first and primary language; passed hearing
screening at 1000, 2000, and 4000 Hz at 30 dB in at least
Purpose of the Study one ear; and possessed adequate cognitive function to
understand and perform the experimental tasks, as per the
The purpose of this study was threefold. The first, Montréal Cognitive Assessment (Nasreddine et al., 2005).
scientifically motivated aim was to provide empirical The age range was comparable between the ALS and
insights into the gap between the production and healthy control groups, F(1, 21) = 1.80, p = .19. Table 1

4582 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Table 1. Demographic and clinical characteristics of participants.

Subject ID Gender Age (years) Onset DaysSinceDiag MoCA SIT_Intell SIT_SR SIT_IntellRate Group

ALS1 M 73 B 134 20 54.55 102.99 56.18 ALS + S


ALS2 M 47 C 1,668 22 99.55 150.35 149.67 ALS − S
ALS3 M 74 C/N 87 21 92.73 128.27 118.94 ALS − S
ALS4 M 72 L 407 25 95.91 122.79 117.77 ALS − S
ALS5 F 65 B 266 27 95.00 85.14 80.88 ALS + S
ALS6 M 58 L 56 28 96.82 190.78 184.71 ALS − S
ALS7 F 66 C 294 21 97.27 146.62 142.62 ALS + S
ALS8 F 62 B 150 27 14.09 47.64 6.71 ALS + S
ALS9 M 38 L 123 30 97.27 152.78 148.61 ALS + S
ALS10 M 38 C/L 491 24 99.55 154.24 153.55 ALS − S
ALS11 M 52 C 192 26 99.09 149.45 148.09 ALS − S
ALS12 F 73 B 571 29 94.55 121.77 115.13 ALS + S
ALS13 F 56 C 269 29 99.09 202.45 200.61 ALS − S
HC1 F 55 n/a 100 188.19 188.19 Control
HC2 F 38 n/a 100 231.78 231.78 Control
HC3 M 65 29 99.55 167.10 166.35 Control
HC4 M 76 26 99.55 196.46 195.58 Control
HC5 M 81 25 100 166.04 166.04 Control
HC6 F 71 27 100 186.45 186.45 Control
Rong & Heidrick: Temporal Patterning of Articulation

HC7 F 80 26 98.64 192.15 189.54 Control


HC8 F 62 27 99.09 171.36 169.80 Control
HC9 F 74 26 98.64 146.21 144.22 Control
HC10 F 66 28 99.09 190.75 189.01 Control
Control 66.80 (13.02) — 26.75 (1.28) 99.45 (0.56) 183.65 (23.02) 182.69 (23.33)
ALS − S 56.71 (12.92) 452.86 (558.77) 25.00 (2.94) 97.53 (2.56) 156.90 (29.75) 153.33 (30.84)
ALS + S 62.83 (12.95) 256.33 (170.05) 25.67 (4.18) 75.45 (34.35) 109.49 (39.66) 91.69 (54.74)

Note. Descriptive statistics for groups are provided as mean (standard deviation) at the bottom of the table. Em dashes indicate “not applicable.” Subject ID: ALS = amyotrophic
lateral sclerosis; HC = healthy control. Gender: M = male; F = female. Onset: B = bulbar; C = cervical; N = neck; L = lumbar. DaysSinceDiag = disease duration in days since diag-
nosis. MoCA = Montréal Cognitive Assessment score (range: 0–30). SIT_Intell = speech intelligibility in percentage of intelligible words, as assessed by the Sentence Intelligibility
Test (SIT). SIT_SR = speaking rate in words per minute (WPM), as assessed by the SIT. SIT_IntellRate = intelligible speaking rate in WPM, derived as the product of SIT_Intell and
SIT_Rate. Group: ALS + S = individuals with ALS, presented with overt clinical speech symptoms; ALS − S = individuals with ALS, absent of overt clinical speech symptoms; Control =
healthy controls; n/a = not applicable.
4583
provides an overview of the demographic and clinical char- speech intelligibility by speaking rate. The SIT-derived
acteristics of the participants. speech intelligibility and intelligible speaking rate served
The participants with ALS were recruited from the as the functional speech outcomes to correlate with the
multidisciplinary ALS clinic at The University of Kansas modulation features of articulation in the first aim.
Medical Center and had been seen by a speech-language
pathologist (the second author) who evaluated their speech Speech Production Experiment
function following standard clinical guidelines. Based on
the clinical speech evaluation, six participants with ALS Experimental Task
manifested overt clinical speech symptoms consistent with The protocol for the speech production experiment
flaccid–spastic dysarthria, and seven were asymptomatic. was part of a larger project, in which participants per-
Regardless of the presence of clinical speech symptoms, all formed a variety of oral reading tasks under habitual and
participants with ALS (along with their caregivers when modified conditions. For this study, the sentence “take
applicable) reported experiencing speech/voice changes today’s tasty tea on the terrace” was selected as the speech
since the disease onset, according to a questionnaire survey stimulus, which was read out loud 3 times by the partici-
of self-perceived problems with bulbar (i.e., speech and pants, first at their habitual rate and then at a slower rate.
swallowing) motor function. Given that ALS is a neuro- For the slow rate task, speaking rate was voluntarily
generative disease known to have a long prodromal phase, manipulated by the participants in accordance with the
detecting subclinical changes preceding the clinical symp- following instruction: “Read the sentence out loud at a
tom onset is of critical importance in early intervention to slower rate, about half your habitual rate, focusing on
slow disease progression and improve patient’s quality of stretching the duration of the words.” The data derived
life (Green et al., 2013; Yorkston et al., 2002). Inclusion of from the habitual rate task were used for all study aims,
both patients with and without overt clinical symptoms can and the data derived from the slow rate task were used
therefore provide a heterogeneous sample of the prodromal for the third aim only.
and symptomatic phases of bulbar disease (i.e., a hallmark The selected sentence consists of seven words of one
feature of ALS characterized by the involvement of the bul- to two syllables with an alternating stress pattern. Phonet-
bar musculature) to assess the utility of the proposed ically, this sentence is loaded with (voiceless) alveolar
approach in detecting temporal speech deficits throughout stops and (middle) vowels, along with several other pho-
the disease course. nemes (e.g., velar stop, fricative, nasal, and approximant).
Previous research on segmental timing has shown that the
Functional Speech Outcomes release of voiceless stops and middle vowels are among
the most affected segments in individuals with ALS, exhi-
A standard functional speech assessment—the Sen- biting significant and greater increases in duration com-
tence Intelligibility Test (SIT; Yorkston, Beukelman, pared to other segments (Tjaden & Turner, 2000). From a
et al., 2007)—was administered to all participants. In this physiological perspective, because muscles of the tongue,
test, participants read 11 randomly generated sentences of especially of tongue tip, are known to be the most
five to 15 words. Speech was digitally recorded at 22.05 involved articulatory musculature in ALS (DePaul et al.,
kHz and later orthographically transcribed and timed by 1988), the alveolar stop–vowel contexts are expected to
two naïve listeners, using the built-in software for the SIT. elicit abnormal tongue kinematics reflective of the under-
The listeners were undergraduate students in the speech- lying neuromotor pathology. In addition to the tongue as
language pathology major, who (a) were native speakers the primary articulator, the jaw, which anatomically lies
of American English; (b) had normal speech, language, under the tongue, and the lips are involved as secondary
hearing, and cognitive functions; and (c) were unfamiliar articulators.
with either the speech stimuli or the speaker characteris-
tics. Based on the perceptual response of each listener, the Instrumentation
percentage of words correctly transcribed and number of A three-dimensional electromagnetic kinematic track-
WPM were calculated and then averaged across listeners ing system (Wave, Northern Digital, Inc.) was used to
as measures of speech intelligibility and speaking rate, record articulatory movements while the participants read
respectively. Interlistener reliability has proven to be high the speech stimulus. Specifically, four small, wired sensors
by our prior work (Rong & Heidrick, 2021), showing a were attached to (a) tongue tip (0.5–1 cm posterior to the
significant strong correlation and minimal discrepancy apex of the tongue), (b) tongue body (2–3 cm posterior to
between the two listeners for both intelligibility (r = .86, tongue tip), (c) jaw (center of lower chin), and (d) lower lip
interlistener discrepancy = 1.02% ± 1.44%) and rate (r = (central vermilion border of lower lip) using dental adhe-
.99, interlistener discrepancy = 3.15 ± 2.03 WPM). Intelli- sive or medical tape. An additional reference sensor embed-
gible speaking rate was then calculated by multiplying ded in a headband was attached to the center of forehead,

4584 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
with its axes aligning as closely as possible with the ana- series was derived from the original, untransformed jaw
tomical axes of the oral cavity. Sensor motions were sensor data, as the Euclidean distance between the jaw
recorded relative to the head reference at 100 Hz by the and an oral anatomical landmark—central upper incisor
WaveFront software (Northern Digital, Inc.). In addition, (location determined by the anterior edge of the palatal
a midsagittal palatal trace was acquired posteriorly from trace)—at each sampled time point. This articulatory time
the edge between the hard and soft palate and anteriorly to series characterized the jaw movement pattern that pas-
the central upper incisor using a probe supplied by North- sively contributed to the tongue tip, tongue body, and lip
ern Digital, Inc. Moreover, participants wore a head- gestures. The other three articulatory time series were
mounted microphone (DPA d:fine 4188) to record speech derived from the transformed tongue tip, tongue body,
acoustic signals, which served as the reference for interpret- and lower lip sensor data, as the Euclidean distance
ing the articulatory data. The acoustic signals were proc- between each articulator and the jaw at each sampled time
essed by the Behringer XENYX 802 sound conditioner, point. These articulatory time series characterized the pat-
digitized at 22.05 kHz, and acquired by the WaveFront terns of tongue tip, tongue body, and lower lip movements
software. relative to the jaw, reflecting the active movement compo-
nents that contributed to the tongue tip, tongue body, and
Data Processing lower lip gestures. A graphic example of the articulatory
time series for the habitual and slow rate tasks for a partici-
The movement trajectories of all articulators were pant with ALS is provided in Figure 1. Temporal modula-
visually inspected to exclude data with recording errors. tion analysis was applied to these articulatory time series,
Note that, because the sensors were independent of each with or without rate normalization (see details below).
other, only the data for the malfunctioning sensors in spe-
cific trials were excluded, whereas the data for other normal Rate Normalization
functioning sensors in these trials were preserved. Given the As a global timing feature, speaking rate itself con-
primary focus of this study on linguistically relevant time- tributes to the statistical pattern of temporal modulation
scales in the lower frequency domain (i.e., ≤ 40 Hz), only the (e.g., reduced speaking rate during the slow rate task tends
superior–inferior and anterior–posterior dimensions of artic- to increase the magnitude of modulation in the low-
ulatory data, which characterized the primary pattern of lin- frequency range). Although it was our intention to use
guistically related articulatory motions, were analyzed to voluntary rate reduction as a triggering method to elicit
minimize the confounding effect of the linguistically less rele- mechanistic changes in temporal patterning of articulation
vant lateral–medial motions; the resulting two-dimensional in Aim 3, the change in speaking rate itself could intro-
articulatory trajectories were low-pass filtered at 40 Hz by a duce a confounder that may interfere with the interpreta-
second-order, zero-lag Butterworth filter to further remove tion of the rate-elicited articulatory changes. Therefore,
high-frequency movement artifacts. we adopted the rate normalization approach in Leong
et al. (2017) to normalize the duration of all articulatory
Decoupling time series for the slow rate task relative to that of the
Because tongue and lower lip resided on the jaw, habitual rate task. Specifically, each slow rate trial was
the sensors placed on tongue tip, tongue body, and lower matched with a trial of the habitual rate task produced by
lip were expected to capture a mixture of active (i.e., the same speaker (i.e., the ith slow rate trial was matched
tongue and lower lip movements independent of the jaw) with the ith habitual rate trial to account for potential
and passive (i.e., loading effect of the jaw) movement habituation effects). Duration was calculated for both the
components. To decouple these active and passive move- slow and habitual rate trials. The slow rate trial was then
ment components, the sensor data for tongue tip, tongue rescaled in time by resampling the time series based on
body, and lower lip were transformed from the head coor- the ratio of duration between the two rate conditions. The
dinate system to the jaw coordinate system, using the resulting rescaled articulatory data for the slow rate task,
transformation method described in Westbury et al. (2002; along with the original articulatory data for the habitual
see Equation 1). The transformed sensor data reflected the rate task, were submitted to temporal modulation analysis
active tongue tip, tongue body, and lower lip movements, in the following.
with removal of the passive loading effect of the jaw.
Temporal Modulation Analysis
Characterization of Time-Varying Articulatory
Changes For each trial of production, 24 features were
To characterize articulatory changes over time for extracted algorithmically by a custom program in MATLAB
the subsequent temporal modulation analysis, we derived (MathWorks) to characterize the temporal modulation
four articulatory time series. The first articulatory time patterns at three linguistically relevant timescales (delta,

Rong & Heidrick: Temporal Patterning of Articulation 4585


Figure 1. Example of acoustic and articulatory time series for the habitual and slow rate tasks for a participant with amyotrophic lateral scle-
rosis (ALS12). TT distance = Euclidean distance between tongue tip and jaw at each sampled time point; TB distance = Euclidean distance
between tongue body and jaw at each sampled time point; L distance = Euclidean distance between lower lip and jaw at each sampled time
point; J distance = Euclidean distance between jaw and upper incisor at each sampled time point.

0.9–2.5 Hz; theta, 2.5–12 Hz; and beta/gamma, 12–40 Hz) delta-, theta-, and beta/gamma-band modulations, respec-
within (intra) and between (inter) articulators. These fea- tively, and (b) two features characterizing the relative
tures are summarized in Table 2 and elaborated below. phase between the theta-band modulation, which has been
identified as the master oscillatory rhythm during speech
Intra-Articulator Modulation processing (Ghitza, 2011, 2012), and the delta- and beta/
In view of the oscillatory models, to generate a gamma-band modulations, for each articulator.
structured rhythmic hierarchy requires the component Modulation depths within the target frequency
oscillators to (a) entrain to all periodicity contents in the bands. Modulation spectrum was generated for each
hierarchy and (b) remain harmonically entrained (or articulatory time series by power spectrum analysis
phase-locked) to one another (Giraud & Poeppel, 2012). using a 2,048-point fast Fourier transform (FFT) with a
In keeping with this theoretical view, 12 features were Hamming window. Similar to the acoustic envelope
extracted to characterize the magnitude and relative phase modulation analysis by Liss et al. (2010), the mean
of temporal modulation patterns across four articulators: depth of modulation within each target frequency band,
tongue tip, tongue body, lower lip, and jaw. This feature representing the extent of articulatory entrainment to the
set included (a) three features measuring the magnitude of relevant linguistic rhythm, was calculated by summing

4586 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Table 2. Summary of modulation features.

Category Feature Targeted aspect of temporal patterning of articulation

Intra-articulator
Delta-band modulation depth TT_mod_depth_delta Articulatory entrainment to the rhythm of prosodic stress
TB_mod_depth_delta
L_mod_depth_delta
J_mod_depth_delta
Theta-band modulation depth TT_mod_depth_theta Articulatory entrainment to the rhythm of syllable
TB_mod_depth_theta
L_mod_depth_theta
J_mod_depth_theta
Beta/gamma-band modulation depth TT_mod_depth_beta.gamma Articulatory entrainment to the rhythm of onset–rime/
TB_mod_depth_beta.gamma phoneme
L_mod_depth_beta.gamma
J_mod_depth_beta.gamma
Delta–theta phase synchronization TT_PSI_delta_theta Harmonic articulatory entrainment to the rhythms of
TB_PSI_delta_theta stress and syllable
L_PSI_delta_theta
J_PSI_delta_theta
Theta–beta/gamma phase synchronization TT_PSI_theta_beta.gamma Harmonic articulatory entrainment to the rhythms of
TB_PSI_theta_beta.gamma syllable and onset–rime/phoneme
L_PSI_theta_beta.gamma
J_PSI_theta_beta.gamma
Interarticulator
Theta-band coherence COH_TTJ_theta Interarticulator entrainment to the rhythm of syllable
COH_TBJ_theta
Beta/gamma-band coherence COH_TTJ_beta.gamma Interarticulator entrainment to the rhythm of onset–rime/
COH_TBJ_beta.gamma phoneme

Note. TT = tongue tip; TB = tongue body; L = lower lip; J = jaw; PSI = phase synchronization index; COH = coherence.

the power at all frequency bins within the target band oscillators are in synchrony, they are referred to as being
and normalizing it by the total spectral power. As a phase locked or harmonically entrained, which satisfies
result, 12 measures of modulation depth (4 articulators × the following condition:
3 target bands) were derived per trial of production, which
are denoted as TT_mod_depth_delta, TT_mod_depth_theta, jΔϕðtÞj < σ; where ΔϕðtÞ ¼ nϕ1 ðtÞ  mϕ2 ðtÞ: (1)
TT_mod_depth_beta.gamma, TB_mod_depth_delta, TB_mod_
depth_theta, TB_mod_depth_beta.gamma, L_mod_depth_delta, In this equation, ϕ1(t) and ϕ2(t) are the instanta-
L_mod_depth_theta, L_mod_depth_beta.gamma, J_mod_depth_ neous phase of the oscillators at time t, Δϕ(t) is the gener-
delta, J_mod_depth_theta, and J_mod_depth_beta.gamma alized phase difference between the two oscillators, n and
in Table 2. m are integers reflecting the frequency relation between
Cross-frequency phase synchronization. The relative the two oscillators, and σ is a small positive constant.
phase between articulatory modulations within different Based on the phase locking condition, the degree of syn-
frequency bands was assessed by a synchronization index chronization between two oscillators can be measured by
based on the harmonic phase alignment of sub-band a phase synchronization index (PSI; Schack & Weiss,
articulatory signals. Synchronization is an important attribute 2005) defined as follows:
of neuron activities contributing to the coherence and D E
 
regularity of neural outputs (Roelfsema et al., 1997; Tononi PSI ¼  eiðnϕ1 ðtÞmϕ2 ðtÞÞ ; (2)
et al., 1998). In the motor speech system, the functional pffiffiffiffiffiffiffi
significance of synchronization is corroborated by empirical where i is the unit imaginary number equal to 1 and
evidence showing that reduced synchronization of motor h∙i denotes averaging across time. PSI ranges between 0
unit firing, as caused by motor neuron degeneration in ALS, and 1, with 0 denoting no synchrony and 1 indicating per-
can reduce the coherence and regularity of articulatory fect synchrony.
muscle activities and their movement kinematics (Rong, 2021; In this study, PSI was used to assess the harmonic
Rong & Pattee, 2021). entrainment of articulatory activities to different linguistic
In physical terms, the synchronization of two oscilla- rhythms by evaluating the phase alignment between artic-
tors is represented by their generalized phase relationship ulatory modulations within different frequency bands.
(Schack & Weiss, 2005; Tass et al., 1998). When two Specifically, each articulatory time series was bandpass

Rong & Heidrick: Temporal Patterning of Articulation 4587


filtered into three sub-band signals (cutoffs: 0.9–2.5, 2.5– Among all initially identified peaks in the interarticulator
12, and 12–40 Hz) corresponding to delta-, theta-, and coherence spectra, only the ones above the baseline were
beta/gamma-band modulations, using a fourth-order, zero- retained. The mean magnitude of the retained peaks
lag Butterworth filter. The PSI between delta- and theta- within each target band was calculated to index the coher-
band modulations was then calculated using Equation 2, ence between the modulations of different articulators at
wherein the ratio of n:m was empirically set to 2:1 based the specified timescales.
on the delta–theta frequency relationship (Leong et al., Based on a post hoc examination, there was a lack of
2017). These indices measured how regularly the theta peaks within the delta band in most interarticulator coherence
subcycles resided within the delta cycles, which, in linguistic spectra, suggesting the absence of interarticulator entrainment
terms, reflected the temporal regularity of syllable stress. to the delta rhythm. Thus, all delta-band coherence measures
Similarly, the PSI between theta- and beta/gamma-band were discarded. The remaining four coherence measures,
modulations was calculated using Equation 2, with the including the theta- and beta/gamma-band coherence for the
ratio of n:m set to 3:1 according to Leong et al. (2017). tongue tip–jaw pair (COH_TTJ_theta, COH_TTJ_beta.
These indices measured how regularly the beta/gamma sub- gamma) and the tongue body–jaw pair (COH_TBJ_theta,
cycles resided within the theta cycles, reflecting the tempo- COH_TBJ_beta.gamma), were submitted to further analysis.
ral organization of phonemes within the syllable structure.
Taken together, eight PSIs were extracted per trial of Statistical Analysis
production, characterizing the harmonic entrainment of the
articulators to different linguistic rhythms. These measures Statistical analyses were conducted within the R sta-
are denoted as TT_PSI_delta_theta, TT_PSI_theta_beta. tistical computing program (R Core Team, 2019). The sig-
gamma, TB_PSI_delta_theta, TB_PSI_theta_beta.gamma, nificance level was set to p < .05 for the main effects and
L_PSI_delta_theta, L_PSI_theta_beta.gamma, J_PSI_delta_theta, adjusted for the post hoc tests using the Bonferroni method.
and J_PSI_theta_beta.gamma in Table 2.
Task Performance Verification
Interarticulator Modulation To verify that the slow rate task was performed as
The coherence of temporal modulations was calcu- instructed, sentence duration was calculated for all trials
lated for two pairs of functionally related articulators, of production and compared between the two rate condi-
namely, tongue tip–jaw and tongue body–jaw, within the tions at both individual and group levels. For individual-
target frequency bands.1 First, the magnitude coherence level comparisons, the sentence duration for each trial of
spectrum was generated for each pair of articulators, using production for the slow rate task was compared with a
a 2,048-point FFT with a 64-point, 50% overlap moving participant-specific baseline, which was derived as the
window. Next, the peaks in the coherence spectrum were mean sentence duration for the habitual rate task for the
identified within each target band. In the dynamical sys- participant. For the group-level comparison, a linear
tems viewpoint, the peaks in the coherence spectrum mixed-effects model was applied, with task (habitual,
reflect the entrainment patterns between two dynamical slow), group (ALS, control), and Task × Group interac-
systems (Namasivayam & van Lieshout, 2008; van Lieshout, tion serving as the fixed effects and participant being
2004). treated as the random effect. Post hoc between-tasks com-
To distinguish the peaks reflecting the entrainment parisons of sentence duration were carried out for each
between articulators from signal artifacts (e.g., small local group based on estimated marginal means (emmeans;
fluctuations), a baseline was established by (a) random Lenth, 2020) with Bonferroni-adjusted significance level.
shuffling both articulatory time series to create a pair of
surrogate signals, (b) generating the magnitude coherence Aim 1: Relationship Between Temporal Patterning
spectrum for the surrogate signals using the same method of Articulation and Functional Speech Outcomes
as above, and (c) repeating the previous two steps 50 Factor analysis. To prevent the potential intercorrela-
times and averaging the coherence spectra across the 50 tions of the modulation features from inflating their contri-
pairs of surrogate signals. The resulting mean coherence butions to the functional speech outcomes, exploratory fac-
spectrum for the surrogate signals served as the baseline tor analysis was applied to the set of features extracted
to compare with the interarticulator coherence spectra. from the habitual rate task, using the variance maximiza-
tion rotation and maximum likelihood factoring method
1
(fa; Revelle, 2020). The 24 features were subsequently fac-
To reduce the number of features and enhance the power of statisti- torized into a smaller number of orthogonal factors, where
cal analyses, the coherence between the temporal modulations of
lower lip and jaw—the articulators underlying the lip gesture—was
the number of factors was determined by parallel analysis
not assessed due to the lip gesture being less activated in the speech (fa.parallel; Revelle, 2020). These factors constituted a
stimulus of this study. lower dimensional representation of the temporal

4588 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
modulation pattern of articulation. The fit of the factori- Aim 3: Effect of Rate Manipulation on Temporal
zation model was evaluated by three standard fit indices: Patterning of Articulation
Tucker–Lewis index (TLI), root-mean-square of residuals To examine the effect of rate manipulation on temporal
(RMSR), and root-mean-square error of approximation patterning of articulation, the data for the habitual and slow
(RMSEA) index. A model with a good fit should have a rate conditions were merged. Linear mixed-effects models were
large TLI (range: 0–1) close to 1 and small RMSR and applied to the merged data. Within each model, one of the
RMSEA values close to 0. Finally, the factor scores, modulation features served as the dependent variable; the fixed
which measured the performance of the participants on effects consisted of task, group, and Task × Group interaction;
each dimension in the factor space, were estimated by the and participant was treated as the random effect. Post hoc
correlation-preserving method. These factor scores were pairwise comparisons were conducted based on the contrast of
used as the independent variables to correlate with the estimated marginal means between the two rate conditions for
functional speech outcomes. each group, with Bonferroni-adjusted significance level.
Regression. To predict the functional speech outcomes
(i.e., speech intelligibility and intelligible speaking rate), all
factor scores as derived above were fed first into linear Results
regression and then into support vector regression (SVR)
with the radial basis function (RBF) kernel. SVR is a Task Performance Verification
regression method based on one of the most robust and
widely used machine learning algorithms—support vector Individual-level between-tasks comparisons revealed
machine (Drucker et al., 1997). SVR has various advantages increased sentence duration in all but three trials of the slow
over linear regression, including (a) more flexible error rate task (including two trials produced by ALS1 and one trial
threshold and (b) allowing for mapping of the original data produced by HC3) relative to the baseline sentence duration
into a higher dimensional space using nonlinear kernels such for the habitual rate task. The three exceptions were consid-
as RBF to solve regression problems that do not have a ered as reflecting failure of following instructions or inability
good fit in a linear space. Given the exploratory nature of to voluntarily reduce speaking rate, which were excluded from
this study and the known nonlinearity of the speech the subsequent analyses. Figure 2 provides a graphic display
production and perception mechanisms, we employed both of sentence duration by task and participant. Group-level sta-
linear and nonlinear regression methods to predict the tistical analysis revealed a significant main effect of task, F(1,
contribution of temporal modulation of articulation to the 104.40) = 365.24, p < .0001, and Task × Group interaction,
functional speech outcomes. Moreover, because of the highly F(1, 104.40) = 7.13, p = .0088, on sentence duration. There
skewed distribution of speech intelligibility (skewness = was no significant difference in sentence duration between the
−3.14) with two participants showing substantially lower two groups, F(1, 21.17) = 3.73, p = .067. Post hoc tests sug-
scores (ALS1, ALS8; see Table 1) than others, intelligibil- gested that sentence duration was significantly longer in the
ity was logarithmically transformed into 10 log10[max(in- slow versus habitual rate task for both healthy controls (t =
telligibility + 1) − intelligibility] (skewness after transfor- 10.95, p < .0001) and individuals with ALS (t = 16.49, p <
mation = 1.48), to prevent the sparse data on the lower .0001). These descriptive and statistical results confirmed the
end of the distribution from generating biased results (i.e., general adherence to task instructions and the validity of task
blind spots) during the training of the machine learning performance at both group and individual levels.
algorithm. The performance of the models was evaluated
by R2 and root-mean-square error (RMSE) through five- Modulation and Coherence Spectra
fold cross-validation.
To provide a visualization of the intra- and interarti-
Aim 2: Effect of Motor Speech Impairment on culator modulation patterns during habitual speech,
Temporal Patterning of Articulation Figure 3 shows the mean (and standard error) of the mod-
To examine the effect of motor speech impairment on ulation and coherence spectra by group. Because the spec-
temporal patterning of articulation, linear mixed-effects tral energy of biological signals naturally follows a 1/f dis-
models were applied to the data for the habitual rate condi- tribution, we further fitted a 1/f function to each modula-
tion. Each model was constructed with one of the modula- tion spectrum to help visualize the modulation patterns.
tion features as the dependent variable, group (ALS, control) Deviations from the 1/f fit imply rhythmic modulation
as the fixed effect, and participant as the random effect. To within the relevant frequency ranges. Based on this notion,
further evaluate the direction and extent of disease-related it can be seen that the articulators all exhibited rhythmic
change in temporal patterning of articulation, the effect sizes modulations within the target frequency bands (delta, theta,
for the between-groups differences in all modulation features and beta/gamma), with tongue tip, tongue body, and jaw
were estimated by Cohen’s d. showing overall greater modulations compared with lower

Rong & Heidrick: Temporal Patterning of Articulation 4589


Figure 2. Sentence duration by task (Habitual = habitual rate task; Slow = slow rate task) and participant. ALS1–ALS13 = participants with
amyotrophic lateral sclerosis; HC1–HC10 = healthy controls.

lip in both ALS and healthy control groups. All coherence and J_mod_depth_beta.gamma), (b) Factor 2 mainly
spectra exhibited one or more peaks within the theta and consisted of delta- and theta-band modulation depths and
beta/gamma bands (note that only the peaks above the delta–theta phase synchronization for tongue and jaw (e.g.,
baseline generated by the surrogate signals were included TT_mod_depth_delta, TB_mod_depth_delta, J_mod_depth_
for calculating interarticulator coherence) and a lack of theta, TT_PSI_delta_theta, TB_PSI_delta_theta, and J_PSI_
peaks within the delta band in both ALS and healthy con- delta_theta), (c) Factor 3 was mainly composed of lip modula-
trol groups. These observations are consistent with the post tion features (e.g., L_mod_depth_delta, L_mod_depth_beta.
hoc peak verification as noted in the Method section. gamma, and L_PSI_delta_theta), (d) Factor 4 consisted of
interarticulator coherence features (e.g., COH_TTJ_beta.
Aim 1: Relationship Between Temporal gamma, COH_TBJ_theta, and COH_TBJ_beta.gamma), (e)
Patterning of Articulation and Functional Factor 5 had one primary component feature reflecting theta-
Speech Outcomes band modulation depth for tongue body (e.g., TB_mod_
depth_theta), and (f) Factor 6 also had one primary compo-
Factor Analysis nent feature reflecting theta–beta/gamma phase synchroniza-
The 24 modulation features were clustered into six tion for tongue body (e.g., TB_PSI_theta_beta/gamma).
composite factors. The resulting six-factor model exhibited
an acceptable fit, as indicated by the fit indices: TLI = Regression
0.86, RMSR = 0.06, RMSEA = 0.055. The rotated factor The results for the regression analyses, including RMSE
loadings are shown in Figure 4, with larger loadings denot- and R2 for both linear regression and SVR and partial R2
ing greater contributions of the feature to the factor. Fea- and p values based on the linear regression models, are listed
tures that loaded greater than 0.3 (i.e., absolute value in Table 3. Between the two regression methods, SVR per-
> 0.3) on each factor were conventionally regarded as the formed notably better in predicting intelligible speaking
primary component features for the factor. Based on this rate than did linear regression, whereas the two methods
notion, (a) Factor 1 was primarily composed of beta/ showed similar performance in predicting speech intellig-
gamma-band modulation depths for tongue and jaw ibility. These results confirmed that temporal patterning of
(e.g., TT_mod_depth_beta.gamma, TB_mod_depth_beta.gamma, articulation was moderately to strongly correlated with the

4590 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Figure 3. Modulation and coherence spectra for ALS (a, b) and healthy control (c, d) groups. Blue line and shaded area around the blue line
represent the mean and standard error for each group. Red line denotes the 1/f fit to the modulation spectrum, with the associated equation
being displayed above the line. The 1/f spectrum is in the form of S = a/fb, where f is frequency and a and b are parameters to be deter-
mined during the fitting process. Vertical dashed lines mark the boundaries between the target frequency bands (delta: 0.9–2.5 Hz, theta:
2.5–12 Hz, beta/gamma: 12–40 Hz). ALS = amyotrophic lateral sclerosis; Control = healthy controls; PSD = power spectral density; COH =
coherence; TT = tongue tip; TB = tongue body; L = lower lip; J = jaw.

functional speech outcomes across individuals with ALS These features included (a) delta- and theta-band modu-
and healthy controls (R2 = .51–.73). Among different pre- lation depths and theta–beta/gamma phase synchroniza-
dictors, Factors 1, 2, and 5 were the most contributive to tion for tongue tip, TT_mod_depth_delta, F(1, 20.86) =
both speech intelligibility and intelligible speaking rate, as 7.44, p = .013; TT_mod_depth_theta, F(1, 21.06) =
reflected by their partial R2 and significant p values. 17.12, p = .00046; TT_PSI_theta_beta.gamma, F(1,
25.32) = 4.27, p = .049, and (b) theta- and beta/gamma-
Aim 2: Effect of Motor Speech Impairment on band modulation depths for tongue body, TB_mod_
Temporal Patterning of Articulation depth_theta, F(1, 20.76) = 5.62, p = .028; TB_mod_depth_
beta.gamma, F(1, 20.21) = 6.16, p = .022. Among these
Of all modulation features, five showed a significant five features, the group effect on TT_mod_depth_theta sur-
main effect of group at the significance level of p < .05. vived Bonferroni correction for multiple tests. The group

Rong & Heidrick: Temporal Patterning of Articulation 4591


Figure 4. Rotated factor loadings. The 24 modulation features are clustered into six composite factors by exploratory factor analysis. Fea-
tures that load greater than 0.3 (i.e., absolute value of loading > 0.3) on each factor are conventionally regarded as the primary component
features for the factor and are marked in red (for features that load greater than 0.3 on more than one factor, the factor corresponding to
the largest absolute value of the loading is identified and marked in red); the remaining features are marked in blue. Feature notation:
TT_mod_depth_delta = delta-band modulation depth for tongue tip; TT_mod_depth_theta = theta-band modulation depth for tongue tip;
TT_mod_depth_beta.gamma = beta/gamma-band modulation depth for tongue tip; TB_mod_depth_delta = delta-band modulation depth for
tongue body; TB_mod_depth_theta = theta-band modulation depth for tongue body; TB_mod_depth_beta.gamma = beta/gamma-band
modulation depth for tongue body; L_mod_depth_delta = delta-band modulation depth for lower lip; L_mod_depth_theta = theta-band mod-
ulation depth for lower lip; L_mod_depth_beta.gamma = beta/gamma-band modulation depth for lower lip; J_mod_depth_delta = delta-band
modulation depth for jaw; J_mod_depth_theta = theta-band modulation depth for jaw; J_mod_depth_beta.gamma = beta/gamma-band mod-
ulation depth for jaw; TT_PSI_delta_theta = delta–theta phase synchronization index (PSI) for tongue tip; TT_PSI_theta_beta.gamma =
theta–beta/gamma PSI for tongue tip; TB_PSI_delta_theta = delta–theta PSI for tongue body; TB_PSI_theta_beta.gamma = theta–beta/
gamma PSI for tongue body; L_PSI_delta_theta = delta–theta PSI for lower lip; L_PSI_theta_beta.gamma = theta–beta/gamma PSI for lower
lip; J_PSI_delta_theta = delta–theta PSI for jaw; J_PSI_theta_beta.gamma = theta–beta/gamma PSI for jaw; COH_TTJ_theta = theta-band
coherence between tongue tip and jaw; COH_TTJ_beta.gamma = beta/gamma-band coherence between tongue tip and jaw;
COH_TBJ_theta = theta-band coherence between tongue body and jaw; COH_TBJ_beta.gamma = beta/gamma-band coherence between
tongue body and jaw.

effect did not reach statistical significance for the rest of the large decreases in the ALS versus healthy control group
features (p > .05). (Cohen’s d = −0.52 to −1.69). In addition, five other fea-
The direction and extent of disease-related change in tures, including TT_mode_depth_beta.gamma, TB_PSI_
all modulation features, as indicated by the Cohen’s d theta_beta.gamma, J_mod_depth_theta, J_mod_depth_beta.
effect sizes, are visualized in Figure 5. Of all features, gamma, and J_PSI_delta_theta, revealed moderate decreases
those identified as exhibiting a significant main group in the ALS versus healthy control group (Cohen’s d = −0.52
effect, corrected or uncorrected, all showed moderate-to- to −0.78), although the main effect did not reach statistical

4592 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Table 3. Results for the regression models between the factor scores derived from the articulatory modulation features and the functional speech outcomes.

Intelligibility Intelligible speaking rate


Factor p Partial R 2
RMSE(LR) R2(LR) RMSE(SVR) R2(SVR) p Partial R 2
RMSE(LR) R2(LR) RMSE(SVR) R2(SVR)

Factor 1 .00030 .19 .36 .53 .37 .51 < .0001 .27 33.31 .63 29.47 .73
Factor 2 < .0001 .38 < .0001 .52
Factor 3 .58 .0050 .28 .019
Factor 4 .18 .029 .87 .00043
Factor 5 .00013 .21 < .0001 .24
Rong & Heidrick: Temporal Patterning of Articulation

Factor 6 .73 .0020 .28 .019

Note. RMSE = root-mean-square error; LR = linear regression; SVR = support vector regression.
4593
Figure 5. Cohen’s d effect sizes for the differences between the articulatory modulation features for individuals with amyotrophic lateral scle-
rosis and healthy controls. The square marks the estimated mean effect size, and the horizontal line around the square is the 95% confi-
dence interval (CI). The numeric values of the mean effect sizes and 95% CIs are shown by the right side of the plot. Feature notation:
TT_mod_depth_delta = delta-band modulation depth for tongue tip; TT_mod_depth_theta = theta-band modulation depth for tongue tip;
TT_mod_depth_beta.gamma = beta/gamma-band modulation depth for tongue tip; TT_PSI_delta_theta = delta–theta phase synchronization
index (PSI) for tongue tip; TT_PSI_theta_beta.gamma = theta–beta/gamma PSI for tongue tip; TB_mod_depth_delta = delta-band modulation
depth for tongue body; TB_mod_depth_theta = theta-band modulation depth for tongue body; TB_mod_depth_beta.gamma = beta/gamma-band
modulation depth for tongue body; TB_PSI_delta_theta = delta–theta PSI for tongue body; TB_PSI_theta_beta.gamma = theta–beta/gamma
PSI for tongue body; L_mod_depth_delta = delta-band modulation depth for lower lip; L_mod_depth_theta = theta-band modulation depth for
lower lip; L_mod_depth_beta.gamma = beta/gamma-band modulation depth for lower lip; L_PSI_delta_theta = delta–theta PSI for lower lip;
L_PSI_theta_beta.gamma = theta–beta/gamma PSI for lower lip; J_mod_depth_delta = delta-band modulation depth for jaw; J_mod_depth_theta =
theta-band modulation depth for jaw; J_mod_depth_beta.gamma = beta/gamma-band modulation depth for jaw; J_PSI_delta_theta = delta–theta
PSI for jaw; J_PSI_theta_beta.gamma = theta–beta/gamma PSI for jaw; COH_TTJ_theta = theta-band coherence between tongue tip and jaw;
COH_TTJ_beta.gamma = beta/gamma-band coherence between tongue tip and jaw; COH_TBJ_theta = theta-band coherence between tongue
body and jaw; COH_TBJ_beta.gamma = beta/gamma-band coherence between tongue body and jaw.

significance. The rest of the features all showed small and affected. Interarticulator modulation remained relatively
nonsignificant differences between the two groups. intact in individuals with ALS.
The results for Aim 2 together suggested that, com-
pared with healthy controls, individuals with ALS exhibited Aim 3: Effect of Rate Manipulation on
notable trends of impairment of intra-articulator modula- Temporal Patterning of Articulation
tion, as characterized by an overall decrease in modulation
depths within the target frequency bands (especially the Figure 6 displays the box plots for all modulation
theta and beta/gamma bands) and cross-frequency phase features by task and group. Statistical results for the main
synchronization for tongue tip, tongue body, and jaw, effects of task, group, and Task × Group interaction as
whereas the modulation features for lower lip were less well as the post hoc tests are provided in Table 4. Because

4594 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Figure 6. Box plots for articulatory modulation features by group (ALS = amyotrophic lateral sclerosis; Control = healthy controls) and task
(Habitual = habitual rate task; Slow = slow rate task). Feature notation: TT_mod_depth_delta = delta-band modulation depth for tongue tip;
TT_mod_depth_theta = theta-band modulation depth for tongue tip; TT_mod_depth_beta.gamma = beta/gamma-band modulation depth for
tongue tip; TT_PSI_delta_theta = delta–theta phase synchronization index (PSI) for tongue tip; TT_PSI_theta_beta.gamma = theta–beta/
gamma PSI for tongue tip; TB_mod_depth_delta = delta-band modulation depth for tongue body; TB_mod_depth_theta = theta-band modu-
lation depth for tongue body; TB_mod_depth_beta.gamma = beta/gamma-band modulation depth for tongue body; TB_PSI_delta_theta =
delta–theta PSI for tongue body; TB_PSI_theta_beta.gamma = theta–beta/gamma PSI for tongue body; L_mod_depth_delta = delta-band
modulation depth for lower lip; L_mod_depth_theta = theta-band modulation depth for lower lip; L_mod_depth_beta.gamma = beta/gamma-
band modulation depth for lower lip; L_PSI_delta_theta = delta–theta PSI for lower lip; L_PSI_theta_beta.gamma = theta–beta/gamma PSI
for lower lip; J_mod_depth_delta = delta-band modulation depth for jaw; J_mod_depth_theta = theta-band modulation depth for jaw;
J_mod_depth_beta.gamma = beta/gamma-band modulation depth for jaw; J_PSI_delta_theta = delta–theta PSI for jaw; J_PSI_theta_beta.
gamma = theta–beta/gamma PSI for jaw; COH_TTJ_theta = theta-band coherence between tongue tip and jaw; COH_TTJ_beta.gamma =
beta/gamma-band coherence between tongue tip and jaw; COH_TBJ_theta = theta-band coherence between tongue body and jaw;
COH_TBJ_beta.gamma = beta/gamma-band coherence between tongue body and jaw.

Rong & Heidrick: Temporal Patterning of Articulation 4595


Table 4. Statistical results for task, group, and Task × Group effects.

Control: Slow vs. ALS: Slow vs.


Task Group Task × Group Habitual Habitual
Feature F p F p F p t p t p

TT_mod_depth_delta 4.90 .029 4.64 .043 14.86 .00020 −4.04 .0001 1.24 .22
TT_mod_depth_theta 17.21 < .0001 11.69 .0026 4.47 .037 1.35 .18 4.74 < .0001
TT_mod_depth_beta.gamma 8.61 .0041 2.80 .11 3.94 .050 −3.22 .0017 −0.74 .46
TT_PSI_delta_theta 6.14 .015 0.58 .45 0.27 .61 −1.99 .049 −1.49 .14
TT_PSI_theta_beta.gamma 1.66 .20 0.85 .36 2.88 .092 −0.27 .79 2.26 .026
TB_mod_depth_delta 16.47 < .0001 3.54 .075 0.050 .82 −2.55 .012 −3.24 .0016
TB_mod_depth_theta 13.62 .00036 5.94 .024 0.31 .58 2.88 .0049 2.32 .022
TB_mod_depth_beta.gamma 15.45 .00016 6.53 .019 14.38 .00025 −5.04 < .0001 0.11 .91
TB_PSI_delta_theta 2.58 .11 0.64 .43 0.17 .68 −0.79 .43 −1.53 .13
TB_PSI_theta_beta.gamma 5.15 .025 1.55 .23 0.93 .34 0.87 .39 2.45 .016
L_mod_depth_delta 2.71 .10 2.31 .14 14.63 .00022 −3.64 .0004 1.65 .10
L_mod_depth_theta 1.01 .32 1.97 .18 0.68 .41 1.22 .23 0.14 .89
L_mod_depth_beta.gamma 0.17 .68 0.30 .59 1.38 .24 0.51 .61 −1.20 .23
L_PSI_delta_theta 16.61 < .0001 0.021 .88 11.04 .0012 −4.92 < .0001 0.57 .57
L_PSI_theta_beta.gamma 3.82 .053 1.73 .20 0.043 .84 1.44 .15 1.32 .19
J_mod_depth_delta 15.24 .00017 0.12 .73 0.16 .69 −2.86 .0051 −2.65 .0092
J_mod_depth_theta 36.69 < .0001 2.95 .10 0.18 .67 3.75 .0003 4.91 < .0001
J_mod_depth_beta.gamma 3.09 .082 3.99 .059 3.64 .059 2.44 .016 −0.11 .91
J_PSI_delta_theta 9.40 .0028 2.86 .11 3.22 .076 −3.23 .0016 −0.96 .34
J_PSI_theta_beta.gamma 19.74 < .0001 2.11 .16 4.68 .033 4.39 < .0001 1.73 .087
COH_TTJ_theta 0.024 .88 0.68 .42 0.0020 .96 0.13 .89 0.084 .93
COH_TTJ_beta.gamma 2.72 .10 0.19 .66 0.85 .36 0.49 .63 1.95 .054
COH_TBJ_theta 0.22 .64 0.30 .59 0.65 .42 0.85 .40 −0.26 .80
COH_TBJ_beta.gamma 2.63 .11 0.71 .41 0.24 .62 0.75 .45 1.60 .11

Note. Main effects are denoted by F and p values (Columns 2–7) estimated by linear mixed-effects models. Post hoc between-tasks com-
parisons are denoted by t and p values (Columns 8–11) based on the contrast of estimated marginal means. Features exhibiting a significant
main effect of task and/or Task × Group at the Bonferroni-corrected significance level of p < .002 are bolded. Task: Habitual = habitual rate
task; Slow = slow rate task. Group: ALS = amyotrophic lateral sclerosis; Control = healthy controls. Feature notation: TT_mod_depth_delta =
delta-band modulation depth for tongue tip; TT_mod_depth_theta = theta-band modulation depth for tongue tip; TT_mod_depth_beta.
gamma = beta/gamma-band modulation depth for tongue tip; TT_PSI_delta_theta = delta–theta phase synchronization index (PSI) for tongue
tip; TT_PSI_theta_beta.gamma = theta–beta/gamma PSI for tongue tip; TB_mod_depth_delta = delta-band modulation depth for tongue
body; TB_mod_depth_theta = theta-band modulation depth for tongue body; TB_mod_depth_beta.gamma = beta/gamma-band modulation
depth for tongue body; TB_PSI_delta_theta = delta–theta PSI for tongue body; TB_PSI_theta_beta.gamma = theta–beta/gamma PSI for
tongue body; L_mod_depth_delta = delta-band modulation depth for lower lip; L_mod_depth_theta = theta-band modulation depth for lower
lip; L_mod_depth_beta.gamma = beta/gamma-band modulation depth for lower lip; L_PSI_delta_theta = delta–theta PSI for lower lip; L_PSI_theta_
beta.gamma = theta–beta/gamma PSI for lower lip; J_mod_depth_delta = delta-band modulation depth for jaw; J_mod_depth_theta = theta-band
modulation depth for jaw; J_mod_depth_beta.gamma = beta/gamma-band modulation depth for jaw; J_PSI_delta_theta = delta–theta PSI for
jaw; J_PSI_theta_beta.gamma = theta–beta/gamma PSI for jaw; COH_TTJ_theta = theta-band coherence between tongue tip and jaw;
COH_TTJ_beta.gamma = beta/gamma-band coherence between tongue tip and jaw; COH_TBJ_theta = theta-band coherence between
tongue body and jaw; COH_TBJ_beta.gamma = beta/gamma-band coherence between tongue body and jaw.

the purpose of this part of analysis was to assess how volun- following 11 features: TT_mod_depth_delta, TT_mod_
tary rate reduction impacted temporal patterning of articu- depth_theta, TT_mod_depth_beta.gamma, TB_mod_ depth_
lation, we focused on the effects of task and Task × Group delta, TB_mod_depth_theta, TB_mod_depth_beta.gamma,
interaction. Out of the 24 features, 14 showed a significant L_mod_depth_delta, L_PSI_delta_theta, J_mod_depth_
task and/or Task × Group effect, including TT_mod_ delta, J_mod_depth_theta, and J_PSI_theta_beta.gamma.
depth_delta (task, Task × Group), TT_mod_depth_theta These results combined indicated that rate manipulation sig-
(task, Task × Group), TT_mod_depth_beta.gamma (task), nificantly modified the intra-articulator modulation patterns
TT_PSI_delta_theta (task), TB_mod_depth_delta (task), of all articulators, whereas interarticulator modulation was
TB_mod_depth_theta (task), TB_mod_depth_beta.gamma relatively unaffected.
(task, Task × Group), TB_PSI_theta_beta.gamma (task), Post hoc comparisons between tasks revealed differ-
L_mod_depth_delta (Task × Group), L_PSI_delta_theta ential effects of voluntary rate reduction on intra-
(task, Task × Group), J_mod_depth_delta (task), J_mod_ articulator modulation between individuals with ALS and
depth_theta (task), J_PSI_delta_theta (task), and J_PSI_theta_ healthy controls. Specifically, individuals with ALS exhib-
beta.gamma (task, Task × Group), at the significance ited (a) a significant increase in TT_mod_depth_theta,
level of p < .05. The task and/or Task × Group effect sur- TB_mod_depth_theta, J_mod_depth_theta, TT_PSI_theta_
vived Bonferroni correction for multiple tests on the beta.gamma, and TB_PSI_theta_beta.gamma and (b) a

4596 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
significant decrease in TB_mod_depth_delta and J_mod_ Contribution of Temporal Patterning of
depth_delta between the slow and habitual rate tasks. For Articulation to Functional Speech Outcomes
healthy controls, a larger number of features showed a signifi-
cant decrease, including TT_mod_depth_delta, TB_mod_ The 24 intra- and interarticulator modulation fea-
depth_delta, L_mod_depth_delta, J_mod_depth_delta, TT_mod_ tures were clustered into six composite factors uncovering
depth_beta.gamma, TB_mod_depth_beta.gamma, TT_ PSI_ the latent traits of temporal patterning of articulation.
delta_theta, L_PSI_delta_theta, and J_PSI_delta_theta, whereas Based on the results of factorization as shown in Figure 4,
a smaller number of features, including TB_mod_ depth_theta, Factors 1 through 4 can be interpreted as reflecting (a)
J_mod_depth_theta, J_mod_depth_beta.gamma, and J_PSI_ entrainment of the tongue and the jaw to the faster lin-
theta_beta.gamma, showed a significant increase between the guistic rhythm related to onset–rime/phoneme, (b) har-
slow and habitual rate tasks. Taken together, the results for monic entrainment of the tongue and the jaw to the
Aim 3 pointed toward an overall positive effect of voluntary slower linguistic rhythms related to stress and syllable, (c)
rate reduction on intra-articulator modulation in individuals lip entrainment to the linguistic rhythm hierarchy, and (d)
with ALS, through (a) enhancing theta-band modulation interarticulator coherence at the target linguistic rhythms,
depths for tongue tip, tongue body, and jaw and theta–beta/ respectively. Factors 5 and 6 had relatively coarse feature
gamma phase synchronization for tongue tip and tongue representations, with the primary component feature
body and (b) preserving delta-band modulation depths for reflecting tongue body entrainment to the syllable rhythm
tongue tip and lower lip; beta/gamma-band modulation (for Factor 5) and harmonic entrainment of the tongue
depths for tongue tip and tongue body; and delta–theta body to the syllable and onset–rime/phoneme rhythms (for
phase synchronization for tongue tip, lower lip, and jaw, to Factor 6), respectively.
prevent them from being further degraded as in healthy The six factors combined were moderately correlated
controls. with speech intelligibility (R2 = .51–.53) and strongly cor-
related with intelligible speaking rate (R2 = .63–.73). The
stronger correlation with intelligible speaking rate was
Discussion consistent with the finding of Yorkston and Beukelman
(1981), suggesting that the integration of intelligibility and
This study investigated an important but underex- speaking rate may provide a more comprehensive charac-
plored aspect of speech production, that is, temporal pattern- terization of the multifaceted functional speech perfor-
ing of articulatory activities, reflecting articulatory entrain- mance than intelligibility alone. As such, some aspects of
ment to the underlying linguistic rhythms in both neurologi- temporal patterning of articulation might be reflected in
cally healthy and impaired speakers. A variety of features intelligible speaking rate, but not in speech intelligibility.
were extracted from the temporal modulation patterns of Among different predictors, Factor 2 exhibited the great-
tongue tip, tongue body, lower lip, and jaw at three linguisti- est and most significant contribution to both speech intel-
cally relevant timescales (delta, theta, and beta/gamma) to ligibility and intelligible speaking rate, followed by Fac-
assess the harmonic entrainment of these articulators to tors 1 and 5, both showing significant and about half the
the rhythms of stress, syllable, and onset–rime/phoneme. contribution as of Factor 2, whereas all other factors
Moderate-to-strong correlations were found between these exhibited minimal contributions. These findings accentu-
modulation features and the functional speech outcomes ated the functional significance of tongue and jaw entrain-
across neurologically healthy and impaired speakers. Com- ment to linguistic rhythms, especially the rhythms of sylla-
pared with healthy speakers, neurologically impaired individ- ble stress, which is consistent with the phonetic properties
uals exhibited reduced and less synchronized articulatory of the speech stimulus eliciting greater activities of the
entrainment to the linguistic rhythms, with tongue tip, ton- tongue and the jaw relative to the lower lip.
gue body, and jaw being more affected than lower lip. Nota- Taken together, the moderate-to-strong correlations
ble trends of improvement on these impairments were between temporal patterning of articulation and functional
observed following a behavioral modification—voluntary speech outcomes paralleled the vast body of neurolinguistic
speaking rate reduction, supporting the utility of rate control and psychoacoustic evidence, corroborating the contribu-
in improving articulatory entrainment in individuals with tions of delta-, theta-, beta-, and gamma-band modulations
ALS. The findings of this study provided preliminary to functional speech communication (Ding & Simon, 2014;
empirical evidence for the functional role of articulatory Doelling et al., 2014; Drullman, 1995; Ghitza, 2012;
entrainment in speech production, shedding light on a novel Kösem & van Wassenhove, 2017; Luo & Poeppel, 2012;
global timing–based approach for profiling articulatory def- Mai et al., 2016). Using artificially manipulated speech,
icits in neuromotor speech disorders, which has potential Ghitza (2012) has associated flattened temporal modulation
clinical implications in motor speech assessment and envelope of spectrally filtered acoustic signals (spectral
rehabilitation. passband: 230–3800 Hz, which is the spectral frequency

Rong & Heidrick: Temporal Patterning of Articulation 4597


range most relevant to articulatory motor activities; see oscillator in AP/TD) and its designated activation period
Chandrasekaran et al., 2009) with degraded speech intellig- (i.e., specified by the planning oscillators in AP/TD). The
ibility. This study further provided evidence that temporal former is determined by the intrinsic dynamic properties
modulation of the physiological source (i.e., articulatory of the articulator (e.g., stiffness and damping), and the lat-
activities) underlying speech acoustics had a similar contri- ter is determined by the central clock as part of the motor
bution to the functional speech outcomes. The fact that we plan in accordance with the periodicity of the underlying
controlled the variation of temporal modulation using real linguistic event (i.e., delta for stress, theta for syllable, and
physiological data collected from individuals with varying beta/gamma for onset–rime/phoneme; Saltzman & Byrd,
speech production capabilities, as opposed to artificially 2000; Turk & Shattuck-Hufnagel, 2014).
manipulating the product of the speech production system If the period of the oscillatory movement of the
(e.g., acoustics) as in the study of Ghitza (2012), linked the articulator matches its activation period, the movement
previously underexplored production mechanism with the can be effectively implemented to achieve the desired
functional speech outcomes. These findings informed a bet- articulatory target (Turk & Shattuck-Hufnagel, 2014). By
ter understanding of time-based sensorimotor interaction contrast, if there is a mismatch between the two periods,
during spoken communication. one can predict that the planned movement would have to
be truncated (when movement period > activation period)
Effect of Motor Speech Impairment on Intra- or time-stretched (when movement period < activation
Articulator Modulation period) during the implementation process (Turk &
Shattuck-Hufnagel, 2014). Due to reduced firing rates of
Consistent with our hypothesis, temporal patterning motor units and muscle weakness, the duration of articu-
of articulation showed trends of impairment in individuals latory movement has been shown to be significantly pro-
with ALS, and such trends were primarily reflected by the longed in individuals with ALS (Rong & Heidrick, 2021;
differences in the intra-articulator modulation features Yunusova et al., 2010), which is accompanied by trunca-
between the ALS and healthy control groups. Due to the tion of movement (Lee & Bell, 2018; Rong & Heidrick,
lack of difference in sentence duration between the two 2021; Rong et al., 2018; Rong, Yunusova, Wang, &
groups (see Figure 2), the observed disease-related changes Green, 2015; Shellikeri et al., 2016). These previously
in intra-articulator modulation were unlikely by-products reported changes in articulatory movement are consistent
of the surface duration difference between individuals with with our prediction, pointing toward a mismatch between
ALS and healthy controls. Instead, these changes were the period of articulatory movement and the underlying
more likely reflective of mechanistic deficits in articulatory linguistic event. Such a mismatch in periodicity may
entrainment in ALS. reduce articulatory entrainment to linguistic rhythms, con-
ceivably explaining the observed decreases in modulation
Modulation Depth depth within the target frequency bands. These putative
Of all intra-articulator modulation features, modula- impairments of articulatory entrainment are in line with
tion depth was the most affected, which was reduced to a the finding of Borrie et al. (2020), which has provided
varying extent in individuals with ALS relative to healthy indirect evidence for impaired articulatory entrainment
controls (see Figure 5), revealing a general trend toward between neurologically impaired and healthy dyads in a
impaired articulatory entrainment to the underlying lin- conversational task, using an acoustic paradigm.
guistic rhythms. Among different articulators, tongue tip The finding that tongue tip showed the greatest
revealed moderate-to-large decreases in modulation depth decrease in modulation depth among all articulators may
within all frequency bands; tongue body and jaw showed be attributed to two neuromotor factors. First, lingual
moderate-to-large decreases in the depth of theta- and motor neurons are known to be more involved than facial
beta/gamma-band modulations. Lower lip, on the other and trigeminal motor neurons in individuals with ALS
hand, revealed relatively small changes in modulation (DePaul et al., 1988). Between tongue tip and tongue
depth within all frequency bands. body, which represent two semi-independent regions of
As a general principle, the key to entrainment the tongue with different muscle and tissue compositions,
between two bodies is the common period of their oscilla- tongue tip has been found to be more susceptible to mus-
tory movements, which ensures a stable phase relationship cle atrophy and fat replacement related to ALS progres-
between these bodies throughout the entire duration of the sion compared with tongue body (Atsumi & Miyatake,
movements (Kugler & Turvey, 1987). Applying this princi- 1987; Cha & Patten, 1989; DePaul et al., 1998). As a
ple to articulatory entrainment, the deciding factor for result, the movement of the tongue, especially of tongue
entraining an articulator to a linguistic rhythm is the tip, tends to be more impaired, resulting in a greater pro-
match between the period of the oscillatory movement of longation of movement duration than that of jaw and lip
the articulator (i.e., modeled as damped spring mass movements, as previously reported by various kinematic

4598 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
studies (Rong & Green, 2019; Rong & Heidrick, 2021; reflected by its vowel nucleus, and (b) jaw plays a key role
Shellikeri et al., 2016). The differential involvement of the in vowel articulation, especially in individuals with ALS
articulators may therefore contribute to the varying extent who exhibit an increased reliance on the jaw than do
of decrease in their modulation depths. Second, the speech healthy speakers, which has proven to be a common adap-
task in this study exerts the highest motoric demand on tation strategy to accommodate tongue impairment (Rong,
tongue tip, which is the primary articulator for the alveo- 2019). Accordingly, the decrease in the delta–theta PSI for
lar stops in all words, followed by tongue body and jaw, the jaw may be interpreted as an articulatory manifestation
which are involved in vowel articulation, whereas the role of abnormal syllable stress in individuals with ALS.
of the lower lip is relatively trivial. The disparate levels of Pertaining to the timing constraint between syllable
motoric demands tend to result in differential sensitivity and onset–rime/phoneme, prior work has reported nonuni-
of the task to motor deficits of tongue tip, tongue body, form changes in segment duration across different pho-
jaw, and lower lip, providing another possible contributing neme classes in individuals with ALS, with the duration of
factor to the varying extent of decrease in modulation vowels being prolonged to a greater extent than that of
depth across different articulators. consonants (Tjaden & Turner, 2000). The differential
changes in vowel and consonant duration could alter the
Cross-Frequency Phase Synchronization alignment of these phonemes within the temporal structure
Cross-frequency phase synchronization was, in gen- of syllables. From the articulatory perspective, such
eral, less affected than modulation depth in individuals changes in phoneme–syllable temporal alignment would
with ALS. Nonetheless, a moderate decrease was observed correspond with reduced phase synchronization between
in the delta–theta PSI for the jaw and the theta–beta/ the theta and beta/gamma modulations of the primary
gamma PSI for tongue tip and tongue body between the articulators, which, in the context of alveolar stop–vowel
ALS and healthy control groups, although these group syllables, are tongue tip and tongue body. This interpreta-
differences did not reach statistical significance. Given that tion provides a tentative explanation of the observed
the relatively small sample size in this study could render decrease in theta–beta/gamma PSI for tongue tip and
insignificant statistical results, it is worth looking into tongue body.
these moderate but insignificant changes in PSI to inform
future large-scale studies on cross-frequency articulatory Voluntary Rate Reduction Tends to Improve
modulation. Intra-Articulator Modulation in Individuals
As demonstrated by the findings of psychoacoustic With ALS
research, the temporal structure of speech can be viewed
as a hierarchy wherein different linguistic events are con- Effect of Voluntary Rate Reduction on Intra-
strained in their relative timing to act together as a coher- Articulator Modulation in Healthy Speakers
ent scene (Cummins & Port, 1998). Pertaining to the tim- Voluntary rate reduction exerted differential impacts
ing constraint between stress and syllable, it has been on intra-articulator modulation between individuals with
shown that the perceptual center or stress beat is located ALS and healthy controls. For healthy controls, voluntar-
near the onset of the vowel nucleus of the syllable ily slowed speech was characterized by overall decreased
(Marcus, 1981). The perception of syllable stress is hence modulation depths within the delta and beta/gamma
dependent on the alignment of the syllable subcycles bands as well as decreased delta–theta PSI, compared with
(reflected by theta-band modulation) within the stress their habitual speech (see Figure 6). These changes
cycles (reflected by delta-band modulation). The speech of resulted in descriptively similar intra-articulator modula-
individuals with ALS is often perceived as having equal tion profiles for voluntarily slowed speech produced by
and excessive stress (Darley et al., 1969a, 1969b), which healthy speakers and habitual speech produced by individ-
has been associated with disproportionate prolongation of uals with ALS. This finding resonated with prior study of
unstressed versus stressed segments resulting in reduced segmental timing, which also revealed a broad similarity
duration contrasts between these segments (Tjaden & between voluntarily slowed speech for healthy speakers
Turner, 2000). Such a segmental timing deficit can pre- and habitual speech for individuals with ALS (Tjaden &
sumably lead to misalignment of stressed and unstressed Turner, 2000).
syllable segments within the temporal template of the Similar to the interpretation of the disease effect, the
higher level prosodic structure, thereby reducing the phase rate-elicited decreases in delta- and beta/gamma-band
synchronization between delta- and theta-band modula- modulation depths can also be interpreted in the context
tions. The finding that the jaw revealed the greatest of the AP/TD model (Saltzman & Byrd, 2000; Saltzman
decrease in delta–theta PSI among all articulators is likely & Munhall, 1989; Saltzman et al., 2008). Specifically, to
attributed to the interplay of two articulatory/linguistic accommodate the rate-elicited prolongation of activation
factors: (a) The stress pattern of a syllable is mainly period, articulatory movements need to be stretched in

Rong & Heidrick: Temporal Patterning of Articulation 4599


time. This possibility is supported by the findings of Rong accentuated rhythmic constraints on the timing of syllables
(2020) and Rong and Heidrick (2021) reporting increased rather than stress. The articulatory underpinning of such a
velocity rising time and reduced peak velocity of both ton- rate-elicited enhancement of rhythmic constraints on sylla-
gue and jaw movements during voluntarily slowed speech. ble timing may manifest as increased theta-band modula-
These changes, according to Turk and Shattuck-Hufnagel tion depths, especially of the articulators involved in pro-
(2014), might result in a quasi steady state during the acti- ducing the syllable nuclei (e.g., tongue and jaw).
vation period of the articulator. Consequently, the modu- Furthermore, the disproportionate prolongation of
lation profile of the articulator may become less “sharp” stressed and unstressed syllables during voluntarily slowed
compared with the habitual rate condition (i.e., analogous speech can alter the temporal organization of these sylla-
to the artificially flattened acoustic modulation envelope bles within the higher level prosodic structure and thereby
in the study of Ghitza, 2012), which would be reflected in impair the harmonic entrainment between the periodicities
the modulation power spectrum as reduced depth of mod- of syllabic and prosodic structures. This interpretation
ulation within the relevant frequency range. The sharpness might explain the decrease in delta–theta PSI for tongue
of modulation profile has been shown to play a critical tip, jaw, and lower lip during healthy speakers’ slow
role in facilitating the entrainment between auditory oscil- speech, as shown in Table 4 and Figure 6.
lations and acoustic envelope at the linguistically relevant
timescales in perceptual studies (Doelling et al., 2014; Effect of Voluntary Rate Reduction on Intra-
Ghitza, 2012). Accordingly, a parallel may be drawn in Articulator Modulation in Individuals With ALS
the production system to link the sharpness of articulatory Similar to healthy speakers, individuals with ALS
modulation with the entrainment of articulatory activities also showed increased theta-band modulation depths for
to linguistic rhythms. The observed decrease in the sharp- most articulators (i.e., tongue tip, tongue body, and jaw)
ness of articulatory modulation within the delta and beta/ accompanied by decreased delta-band modulation depths
gamma bands might thus be interpreted as reflecting for some but not all articulators (i.e., tongue body and
reduced articulatory entrainment to the rhythms of stress jaw) during the slow rate task, reflecting a putative shift
and onset–rime/phoneme, in response to rate manipulation. of articulatory entrainment paradigm from being centered
Interestingly, unlike the delta- and beta/gamma-band around stress rhythm to leaning toward syllable rhythm.
modulation depths, the theta-band modulation depths for In addition, two distinct rate-elicited changes, that is,
tongue body and jaw showed a significant increase during increased theta–beta/gamma PSI for tongue tip and
the slow rate task. While the exact cause of this theta- tongue body, were found in individuals with ALS, reveal-
band–specific finding remained inconclusive, it was sus- ing a trend toward more regular alignment of phoneme
pected to be associated with a rate-elicited articulatory segments within the temporal structure of syllables. It is
entrainment paradigm shift resulting in a tendency toward important to point out that all modulation features identi-
enhanced rhythmic constraints on syllable timing. It is fied here as exhibiting a significant rate-elicited increase in
well established in the linguistics literature that a stress- individuals with ALS were also among the features identi-
timed language such as English is temporally organized fied as being susceptible to the disease effect, as reflected
around the element of stress (Ramus et al., 2000). Such a by moderate (for theta–beta/gamma PSI) to large (for
stress-based temporal organization principle is imple- theta-band modulation depths) decreases in the ALS
mented by exerting rhythmic constraints on stress timing group versus the healthy control group (refer to Figure 5).
to generate relatively isochronous interstress intervals (Liss Several other features identified as being susceptible to
et al., 2013; Ramus et al., 2000). Syllable timing, on the ALS, such as delta- and beta/gamma-band modulation
other hand, is less constrained in English, giving rise to a depths for tongue tip, beta/gamma-band modulation depth
great variety of syllable types with variable durations for tongue body, and delta–theta PSI for jaw, although
(Ramus et al., 2000). The differential rhythmic constraints did not show an improvement following rate manipula-
on stress and syllable timing are corroborated by the tion, were either not degraded in individuals with ALS as
observation of greater delta-band modulation depths than in healthy controls. Taken together, these findings revealed
theta-band modulation depths in the articulatory pattern an overall positive trend of improvement in temporal pat-
of habitual speech, as shown in Figure 6. During volun- terning of articulation in individuals with ALS following
tarily slowed speech, however, prior research has shown voluntary rate reduction, as manifested by enhanced or pre-
that speakers tend to neutralize syllable durations through served harmonic entrainment of the articulators to the lin-
disproportionate prolongation of stressed versus unstressed guistic rhythm hierarchy.
syllables, especially of their nuclei (Miller, 1981; Peterson It should be noted that, given the rate normalization
& Lehiste, 1960; Tjaden & Turner, 2000), resulting in a of the articulatory data, the observed positive trend of
modified temporal organization of linguistic units resem- improvement is not likely a simple by-product of the change
bling that of a syllable-timed language, which exerts in rate or surface duration of articulatory movement, but

4600 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
presumably reflective of mechanistic changes in temporal interarticulator coherence observed in Rong (2020) may
patterning of articulation in response to rate manipulation. be a by-product of rate reduction rather than a mechanis-
As pointed out earlier, a potential barrier for articulatory tic change in interarticulator coordination.
entrainment in individuals with ALS is the mismatch
between the period of articulatory movement and its desig- Clinical Implications: A Novel Perspective
nated activation period, with the former presumably exceed- Toward Global Timing–Based Motor Speech
ing the latter owing to slowed articulatory muscle contrac- Assessment and Rehabilitation
tions (DePaul & Brooks, 1993; Langmore & Lehman,
1994). As a global timing variable, voluntary reduction of Using a set of algorithmically derived temporal
speaking rate can increase the activation periods related to modulation features, this study provided preliminary evi-
the underlying linguistic events. This would have three poten- dence for (a) the impact of motor speech impairment on
tial impacts: (a) improving temporal alignment between the temporal patterning of articulation and (b) the contribu-
prolonged activation period and the increased articulatory tion of temporal patterning of articulation to the func-
movement period; (b) enabling the central nervous system to tional speech outcomes. These findings converge with the
recruit more motor units within the activation period to extant psychoacoustic evidence on the functional signifi-
increase the overall force generation capacity and, in turn, cance of temporal patterning of speech, pointing toward a
mitigate the undershooting problem; and (c) affording more time-based sensorimotor mechanism stemming from har-
processing time to mitigate the deficits in initiating and stop- monic entrainment of articulatory-motor and auditory-
ping movements. These changes can arguably facilitate the perceptual activities to common linguistic rhythms. Such a
implementation of the planned articulatory movements to coupling between the production and perception systems
better achieve the targets associated with the underlying lin- may facilitate language processing and optimize communi-
guistic events, thereby improving the harmonic entrainment cation efficiency (Pickering & Garrod, 2004). This puta-
of the articulators to the linguistic rhythm hierarchy. tive mechanism has potential assessment and management
implications for neuromotor speech disorders.
Considerations Concerning Interarticulator From the assessment perspective, integrating the
Coherence evaluation of temporal patterning of articulation into the
existing paradigm of motor speech assessment may add
Compared with intra-articulator modulation, inter- incremental value to the diagnosis and monitoring of neu-
articulator coherence was relatively unaffected by either romotor speech disorders. The modulation features in this
the disease effect or rate manipulation, at least in the con- study provide good candidates for measuring and evaluat-
text of this study. A possible contributing factor for the ing temporal patterning of articulation. Owing to the
lack of significant disease effect on interarticulator coher- automation of analysis, these features (a) can be extracted
ence was the speech task not exerting a sufficiently high without laborious data parsing and labeling as in tradi-
demand on interarticulator coordination. Our prior work tional segmental analyses and (b) are objective and repro-
has shown that, in a more demanding task such as oral ducible. Further analytical and clinical validations are
diadochokinesis (i.e., rapid repetitions of “puh tuh kuh”), needed to substantiate the utility of these features in (dif-
which requires the articulatory muscles to be highly coor- ferential) assessment of neuromotor speech disorders.
dinated to enable rapid recruitment and activation of these From the management perspective, entrainment-
muscles as coordinative structures (as opposed to recruit- based rehabilitation has a history of managing timing defi-
ing and activating them separately, which takes longer cits in communication disorders such as nonfluent aphasia
time), individuals with ALS tend to exhibit a greater defi- (Feenaughty et al., 2021; Fridriksson et al., 2012). Its
cit in interarticulator coherence compared with the regular application to neuromotor speech disorders is, however,
sentence reading task (Rong & Heidrick, 2021). The gen- less explored. The findings of this study provide novel
eral lack of rate effect on interarticulator coherence is in insights into the functional role of articulatory entrain-
keeping with early physiological studies, showing stable ment in speech production and elucidate how articulatory
relative timing between articulators despite speaking rate entrainment can be modulated through global behavioral
variations (Nittrouer et al., 1988; Tuller et al., 1982). modifications such as rate manipulation to facilitate the
However, it contradicts the finding of Rong (2020), which implementation of motor plans and improve the quality of
has reported a significant decrease in interarticulator articulatory movement outcomes in individuals with
coherence during voluntarily slowed speech in healthy motor speech impairments. Such an entrainment paradigm
speakers. This discrepancy might be attributed to the gives rise to a novel perspective toward global timing–
methodological differences between the two studies. based motor speech rehabilitation, which, with further
Unlike this study, the articulatory data in Rong (2020) are research warranted, may have potential applications in the
not rate-normalized. Thus, the rate-elicited decrease in management of neuromotor speech disorders.

Rong & Heidrick: Temporal Patterning of Articulation 4601


Limitations and Future Directions changes in duration across segments as reported in previous
studies (Berry, 2011; Tjaden & Turner, 2000). Due to the
Several limitations of this study must be acknowl- exploratory nature of this study, there was a lack of empiri-
edged. Due to limited empirical research on temporal pat- cal evidence for how rate manipulation would affect the tem-
terning of articulation in the speech production literature, poral structure of the target stimulus. The linear rescaling
the features in this study were constructed based on both a method thus served as a simplified but reasonable first step
theoretical model of speech production (AP/TD) and empir- of temporal normalization to handle the rate-related con-
ical evidence from the psychoacoustic literature to measure founder. With further research warranted, more accurate
the entrainment factors that were theoretically driven and temporal normalization methods may be developed in the
experimentally demonstrated to contribute to speech pro- future to rescale the duration of each segment in a nonlinear
cessing and comprehension. There was a lack of gold stan- and more realistic manner.
dard in the production literature to compare with the novel As a starting point, we used voluntary speaking rate
features in this study. Nonetheless, our findings demon- reduction—a relatively simple and flexible rate control
strated (a) moderate-to-strong correlations between these strategy—to modify temporal patterning of articulation
novel features and the functional speech outcomes and (b) without providing specific instructions on how the modifi-
detectible disease effects on these features in ways consistent cation should be implemented (by prolonging vowels, con-
with the pathological mechanism of speech impairment sonants, etc.) except encouraging speakers to stretch the
related to ALS. These findings provided preliminary empiri- duration of the words. While this rate control strategy
cal evidence for the utility of these novel features in profiling maximally preserved the naturalness of speech compared
the patterns of articulatory entrainment and the functional with other more rigid forms of rate control relying on exter-
significance of temporal patterning of speech from the pro- nal timing cues, it nevertheless gave rise to the variability of
duction standpoint. Future work should investigate other ways in which rate control was implemented. This may
clinical populations with different neuromotor pathologies result in variable changes in the duration of different sub-
to further delineate the role of entrainment in shaping differ- components and, in turn, differentially impact the global
ent neuromotor processes underlying speech production. temporal structure of speech across individuals. Such puta-
As an initial exploratory effort, this study employed a tive individual variabilities should be investigated in future
relatively small sample and a single stimulus targeting the work, which may provide useful phenotype data for further
temporal deficits of specific articulators known to be tailoring and optimizing the effect of rate manipulation in
involved in ALS. Therefore, it should be cautious to gener- improving temporal patterning of articulation. Moreover,
alize the findings. Given prior research reporting differential to direct the attention focus of the participants to the
changes in various segmental-level articulatory features dur- manipulation of speaking rate, we did not control intensity
ing the early and late stages of ALS (Rong et al., 2018), it is during the experimental tasks. It was thus possible that the
likely that the disease effect on temporal patterning of artic- participants might change the intensity of their speech as
ulation may also vary with severity. The current sample they reduced their speaking rate. How intensity is con-
consisted of two patients on the lower end of the severity trolled during voluntary rate reduction and whether/how
spectrum and 11 on the medium-to-high end, resulting in a intensity modifications contribute to temporal patterning of
skewed sample distribution, especially pertaining to the articulation remain unclear and require future research.
measure of speech intelligibility. Although intelligibility While the present findings tentatively support the
scores were transformed in the statistical model to handle utility of voluntary rate reduction in improving temporal
the skewness of the data, a larger and more evenly distrib- patterning of articulation in individuals with ALS, future
uted sample spreading a wide range of speech severities is studies should expand the scope of investigation to other
needed in follow-up work to further validate the findings of rate control strategies (Yorkston et al., 1990) and popula-
this study. Moreover, future studies should test other speech tions with different neuromotor pathologies, in order to
stimuli such as passage reading and storytelling, which are (a) determine how these rate control strategies (differen-
more representative of the natural conversational speaking tially) influence temporal patterning of articulation in dif-
style, to determine the extent to which the patterns observed ferent clinical populations and (b) identify the optimal rate
in this study can be generalized to a wider linguistic context. control strategy for each population. For instance, for
To mitigate the potential by-products of speaking neuromotor disorders with internal timing deficits (e.g.,
rate variation and prevent them from confounding the cerebellar or basal ganglia disorder), externally paced rate
interpretation of the rate-elicited changes in temporal pat- manipulation using rhythmic auditory–visual cues may be
terning of articulation, we employed a rate normalization preferable over voluntary rate manipulation. In these
method through linear rescaling of duration. It should, cases, the auditory–visual rhythms may provide an external
however, be noted that the effect of rate manipulation on time reference to compensate for the internal timing deficits
duration is nonlinear in nature, as manifested by differential and thereby improve temporal patterning of articulation.

4602 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Conclusions grateful to all participants and their families for contributing
their time to this study. The authors also thank Olivia
This is the first known empirical study to systemati- Hansen for her assistance with data processing and analysis.
cally evaluate temporal patterning of articulation at multi-
ple linguistically relevant, hierarchically nested timescales
(corresponding to stress, syllable, and onset–rime/phoneme) References
in both neurologically healthy and impaired speakers.
While the extant literature on neuromotor speech disorders Amano-Kusumoto, A., & Hosom, J.-P. (2011). A review of research
leans heavily toward spatial and temporal features at the on speech intelligibility and correlations with acoustic features
segmental level, the findings of this study suggest that, (Technical Report CSLU-011-001). Center for Spoken Lan-
apart from the segmental features, global temporal organi- guage Understanding, Oregon Health & Science University.
American Speech-Language-Hearing Association. (2002). National
zation of articulation around linguistic rhythms also plays Outcomes Measurement System (NOMS): Adults speech-
an important role in shaping the functional speech out- language pathology user’s guide.
comes. In individuals with ALS, temporal patterning of American Speech-Language-Hearing Association. (2013). National
articulation tends to be disrupted, presumably due to Outcomes Measurement System (NOMS): Adults speech-
impaired articulatory entrainment to the rhythms of the language pathology user’s guide.
Atsumi, T., & Miyatake, T. (1987). Morphometry of the degener-
underlying linguistic events, contributing to degraded func- ative process in the hypoglossal nerves in amyotrophic lateral
tional speech outcomes. Through voluntary speaking rate sclerosis. Acta Neuropathologica, 73(1), 25–31. https://doi.org/
reduction, several deficits in temporal patterning of articu- 10.1007/BF00695498
lation in individuals with ALS reveal a trend of improve- Ball, L. J., Beukelman, D. R., & Pattee, G. L. (2002). Timing of
ment, possibly by reshaping the temporal template of artic- speech deterioration in people with amyotrophic lateral sclero-
sis. Journal of Medical Speech-Language Pathology, 10(4),
ulatory motor plans to better accommodate the disease- 231–235.
related neuromechanical constraints in the articulatory sys- Ball, L. J., Willis, A., Beukelman, D. R., & Pattee, G. L. (2001). A
tem. The findings of this study shed light on a novel global protocol for identification of early bulbar signs in amyotrophic
timing–based approach for profiling articulatory deficits in lateral sclerosis. Journal of the Neurological Sciences, 191(1–2),
neuromotor speech disorders, which may have potential 43–53. https://doi.org/10.1016/S0022-510X(01)00623-2
Berry, J. (2011). Speaking rate effects on normal aspects of artic-
clinical applications in the long term. From the assessment ulation: Outcomes and issues. SIG 5 Perspectives on Speech
perspective, integrating the evaluation of temporal pattern- Science and Orofacial Disorders, 21(1), 15–26. https://doi.org/
ing of articulation into the existing motor speech assess- 10.1044/ssod21.1.15
ment may add incremental value to the (differential) diag- Borrie, S. A., Barrett, T. S., Liss, J. M., & Berisha, V. (2020). Sync
nosis and monitoring of neuromotor speech disorders. pending: Characterizing conversational entrainment in dysar-
thria using a multidimensional, clinically informed approach.
From the management perspective, manipulating temporal Journal of Speech, Language, and Hearing Research, 63(1), 83–
patterning of articulation through behavioral modifications 94. https://doi.org/10.1044/2019_JSLHR-19-00194
such as speaking rate reduction may recalibrate the neuro- Brooks, B. R., Miller, R. G., Swash, M., & Munsat, T. L. (2000).
motor processes underlying the articulatory activities to El Escorial revisited: Revised criteria for the diagnosis of
improve the planning and implementation of articulatory amyotrophic lateral sclerosis. Amyotrophic Lateral Sclerosis
and Other Motor Neuron Disorders, 1(5), 293–299. https://doi.
movements. org/10.1080/146608200300079536
Buhusi, C. V., & Meck, W. H. (2005). What makes us tick? Func-
tional and neural mechanisms of interval timing. Nature Reviews
Data Availability Statement Neuroscience, 6(10), 755–765. https://doi.org/10.1038/nrn1764
Cha, C. H., & Patten, B. M. (1989). Amyotrophic lateral sclero-
sis: Abnormalities of the tongue on magnetic resonance imag-
The data set generated during this study is not pub- ing. Annals of Neurology, 25(5), 468–472. https://doi.org/10.
licly available due to containing protected health informa- 1002/ana.410250508
tion but will be made available in a de-identified format Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A., &
from the corresponding author on reasonable request. Ghazanfar, A. A. (2009). The natural statistics of audiovisual
speech. PLOS Computational Biology, 5(7), e1000436. https://
doi.org/10.1371/journal.pcbi.1000436
Cummins, F., & Port, R. (1998). Rhythmic constraints on stress
Acknowledgments timing in English. Journal of Phonetics, 26(2), 145–171.
https://doi.org/10.1006/jpho.1998.0070
This work was supported by the New Investigators Darley, F. L., Aronson, A. E., & Brown, J. R. (1969a). Clusters
Research Grant awarded by the American Speech- of deviant speech dimensions in the dysarthrias. Journal of
Speech and Hearing Research, 12(3), 462–496. https://doi.org/
Language-Hearing Foundation (PI: Panying Rong) and 10.1044/jshr.1203.462
the New Faculty General Research Fund awarded by The Darley, F. L., Aronson, A. E., & Brown, J. R. (1969b). Differen-
University of Kansas (PI: Panying Rong). The authors are tial diagnostic patterns of dysarthria. Journal of Speech and

Rong & Heidrick: Temporal Patterning of Articulation 4603


Hearing Research, 12(2), 246–269. https://doi.org/10.1044/jshr. modulation spectrum. Frontiers in Psychology, 3, 238. https://
1202.246 doi.org/10.3389/fpsyg.2012.00238
de Carvalho, M., Eisen, A., Krieger, C., & Swash, M. (2014). Giraud, A.-L., & Poeppel, D. (2012). Cortical oscillations and
Motoneuron firing in amyotrophic lateral sclerosis (ALS). speech processing: Emerging computational principles and
Frontiers in Human Neuroscience, 8, 719–719. https://doi.org/ operations. Nature Neuroscience, 15(4), 511–517. https://doi.
10.3389/fnhum.2014.00719 org/10.1038/nn.3063
DePaul, R., Abbs, J. H., Caligiuri, M., Gracco, V. L., & Brooks, Goldstein, L. (2019). The role of temporal modulation in sensori-
B. R. (1988). Hypoglossal, trigeminal, and facial motoneuron motor interaction. Frontiers in Psychology, 10, 2608. https://
involvement in amyotrophic lateral sclerosis. Neurology, doi.org/10.3389/fpsyg.2019.02608
38(2), 281–283. https://doi.org/10.1212/WNL.38.2.281 Grahn, J. A. (2009). The role of the basal ganglia in beat percep-
DePaul, R., & Brooks, B. R. (1993). Multiple orofacial indices in tion: Neuroimaging and neuropsychological investigations.
amyotrophic lateral sclerosis. Journal of Speech and Hearing Annals of the New York Academy of Sciences, 1169(1), 35–45.
Research, 36(6), 1158–1167. https://doi.org/10.1044/jshr.3606. https://doi.org/10.1111/j.1749-6632.2009.04553.x
1158 Green, J. R., Yunusova, Y., Kuruvilla, M. S., Wang, J., Pattee,
DePaul, R., Waclawik, A., Abbs, J. H., & Brooks, B. R. (1998). G. L., Synhorst, L., Zinman, L., & Berry, J. D. (2013). Bulbar
Histopathological characteristics in lingual muscle tissue in and speech motor assessment in ALS: Challenges and future
ALS: Perspectives on the natural history of the disease. In directions. Amyotrophic Lateral Sclerosis and Frontotemporal
M. P. Cannito, D. R. Beukelman, & K. M. Yorkston (Eds.), Degeneration, 14(7–8), 494–500. https://doi.org/10.3109/21678421.
Neuromotor speech disorders: Nature, assessment, and manage- 2013.817585
ment (p. 69). Brookes. Guenther, F. H. (1994). A neural network model of speech acqui-
Ding, N., & Simon, J. Z. (2014). Cortical entrainment to continu- sition and motor equivalent speech production. Biological
ous speech: Functional roles and interpretations. Frontiers in Cybernetics, 72(1), 43–53. https://doi.org/10.1007/BF00206237
Human Neuroscience, 8, 311. https://doi.org/10.3389/fnhum. Guenther, F. H. (1995). Speech sound acquisition, coarticulation,
2014.00311 and rate effects in a neural network model of speech produc-
Doelling, K. B., Arnal, L. H., Ghitza, O., & Poeppel, D. (2014). tion. Psychological Review, 102(3), 594–621. https://doi.org/10.
Acoustic landmarks drive delta–theta oscillations to enable 1037/0033-295x.102.3.594
speech comprehension by facilitating perceptual parsing. Neu- Guenther, F. H., Ghosh, S. S., & Tourville, J. A. (2006). Neural
roImage, 85, 761–768. https://doi.org/10.1016/j.neuroimage. modeling and imaging of the cortical interactions underlying
2013.06.035 syllable production. Brain and Language, 96(3), 280–301.
Drucker, H., Burges, C., Kaufman, L., Smola, A., & Vapnik, V. https://doi.org/10.1016/j.bandl.2005.06.001
(1997). Support vector regression machines. Advances in Neu- Guenther, F. H., Hampson, M., & Johnson, D. (1998). A theoreti-
ral Information Processing Systems, 28(7), 779–784. cal investigation of reference frames for the planning of
Drullman, R. (1995). Temporal envelope and fine structure cues speech movements. Psychological Review, 105(4), 611–633.
for speech intelligibility. The Journal of the Acoustical Society https://doi.org/10.1037/0033-295x.105.4.611-633
of America, 97(1), 585–592. https://doi.org/10.1121/1.413112 Ivry, R. B., & Spencer, R. M. (2004). The neural representation
Drullman, R., Festen, J. M., & Houtgast, T. (1996). Effect of of time. Current Opinion in Neurobiology, 14(2), 225–232.
temporal modulation reduction on spectral contrasts in https://doi.org/10.1016/j.conb.2004.03.013
speech. The Journal of the Acoustical Society of America, Kent, R. D., Kent, J. F., Weismer, G., Sufit, R. L., Rosenbek,
99(4), 2358–2364. https://doi.org/10.1121/1.415423 J. C., Martin, R. E., & Brooks, B. R. (1990). Impairment of
Drullman, R., Festen, J. M., & Plomp, R. (1994). Effect of tem- speech intelligibility in men with amyotrophic lateral sclerosis.
poral envelope smearing on speech reception. The Journal of Journal of Speech and Hearing Disorders, 55(4), 721–728.
the Acoustical Society of America, 95(2), 1053–1064. https:// https://doi.org/10.1044/jshd.5504.721
doi.org/10.1121/1.408467 Kleinow, J., Smith, A., & Ramig, L. O. (2001). Speech motor sta-
Duffy, J. R. (2013). Motor speech disorders: Substrates, differen- bility in IPD: Effects of rate and loudness manipulations.
tial diagnosis, and management (3rd ed.). Mosby. Journal of Speech, Language, and Hearing Research, 44(5),
Farina, D., & Negro, F. (2015). Common synaptic input to motor 1041–1051. https://doi.org/10.1044/1092-4388(2001/082)
neurons, motor unit synchronization, and force control. Exer- Konoike, N., Kotozaki, Y., Miyachi, S., Miyauchi, C. M.,
cise and Sport Sciences Reviews, 43(1), 23–33. https://doi.org/ Yomogida, Y., Akimoto, Y., Kuraoka, K., Sugiura, M.,
10.1249/jes.0000000000000032 Kawashima, R., & Nakamura, K. (2012). Rhythm information
Feenaughty, L., Basilakos, A., Bonilha, L., & Fridriksson, J. represented in the fronto-parieto-cerebellar motor system.
(2021). Speech timing changes accompany speech entrainment NeuroImage, 63(1), 328–338. https://doi.org/10.1016/j.neuroimage.
in aphasia. Journal of Communication Disorders, 90, 106090. 2012.07.002
https://doi.org/10.1016/j.jcomdis.2021.106090 Kösem, A., & van Wassenhove, V. (2017). Distinct contributions
Fridriksson, J., Hubbard, H. I., Hudspeth, S. G., Holland, A. L., of low- and high-frequency neural oscillations to speech com-
Bonilha, L., Fromm, D., & Rorden, C. (2012). Speech entrain- prehension. Language, Cognition and Neuroscience, 32(5),
ment enables patients with Broca’s aphasia to produce fluent 536–544. https://doi.org/10.1080/23273798.2016.1238495
speech. Brain, 135(12), 3815–3829. https://doi.org/10.1093/ Kugler, P. N., & Turvey, M. T. (1987). Information, natural law,
brain/aws301 and the self-assembly of rhythmic movement. Erlbaum.
Ghitza, O. (2011). Linking speech perception and neurophysiol- Kuruvilla-Dugdale, M., & Mefferd, A. (2017). Spatiotemporal
ogy: Speech decoding guided by cascaded oscillators locked to movement variability in ALS: Speaking rate effects on tongue,
the input rhythm. Frontiers in Psychology, 2, 130. https://doi.org/ lower lip, and jaw motor control. Journal of Communication Dis-
10.3389/fpsyg.2011.00130 orders, 67, 22–34. https://doi.org/10.1016/j.jcomdis.2017.05.002
Ghitza, O. (2012). On the role of theta-driven syllabic parsing in Langmore, S. E., & Lehman, M. E. (1994). Physiologic deficits in
decoding speech: Intelligibility of speech with a manipulated the orofacial system underlying dysarthria in amyotrophic

4604 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
lateral sclerosis. Journal of Speech and Hearing Research, Nasreddine, Z. S., Phillips, N. A., Bédirian, V., Charbonneau, S.,
37(1), 28–37. https://doi.org/10.1044/jshr.3701.28 Whitehead, V., Collin, I., Cummings, J. L., & Chertkow, H.
Lee, J., & Bell, M. (2018). Articulatory range of movement in (2005). The Montréal Cognitive Assessment, MoCA: A brief
individuals with dysarthria secondary to amyotrophic lateral scle- screening tool for mild cognitive impairment. Journal of the
rosis. American Journal of Speech-Language Pathology, 27(3), American Geriatrics Society, 53(4), 695–699. https://doi.org/10.
996–1009. https://doi.org/10.1044/2018_ajslp-17-0064 1111/j.1532-5415.2005.53221.x
Lenth, R. V. (2020). emmeans: Estimated marginal means, aka least- Negro, F., & Farina, D. (2011). Linear transmission of cortical
squares means. https://CRAN.R-project.org/package=emmeans oscillations to the neural drive to muscles is mediated by com-
Leong, V., & Goswami, U. (2015). Acoustic-emergent phonology mon projections to populations of motoneurons in humans.
in the amplitude envelope of child-directed speech. PLOS Journal of Physiology, 589(3), 629–637. https://doi.org/10.
ONE, 10(12), e0144411. https://doi.org/10.1371/journal.pone. 1113/jphysiol.2010.202473
0144411 Nittrouer, S., Munhall, K., Kelso, J. A. S., Tuller, B., & Harris,
Leong, V., Kalashnikova, M., Burnham, D., & Goswami, U. (2017). K. S. (1988). Patterns of interarticulator phasing and their
The temporal modulation structure of infant-directed speech. relation to linguistic structure. The Journal of the Acoustical
Open Mind, 1(2), 78–90. https://doi.org/10.1162/OPMI_a_00008 Society of America, 84(5), 1653–1661. https://doi.org/10.1121/
Liberman, M., & Prince, A. (1977). On stress and linguistic 1.397180
rhythm. Linguistic Inquiry, 8(2), 249–336. Ohala, J. J. (1975). The temporal regulation of speech. In G.
Liss, J. M., LeGendre, S., & Lotto, A. J. (2010). Discriminating Fant & M. A. A. Tatham (Eds.), Auditory analysis and per-
dysarthria type from envelope modulation spectra. Journal of ception of speech (pp. 431–453). Academic Press.
Speech, Language, and Hearing Research, 53(5), 1246–1255. Peterson, G. E., & Lehiste, I. (1960). Duration of syllable nuclei
https://doi.org/10.1044/1092-4388(2010/09-0121) in English. The Journal of the Acoustical Society of America,
Liss, J. M., Utianski, R., & Lansford, K. (2013). Crosslinguistic 32(6), 693–703. https://doi.org/10.1121/1.1908183
application of English-centric rhythm descriptors in motor Pickering, M. J., & Garrod, S. (2004). Toward a mechanistic psy-
speech disorders. Folia Phoniatrica et Logopaedica, 65(1), 3– chology of dialogue. Behavioral and Brain Sciences, 27(2),
19. https://doi.org/10.1159/000350030 169–190. https://doi.org/10.1017/s0140525x04000056
Liss, J. M., White, L., Mattys, S. L., Lansford, K., Lotto, A. J., Poeppel, D. (2003). The analysis of speech in different temporal
Spitzer, S. M., & Caviness, J. N. (2009). Quantifying speech integration windows: Cerebral lateralization as “asymmetric
rhythm abnormalities in the dysarthrias. Journal of Speech, sampling in time.” Speech Communication, 41(1), 245–255.
Language, and Hearing Research, 52(5), 1334–1352. https:// https://doi.org/10.1016/S0167-6393(02)00107-3
doi.org/10.1044/1092-4388(2009/08-0208) Poeppel, D., & Assaneo, M. F. (2020). Speech rhythms and their
Luo, H., & Poeppel, D. (2012). Cortical oscillations in auditory neural foundations. Nature Reviews Neuroscience, 21(6), 322–
perception and speech: Evidence for two temporal windows in 334. https://doi.org/10.1038/s41583-020-0304-4
human auditory cortex. Frontiers in Psychology, 3, 170. R Core Team. (2019). R: A language and environment for statistical
https://doi.org/10.3389/fpsyg.2012.00170 computing. R Foundation for Statistical Computing. http://
Mai, G., Minett, J. W., & Wang, W. S.-Y. (2016). Delta, theta, www.R-project.org/
beta, and gamma brain oscillations index levels of auditory Ramus, F., Nespor, M., & Mehler, J. (2000). Correlates of lin-
sentence processing. NeuroImage, 133, 516–528. https://doi. guistic rhythm in the speech signal. Cognition, 75(1), AD3–
org/10.1016/j.neuroimage.2016.02.064 AD30. https://doi.org/10.1016/S0010-0277(00)00101-3
Marcus, S. M. (1981). Acoustic determinants of perceptual center Revelle, W. (2020). psych: Procedures for psychological, psycho-
(P-center) location. Perception & Psychophysics, 30(3), 247– metric, and personality research. Northwestern University.
256. https://doi.org/10.3758/BF03214280 https://CRAN.R-project.org/package=psych
McClean, M. D. (2000). Patterns of orofacial movement velocity Riecke, L., Formisano, E., Sorger, B., Başkent, D., & Gaudrain,
across variations in speech rate. Journal of Speech, Language, E. (2018). Neural entrainment to speech modulates speech
and Hearing Research, 43(1), 205–216. https://doi.org/10.1044/ intelligibility. Current Biology, 28(2), 161–169.e165. https://
jslhr.4301.205 doi.org/10.1016/j.cub.2017.11.033
McClean, M. D., & Tasko, S. M. (2003). Association of orofacial Roelfsema, P. R., Engel, A. K., König, P., & Singer, W. (1997).
muscle activity and movement during changes in speech rate and Visuomotor integration is associated with zero time-lag syn-
intensity. Journal of Speech, Language, and Hearing Research, chronization among cortical areas. Nature, 385(6612), 157–
46(6), 1387–1400. https://doi.org/10.1044/1092-4388(2003/108) 161. https://doi.org/10.1038/385157a0
McHenry, M. A. (2003). The effect of pacing strategies on the Roenneberg, T., Daan, S., & Merrow, M. (2003). The art of
variability of speech movement sequences in dysarthria. Jour- entrainment. Journal of Biological Rhythms, 18(3), 183–194.
nal of Speech, Language, and Hearing Research, 46(3), 702– https://doi.org/10.1177/0748730403018003001
710. https://doi.org/10.1044/1092-4388(2003/055) Romö, N., Lee, J., & Robb, M. P. (2022). Properties of relative
Mefferd, A. S., Pattee, G. L., & Green, J. R. (2014). Speaking timing and phonetic complexity in adults with dysarthria sec-
rate effects on articulatory pattern consistency in talkers with ondary to amyotrophic lateral sclerosis. Folia Phoniatrica et
mild ALS. Clinical Linguistics & Phonetics, 28(11), 799–811. Logopaedica, 74(4), 284–295. https://doi.org/10.1159/000521144
https://doi.org/10.3109/02699206.2014.908239 Rong, P. (2019). The effect of tongue–jaw coupling on phonetic
Miller, J. L. (1981). Effects of speaking rate on segmental dis- distinctiveness of vowels in amyotrophic lateral sclerosis. Jour-
tinctions. In P. D. Eimas & J. L. Miller (Eds.), Perspectives nal of Speech, Language, and Hearing Research, 62(9), 3248–
on the study of speech (pp. 39–73). Erlbaum. 3264. https://doi.org/10.1044/2019_JSLHR-S-19-0058
Namasivayam, A. K., & van Lieshout, P. (2008). Investigating Rong, P. (2020). Neuromotor control of speech and speechlike
speech motor practice and learning in people who stutter. tasks: Implications from articulatory gestures. Perspectives of
Journal of Fluency Disorders, 33(1), 32–51. https://doi.org/10. the ASHA Special Interest Groups, 5(5), 1324–1338. https://
1016/j.jfludis.2007.11.005 doi.org/10.1044/2020_PERSP-20-00070

Rong & Heidrick: Temporal Patterning of Articulation 4605


Rong, P. (2021). A novel hierarchical framework for measuring Šimko, J., & Cummins, F. (2010). Embodied task dynamics. Psycho-
the complexity and irregularity of multimodal speech signals logical Review, 117(4), 1229–1246. https://doi.org/10.1037/a0020490
and its application in the assessment of speech impairment in Šimko, J., & Cummins, F. (2011). Sequencing and optimization
amyotrophic lateral sclerosis. Journal of Speech, Language, within an embodied task dynamic model. Cognitive Science,
and Hearing Research, 64(8), 2996–3014. https://doi.org/10. 35(3), 527–562. https://doi.org/10.1111/j.1551-6709.2010.01159.x
1044/2021_JSLHR-20-00743 Tass, P., Rosenblum, M. G., Weule, J., Kurths, J., Pikovsky, A.,
Rong, P., & Green, J. R. (2019). Predicting speech intelligibility Volkmann, J., Schnitzler, A., & Freund, H. J. (1998). Detec-
based on spatial tongue–jaw coupling in persons with amyo- tion of n:m phase locking from noisy data: Application to
trophic lateral sclerosis: The impact of tongue weakness and magnetoencephalography. Physical Review Letters, 81(15),
jaw adaptation. Journal of Speech, Language, and Hearing 3291–3294. https://doi.org/10.1103/PhysRevLett.81.3291
Research, 62(8S), 3085–3103. https://doi.org/10.1044/2018_ Thaut, M. H. (2003). Neural basis of rhythmic timing networks
JSLHR-S-CSMC7-18-0116 in the human brain. Annals of the New York Academy of Sci-
Rong, P., & Heidrick, L. (2021). Spatiotemporal control of articula- ences, 999(1), 364–373. https://doi.org/10.1196/annals.1284.044
tion during speech and speechlike tasks in amyotrophic lateral Thorns, J., Wieringa, B. M., Mohammadi, B., Hammer, A.,
sclerosis. American Journal of Speech-Language Pathology, Dengler, R., & Münte, T. F. (2010). Movement initiation and
30(3S), 1382–1399. https://doi.org/10.1044/2020_AJSLP-20-00136 inhibition are impaired in amyotrophic lateral sclerosis.
Rong, P., & Pattee, G. L. (2021). A potential upper motor neu- Experimental Neurology, 224(2), 389–394. https://doi.org/10.
ron measure of bulbar involvement in amyotrophic lateral 1016/j.expneurol.2010.04.014
sclerosis using jaw muscle coherence. Amyotrophic Lateral Tjaden, K. (2007). Segmental articulation in motor speech disor-
Sclerosis and Frontotemporal Degeneration, 22(5–6), 368–379. ders. In G. Weismer (Ed.), Motor speech disorders: Essays for
https://doi.org/10.1080/21678421.2021.1874993 Ray Kent (pp. 151–186). Plural.
Rong, P., Yunusova, Y., Eshghi, M., Rowe, H. P., & Green, J. R. Tjaden, K., & Turner, G. (2000). Segmental timing in amyotro-
(2020). A speech measure for early stratification of fast and phic lateral sclerosis. Journal of Speech, Language, and Hear-
slow progressors of bulbar amyotrophic lateral sclerosis: Lip ing Research, 43(3), 683–696. https://doi.org/10.1044/jslhr.
movement jitter. Amyotrophic Lateral Sclerosis and Fronto- 4303.683
temporal Degeneration, 21(1–2), 34–41. https://doi.org/10.1080/ Tjaden, K., & Wilding, G. E. (2004). Rate and loudness manipu-
21678421.2019.1681454 lations in dysarthria: Acoustic and perceptual findings. Jour-
Rong, P., Yunusova, Y., & Green, J. R. (2015). Speech intellig- nal of Speech, Language, and Hearing Research, 47(4), 766–
ibility decline in individuals with fast and slow rates of ALS 783. https://doi.org/10.1044/1092-4388(2004/058)
progression. In S. Möller, H. Ney, B. Möbius, E. Nöth, & S. Todd, N. P., Lee, C. S., & O’Boyle, D. J. (2002). A sensorimotor
Steidl (Eds.), INTERSPEECH 2015 (pp. 2967–2971). ISCA. theory of temporal tracking and beat induction. Psychological
Rong, P., Yunusova, Y., Richburg, B., & Green, J. R. (2018). Research, 66(1), 26–39. https://doi.org/10.1007/s004260100071
Automatic extraction of abnormal lip movement features Tononi, G., Edelman, G. M., & Sporns, O. (1998). Complexity
from the alternating motion rate task in amyotrophic lateral and coherency: Integrating information in the brain. Trends in
sclerosis. International Journal of Speech-Language Pathology, Cognitive Sciences, 2(12), 474–484. https://doi.org/10.1016/
20(6), 610–623. https://doi.org/10.1080/17549507.2018.1485739 S1364-6613(98)01259-5
Rong, P., Yunusova, Y., Wang, J., & Green, J. R. (2015). Predict- Tourville, J. A., & Guenther, F. H. (2011). The DIVA model: A
ing early bulbar decline in amyotrophic lateral sclerosis: A neural theory of speech acquisition and production. Language
speech subsystem approach. Behavioural Neurology, 2015, and Cognitive Processes, 26(7), 952–981. https://doi.org/10.
Article 183027. https://doi.org/10.1155/2015/183027 1080/01690960903498424
Saltzman, E. (1986). Task dynamic coordination of the speech articu- Tuller, B., Kelso, J. S., & Harris, K. S. (1982). Interarticulator
lators: A preliminary model. Experimental Brain Research Series, phasing as an index of temporal regularity in speech. Journal of
15, 129–144. https://doi.org/10.1007/978-3-642-71476-4_10 Experimental Psychology: Human Perception and Performance,
Saltzman, E., & Byrd, D. (2000). Task-dynamics of gestural tim- 8(3), 460–472. https://doi.org/10.1037/0096-1523.8.3.460
ing: Phase windows and multifrequency rhythms. Human Turk, A., & Shattuck-Hufnagel, S. (2013). What is speech
Movement Science, 19(4), 499–526. https://doi.org/10.1016/ rhythm? A commentary on Arvaniti and Rodriquez, Krivoka-
S0167-9457(00)00030-0 pić, and Goswami and Leong. Laboratory Phonology, 4(1),
Saltzman, E., & Munhall, K. G. (1989). A dynamical approach to 93–118. https://doi.org/10.1515/lp-2013-0005
gestural patterning in speech production. Ecological Psychol- Turk, A., & Shattuck-Hufnagel, S. (2014). Timing in talking:
ogy, 1(4), 333–382. https://doi.org/10.1207/s15326969eco0104_2 What is it used for, and how is it controlled? Philosophical
Saltzman, E., Nam, H., Krivokapic, J., & Goldstein, L. (2008). A Transactions of the Royal Society B: Biological Sciences,
task-dynamic toolkit for modeling the effects of prosodic 369(1658), 20130395. https://doi.org/10.1098/rstb.2013.0395
structure on articulation. In P. A. Barbosa, S. Madureira, & Turner, G. S., Tjaden, K., & Weismer, G. (1995). The influence
C. Reis (Eds.), Proceedings of the Speech Prosody 2008 Con- of speaking rate on vowel space and speech intelligibility for
ference (pp. 175–184). individuals with amyotrophic lateral sclerosis. Journal of
Schack, B., & Weiss, S. (2005). Quantification of phase synchro- Speech and Hearing Research, 38(5), 1001–1013. https://doi.
nization phenomena and their importance for verbal memory org/10.1044/jshr.3805.1001
processes. Biological Cybernetics, 92(4), 275–287. https://doi. van Brenk, F., Kain, A., & Tjaden, K. (2021). Investigating acous-
org/10.1007/s00422-005-0555-1 tic correlates of intelligibility gains and losses during slowed
Shellikeri, S., Green, J. R., Kulkarni, M., Rong, P., Martino, R., speech: A hybridization approach. American Journal of
Zinman, L., & Yunusova, Y. (2016). Speech movement mea- Speech-Language Pathology, 30(3S), 1343–1360. https://doi.
sures as markers of bulbar disease in amyotrophic lateral scle- org/10.1044/2021_AJSLP-20-00172
rosis. Journal of Speech, Language, and Hearing Research, van Lieshout, P. (2004). Dynamical systems theory and its applica-
59(5), 887–899. https://doi.org/10.1044/2016_JSLHR-S-15-0238 tion in speech. In B. Maassen, R. Kent, H. Peters, P. van

4606 Journal of Speech, Language, and Hearing Research • Vol. 65 • 4577–4607 • December 2022
Lieshout, & W. Hulstijn (Eds.), Speech motor control in normal Yorkston, K. M., Beukelman, D. R., & Ball, L. J. (2002). Man-
and disordered speech (pp. 51–82). Oxford University Press. agement of dysarthria in amyotrophic lateral sclerosis. Geriat-
Weismer, G. (2008). Speech intelligibility. In M. J. Ball, M. R. rics & Aging, 5, 38–41.
Perkins, N. Müller, & S. Howard (Eds.), The handbook of Yorkston, K. M., Beukelman, D. R., Hakel, M., & Dorsey, M.
clinical linguistics (pp. 568–582). Blackwell. (2007). Speech Intelligibility Test for Windows. Institute for
Weismer, G., Jeng, J. Y., Laures, J. S., Kent, R. D., & Kent, Rehabilitation Science and Engineering at Madonna Rehabili-
J. F. (2001). Acoustic and intelligibility characteristics of tation Hospitals.
sentence production in neurogenic speech disorders. Folia Yorkston, K. M., Dowden, P. A., & Beukelman, D. R. (1992).
Phoniatrica et Logopaedica, 53(1), 1–18. https://doi.org/10. Intelligibility measurement as a tool in the clinical manage-
1159/000052649 ment of dysarthric speakers. In R. D. Kent (Ed.), Intelligibil-
Weismer, G., Laures, J. S., Jeng, J. Y., Kent, R. D., & Kent, ity in speech disorders: Theory, measurement and management
J. F. (2000). Effect of speaking rate manipulations on acoustic (pp. 265–286). John Benjamins. https://doi.org/10.1075/sspcl.1.
and perceptual aspects of the dysarthria in amyotrophic lat- 08yor
eral sclerosis. Folia Phoniatrica et Logopaedica, 52(5), 201– Yorkston, K. M., Hakel, M., Beukelman, D. R., & Fager, S.
219. https://doi.org/10.1159/000021536 (2007). Evidence for effectiveness of treatment of loudness,
Westbury, J. R., Lindstrom, M. J., & McClean, M. D. (2002). rate, or prosody in dysarthria: A systematic review. Journal of
Tongues and lips without jaws: A comparison of methods for Medical Speech-Language Pathology, 15(2), xi–xxxvi.
decoupling speech movements. Journal of Speech, Language, Yorkston, K. M., Hammen, V. L., Beukelman, D. R., & Traynor,
and Hearing Research, 45(4), 651–662. https://doi.org/10.1044/ C. D. (1990). The effect of rate control on the intelligibility
1092-4388(2002/052) and naturalness of dysarthric speech. Journal of Speech and
Windmann, A., Šimko, J., & Wagner, P. (2015). Optimization- Hearing Disorders, 55(3), 550–560. https://doi.org/10.1044/jshd.
based modeling of speech timing. Speech Communication, 74, 5503.550
76–92. https://doi.org/10.1016/j.specom.2015.09.007 Yorkston, K. M., Strand, E. A., Miller, R., Hillel, A., & Smith,
Wing, A. M., & Kristofferson, A. B. (1973). Response delays and K. (1993). Speech deterioration in amyotrophic lateral sclero-
the timing of discrete motor responses. Perception & Psycho- sis: Implications for the timing of intervention. Journal of
physics, 14(1), 5–12. https://doi.org/10.3758/BF03198607 Medical Speech-Language Pathology, 1(1), 35–46.
Yorkston, K. M., & Beukelman, D. R. (1981). Communication effi- Yunusova, Y., Green, J. R., Lindstrom, M. J., Ball, L. J., Pattee,
ciency of dysarthric speakers as measured by sentence intellig- G. L., & Zinman, L. (2010). Kinematics of disease progression
ibility and speaking rate. Journal of Speech and Hearing Disor- in bulbar ALS. Journal of Communication Disorders, 43(1), 6–20.
ders, 46(3), 296–301. https://doi.org/10.1044/jshd.4603.296 https://doi.org/10.1016/j.jcomdis.2009.07.003

Rong & Heidrick: Temporal Patterning of Articulation 4607


Copyright of Journal of Speech, Language & Hearing Research is the property of American
Speech-Language-Hearing Association and its content may not be copied or emailed to
multiple sites or posted to a listserv without the copyright holder's express written permission.
However, users may print, download, or email articles for individual use.

You might also like