0% found this document useful (0 votes)
179 views

Crenshaw Et Al (2017) - Revised Scoring For The Communication Patterns Questionnaire

The study re-examines the factor structure of the widely used Communication Patterns Questionnaire (CPQ) using data from four samples to improve its reliability. Exploratory and confirmatory factor analyses identified a three-factor solution including constructive communication and two demand/withdraw scales as providing the best fit. This revised structure was confirmed in two additional samples. The updated scales include more items and demonstrate improved reliability, stronger associations with relationship satisfaction, and increased sensitivity to changes from therapy compared to the original scales.

Uploaded by

Mayuri Morye
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
179 views

Crenshaw Et Al (2017) - Revised Scoring For The Communication Patterns Questionnaire

The study re-examines the factor structure of the widely used Communication Patterns Questionnaire (CPQ) using data from four samples to improve its reliability. Exploratory and confirmatory factor analyses identified a three-factor solution including constructive communication and two demand/withdraw scales as providing the best fit. This revised structure was confirmed in two additional samples. The updated scales include more items and demonstrate improved reliability, stronger associations with relationship satisfaction, and increased sensitivity to changes from therapy compared to the original scales.

Uploaded by

Mayuri Morye
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Revised Scoring and Improved Reliability for the Communication

Patterns Questionnaire
Alexander O. Crenshaw1, Andrew Christensen2, Donald H. Baucom3,
Norman B. Epstein4, Brian R.W. Baucom1
1 University
of Utah
2 University of California-Los Angeles
3 University of North Carolina, Chapel Hill
4 University of Maryland, College Park

The Communication Patterns Questionnaire (CPQ; Christensen, 1987) is a widely used self-report
measure of couple communication behavior and is well-validated for assessing the
demand/withdraw interaction pattern, which is a robust predictor of poor relationship and
individual outcomes (Schrodt, Witt, & Shimkowski, 2013). However, no studies have examined
the CPQ’s factor structure using analytic techniques sufficient by modern standards, nor have any
studies replicated the factor structure using additional samples. Further, the current scoring system
uses fewer than half of the total items for its four subscales, despite the existence of unused items
that have content conceptually consistent with those subscales. These characteristics of the CPQ
have likely contributed to findings that subscale scores are often troubled by sub-optimal
psychometric properties such as low internal reliability (e.g., Christensen, Eldridge, Catta-Preta,
Lim, & Santagata, 2006). The present study uses exploratory and confirmatory factor analyses on
four samples to re-examine the factor structure of the CPQ to improve scale score reliability and
to determine if including more items in the subscales is warranted. Results indicate that a three-
factor solution (constructive communication and two demand/withdraw scales) provides the best
fit for the data. That factor structure was confirmed in the replication samples. Compared with
the original scales, the revised scales include additional items that expand the conceptual range of
the constructs, substantially improve reliability of scale scores, and demonstrate stronger
associations with relationship satisfaction and sensitivity to change in therapy. Implications for
research and treatment are discussed.

Keywords: assessment, marriage, couples, communication, Communication Patterns Questionnaire


(CPQ)

Citation: Crenshaw, A.O., Christensen, A., Baucom, D.H., Epstein, N.B., & Baucom, B.R.W. (2017).
Revised scoring and improved reliability for the Communication Patterns Questionnaire. Psychological
Assessment, 29(7), 913-925.

©American Psychological Association, 2016. This paper is not the copy of record and may not exactly replicate the authoritative
document published in the APA journal. The final article is available at: https://doi.org/10.1037/pas0000385
Supplemental material available at: https://doi.org/10.1037/pas0000385.supp

Alexander O. Crenshaw, Department of Psychology, University Andrew Christensen at UCLA (MH56223) and Neil S. Jacobson at
of Utah; Andrew Christensen, Department of Psychology, University the University of Washington (MH56165). We thank Dr. Lisa Harris
of California-Los Angeles; Norman B. Epstein, Department of and Dr. James Shenk, former students of Dr. Christensen, for use of
Family Science, University of Maryland, College Park; Brian R.W. their data that comprised the Divorcing sample.
Baucom, Department of Psychology, University of Utah. This Correspondence concerning this article should be addressed to
manuscript was supported in part by start-up funding from the Alexander Crenshaw, Department of Psychology, University of Utah,
University of Utah awarded to Brian Baucom. The randomized Salt Lake City, UT 84112. E-mail: [email protected].
clinical trial data on which this manuscript is based was supported by Effective communication between partners is widely
grants from the National Institute of Mental Health awarded to considered to be an essential part of successful romantic
REVISED SCORING FOR THE CPQ 2

relationship functioning, and dissatisfaction with negative affect during and following interaction between
communication is the most common reason couples seek partners (e.g., McGinn, McFarland, & Christensen, 2009).
therapy (e.g., Doss, Simpson, & Christensen, 2004). Mutual avoidance describes a process in which both
Communication within romantic relationships partners avoid the conflict altogether, for example, by
encompasses a wide range of behaviors and behavioral becoming silent, changing the subject, or walking away
patterns, and a large accumulation of evidence suggests from each other (Christensen & Shenk, 1991). In mutual
that both negative and positive behaviors contribute to avoidance, withdrawal by one partner is not contested by
relationship satisfaction and outcomes (e.g., Karney & the other, as he or she is also seeking to withdraw. In
Bradbury, 1995). Assessment of these behavioral patterns contrast, demand/withdraw behavior is a dyadic pattern
is an integral part of couple communication research as in which one partner nags, criticizes, complains, or
well as of the practice of couple therapy. The ability to otherwise attempts to initiate change, while the other
measure communication patterns using a self-report partner avoids, terminates, or withdraws from the
measure is particularly important for couple therapists, interaction (Christensen, 1987).
given the time and resource requirements of other A large body of evidence links demand/withdraw
methods. Unfortunately, the most widely used self-report behavior to numerous individual and relationship
measure of communication, the Communication Patterns sequelae. Higher levels of demand/withdraw behavior are
Questionnaire (CPQ; Christensen, 1987), has significant associated with greater relationship distress among both
psychometric limitations (e.g., Christensen et al., 2006) satisfied and unsatisfied couples (see Eldridge & B.
that prevent researchers and therapists from optimally Baucom, 2012), a finding that has been replicated in
assessing and tracking communication patterns in opposite-sex couples from numerous countries (e.g.,
couples. However, these are not limitations of the CPQ United States, Taiwan, Brazil, Switzerland, Pakistan; see
itself, but rather limitations of the current scoring used to B. Baucom, McFarland, & Christensen, 2010; Christensen
compute its subscales. They can be addressed by re- et al., 2006), and in same-sex couples in the United States
analyzing the CPQ’s factor structure and including (e.g., Kurdek, 2004). Higher levels of demand/withdraw
additional, unused items, in order to improve behavior are also associated with greater likelihood of
psychometric properties and the utility of the scale in both divorce (Gottman & Levenson, 2000), infidelity
research and clinical practice settings. (Balderrama-Durbin, Allen, & Rhoades, 2012), and
intimate partner violence (Holtzworth-Munroe, Smutzler,
Couple Communication & Stuart, 1998). Demand/ withdraw is also associated
Communication behavior in couples can be with a host of negative individual outcomes, including
categorized into two main types: positive behaviors and depression (Rehman, Ginting, Karimiha, & Goodnight,
negative behaviors (Woodin, 2011). Within these clusters, 2010), alcoholism (Kelly, Halford, & Young, 2002), and
constructive communication (positive) and decreased subjective well-being (Schrodt, et al., 2013).
demand/withdraw behavior (negative) are particularly These behavioral patterns are most commonly
strongly associated with a wide range of relationship assessed in two ways: observational coding and self-
functioning variables (e.g., K. Baucom, B. Baucom, & report. Observational coding involves having trained (e.g.,
Christensen, 2015; Schrodt, Witt, & Shimkowski, 2013). Heavey, Gill, & Christensen, 1998) or untrained (e.g., K.
Constructive communication is an inclusive term for a Baucom, B. Baucom, & Christensen, 2012) raters view
host of positive behaviors that serve to promote a video recordings of couples engaging in a discussion and
collaborative approach to problem solving and engender rate the strength and/or frequency of certain behaviors.
trust and understanding. Examples include making Observational coding is commonly used in laboratory-
suggestions (in contrast to demands), compromising, based research because it is objective. However, due to its
perspective-taking, and expressing feelings. Constructive time- and resource-consuming nature, observational
communication is strongly and positively associated with coding tends to be restricted to research contexts and with
marital satisfaction (Heavey, Larson, Zumtobel, & small to moderate sample sizes. Large scale survey
Christensen, 1996; Litzinger & Gordon, 2005), is research, internet-based research, and clinical settings are
associated with more forgiveness in heterosexual couples much more reliant on self-report measures to assess
(Fincham & Beach, 2002), and is believed to buffer the communication patterns.
detrimental effect of poor sexual satisfaction on overall
marital satisfaction (Litzinger & Gordon, 2005). Communication Patterns Questionnaire
In contrast to the relationship enhancing nature of A freely available and one of the most commonly
constructive communication, demand/withdraw used self-report measures for assessing communication
behavior and mutual avoidance are patterns of behavior patterns in romantic couples is the Communication
that sustain and intensify conflict and are associated with Patterns Questionnaire (CPQ; Christensen, 1987; Schrodt
REVISED SCORING FOR THE CPQ 3

et al., 2013). Based in part on an original measure of the CPQ in assessing couple communication behavior,
developed by Sullaway and Christensen (1983), the CPQ despite the fact that a number of different scoring systems
consists of 35 Likert-scale items that assess dyadic have been used. Currently, the Christensen and Shenk
patterns in ways that couples typically deal with (1991) scoring system for the CPQ is the most commonly
relationship problems at three time periods: when a used, although the constructive communication scale has
problem arises, during discussion of the problem, and since been revised to include seven items, and includes
after the discussion of the problem. The items of the CPQ items assessing both positive communication and
are most commonly used to generate four subscales: negative communication (Heavey et. al, 1996). However,
constructive communication (7 items), mutual avoidance this scoring was constructed on theoretical grounds
(3 items), and two demand/withdraw scales (self- without the use of factor analytic techniques, which raises
demand/partner withdraw and partner-demand/self- concerns about psychometric properties of the subscales.
withdraw; 3 items each). Indeed, psychometric properties of the CPQ are highly
The CPQ scoring has undergone several revisions inconsistent across studies. For example, Christensen et
since its creation. It was originally conceptualized as al., 2006) reported inter-item ICCs (Cronbach’s alpha)
having three scales—mutual constructive between .73 and 78 for constructive communication and
communication, demand/withdraw behavior, and female-demand/male-withdraw, but also reported an ICC
demand/withdraw roles (Christensen, 1987). Using this of .58 for male-demand/female-withdraw among
scoring, stronger demand/withdraw behavior has been Americans, and ICCs ranging from .21 to .81 in samples
linked with lower relationship satisfaction and greater from Taiwan, Brazil, and Italy. Another cross-cultural
asymmetry in level of intimacy and independence desired study found ICCs ranging from .44 to .80 (Bodenmann,
by partners (Christensen, 1987). The same study also Kaiser, Hahlweg, & Fehm-Wolfsdorf, 1998).
found the constructive communication scale to be The lack of consistency in internal reliabilities of
inversely related to demand/withdraw behavior. subscale scores for the CPQ across various samples raises
However, this early scoring system grouped items into concerns about the extent of its ability to validly describe
scales on conceptual grounds and examined psychometric communication across a range of populations. With such
properties using only a within-couple intra-class inconsistent reliabilities reported, one may question
correlation (ICC), finding moderate agreement between whether the CPQ is measuring the same constructs in
males and females. different populations and among couples at different
Christensen and Shenk (1991) revised the CPQ on levels of functioning. In addition, although items
conceptual grounds to include a fourth scale: mutual measuring the same construct tend to produce a strong
avoidance. Accounting for the fact that an individual can ICC, a strong ICC by itself is not sufficient for
occupy both demanding and withdrawing roles in a determining if items measure the same construct. Further,
relationship, even if those two behaviors are mutually the one study that utilized factor analytic methods for
exclusive at any given time point, Christensen and Shenk determining the factor structure of the CPQ (Noller &
modified the demand/withdraw scales by removing the White, 1990) examined only 96 married couples, did not
demand/withdraw roles scale and separating sample across a range of couple functioning, and did not
demand/withdraw behavior into male-demand/female- replicate their exploratory results with a priori
withdraw and female-demand/male-withdraw subscales. confirmatory techniques. In addition, they only utilized
In a sample of 62 couples, they found that all four scales the Kaiser-Guttman “Eigenvalue > 1” criterion for
of their revised CPQ distinguished distressed from non- deciding the number of factors, a technique that is
distressed couples. Another study examining inadequate by modern standards and that tends to
psychometric properties of the CPQ in a sample of 96 overestimate the number of factors (Tabachnick & Fidell,
married community couples found that 29 of its 35 items 2013).
were individually able to distinguish couples with respect Another problem with the current scoring of the
to marital adjustment (Noller & White, 1990). In addition, CPQ is that it makes use of only 16 of 35 total items, even
Noller and White used exploratory factor analysis to though several unused items are conceptually consistent
examine the CPQ’s factor structure, finding four factors with some of the subscales. This fact is especially
somewhat different from previous scoring systems: problematic for the demand/withdraw scales, which
coercion, mutuality, post-conflict distress, and destructive contain only three items each despite the fact that the
processes. Using this scoring system, they found well- CPQ includes several additional items that assess
adjusted couples reported higher levels of mutuality, and conceptually similar behavior. Put in broader terms,
poorly-adjusted couples reported higher levels of demand/withdraw could be described as a behavior
destructive process, coercion, and post-conflict distress. pattern in which one person actively approaches a
Taken together, there is strong evidence for the utility problem while the other actively avoids the problem,
REVISED SCORING FOR THE CPQ 4

discounts it as a problem, or responds with passivity. based couple therapies (Christensen, Atkins, Berns,
Thus any item that describes an asymmetrical behavior Wheeler, D. Baucom, & Simpson, 2004). All couples had
pattern in which one partner has a negatively valenced to be legally married, living together, and meet criteria for
approach orientation to the partner or problem while the serious and stable marital distress prior to treatment. Both
other partner has an avoidant orientation toward the partners had to be between the ages of 18 and 65, fluent
partner or problem may capture demand/withdraw in English, and have at least a high school or equivalent
behavior and is likely to be a good candidate for the education (see Christensen et al., 2004 for a complete
demand/withdraw scale. For example, Item 17 (“I description of sample characteristics). Mean marital
threaten negative consequences and my partner gives in satisfaction in this sample, as measured by the Dyadic
or backs down.”) appears to be an especially destructive Adjustment Scale (DAS; Spanier, 1976), was 84.5 (SD =
type of demand/withdraw behavior, but it is currently 15.0) for men and 84.7 (SD = 14.0) for women. DAS
unused in any scale. scores in this sample fell below the well-accepted and
Thus there is strong reason to believe that the CPQ widely used cutoff of 97.5 for clinically significant distress,
could be improved considerably through using items that which is one standard deviation below the population
are conceptually consistent with its subscales but not mean (e.g., Christensen et al., 2004).
currently included in the scoring system. However, no Sample two, the Community sample, was a subset (n
study to date has examined the factor structure of the = 359) of couples with complete CPQ data from a sample
CPQ using methods that meet modern analytic standards, of 386 married couples from communities in North
nor has any study confirmed the hypothesized scales on a Carolina and the Maryland/Washington, DC area as part
replication sample using an a priori approach such as of a larger study. Couples were recruited to match the U.S.
confirmatory factor analysis (CFA). The present study population on key demographic variables, including age,
uses modern factor analytic techniques to reexamine the income, and ethnic status (see D. Baucom, Epstein,
factor structure of the CPQ, determine if additional items Rankin, and Burnett, 1996, for a complete description of
should be included in its subscales, and examine the Community sample). Mean DAS scores in this sample
replicability of the factor structure on three separate were 111 (SD = 15.4) for men and 112 (SD = 14.9) for
samples representing a wide range of couple functioning. women, well above the distress cutoff of 97.5.
Specifically, we hypothesized that Exploratory Factor Sample three, the Clinic sample, was a subset (n = 60)
Analysis (EFA) conducted on a sample of treatment- of couples with complete CPQ data from a sample of 85
seeking couples will replicate the four-factor solution used couples presenting for marital therapy to either a private
by Christensen and Shenk (1991) and modified by Heavey practice or university psychology clinic in southern
et al. (1996). Second, we hypothesized that the EFA California. Couples completed a series of questionnaires
would result in several currently-unused but conceptually including the CPQ and DAS during a pre-treatment
consistent items loading strongly onto the subscales. evaluation. Average DAS scores were 96.10 (SD = 12.85)
Third, we hypothesized that using CFA, the factor for men and 90.39 (SD = 18.17) for women, slightly
structure would replicate across three additional samples below the distress cutoff.
representing a wide range of relationship functioning, and Sample four, the Divorcing sample, was a subset (n =
factor loadings for revised subscales would not 52) of couples with complete CPQ data from a sample of
significantly differ across men and women. Fourth, 60 couples recruited from a conciliation court (for couples
inclusion of additional items was hypothesized to result in unable to reach a custody agreement) in southern
improved internal reliability of subscale scores, improving California as part of a larger study (Harris, 1992). Table 1
power to detect associations with other important presents descriptive statistics for demographic
variables. Finally, we hypothesized that revised CPQ characteristics for all four samples.
subscales would show improved construct validity by
having significantly stronger associations with Measures
relationship satisfaction and by demonstrating greater Communication Patterns Questionnaire (CPQ).
sensitivity to change produced by couple therapy. The CPQ (Christensen, 1987) is a self-report measure of
communication behavior in romantic couples. It contains
Method 35 Likert scale items assessing how couples typically deal
Participants with problems in their relationship: four items assessing
The current investigation utilized four separate how behavior when a problem arises, 18 items assessing
samples of heterosexual married couples. The first sample behavior during a discussion of a problem, and 13 items
(Clinical Trial) consists of 134 couples that took part in a assessing behavior that occurs after discussion of a
multi-site, randomized clinical trial of two behaviorally- problem. Each item assesses partners’ perception of how
REVISED SCORING FOR THE CPQ 5

Table 1
Sample characteristics for each of the four samples
Clinical Trial Community Clinic Divorcing
Sample Size (# couples) 134 359 60 52
Race/ethnicity (%) Male Female Male Female Male Female Male Female
Caucasian 79.1 76.1 89 89 98.8 95.3 37.3 40.0
African American 6.7 8.2 11 11 1.2 1.2 30.5 30.0
Latino/Latina 5.2 5.2 - - - 1.2 30.5 28.3
Asian/Pacific Islander 6.0 4.5 - - - 2.4 1.7 1.7
Native Amer./Alaskan 0.7 - - - - - - -
M (SD) M (SD) M (SD) M (SD)
Male Female Male Female Male Female Male Female
DAS 84.5 (15.0) 84.7 (14.0) 111 (15.4) 112 (15.0) 96.1 (12.9) 90.4 (18.2) n/a n/a
Age 43.5 (8.7) 41.6 (8.6) 44.2 (13.1) 42.2 (12.6) 38.7 (8.8) 35.3 (7.4) 37.6 (7.7) 34.5 (6.5)
Years education 17.0 (3.2) 17.0 (3.2) 15.7 (3.3) 15.1 (2.7) n/a n/a 14.2 (2.4) 14.9 (2.6)
Annual Income (Med.) $48k $36k $50-70k n/a $10-50k
Marriage length (years) 10.0 (7.6) 17.5 (13.2) 7.69 (7.3) n/a
Note. DAS = Dyadic Adjustment Scale (Spanier, 1976). Median annual income was reported at the individual level in the Clinical Trial sample
and at the couple level in the Community and Divorcing sample. Education and income were not available in the Clinic sample, and DAS and
marriage length were not available in the Divorcing sample.

likely a certain type of behavior (e.g., both members avoid clarity, interpretability, and theoretical meaningfulness of
discussing the problem) occurs when faced with a the subscales by examining whether inclusion of other
relationship problem, from 1 (very unlikely) to 9 (very theoretically similar but previously unused items
likely). Of the 35 items on the CPQ, 16 are currently used broadened the content domain of each scale.
to form four subscales: constructive communication (7 To accomplish both aims, EFAs were first conducted
items), self-demand/partner-withdraw (3 items), partner- using a data-driven, empirical approach in order to narrow
demand/self-withdraw (3 items), and mutual avoidance (3 the field of possible factor solutions. This approach began
items). with examination of scree plots, separately for men and
Dyadic Adjustment Scale (DAS). The DAS women, based on eigenvalues from an initial, unrestricted
(Spanier, 1976) is a 32-item measure of relationship (i.e., number of factors extracted was set equal to number
satisfaction, in which higher scores indicate higher of items) extraction. The scree test identifies the optimal
satisfaction. Scores below 97.5 indicate clinically number of factors in EFA as being the number of
significant relationship distress (e.g., Christensen et al., eigenvalues above the “elbow” in the plot, which is the
2004). point at which the slope of the line decreases most sharply
(Tabachnick & Fidell, 2013). Rather than rely solely on
Analyses the scree test, it was used to form an initial hypothesis
EFAs were conducted in SPSS 21 on the Clinical Trial about the number of factors present and to determine a
sample using the common factor model with maximum range of other plausible factor solutions. All plausible
likelihood estimation (Schmitt, 2011; Tabachnick & factor solutions were then examined, and results were
Fidell, 2013). An oblique (Promax) rotation was used to compared in terms of variance explained, conceptual
allow for correlation between factors. We determined the interpretability of factors (i.e., did the items within each
Clinical Trial sample to be the most appropriate for the factor appear to measure a single identifiable construct),
EFA because the CPQ is most commonly used to and consistency of item loadings across both men and
measure communication in distressed (versus non- women. All items with standardized loadings above .3 on
distressed) couples and in treatment-seeking couples. We a given subscale were considered possible candidates for
wanted to ensure that the revised scales were most inclusion in that subscale.
appropriate for the population for which it is used, and Once a factor solution had been determined, CFAs
the Clinical Trial sample was the only sample in which all were conducted separately for men and women on each
couples were clinically distressed, treatment seeking of the three replication samples (Community, Clinic, and
couples. There were two central aims of the EFA. First Divorcing). CFAs were also conducted separately for each
was to determine whether an analysis of all 35 items would subscale in order to examine fit for each scale, sex, and
produce four factors consistent with the current scoring sample combination individually. Items within subscales
of the CPQ, or if a different solution was more were not expected to correlate after accounting for shared
appropriate. The second aim was to maximize conceptual factor variance, so residual correlations were fixed to zero.
REVISED SCORING FOR THE CPQ 6

Although the sample sizes of the Clinic and of .1 resulted in the lowest DIC for all three samples, so
Divorcing samples were smaller than is typically all prior variances were set at .1. All models were then
recommended for standard CFA (i.e., Maximum rerun using noninformative priors to examine the model’s
Likelihood estimation; Kline, 2015; Muthén & sensitivity to priors.
Asparouhov, 2012), we decided to perform CFAs Evaluation of model fit was done via the posterior
separately on each sample and each subscale, rather than predictive p value, which is the standard fit index used for
combining them, for several reasons. First, an important Bayesian SEM (Muthén & Asparouhov, 2012). The
question from both a theoretical and measurement posterior predictive p is similar to a Chi-square test of
perspective is whether the CPQ can validly capture model fit in that a “nonsignificant” p value indicates good
communication behavior across a wide range of couple model fit, but it does not behave in the same way as a Chi-
functioning. A related but separate question is whether square test in that the expected Type I error is not .05 for
communication behavior assessed via the CPQ can be a fitting model (see Muthén & Asparouhov, 2012).
described by the same set of dimensions (i.e., factor However, the posterior predictive p does appear sensitive
structure) across levels of relationship functioning. Both to sample size, though the extent of its sensitivity to
of these questions should be answered in order to sample size does not yet appear fully resolved (see
determine whether the CPQ can be used validly across the Muthén & Asparouhov, 2012). Consistent with Muthén
spectrum of relationship quality. We also chose to and Asparouhov (2012), we selected a posterior predictive
conduct CFAs separately for each subscale in order to be p cutoff of .05 for the current study. As the purpose of
able to identify specific sources of misfit in the model if the current study was to revise and improve an existing
poor fit were to arise. measure rather than test a new measurement model,
While standard structural equation modeling (SEM) subscales were not rejected based only on a posterior
typically calls for large sample sizes of approximately 200 predictive p below the cutoff. In addition to posterior
or higher, Bayesian SEM (Muthén, 2010; Muthén & predictive p, we considered the statistical significance (p <
Asparouhov, 2012) can be used with sample sizes as small .05) of individual item loadings, consistency of factor
as two or three times the number of unknown parameters, loadings with EFA results and across samples, and
especially when good priors are provided (Lee & Song, sensitivity of results to choice of priors.
2004). In the smallest sample used in this study Once subscales were finalized, a test of “weak”
(Divorcing), the ratio of sample size to number of factorial invariance (equivalency of item loadings; Kline,
unknown parameters was 2.9 (52 individuals divided by 2015) was conducted through multiple group analysis on
18 parameters—nine loadings and nine error variances)1 the Clinical Trial sample to test whether item loadings
for the constructive communication scale and 3.7 (52 could be treated as equivalent across men and women.
divided by 14) for the demand/withdraw scales. As a These analyses were performed using the Clinical Trial
result, Bayesian SEM was appropriate for estimating a sample because it was not part of the CFA and it also
separate model for each of the three replication samples, allowed examination of equivalency of item loadings on a
despite the small Clinic and Divorcing samples. sample for which the CPQ is most often used. Using the
Analyses were performed using the Bayes estimator maximum likelihood (ML) estimator in Mplus in order to
in Mplus 7.31 (Muthén & Muthén, 2012), with the allow for statistical comparison of nested models, models
number of iterations set at 30,000. Rather than viewing for each subscale were run in which item loadings were
parameters as constants, Bayesian analysis makes use of constrained to be equal for men and women and again
predetermined values, or priors, to estimate the parameter without such restriction. A chi-square difference test was
distribution (Muthén & Asparouhov, 2012). Priors can be used to determine whether the constrained and
diffuse (noninformative) or based on previous theory or unconstrained models were significantly different.
empirical results (informative). As we used the CFA to We also examined changes in internal reliability for
validate the measurement model specified in the EFA, each subscale score in each sample when moving from
unstandardized factor loadings from the EFA were used the original to revised scoring. In addition to being
as informative priors in the CFA model. Bayesian SEM necessary for ensuring that items on a scale are in fact
also requires a value be set for the variance of each prior, measuring a single construct, having high internal
and Muthén (2010) recommends testing several prior reliability is also important for maximizing statistical
variance values and selecting the value with the lowest power in empirical studies. Given a true correlation, ρ,
Deviance Information Criterion (DIC). A prior variance between a CPQ subscale and another measure of interest,

1This calculation reflects the number of parameters for the final


version of each scale, after poorly-fitting items were dropped in the
CFA step.
REVISED SCORING FOR THE CPQ 7

the observed correlation, r, will be reduced by the degree Table 2


to which α of each measure is below 1 (Kline, 2015). Thus, EFA standardized item loadings for three-factor solution (Clinical
by improving the internal reliability of a measure’s scores, Trial sample)
power in any analysis using that measure is necessarily Item CC SD / PW PD / SW
improved. The R package cocron (Diedenhofen, 2016) M F M F M F
was used to test significant differences in the internal 1. Both avoid discussing a -.389 -.318
2. Both try to discuss a .551 .667
reliability of subscales using the revised and original 6. Both express feelings b .359 .550
scoring. 8. Both suggest compromises .597 .714
Lastly, we examined convergent validity of the 23. Both feel understood c .636 .667
revised subscales and compared them with the original 24. Both withdraw c -.466-.500
25. Both feel resolved c .675 .712
subscales. First, we examined correlations between 26. Neither gives in c -.625 -.419
relationship satisfaction and original and revised CPQ 27. Both are especially nice c .620 .633
subscales in the Clinical Trial, Community, and Clinic 3. I start discussion/P avoids a -.373 .510 .560
samples. Satisfaction data were not available in the 9. I nag & demand/P withdraws b .717 .782
Divorcing sample. The Fisher r-to-z transformation in 11. I criticize/P defends b .499 .587
13.I pressure to change/P resists b .606 .609
which two correlations share the same variable (i.e., DAS) 17. I threaten/P gives in b .697 .353
was used to determine whether the differences in pairs of 19. I call names, swear, etc. b -.305 .613 .519
correlations with the DAS were significant (Lee & 32. I pressure to apologize/P resists c .464 .568
Preacher, 2013). Next, the ability of the CPQ subscales to 4. P starts discussion/I avoid a -.372 .263 .352
10. P nags & demands/I withdraw b -.381 .616 .570
detect change over a course of couple therapy was 12. P criticizes/I defend b .507 .594
examined in the Clinical Trial sample in a series of 14. P pressures to change/I resist b .517 .555
Multilevel Models (MLMs). MLMs were estimated at 18. P threatens/I give in b .312 .600 .646
sample sizes ranging from the full sample (N = 134) to 20. P calls names, swears, etc. b .689 .553
the size of the average published outcome study of couple 33. P pressures to apologize/I resist c .595 .537
15. I express /P offers solutions .303 .395
therapy (n = 30) using a bootstrap resampling procedure. 16. P expresses /I offer solutions .166 .366
28. I feel guilty/partner feels hurt c .335 .331 .369
Results 29. P feels guilty/I feel hurt c .358
Exploratory Factor Analysis 5. Both blame, accuse, criticize b .357 .335 .337
All item loadings reported are standardized unless 7. Both threaten each other b .400 .410
21. I push, shove, slap (etc) partner b .359 .033 .099 .340
stated otherwise. Scree plots (see Figure A1 in online 22. P pushes, shoves, slaps (etc) me b .203 -.059 .388 .440
supplemental material) for both men and women 30. I’m nice/P distant c
suggested three clear factors, as indicated by a clear 31. P nice/I’m distant c
“elbow” in both plots at the fourth eigenvalues 34. I seek support from others c
35. P seeks support from others c
(Tabachnick & Fidell, 2013). As a result, we identified a Note. CC = constructive communication, SD/PW = self demand /
three-factor solution as the most likely solution, but chose partner withdraw, PD/SW = partner demand / self withdraw. Item
to also examine four- and five-factor solutions in order to content shortened for readability. Loadings under .3 omitted for
rule out these alternative possibilities. Consequently, readability unless included for conceptual reasons,. a = “When some
subsequent extractions were performed using three-, problem in the relationship arises” b = “During discussion of a
relationship problem” c = “After a discussion of a relationship
four-, and five-factor solutions, followed by a comparison problem”. Bolded items were included in final scales; italicized items
of the possible solutions in terms of variance explained, loaded over .3 but were ultimately excluded. Items 15, 16, 21, and 22
conceptual clarity, and interpretability. were removed after CFAs. M = males; F = females; P = Partner.
Variance Explained. A three-factor solution
resulted in rotated eigenvalues of 3.71, 3.60, and 3.21 for 37.82% (men) and 37.21% (women) of the variance in
men; eigenvalues for women were 4.23, 3.20, and 3.48. CPQ responses. Extractions with more factors will
The three-factor solution explained 29.35% of the necessarily explain more variance than extractions with
variance in CPQ responses for both men and women. By fewer factors, but the addition of a fourth and fifth factor
comparison, the rotated eigenvalues for a four-factor in this sample did not explain substantially more variance
solution were 3.70, 3.62, 3.25, and 1.69 (men), and 4.23, compared with the three-factor solution.
3.20, 3.35, and 2.17 (women). The four-factor solution Conceptual Interpretability and Consistency of
explained 34.04% of the variance in CPQ responses for Loadings Across Sex. Across sex, the three-factor
men and 33.56% for women. Finally, a five-factor solution had the same conceptual interpretation and
solution resulted in rotated eigenvalues of 3.54, 3.47, 3.17, yielded very similar solutions in terms of item loadings
2.67, and 2.12 (men), and 4.26, 3.23, 3.31, 2.04, and 1.39 (see Table 2 for all factor loadings). We interpreted the
(women). The five-factor solution explained three factors to be: constructive communication, self-
REVISED SCORING FOR THE CPQ 8

demand/partner-withdraw, and partner-demand/self- Confirmatory Factor Analysis


withdraw. These factors are conceptually the same as Tables A1-A3 (online supplemental material) present
three of the previous CPQ factors, except that the three means, standard deviations, and correlations for all CPQ
items from the original mutual avoidance scale (both items in the CFA samples, and Figure A2 shows the CFA
avoid discussing problem, both withdraw after discussion, model specification. Initial analyses identified four items
and neither gives in after discussion) loaded negatively on that loaded poorly across the replication samples and
the constructive communication scale in these solutions. resulted in poor model fit. Therefore, items 15 and 16 (I
The four-factor solution yielded the same three express feelings while my partner offers reasons and
conceptual factors as the three-factor solution, with an solutions, and the partner version of the same item) were
additional factor, inconsistent across sex. For men, the set removed from the Constructive Communication (CC)
of constructive communication items was split such that scale, and items 21 and 22 (I push, shove, slap, hit, or kick
the fourth factor included three items previously on the partner, and the partner version of the same item) were
constructive communication factor (both try to discuss removed from the self-demand/partner-withdraw scale
the problem; both express feelings; both suggest and partner-demand/self-withdraw scale, respectively.
solutions) in addition to two unrelated items (partner hits The modified solution provided a substantially better fit
me; I’m nice after discussion while partner is distant). For for the data overall, and the removal of two items that
women, the fourth factor consisted of the CPQ’s two describe the behavior of only one member of the couple
violence items (I push, shove, slap, hit, or kick my partner; from the demand/withdraw scales resulted in a
and the partner version of the same item). conceptually clearer dyadic representation of
The five-factor solution yielded the same three demand/withdraw behavior.
conceptual factors as the three-factor solution, with the Table 3 displays results from the CFAs, including
addition of a two-item violence factor that was consistent standardized factor loadings and posterior predictive p
across sex (I hit partner, partner hits me) and a fifth factor values for each subscale/sample combination. Overall,
that was inconsistent across sex. For men, the fifth factor factor loadings in all three replication samples for both
was uninterpretable, including the following items: both males and females were significant, nearly all above or
try to discuss the problem; both express feelings; both substantially above .3, and highly similar to those in the
suggest solutions; I call partner names; and I’m nice after Clinical Trial sample. Of the 138 factor loadings reported,
discussion while partner is distant. For women, the fifth all but one loaded significantly on their respective scales
factor contained only two post-discussion items: I feel (Male Item 24 in Divorcing sample was nonsignificant, p
guilty while my partner feels hurt, and I try to be nice > .05). Posterior predictive p values suggest that the
while my partner is distant. model in 12 of the 18 scale-sample-sex combinations has
Taken together, the EFA results suggest that a three- less than perfect model fit (p < .05).
factor solution provides an optimal description of the CFAs were then repeated using noninformative
dimensional structure of CPQ items. The three-factor priors in order to examine sensitivity of the measurement
solution is highly similar for men and women, it is model to priors. For the Community and Clinic samples,
conceptually clear and interpretable, and the presence of the direction, magnitude, and significance of factor
a constructive communication and two demand- loadings were largely unchanged for all factor loadings
withdraw subscales is consistent with how the CPQ has across all scales. For the Divorcing sample, factor
been used in previous research. A four-factor solution loadings were more sensitive to specification of priors.
yielded a fourth factor that was inconsistent between Specifically, using noninformative priors there were
men and women and included only three and two items, several instances on each scale for men, and on CC for
respectively. A five-factor solution yielded a fourth women, in which the magnitudes of factor loadings were
factor (violence) that was consistent between men and substantially smaller relative to when using informative
women, but it contained only two items, which is below priors (analyses available from the first author).
the required three items for retaining a factor (e.g., A test of “weak” factorial invariance (equivalence of
Kline, 2015). Furthermore, the fifth factor was unstandardized item loadings; Kline, 2015) was then
uninterpretable for men and contained only two items conducted on the Clinical Trial sample to test equivalency
for women. It is worth noting that none of the solutions of item loadings across men and women. As shown in
that were explored yielded anything close to the original Table A4, all factor loadings for the constrained model
mutual avoidance subscale used in the previous scoring were significant at (ps < .01), and none of the chi-square
of the CPQ. Instead, those items loaded negatively on difference tests comparing the constrained with the
the constructive communication factor. unconstrained model were significant (Constructive
Communication Χ2(9) = 6.01, p = .739; SD/PW Χ2(7) =
5.01, p = .659; PD/SW Χ2(7) = 10.47, p = .163). Results
REVISED SCORING FOR THE CPQ 9

Table 3 women’s self-demand/partner-withdraw subscale, which


Standardized factor loadings from Bayesian CFA using empirical increased from α = .645 to α = .770 (a 19.4% increase).
priors for replication samples (Community, Clinic, Divorcing) Of the 24 reliabilities computed, one decreased slightly
Item Community Clinic Divorcing from the original to the revised scoring; the men’s self-
M F M F M F demand/partner-withdraw subscale in the Divorcing
Constructive Communication
1. Both avoid discussinga .419 .563 .379 .319 .298 .258 sample changed from .634 (original) to .617 (revised), a
2. Both try to discussa .588 .602 .617 .588 .651 .516 decrease of 2.7%.
6. Both express feelingsb .505 .631 .385 .438 .464 .535
8. Both suggest solutions &
The observed reliability improvements translate into
.694 .643 .652 .652 .668 .755 substantially improved power for detecting meaningful
compromisesb
23. Both feel understoodc .768 .796 .579 .657 .677 .673 relationships with other variables, reducing the sample
24. Both withdrawc .644 .663 .516 .728 .185# .364
25. Both feel resolvedc .735 .744 .574 .701 .527 .519 size needed to find statistical significance. To examine the
26. Neither gives inc .588 .611 .694 .618 .314 .324 extent to which the revised scoring system improves
27. Both are especially nicec .543 .504 .649 .392 .522 .543 power in studies that use the CPQ, we used the original
Posterior predictive p <.001 <.001 <.001 .002 .027 .113
Self-demand / Partner-withdraw (Clinical Trial) sample to calculate the sample size needed
3. I start discussion/P avoidsa .402 .359 .543 .576 .381 .439 to achieve .8 power in a two-tailed correlation analysis
9. I nag & demand/P withdrawb .631 .682 .700 .761 .627 .629 between the CPQ and a hypothetical other measure.
11. I criticize/P defendsb .778 .831 .571 .608 .654 .630
13.I pressure to change/P resistb .692 .742 .619 .741 .500 .607 Using G*Power 3.1, we examined a range of values for
17. I threaten/P gives inb .553 .602 .466 .479 .447 .482 the reliability of the other measure scores and the true
19. I call names, swear, etc.b .649 .648 .548 .337 .476 .561
32. I pressure to apologize/P resistc .528 .524 .617 .643 .333 .496
population correlation (ρ) between the CPQ scale and the
Posterior predictive p <.001 <.001 .067 <.001 .073 .665 other measure. We used two subscales from the Clinical
Partner-demand / Self-withdraw Trial sample: male-reported Constructive
4. P starts discussion/I avoida .472 .559 .338 .531 .243 .564
10. P nags & demands/I withdrawb .746 .677 .589 .588 .518 .731
Communication, which showed the greatest
12. P criticizes/I defendb .680 .724 .669 .593 .565 .638 improvement in reliability, and female-reported self-
14. P pressures to change/I resistb .671 .754 .708 .744 .350 .651 demand/partner-withdraw, which showed the lowest
18. P threatens/I give inb .589 .592 .700 .450 .679 .525
20. P calls names, swears, etc.b .642 .582 .658 .464 .714 .494 improvement in reliability. Table 5 displays sample sizes
33. P pressure to apologize/I resistc .462 .571 .502 .554 .606 .491 needed to achieve .8 power across various ρ and other-
Posterior predictive p <.001 <.001 .285 .008 .209 .018 scale α values. For the greatest power improvement (male
Note. Item content shortened to improve readability. M = males; F =
females; P = Partner. a = “When some problem in the relationship
reported Constructive Communication), the sample size
arises…” b = “During discussion of a relationship problem…” c = needed to achieve .8 power was 29.4% to 31.3% lower
“After a discussion of a relationship problem…”. #= Factor loading when using the revised scoring, compared with the
was not statistically significant (p > .05). All other loadings were original scoring. For the smallest power improvement
significant at p < .05. (female reported self-demand/partner-withdraw), the
sample size needed to achieve .8 power was 16.1% to
indicate that factor loadings for CPQ subscales are not 17.9% lower when using the revised scoring. Thus, across
significantly different for men and women. all scales in the Clinical Trial sample, sample sizes needed
to achieve .8 power when using the CPQ are reduced by
Reliability and Power Improvement 16.1% to 31.3% when using the revised, compared with
Table 4 presents old and new reliabilities, reliability original, scoring.
increase, proportional increase in reliability, and a chi-
square test for difference in Cronbach alpha separately for Associations with Relationship Satisfaction and
men and women in all samples. Chi-square tests of Sensitivity to Change
differences in Cronbach alphas (Feldt, 1987) found that We next tested convergent validity by examining
18 of 24 reliabilities using the revised scoring were correlations between relationship satisfaction and CPQ
significantly larger than when using the original scoring. subscales using original and revised scoring in the Clinical
Further, six of 24 reliabilities using the original scoring are Trial, Community, and Clinic samples (satisfaction data
above .7, a generally-recognized cutoff for good internal were not available in the Divorcing sample; see Table A5).
reliability, whereas 22 of 24 reliabilities are above .7 using Using the original scoring, 25 of the 36 total correlations
the revised scoring. In most cases, proportion increase in (3 samples X 2 CPQs [male and female report] X 3
reliability, which translates most closely to expected subscales X 2 DASs [male and female report]) were
power improvement, was substantial. Using the original significant, while 28 were significant using the revised
Clinical Trial sample as an example, the largest proportion scoring. Twenty-seven of the 36 correlations were
increase was for men’s Constructive Communication relatively stronger using the revised scoring compared
subscale, which increased from α = .566 to α = .801 (a with the original, while 9 were relatively weaker. Nine of
41.5% increase). The smallest proportion increase was for the stronger correlations were significantly larger, and two
Table 4
Cronbach’s α in all four samples using original and revised CPQ scoring systems
Males Females
Old New Prop. Old New Prop.
Sample α α Diff. Χ2 Incr. α α Diff. Χ2 Incr.
Clinical Trial
CC .566 .801 a .235 22.9*** .415 .650 .808 a .158 18.6*** .243
SD / PW .618 .782 a .164 26.6*** .265 .645 .770 a .125 17.3*** .194
PD / SW .541 .736 a .195 17.4*** .360 .615 .751 a .136 14.0*** .221
Community
CC .790 .845 .055 14.0*** .070 .781 .863 .082 29.3*** .105
SD / PW .679 .805 a .126 65.0*** .186 .657 .813 a .156 128.7*** .237
PD / SW .722 .820 .098 58.1*** .136 .699 .821 a .122 88.6*** .175
Clinic
CC .757 .803 .046 1.0 .061 .720 .812 .092 3.5+ .128
SD / PW .647 .765 a .118 9.2** .182 .689 .795 a .106 9.6** .154
PD / SW .654 .804 a .150 12.8*** .229 .532 .726 a .194 11.4*** .365
Divorcing
CC .562 .666 .104 1.2 .185 .569 .720 a .151 3.2+ .265
SD / PW .634 .617 -.017 0.1 -.027 .642 .802 a .160 16.0*** .249
PD / SW .622 .794 a .172 12.5*** .277 .758 .814 .056 2.7 .074
Note. CC = Constructive communication. SD / PW = Self-demand/partner-withdraw. PD / SW = Partner-
demand/self-withdraw. Old α = intra-class correlation (ICC; Cronbach’s α) of original scoring. New α = ICC of
revised scoring. Diff = difference (subtracted) between α of revised scoring and original scoring. Χ2 = chi-square
test of difference of old and new Cronbach alphas. Prop. Incr. = proportion increase in α when moving from
original to revised scoring. Bolded numbers represent cases in which ICC is at or above .7; a = went from below .7
(original scoring) to above .7 (revised scoring).
+ p < .10, * p < .05, ** p < .01, *** p < .001.

were larger at a trend level. No correlations were system (0 = revised, 1 = original), a dummy coded
meaningfully larger when using the original subscales. The variable indicating pre-treatment (0) vs. post-therapy (1),
ability of the CPQ to detect change in communication an effect coded variable indicating type of treatment (-.5
behavior over a course of couple therapy was examined = IBCT, .5 = TBCT) as well as the two-, three- and four-
using a series of MLMs. Because different numbers of way interactions among these variables. There were no
items are included in each scale using the revised and significant interactions involving scoring system for any
original scoring systems, scale scores were generated using behavior indicating that the change in mean levels of
the mean of all items on a given scale. Additionally, behaviors are not significantly different using the original
demand/withdraw subscales were recoded to female- and revised scoring systems.
demand/male-withdraw (FD/MW) and male- To further compare the ability of the original and
demand/female-withdraw (MD/FW) in order for the revised scoring systems to detect significant changes in
specific type of behavior reported (one person demanding each type of behavior over the course of treatment, a
and the other withdrawing) to be the same regardless of series of MLMs was run where each type of behavior was
reporter. Participants in the Clinical Trial sample were regressed onto the effect coded variable for partner, the
assigned to one of two couple therapies, Integrative dummy coded variable indicating pre-treatment vs. post-
Behavioral Couple Therapy (IBCT) or Traditional therapy, the effect coded variable indicating type of
Behavioral Couple Therapy (TBCT; see Christensen et al., treatment, and a two-way interaction between pre/post-
2004, for a description of both therapies). Previous treatment and type of therapy. These models were run
research has found that these two treatments produce separately for each scoring system using sample sizes
significantly different amounts of change in positive and ranging from n = 120 to n = 30 using a bootstrapped
negative behaviors measured using observational coding resampling procedure that included 100 draws per model
methods during the active treatment phase (K. Baucom et where samples were selected using a stratified (type of
al., 2015), so interactions involving type of treatment were treatment), clustered (partner) design with replacement. A
included to test for possible treatment differences in all sample size of n = 30 was selected as the lower limit of
models. In the first analysis, each behavior was regressed analyses because it is approximately equal to the average
onto an effect coded variable for partner (-.5 = male, .5 = sample size (n = 30.35) of the 40 published outcome trials
female), a dummy coded variable indicating scoring of behaviorally based couple therapy reported in Shadish
REVISED SCORING FOR THE CPQ 11

Table 5
Sample sizes needed to achieve .8 power, using example of largest and smallest α improvement in
Clinical Trial sample
Male reported Constructive Communication
Internal Reliability (Cronbach’s α) of Other Measure
.60 .65 .70 .75 .80 .85
ρ Old New Old New Old New Old New Old New Old New
.25 367 259 339 238 314 221 293 206 275 193 258 182
.30 254 179 234 165 217 153 203 142 190 133 179 125
.35 186 131 171 120 159 111 148 104 139 97 130 91
.40 142 99 131 91 121 85 113 79 106 74 99 69
.45 111 78 103 72 95 66 89 62 83 58 78 54
.50 90 63 72 58 67 53 62 49 58 46 54 43
Female reported self-demand/partner-withdraw
Internal Reliability (Cronbach’s α) of Other Measure
.60 .65 .70 .75 .80 .85
ρ Old New Old New Old New Old New Old New Old New
.25 322 269 297 248 275 230 257 215 241 201 226 189
.30 223 186 205 171 190 159 178 148 166 139 156 130
.35 163 136 150 125 139 116 130 108 121 101 114 95
.40 124 103 114 95 106 88 99 82 92 77 87 72
.45 97 81 90 75 83 69 77 64 72 60 68 56
.50 78 65 72 60 67 55 62 52 58 48 54 45
Note. ρ = true correlation between CPQ scale and other measure. Old = Sample size required for .8
power using original scoring; New = Sample size required for .8 power using revised scoring.

and Baldwin (2005), Christensen et al. (2004), and whether there was empirical justification for including
Johnson, Hunsley, Greenberg, and Schindler (1999) additional, conceptually similar CPQ items in the scoring
combined. As reported in Table A6, significant changes of its subscales. EFAs on the Clinical Trial sample and
emerged from pre-treatment to post-therapy for CFAs on the replication sample found that a three-factor
constructive communication (CC), MD/FW and solution provided an optimal fit for the data. Four items
FD/MW using the original and revised scoring systems initially selected in EFAs were subsequently identified as
for all sample sizes. A significant treatment by time effect driving misfit in CFAs and were dropped from the final
emerged for CC in samples sizes of n 40 for the revised subscales. The final scales were: Constructive
scoring and n 50 for the original scoring. Similarly, the Communication (9 items: 2, 6, 8, 23, 25, 27, plus reverse-
p-value of the pre-treatment to post-therapy effect for scored items 1, 24, and 26), Self-demand/Partner-
MD/FW was approaching the commonly accepted cut- withdraw (7 items: 3, 9, 11, 13, 17, 19, and 32), and
off of p < .05 at n = 30 using the original scoring system Partner-demand/Self-withdraw (7 items: 4, 10, 12, 14, 18,
but not using the revised scoring system (p = .012). 20, and 33).2 Additional analyses showed that the revised
Results for FD/MW were largely equivalent at each scoring generally had significantly higher internal
sample size across the two scoring systems. This reliabilities than the original, had equal or larger
collection of results suggests that both scoring systems are correlations with relationship satisfaction, and was more
able to detect significant change in all three behaviors, and sensitive to change over time at small sample sizes. Taken
that the revised scoring system appears to be somewhat together, this collection of results strongly suggests that
more sensitive to change over time at small sample sizes the revised scoring system offers substantial
for CC and MD/FW. improvements over the original scoring system and that
the revised scoring system should be used in place of the
Discussion original in future research and clinical practice. We
The present study investigated the factor structure of consider the detailed results of each analysis in turn below.
the CPQ in four samples of heterosexual married couples. Overall, results of CFAs showed that factor loadings
The primary aims of this study were to re-examine the for each subscale were significant, in the same direction,
optimal number of factors in the CPQ and to examine and largely consistent across samples, although some

2Subscale values are computed by adding up all items within the Those interested may contact the first author for a free copy of the
subscale. Items 1, 24, and 26 on the Constructive Communication CPQ.
should be reverse scored by subtracting each raw value from 10.
REVISED SCORING FOR THE CPQ 12

subscales in the Divorcing sample were sensitive to the items), for example, but move on to the more destructive
specification of priors. Posterior predictive p values found behavior of threatening (one of the additional items) later
that the model for 12 of the 18 scale-sample-sex in the polarization process. The revised
combinations provided a less than perfect reproduction demand/withdraw subscales may thus better distinguish
of the data. This result is not completely surprising, as the levels of dysfunction among couples in the extreme range
three-factor model in the Clinical Trial sample accounted of demand/ withdraw behavior, compared with the
for just 29.35% of the overall item variance in the CPQ. original scoring. However, the present study does not test
This low variance accounted for suggests that, even whether individual items provide different information
though items generally loaded strongly, there is still a value at various points on the spectrum of couple
substantial amount of variance in the partners’ responses functioning; such a question is better addressed by Item
to the items that is unrelated to those subscales. This Response Theory (IRT; e.g., Embretson & Reise, 2013),
remaining variance may ultimately result in greater which would be a valuable direction for future research.
measurement error of demand/withdraw and Psychometric properties of the CPQ using the
constructive communication than is ideal, even though revised scales were substantially improved in all four
the revision results in a considerable improvement. samples, compared with the original scales. ICCs using
Finally, tests of “weak” factorial invariance, or the revised scoring were significantly larger than when
equivalency of loadings, for men and women on the IBCT using the original scoring for 18 out of 24 subscale-
sample were nonsignificant, indicating that item loadings sample-sex combinations. Such improvements in
on each scale are equivalent across sex. reliability result in substantially improved power to detect
It is important to note that the purpose of this study relationships with other variables, improving the CPQ’s
was to improve the scoring for an existing, widely used utility in empirical studies. In addition, one of the main
measure, not to test a measurement model per se. Thus concerns with the CPQ as it was previously used was the
we were less conservative in our evaluation of the CFA substantial variability in internal reliability across samples
model fit than one might be if developing a new scale. (e.g., Christensen et. al, 2006). The revised scoring results
Conceptual considerations also weighed heavily in our in much greater consistency in internal reliability of the
evaluation of the revised subscales. The addition of four subscale scores across the four samples examined. This
new items to the demand/withdraw scales helps to improved consistency suggests that the revised scoring is
capture demand/withdraw behavior in a broader set of a good fit across the range of couple functioning,
circumstances, assessing a fuller range of conceptually providing strong justification for its use in samples
similar behaviors under the umbrella of ranging from satisfied to severely distressed couples, and
demand/withdraw. The revised scoring may thus identify with couples who present for treatment in clinical trials,
previously missed couples who simply engage in different private practice, community clinics, or other non-
types of demand/withdraw behavior. That is, couples University settings.
may manifest the demand/withdraw pattern in different We also examined convergent validity of the
ways (e.g., pressuring instead of nagging), but the subscales by comparing correlations of original and
behaviors may serve the same function and may have revised scales with relationship satisfaction, and by
developed from the same cycle of polarization examining sensitivity to change from treatment. Twenty-
hypothesized to contribute to the development, seven of 36 correlations with relationship satisfaction
perpetuation, and worsening of this destructive behavior were relatively larger when using the revised scoring
pattern (B. Baucom & Atkins, 2013). compared with the original, though just nine of those
The additional items may also better distinguish were statistically significant. Both the revised and original
between couples at the higher end of the spectrum of scoring systems were able to detect significant changes in
demand/withdraw behaviors. The original behavior over the course of treatment, and the magnitude
demand/withdraw items (start/avoid discussion, of these changes in behavior was not significantly
nag/withdraw, criticize/defend) are relatively mild different across scoring systems. The revised scoring
compared with some of the added items (pressure for appears to be more sensitive to change at small sample
action/resist, pressure to apologize/resist, threaten/give sizes, but not at moderate to large sample sizes.
in, call names or attack character). Demand/withdraw In sum, results of the current study demonstrate that,
behavior is thought to emerge over time, with behaviors while the original scoring for the CPQ is adequate, the
becoming more extreme through a cycle of intermittent revised scoring represents a substantial improvement
reinforcement (e.g., B. Baucom & Atkins, 2013). As such, overall, and we recommend its use in place of the original
there may be a rough sequence or hierarchy of behaviors scoring in future research and clinical applications.
in which couples early in the polarization process attempt Existing data can also be reanalyzed using the revised
to coerce their partner by nagging (one of the original scales, as the items themselves remain unchanged. EFAs
REVISED SCORING FOR THE CPQ 13

found three factors in the CPQ that were consistent for research should examine the revised scales in same-sex
men and women and added additional items to each couples and couples outside of the United States. IRT
subscale. The factor solution was largely confirmed using analysis on a large sample may also be fruitful for
CFAs on three diverse samples. Additionally, 18 of 24 improving measurement of couple communication
internal reliabilities were significantly larger when using behavior by testing whether the items added to the
the revised scoring compared with the original, which demand/withdraw scales can better measure more
translates into substantially improved power to detect extreme demand/withdraw behavior. These future
relationships with other variables in the revised scoring. directions could contribute to continued refinement of
Lastly, the revised subscales demonstrate improved measurement of couple communication behavior and
construct validity by, overall, having stronger associations address some of the remaining issues with the CPQ.
with relationship satisfaction and being better able to However, the CPQ has proven to be a highly useful self-
predict change in therapy. report measure both in research and applied settings, and
the current study both further confirms its utility across
Limitations the range of couple functioning and enhances its utility in
There are several important limitations to the current all examined contexts through improved reliability and
study. First, we examined CPQ data only from conceptual clarity.
heterosexual married couples. At least one study using
observational coding has found that same-sex couples
engage in demand/withdraw behavior in ways highly
similar to heterosexual couples (B. Baucom et. al, 2010),
and other studies have confirmed the utility of the CPQ
in same-sex couples (e.g., Kurdek, 2004). However, we
did not test the revised CPQ scoring with same-sex
couples because of the unavailability of such data.
Similarly, we examined only couples living within the
United States. Finally, factor loadings in the Divorcing
sample were substantially more sensitive to priors than
were those in the Community or Clinic samples. It is
difficult to know whether this increased sensitivity was
related to a restricted range of behavior present in
divorcing couples, the smaller sample size of the
Divorcing sample, or some combination of the two.
Despite this uncertainty, parameter estimates obtained for
the Divorcing sample using empirical priors were similar
to those obtained for the other samples and demonstrate
the acceptability of the revised scoring method for use in
Divorcing samples.

Conclusions and Future Directions


The findings of the current study improve the utility
of the CPQ for both research and practice settings. The
improved reliability of the revised scale scores directly
translates into improved power for detecting meaningful
relationships with other variables in empirical research. In
applied settings, the revised scales allow for more accurate
assessment of communication behavior in order to better
inform treatment plans and to better assess treatment
progress in couple therapy. Additionally, the current study
found that the three-factor conceptualization of the CPQ
can validly describe communication behavior across a
range of couple functioning, from well-functioning
couples in the community to couples in the process of
getting divorced, and the loadings of items on subscales
was found to be equivalent for men and women. Future
REVISED SCORING FOR THE CPQ 14

References

Balderrama-Durbin, C. M., Allen, E. S., & Rhoades, G. K. (2012). Demand and withdraw behaviors in couples with a
history of infidelity. Journal of Family Psychology, 26, 11-17. doi:10.1037/a0026756.
Baucom, B. R., & Atkins, D. C. (2013). Understanding marital distress: Polarization processes. In Fine, M.A. &
Fincham, F.D. (Eds.), Handbook of family theories: A content-based approach (145-166). New York, NY: Routledge.
Baucom, B. R., McFarland, P. T., & Christensen, A. (2010). Gender, topic, and time in observed demand–withdraw
interaction in cross-and same-sex couples. Journal of Family Psychology, 24, 233-242. doi:10.1037/a0019717.
Baucom, D. H., Epstein, N., Rankin, L. A., & Burnett, C. K. (1996). Assessing relationship standards: The Inventory
of Specific Relationship Standards. Journal of Family Psychology, 10, 72-88. doi:10.1037/0893-3200.10.1.72.
Baucom, K. J., Baucom, B. R., & Christensen, A. (2012). Do the naïve know best? The predictive power of naïve
ratings of couple interactions. Psychological Assessment, 24, 983-994. doi:10.1037/a0028680.
Baucom, K. J., Baucom, B. R., & Christensen, A. (2015). Changes in dyadic communication during and after
integrative and traditional behavioral couple therapy. Behaviour research and therapy, 65, 18-28.
doi:10.1016/j.brat.2014.12.004.
Bodenmann, G., Kaiser, A., Hahlweg, K., & Fehm‐Wolfsdorf, G. (1998). Communication patterns during marital
conflict: A cross‐cultural replication. Personal Relationships, 5, 343-356. doi:10.1111/j.1475-
6811.1998.tb00176.x.
Christensen, A. (1987). Detection of conflict patterns in couples. In K. Hahlweg & M.J. Goldstein (Eds.).
Understanding major mental disorder: The contribution of family interaction research (pp. 250-265). New York, NY, US:
Family Process Press.
Christensen, A., Atkins, D. C., Berns, S., Wheeler, J., Baucom, D. H., & Simpson, L. E. (2004). Traditional versus
integrative behavioral couple therapy for significantly and chronically distressed married couples. Journal of
Consulting and Clinical Psychology, 72, 176-191. doi: 10.1037/0022-006X.72.2.176.
Christensen, A., Eldridge, K., Catta‐Preta, A. B., Lim, V. R., & Santagata, R. (2006). Cross‐cultural consistency of the
demand/withdraw interaction pattern in couples. Journal of Marriage and Family, 68, 1029-1044.
doi:10.1111/j.1741-3737.2006.00311.x.
Christensen, A., & Shenk, J. L. (1991). Communication, conflict, and psychological distance in nondistressed, clinic,
and divorcing couples. Journal of Consulting and Clinical Psychology, 59, 458-463. doi:10.1037/0022-
006X.59.3.458.
Diedenhofen, b. (2016). cocron: statistical comparisons of two or more alpha coefficients. R package version 1.0-1.
Doss, B. D., Simpson, L. E., & Christensen, A. (2004). Why do couples seek marital therapy?. Professional Psychology:
Research and Practice, 35, 608-614. doi:10.1037/0735-7028.35.6.608.
Eldridge, K. A., & Baucom, B. (2011). Demand-withdraw communication in couples. In P. Noller & G.C. Karantzas
(Eds.). The Wiley-Blackwell handbook of couples and family relationships (pp. 144-158). West Sussex, UK: Wiley
Blackwell.
Eldridge, K. A., & Christensen, A. (2002). Demand-withdraw communication during couple conflict: A review and
analysis. In P. Noller & J.A. Feeney (Eds.). Understanding marriage: Developments in the study of couple interaction (pp.
289-322). Cambridge University Press.
Embretson, S. E., & Reise, S. P. (2013). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates,
Inc.
Feldt, L. S., Woodruff, D. J., & Salih, F. A. (1987) Statistical inference for coefficient alpha. Applied Psychological
Measurement, 11, 93-103. doi:10.1177/014662168701100107.
Fincham, F. D., & Beach, S. R. (2002). Forgiveness in marriage: Implications for psychological aggression and
constructive communication. Personal Relationships, 9, 239-251. doi:10.1111/1475-6811.00016.
Gottman, J. M. (1994). What predicts divorce: The relationship between marital processes and marital outcomes. Hillsdale, NUJ:
Erlbaum.
REVISED SCORING FOR THE CPQ 15

Gottman, J. M., & Levenson, R. W. (2000). The timing of divorce: predicting when a couple will divorce over a 14‐
year period. Journal of Marriage and Family, 62, 737-745. doi:10.1111/j.1741-3737.2000.00737.x.
Harris, L. E. (1992). Marital conflict and divorce: a cross-cultural study of conciliation court participants (Unpublished doctoral
dissertation). University of California Los Angeles, Los Angeles, CA.
Heavey, C., Gill, D. S., & Christensen, A. (1998). The Couple Interaction Rating System (Unpublished document).
University of California, Los Angeles.
Heavey, C. L., Larson, B. M., Zumtobel, D. C., & Christensen, A. (1996). The Communication Patterns
Questionnaire: The reliability and validity of a constructive communication subscale. Journal of Marriage and the
Family, 796-800. doi:10.2307/353737.
Holtzworth-Munroe, A., Smutzler, N., & Stuart, G. L. (1998). Demand and withdraw communication among couples
experiencing husband violence. Journal of Consulting and Clinical Psychology, 66, 731-743. doi:10.1037/0022-
006X.66.5.731.
Johnson, S. M., Hunsley, J., Greenberg, L., & Schindler, D. (1999). Emotionally focused couples therapy: Status and
challenges. Clinical Psychology: Science and Practice, 6, 67-79. doi:10.1093/clipsy.6.1.67.
Karney, B. R., & Bradbury, T. N. (1995). The longitudinal course of marital quality and stability: A review of theory,
method, and research. Psychological Bulletin, 118, 3–34. doi:10.1037/0033-2909.118.1.3.
Kelly, A. B., Halford, W. K., & Young, R. M. (2002). Couple communication and female problem drinking: A
behavioral observation study. Psychology of Addictive Behaviors, 16, 269-271. doi:10.1037/0893-164X.16.3.269.
Kline, R. B. (2015). Principles and practice of structural equation modeling (4th ed.). New York, NY: Guilford Press.
Kurdek, L. A. (2004). Are gay and lesbian cohabiting couples really different from heterosexual married couples?
Journal of Marriage and the Family, 66, 880-900. doi:10.1111/j.0022-2445.2004.00060.x.
Lee, I. A., & Preacher, K. J. (2013, September). Calculation for the test of the difference between two dependent
correlations with one variable in common [Computer software]. Available from http://quantpsy.org.
Lee, S. Y., & Song, X. Y. (2004). Evaluation of the Bayesian and maximum likelihood approaches in analyzing
structural equation models with small sample sizes. Multivariate Behavioral Research, 39, 653-686.
doi:10.1207/s15327906mbr3904_4.
Litzinger, S., & Gordon, K. C. (2005). Exploring relationships among communication, sexual satisfaction, and marital
satisfaction. Journal of Sex & Marital Therapy, 31, 409-424. doi:10.1080/00926230591006719.
Margolin, G., & Wampold, B. E. (1981). Sequential analysis of conflict and accord in distressed and nondistressed
marital partners. Journal of consulting and clinical psychology, 49, 554-567. doi:10.1037/0022-006X.49.4.554.
McGinn, M. M., McFarland, P. T., & Christensen, A. (2009). Antecedents and consequences of demand/withdraw.
Journal of Family Psychology, 23, 749-757. doi:10.1037/a0016185.
Muthén, B. (2010). Bayesian analysis in Mplus: A brief introduction. Unpublished manuscript.
www.statmodel.com/download/IntroBayesVersion, 203.
Muthén, B., & Asparouhov, T. (2012). Bayesian structural equation modeling: a more flexible representation of
substantive theory. Psychological Methods, 17, 313-335. doi:10.1037/a0026802.
Muthén, L.K. and Muthén, B.O. (1998-2012). Mplus user’s guide (6th ed.). Los Angeles, CA: Muthén & Muthén.
Noller, P., & White, A. (1990). The validity of the Communication Patterns Questionnaire. Psychological Assessment: A
Journal of Consulting and Clinical Psychology, 2, 478-482. doi:10.1037/1040-3590.2.4.478.
Rehman, U. S., Ginting, J., Karimiha, G., & Goodnight, J. A. (2010). Revisiting the relationship between depressive
symptoms and marital communication using an experimental paradigm: The moderating effect of acute sad
mood. Behaviour Research and Therapy, 48, 97-105. doi:10.1016/j.brat.2009.09.013.
Schmitt, T. A. (2011). Current methodological considerations in exploratory and confirmatory factor analysis. Journal of
Psychoeducational Assessment, 29, 304-321. doi:10.1177/0734282911406653.
Schrodt, P., Witt, P. L., & Shimkowski, J. R. (2014). A meta-analytical review of the demand/withdraw pattern of
interaction and its associations with individual, relational, and communicative outcomes. Communication
Monographs, 81, 28-58. doi:10.1080/03637751.2013.813632.
REVISED SCORING FOR THE CPQ 16

Shadish, W. R., & Baldwin, S. A. (2005). Effects of behavioral marital therapy: a meta-analysis of randomized
controlled trials. Journal of consulting and clinical psychology, 73, 6-14. doi:10.1037/0022-006X.73.1.6.
Spanier, G. B. (1976). Measuring dyadic adjustment: New scales for assessing the quality of marriage and similar
dyads. Journal of Marriage and the Family, 38, 15-28. doi:10.2307/350547.
Sullaway, M. & Christensen, A. (1983). Assessment of dysfunctional interaction patterns in couples. Journal of Marriage
and the Family, 45, 653-660. doi:10.2307/351670.
Tabachnick, B. G., & Fidell, L. S. (2013). Using multivariate statistics (6th ed.). Boston, MA: Pearson.
Woodin, E. M. (2011). A two-dimensional approach to relationship conflict: meta-analytic findings. Journal of Family
Psychology, 25, 325-335. doi:10.1037/a0023791.

You might also like