100% found this document useful (11 votes)
449 views

Instant Download (eTextbook PDF) for Applied Regression Analysis and Other Multivariable Methods 5th Edition PDF All Chapters

Other

Uploaded by

ciferrdinon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (11 votes)
449 views

Instant Download (eTextbook PDF) for Applied Regression Analysis and Other Multivariable Methods 5th Edition PDF All Chapters

Other

Uploaded by

ciferrdinon
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Download Full Version ebookmass - Visit ebookmass.

com

(eTextbook PDF) for Applied Regression Analysis


and Other Multivariable Methods 5th Edition

https://ebookmass.com/product/etextbook-pdf-for-applied-
regression-analysis-and-other-multivariable-methods-5th-
edition/

OR CLICK HERE

DOWLOAD NOW

Discover More Ebook - Explore Now at ebookmass.com


Instant digital products (PDF, ePub, MOBI) ready for you
Download now and discover formats that fit your needs...

Primer of applied regression and analysis of variance 3rd


Edition Glantz S.A.

https://ebookmass.com/product/primer-of-applied-regression-and-
analysis-of-variance-3rd-edition-glantz-s-a/

ebookmass.com

Primer of Applied Regression & Analysis of Variance 3rd


edition Edition Stanton A. Glantz

https://ebookmass.com/product/primer-of-applied-regression-analysis-
of-variance-3rd-edition-edition-stanton-a-glantz/

ebookmass.com

Survey Research Methods (Applied Social Research Methods


Book 1) 5th Edition, (Ebook PDF)

https://ebookmass.com/product/survey-research-methods-applied-social-
research-methods-book-1-5th-edition-ebook-pdf/

ebookmass.com

After Ancient Biography: Modern Types and Classical


Archetypes 1st ed. Edition Robert Fraser

https://ebookmass.com/product/after-ancient-biography-modern-types-
and-classical-archetypes-1st-ed-edition-robert-fraser/

ebookmass.com
Psychoanalysis and the Politics of the Family: The Crisis
of Initiation Daniel Tutt

https://ebookmass.com/product/psychoanalysis-and-the-politics-of-the-
family-the-crisis-of-initiation-daniel-tutt/

ebookmass.com

The Mail Order Bride's Secret Linda Broday

https://ebookmass.com/product/the-mail-order-brides-secret-linda-
broday/

ebookmass.com

Nico's Wish: An MM Age Play, Age Gap Romance (The Littles


Of Cape Daddy Book 4) Zack Wish & Lana Kyle

https://ebookmass.com/product/nicos-wish-an-mm-age-play-age-gap-
romance-the-littles-of-cape-daddy-book-4-zack-wish-lana-kyle/

ebookmass.com

Small Animal Laparoscopy and Thoracoscopy 2nd Edition Boel


A. Fransson

https://ebookmass.com/product/small-animal-laparoscopy-and-
thoracoscopy-2nd-edition-boel-a-fransson/

ebookmass.com

Essentials of Criminal Law 11th Edition E-book PDF Version


– Ebook PDF Version

https://ebookmass.com/product/essentials-of-criminal-law-11th-edition-
e-book-pdf-version-ebook-pdf-version/

ebookmass.com
The Concept of the Individual in the Thought of Karl Marx
Zhi Li

https://ebookmass.com/product/the-concept-of-the-individual-in-the-
thought-of-karl-marx-zhi-li/

ebookmass.com
vi    Contents

4 Introduction to Regression Analysis 41


4.1 Preview    41
4.2 Association versus Causality    42
4.3 Statistical versus Deterministic Models    45
4.4 Concluding Remarks    45
References    46

5 Straight-line Regression Analysis 47


5.1 Preview    47
5.2 Regression with a Single Independent Variable     47
5.3 Mathematical Properties of a Straight line     50
5.4 Statistical Assumptions for a Straight-line Model     51
5.5 Determining the Best-fitting Straight Line     55
5.6 Measure of the Quality of the Straight-line Fit and
Estimate of s2    60
5.7 Inferences about the Slope and Intercept     61
5.8 Interpretations of Tests for Slope and Intercept     64
5.9 The Mean Value of Y at a Specified Value of X    66
5.10 Prediction of a New Value of Y at X0    68
5.11 Assessing the Appropriateness of the Straight-line Model     70
5.12 Example: BRFSS Analysis    71
Problems    74
References    107

6 The Correlation Coefficient and Straight-line


Regression Analysis 108
6.1 Definition of r    108
6.2 r as a Measure of Association     109
6.3 The Bivariate Normal Distribution    112
6.4 r 2 and the Strength of the Straight-line Relationship     113
6.5 What r 2 Does Not Measure    115
6.6 Tests of Hypotheses and Confidence Intervals for the
Correlation Coefficient    117
6.7 Testing for the Equality of Two Correlations     120
6.8 Example: BRFSS Analysis    122
6.9 How Large Should r 2 Be in Practice?    123
Problems    125
References    127

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents     vii

7 The Analysis-of-Variance Table 129


7.1 Preview    129
7.2 The ANOVA Table for Straight-line Regression     129
Problems    133

8 Multiple Regression Analysis:


General Considerations 136
8.1 Preview    136
8.2 Multiple Regression Models    137
8.3 Graphical Look at the Problem     138
8.4 Assumptions of Multiple Regression    141
8.5 Determining the Best Estimate of the Multiple
Regression Equation    143
8.6 The ANOVA Table for Multiple Regression     145
8.7 Example: BRFSS Analysis    146
8.8 Numerical Examples    148
Problems    151
References    164

9 Statistical Inference in Multiple Regression 165


9.1 Preview    165
9.2 Test for Significant Overall Regression     166
9.3 Partial F Test    167
9.4 Multiple Partial F Test    172
9.5 Strategies for Using Partial F Tests    175
9.6 Additional Inference Methods for Multiple Regression     180
9.7 Example: BRFSS Analysis    186
Problems    188
References    198

10 Correlations: Multiple, Partial, and Multiple Partial 199


10.1 Preview    199
10.2 Correlation Matrix    200
10.3 Multiple Correlation Coefficient    201
10.4 Relationship of RY | X1, X2, p , XK to the Multivariate
Normal Distribution    203

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
viii    Contents

10.5 Partial Correlation Coefficient    204


10.6 Alternative Representation of the Regression Model     212
10.7 Multiple Partial Correlation    212
10.8 Concluding Remarks    214
Problems    215
References    225

11 Confounding and Interaction in Regression 226


11.1 Preview    226
11.2 Overview    226
11.3 Interaction in Regression    228
11.4 Confounding in Regression    236
11.5 Summary and Conclusions    242
Problems    242
References    256

12 Dummy Variables in Regression 257


12.1 Preview    257
12.2 Definitions    257
12.3 Rule for Defining Dummy Variables    258
12.4 Comparing Two Straight-line Regression Equations: An Example     259
12.5 Questions for Comparing Two Straight Lines     261
12.6 Methods of Comparing Two Straight Lines     262
12.7 Method I: Using Separate Regression Fits to Compare Two Straight Lines     263
12.8 Method II: Using a Single Regression Equation to Compare
Two Straight Lines    268
12.9 Comparison of Methods I and II     271
12.10 Testing Strategies and Interpretation: Comparing Two Straight Lines     272
12.11 Other Dummy Variable Models    273
12.12 Comparing Four Regression Equations    275
12.13 Comparing Several Regression Equations Involving Two Nominal Variables    277
Problems    283
References    307

13 Analysis of Covariance and Other Methods


for Adjusting Continuous Data 308
13.1 Preview    308
13.2 Adjustment Problem    309

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents     ix

13.3 Analysis of Covariance    310


13.4 Assumption of Parallelism: A Potential Drawback     313
13.5 Analysis of Covariance: Several Groups and Several Covariates     314
13.6 Analysis of Covariance: Several Nominal Independent Variables     316
13.7 Comments and Cautions    318
13.8 Summary    321
Problems    321
References    338

14 Regression Diagnostics 339


14.1 Preview    339
14.2 Simple Approaches to Diagnosing Problems in Data     340
14.3 Residual Analysis: Detecting Outliers and Violations of Model Assumptions     347
14.4 Strategies for Addressing Violations of Regression Assumptions     355
14.5 Collinearity    358
14.6 Diagnostics Example    372
Problems    382
References    399

15 Polynomial Regression 401


15.1 Preview    401
15.2 Polynomial Models    402
15.3 Least-squares Procedure for Fitting a Parabola     402
15.4 ANOVA Table for Second-order Polynomial Regression     404
15.5 Inferences Associated with Second-order Polynomial Regression     405
15.6 Example Requiring a Second-order Model     406
15.7 Fitting and Testing Higher-order Models    410
15.8 Lack-of-fit Tests    410
15.9 Orthogonal Polynomials    412
15.10 Strategies for Choosing a Polynomial Model     422
Problems    423

16 Selecting the Best Regression Equation 438


16.1 Preview    438
16.2 Steps in Selecting the Best Regression Equation: Prediction Goal     439
16.3 Step 1: Specifying the Maximum Model: Prediction Goal     439
16.4 Step 2: Specifying a Criterion for Selecting a Model: Prediction Goal     442
16.5 Step 3: Specifying a Strategy for Selecting Variables: Prediction Goal     444
16.6 Step 4: Conducting the Analysis: Prediction Goal     454

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
x    Contents

16.7 Step 5: Evaluating Reliability with Split Samples: Prediction Goal     454
16.8 Example Analysis of Actual Data     457
16.9 Selecting the Most Valid Model    463
Problems    466
References    480

17 One-way Analysis of Variance 481


17.1 Preview    481
17.2 One-way ANOVA: The Problem, Assumptions, and
Data Configuration    484
17.3 Methodology for One-way Fixed-effects ANOVA     488
17.4 Regression Model for Fixed-effects One-way ANOVA     494
17.5 Fixed-effects Model for One-way ANOVA     497
17.6 Random-effects Model for One-way ANOVA     500
17.7 Multiple-comparison Procedures for Fixed-effects One-way ANOVA     503
17.8 Choosing a Multiple-comparison Technique    515
17.9 Orthogonal Contrasts and Partitioning an ANOVA Sum of Squares      516
Problems    522
References    543

18 Randomized Blocks: Special Case


of Two-way ANOVA 545
18.1 Preview    545
18.2 Equivalent Analysis of a Matched-pairs Experiment     549
18.3 Principle of Blocking    553
18.4 Analysis of a Randomized-blocks ­Study     555
18.5 ANOVA Table for a Randomized-blocks Study     557
18.6 Regression Models for a Randomized-blocks Study     561
18.7 Fixed-effects ANOVA Model for a Randomized-blocks Study     565
Problems    566
References    578

19 Two-way ANOVA with Equal Cell Numbers 579


19.1 Preview    579
19.2 Using a Table of Cell Means     581
19.3 General Methodology    586
19.4 F Tests for Two-way ANOVA    592
19.5 Regression Model for Fixed-effects Two-way ANOVA     594

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents     xi

19.6 Interactions in Two-way ANOVA    599


19.7 Random- and Mixed-effects Two-way ANOVA Models     607
Problems    610
References    629

20 Two-way ANOVA with Unequal Cell Numbers 630


20.1 Preview    630
20.2 Presentation of Data for Two-way ANOVA: Unequal Cell Numbers      630
20.3 Problem with Unequal Cell Numbers: Nonorthogonality     632
20.4 Regression Approach for Unequal Cell Sample Sizes     637
20.5 Higher-way ANOVA    641
Problems    642
References    659

21 The Method of Maximum Likelihood 661


21.1 Preview    661
21.2 The Principle of Maximum Likelihood     661
21.3 Statistical Inference Using Maximum Likelihood     665
21.4 Summary    677
Problems    678
References    680

22 Logistic Regression Analysis 681


22.1 Preview    681
22.2 The Logistic Model    681
22.3 Estimating the Odds Ratio Using Logistic Regression     683
22.4 A Numerical Example of Logistic Regression     689
22.5 Theoretical Considerations    698
22.6 An Example of Conditional ML Estimation Involving Pair-matched
Data with Unmatched Covariates    704
22.7 Summary    707
Problems    708
References    712

23 Polytomous and Ordinal Logistic Regression 714


23.1 Preview    714
23.2 Why Not Use Binary Regression?     715

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xii    Contents

23.3 An Example of Polytomous Logistic Regression: One Predictor,


Three Outcome Categories    715
23.4 An Example: Extending the Polytomous Logistic Model to Several Predictors      721
23.5 Ordinal Logistic Regression: Overview    726
23.6 A “Simple” Example: Three Ordinal Categories and
One Dichotomous Exposure Variable    727
23.7 Ordinal Logistic Regression Example Using Real Data with
Four Ordinal Categories and Three Predictor Variables     731
23.8 Summary    737
Problems    738
References    742

24 Poisson Regression Analysis 743


24.1 Preview    743
24.2 The Poisson Distribution    743
24.3 An Example of Poisson Regression     745
24.4 Poisson Regression    748
24.5 Measures of Goodness of Fit     753
24.6 Continuation of Skin Cancer Data Example     756
24.7 A Second Illustration of Poisson Regression Analysis     762
24.8 Summary    765
Problems    766
References    780

25 Analysis of Correlated Data Part 1:


The General Linear Mixed Model 781
25.1 Preview    781
25.2 Examples    784
25.3 General Linear Mixed Model Approach     792
25.4 Example: Study of Effects of an Air Pollution Episode on FEV1 Levels     806
25.5 Summary—Analysis of Correlated Data: Part 1     818
Problems    819
References    824

26 Analysis of Correlated Data Part 2: Random Effects


and Other Issues 825
26.1 Preview    825
26.2 Random Effects Revisited    825

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents     xiii

26.3 Results for Models with Random Effects Applied to


Air Pollution Study Data    829
26.4 Second Example—Analysis of Posture Measurement Data     839
26.5 Recommendations about Choice of Correlation Structure     859
26.6 Analysis of Data for Discrete Outcomes     861
Problems    862
References    882

27 Sample Size Planning for Linear and Logistic


Regression and Analysis of Variance 883
27.1 Preview    883
27.2 Review: Sample Size Calculations for Comparisons of
Means and Proportions    884
27.3 Sample Size Planning for Linear Regression     886
27.4 Sample Size Planning for Logistic Regression     889
27.5 Power and Sample Size Determination for Linear Models:
A General Approach    893
27.6 Sample Size Determination for Matched Case–control Studies with a
Dichotomous Outcome    908
27.7 Practical Considerations and Cautions    910
Problems    911
References    913

Appendix A—Tables 915


A.1Standard Normal Cumulative Probabilities    916
A.2Percentiles of the t Distribution    919
A.3Percentiles of the Chi-square Distribution     920
A.4Percentiles of the F Distribution    921
11r
A.5 Values of 12 ln     928
12r
A.6 Upper a Point of Studentized Range     930
A.7 Orthogonal Polynomial Coefficients    932
A.8A Bonferroni Corrected Jackknife Residual Critical Values     933
A.8B Bonferroni Corrected Studentized Residual Critical Values     933
A.9 Critical Values for Leverages    934
A.10 Critical Values for the Maximum of n Values of Cook’s (n 2 k 2 1) di    936

Appendix B—Matrices and Their Relationship


to Regression Analysis 937

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xiv    Contents

Appendix C—SAS Computer Appendix 949

Appendix D—Answers to Selected Problems 991

Index 1037

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface
This is the fourth revision of our second-level statistics text, originally published in 1978 and
revised in 1987, 1998, and 2008. As with previous versions, this text is intended primarily
for advanced undergraduates, graduate students, and working professionals in the health,
social, biological, and behavioral sciences who engage in applied research in their fields. The
text may also provide professional statisticians with some new insights into the application of
advanced statistical techniques to realistic research problems.
We have attempted in this revision to retain the basic structure and flavor of the earlier
editions, while at the same time making changes to keep pace with current analytic practices
and computer usage in applied research. Notable changes in this fifth edition, discussed in
more detail later, include
i. Clarification of content and/or terminology as suggested by reviewers and read-
ers, including revision of variable and subscript notation used for predictor vari-
ables and regression coefficients to provide consistency over different chapters.
ii. Expanded and updated coverage of some content areas (e.g., confounding and
interaction in regression in Chapter 11, selecting the best regression equation in
Chapter 16, sample size determination in Chapter 27).
iii. A new linear regression example that is carried through and expanded upon in
Chapters 5, 6, 8, 9, 11, 12, 13, and 16.
iv. Some new exercises at the end of selected chapters, including exercises related to
the new example described in item (iii) above.
v. Updated SAS computer output using SAS 9.3 that reflects improvements in out-
put styling.
vi. Two computer appendices on programming procedures for multiple linear regres-
sion models, logistic regression models, Poisson regression models, and mixed
linear models:
a. In-text: SAS
b. Online: SPSS, STATA, and R
In this fifth edition, as in our previous versions, we emphasize the intuitive logic and
assumptions that underlie the techniques covered, the purposes for which these techniques
are designed, the advantages and disadvantages of these techniques, and valid interpretations
xv
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xvi    Preface

based on these techniques. Although we describe the statistical calculations required for the
techniques we cover, we rely on computer output to provide the results of such calculations
so the reader can concentrate on how to apply a given technique rather than how to carry
out the calculations. The mathematical formulas that we do present require no more than
simple algebraic manipulations. Proofs are of secondary importance and are generally omit-
ted. Calculus is not explicitly used anywhere in the main text. We introduce matrix notation
to a limited extent in Chapters 25 and 26 because we believe that the use of matrices provides
a more convenient way to understand some of the complicated mathematical aspects of the
analysis of correlated data. We also have continued to include an appendix on matrices for
the interested reader.
This edition, as with the previous editions, is not intended to be a general reference
work dealing with all the statistical techniques available for analyzing data involving several
variables. Instead, we focus on the techniques we consider most essential for use in applied
research. We want the reader to understand the concepts and assumptions involved in these
techniques and how these techniques can be applied in practice, including how computer
packages can help make it easier to perform the analysis of one’s data.
The most notable features of this fifth edition, including the material that has not been
modified from the previous edition, are the following:

1. Regression analysis (Chapters 1–16) and analysis of variance (Chapters 17–20)


are discussed in considerable detail and with pedagogical care that reflects the
authors’ extensive experience and insight as teachers of such material.
2. A new linear regression example based on a complex survey design is carried
through and expanded upon in several chapters, including new exercises involv-
ing the dataset for this example. To obtain the most valid estimates of regression
coefficients, weighting and stratification schemes involved in the survey design
should be taken into account. Although it is beyond the scope of this text to
describe regression methods for analyzing complex survey designs, we discuss and
illustrate the extent to which results from using such “weighted” methods may
differ from results from using the “unweighted” methods emphasized in this text.
3. The relationship between regression analysis and analysis of variance is high-
lighted.
4. The connection between multiple regression analysis and multiple and partial
correlation analysis is discussed in detail.
5. Several advanced topics are presented in a unique, nonmathematical manner,
including chapters on maximum likelihood (ML) methods (21), binary logis-
tic regression (22), polytomous and ordinal logistic regression (23), and Poisson
regression (24) and two chapters (25–26) on the analysis of correlated data
(described further below). The material on ML methods in Chapters 21–26 pro-
vides a strong foundation for understanding why ML estimation is the most
widely used method for fitting mathematical models involving several variables.
6. An up-to-date discussion of the issues and procedures involved in fine-tuning
a regression analysis is presented on confounding and interaction in regression
(Chapter 11), selecting the best regression model (Chapter 16), and regression
diagnostics (Chapter 14).

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface     xvii

7. Chapter 23 on polytomous and ordinal logistic regression methods extends the


standard (binary) logistic model to outcome variables that have more than two
categories. Polytomous logistic regression is used when the outcome categories
do not have any natural order, whereas ordinal logistic regression is appropriate
when the outcome categories have a natural order.
8. Chapters 25 and 26 on the analysis of correlated data describe the ML/REML
linear mixed model approach incorporated into SAS’s MIXED procedure. Since
ML estimation is assumed, these chapters are logically ordered after the current
Chapter 21 on ML estimation. In Chapter 25, we describe the general form
of the linear mixed model, introduce the terms correlated structure and robust/
empirical standard errors, and illustrate how to model correlated data when only
fixed effects are considered. In Chapter 26, which serves as Part 2 of this topic,
we focus on linear mixed models that contain random effects. Chapter 26 also
provides a link to ANOVA Chapters 17–20, alternatively formulating the linear
mixed model approach in terms of an ANOVA that partitions sources of varia-
tion from various predictors into the sums of squares and corresponding mean
square terms of a summary ANOVA table.
9. Chapter 27 on sample size determination for linear and logistic regression models
describes two approaches for sample size calculation, the first being an approxi-
mate approach that yields fairly accurate sample sizes and requires only manual
computation. The second approach is based on more traditional theory for sam-
ple size determination and is best implemented using computer software. This
chapter has been updated to reflect updated SAS 9.3 and PASS 11 output.
10. Representative computer results from SAS 9.3 are used to illustrate concepts in
the body of the text, as well as to provide a basis for exercises for the reader. In
this edition, we revised the computer output to reflect the most recent version of
SAS, and, in many instances, we annotated comments on the output so that it is
easier to read.
11. Numerous examples and exercises illustrate applications to real studies in a wide
variety of disciplines. New exercises have been added to several chapters.
12. Solutions to selected exercises are provided in Appendix D. An Instructor’s Solu-
tions Manual containing solutions to all exercises is also available with the fifth
edition. In addition, a Student Solutions Manual containing complete solutions
to selected problems is available for students.
13. Computer Appendix C is a new addition to the text that describes how to use
Version 9.3 of SAS to carry out linear regression, logistic regression, Poisson
regression, and correlated data analysis of linear models.
14. Links to freely downloadable datasets; a computer appendix on the use of STATA,
SPSS, and R packages to carry out linear regression modeling; updates on errata;
and other information are available at CengageBrain.com.
15. The computer appendix mentioned in item (14) will be a freely downloadable
electronic document providing computer guidelines for multiple linear regression
models. (Other textbooks by Kleinbaum and Klein have computer appendices
for SAS, STATA, and SPSS use with logistic models and Cox proportional

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xviii    Preface

hazards and extended Cox models for survival data.) The computer appendix
will provide a quick and easy reference guide to help the reader avoid having to
spend a lot of time finding information from sometimes confusing help guides in
packages like SAS.

Suggestions for Instructors or Individual Learners


For formal classroom instruction and/or individual/distance learning, the chapters fall natu-
rally into four clusters:
Course 1: Chapters 4–16, on linear regression analysis
Course 2: Chapters 17–20, on the analysis of variance
Course 3: Chapters 21–-24, on maximum likelihood methods and important appli-
cations involving logistic and Poisson regression modeling
Course 4: Chapters 25–26, on the analysis of correlated data involving linear mixed
models
Portions of Chapter 27 on sample size determination could be added, as appropriate, to
Courses 1–3 above. Courses 1 and 2 have often been combined into one course on regression
and ANOVA methods. For a first course in regression analysis, some of Chapters 11 through
16 may be considered too specialized. For example, Chapter 15 on selecting the best regres-
sion model and Chapter 16 on regression diagnostics might be used in a continuation course
on regression modeling, which might also include some of the advanced topics covered in
Chapters 21–27.

Acknowledgments
We wish to acknowledge several people who contributed to the development of this text,
including early editions as well as this fifth edition. Drs. Kleinbaum and Kupper continue to
be indebted to John Cassel and Bernard Greenberg, two mentors who have provided us with
inspiration and the professional and administrative guidance that enabled us at the begin-
ning of our careers to gain the broad experience necessary to write this text.
Dr. Kleinbaum also wishes to thank John Boring, former Chair of the Department of
Epidemiology at Emory University, for his strong support and encouragement during the
writing of the third and fourth editions and for his deep commitment to teaching excellence.
Dr. Kleinbaum also wishes to thank Dr. Mitch Klein of Emory’s Department of Epidemiology
for his colleagueship, including thoughtful suggestions on and review of previous editions.
Dr. Kleinbaum also thanks Dr. Viola Vaccarino, Chair of the Department of Epidemiology
at Emory University, for continued support and encouragement of his academic life at the
Rollins School of Public Health at Emory University.
Dr. Kupper will forever be indebted to Dr. William Mendenhall, founder and longtime
Chair of the University of Florida Department of Statistics. Dr. Mendenhall gave Dr. Kupper
his start in the field of statistics, and he served as a perfect example of an inspiring teacher
and a caring mentor.
Mr. Nizam wishes to thank Dr. Lance Waller, Chair of the Department of Biostatistics
and Bioinformatics at Emory University, for his strong support and Dr. John Spurrier of the
Department of Statistics at the University of South Carolina for being a wonderful teacher,
advisor, and mentor.

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface     xix

We thank Julia Labadie for her assistance in preparing SAS computer output for this
edition. We also thank Dr. Keith Muller for his contributions to earlier editions as one of
our coauthors.
We thank our spouses—Edna Kleinbaum, Sandy Martin, Janet Nizam, and Abby
Horowitz—for their encouragement and support during the writing of various revisions.
We thank our reviewers of the fifth edition for their helpful suggestions:
Joseph Glaz, University of Connecticut
Lynn Kuo, University of Connecticut
Robert Paige, Missouri University of Science and Technology
Debaraj Sen, Concordia University
Po Yang, DePaul University
We thank the Cengage Learning Statistics and Mathematics team, especially Molly
Taylor, Senior Product Manager, and Laura Wheel, Senior Content Developer, for guiding
us through the publication process for the fifth edition, as well as Jessica Rasile, Content
Project Manager, and Tania Andrabi, Production Manager.
David G. Kleinbaum
Lawrence L. Kupper
Azhar Nizam
Eli S. Rosenberg

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1
Concepts and Examples
of Research

1.1 Concepts
The purpose of most empirical research is to assess relationships among a set of vari-
ables, which are factors that are distinctly measured on observational units (or subjects).
Multivariable1 techniques are concerned with the statistical analysis of such relationships,
particularly when at least three variables are involved. Regression analysis, our primary focus,
is one type of multivariable technique. Other techniques will also be described in this text.
Choosing an appropriate technique depends on the purpose of the research and on the types
of variables under investigation (a subject discussed in Chapter 2).
Research may be classified broadly into three types: experimental, quasi-experimental, or
observational. Multivariable techniques are applicable to all such types, yet the confidence
one may reasonably have in the results of a study can vary with the research type. In most
types, one variable is usually taken to be a response or dependent variable—that is, a variable
to be predicted from other variables. The other variables are called predictor or independent
variables.
If observational units (subjects) are randomly assigned to levels of important predictors,
the study is usually classified as an experiment. Experiments are the most controlled type of
study; they maximize the investigator’s ability to isolate the observed effect of the predictors
from the distorting effects of other (independent) variables that might also be related to the
response.

1
The term multivariable is preferable to multivariate. Statisticians generally use the term multivariate analysis to
describe a method in which several dependent variables can be considered simultaneously. Researchers in the bio-
medical and health sciences who are not statisticians, however, use this term to describe any statistical technique
involving several variables, even if only one dependent variable is considered at a time. In this text, we prefer to avoid
the confusion by using the term multivariable analysis to denote the latter, more general description.

1
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2    Chapter 1   Concepts and Examples of Research

If subjects are assigned to treatment conditions without randomization, the study is


called quasi-experimental (Campbell and Stanley 1963). Such studies are often more feasible
and less expensive than experimental studies, but they provide less control over the study
­situation.
Finally, if all observations are obtained without either randomization or artificial manip-
ulation (i.e., allocation) of the predictor variables, the study is said to be observational. Exper-
iments offer the greatest potential for drawing definitive conclusions, and observational
studies the least; however, experiments are the most difficult studies to implement, and
observational studies the easiest. A researcher must consider this trade-off between interpre-
tive potential and complexity of design when choosing among types of studies (Kleinbaum,
Kupper, and Morgenstern 1982, Chapter 3).
To assess a relationship between two variables, one must measure both of them in some
manner. Measurement inherently and unavoidably involves error. The need for statisti-
cal design and analysis emanates from the presence of such error. Traditionally, statistical
­inference has been divided into two kinds: estimation and hypothesis testing. Estimation
refers to describing (i.e., quantifying) characteristics and strengths of relationships. Testing
refers to specifying hypotheses about relationships, making statements of probability about
the reasonableness of such hypotheses, and then providing practical conclusions based on
such statements.
This text focuses on regression and correlation methods involving one response variable
and one or more predictor variables. In these methods, a mathematical model is specified
that describes how the variables of interest are related to one another. The model must some-
how be developed from study data, after which inference-making procedures (e.g., testing
hypotheses and constructing confidence intervals) are conducted about important param-
eters of interest. Although other multivariable regression methods will be discussed, linear
regression techniques are emphasized for three reasons: they have wide applicability; they can
be the most straightforward to implement; and other, more complex statistical procedures
can be better appreciated once linear regression methods are understood.

1.2 Examples
The examples that follow concern real problems from a variety of disciplines and involve
variables to which the methods described in this book can be applied. We shall return to
these examples later when illustrating various methods of multivariable analysis.

■ Example 1.1 Study of the associations among the physician–patient relationship,


perception of pregnancy, and outcome of pregnancy, illustrating the use of regression
analysis and logistic regression analysis.
Thompson (1972) and Hulka and others (1971) looked at both the process and the
outcomes of medical care in a cohort of 107 pregnant married women in North Carolina.
The data were obtained through patient interviews, questionnaires completed by physicians,
and a review of medical records. Several variables were recorded for each patient.
One research goal of primary interest was to determine what association, if any, existed
between satisfaction with medical care and a number of variables meant to describe patient

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Examples    3

perception of pregnancy and the physician–patient relationship. Three perception-of-


pregnancy variables measured the patient’s worry during pregnancy, her desire for the baby,
and her concern about childbirth. Two other variables measured the physician–patient rela-
tionship in terms of informational communication concerning prescriptions and affective
communication concerning perceptions. Other variables considered were age, social class,
education, and parity.
Regression analysis was used to describe the relationship between scores measuring patient
satisfaction with medical care and the preceding variables. From this analysis, variables found
not to be related to medical care satisfaction could be eliminated, while those found to be
associated with satisfaction could be ranked in order of importance. Also, the effects of con-
founding variables such as age and social class could be considered, to three ends: any asso-
ciations found could not be attributed solely to such variables; measures of the strength of
the relationship between satisfaction and other variables could be obtained; and a functional
equation predicting level of patient satisfaction in terms of the other variables found to be
important in describing satisfaction could be developed.
Another question of interest in this study was whether patient perception of pregnancy
and/or the physician–patient relationship was associated with complications of pregnancy.
A variable describing complications was defined so that the value 1 could be assigned if the
patient experienced one or more complications of pregnancy and 0 if she experienced no
complications. Logistic regression analysis was used to evaluate the relationship between the
occurrence of complications of pregnancy and other variables. This method, like regression
analysis, allows the researcher to determine and rank important variables that can distinguish
between patients who have complications and patients who do not. ■

■ Example 1.2 Study of race and social influence in cooperative problem-solving dyads,
illustrating the use of analysis of variance and analysis of covariance.
James (1973) conducted an experiment on 140 seventh- and eighth-grade males to
investigate the effects of two factors—race of the experimenter (E) and race of the compari-
son norm (N)—on social influence behaviors in three types of dyads: white–white; black–
black; and white–black. Subjects played a game of strategy called Kill the Bull, in which
14 separate decisions must be made for proceeding toward a defined goal on a game board.
In the game, each pair of players (dyad) must reach a consensus on a direction at each deci-
sion step, after which they signal the E, who then rolls a die to determine how far they can
advance along their chosen path of six squares. Photographs of the current champion players
(N) (either two black youths [black norm] or two white youths [white norm]) were placed
above the game board.
Four measures of social influence activity were used as the outcome variables of inter-
est. One of these, called performance output, was a measure of the number of times a given
subject attempted to influence his dyad to move in a particular direction.
The major research question focused on the outcomes for biracial dyads. Previous
research of this type had used only white investigators and implicit white comparison
norms, and the results indicated that the white partner tended to dominate the decision
making. James’s study sought to determine whether such an “interaction disability,” previ-
ously attributed to blacks, would be maintained, removed, or reversed when the comparison
norm, the experimenter, or both were black. One approach to analyzing this problem was to

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4    Chapter 1   Concepts and Examples of Research

perform a two-way analysis of variance on social-influence-activity difference scores between


black and white partners, to assess whether such differences were affected by either the race
of E or the race of N. No such significant effects were found, however, implying that nei-
ther E nor N influenced biracial dyad interaction. Nevertheless, through use of analysis of
covariance, it was shown that, controlling for factors such as age, height, grade, and verbal
and mathematical test scores, there was no statistical evidence of white dominance in any of
the experimental conditions.
Furthermore, when combined output scores for both subjects in same-race dyads
(white–white or black–black) were analyzed using a three-way analysis of variance (the three
factors being race of dyad, race of E, and race of N), subjects in all-black dyads were found
to be more verbally active (i.e., exhibited a greater tendency to influence decisions) under a
black E than under a white E; the same result was found for white dyads under a white E.
This property is generally referred to in statistical jargon as a “race of dyad” by “race of E”
interaction. The property continued to hold up after analysis of covariance was used to con-
trol for the effects of age, height, and verbal and mathematical test scores. ■

■ Example 1.3 Study of the relationship of cultural change to health, illustrating the use
of analysis of variance.
Patrick and others (1974) studied the effects of cultural change on health in the U.S.
Trust Territory island of Ponape. Medical and sociological data were obtained on a sample
of about 2,000 people by means of physical exams and a sociological questionnaire. This
Micronesian island has experienced rapid Westernization and modernization since American
occupation in 1945. The question of primary interest was whether rapid social and cultural
change caused increases in blood pressure and in the incidence of coronary heart disease. A
specific hypothesis guiding the research was that persons with high levels of cultural ambigu-
ity and incongruity and low levels of supportive affiliations with others have high levels of
blood pressure and are at high risk for coronary heart disease.
A preliminary step in the evaluation of this hypothesis involved measuring three vari-
ables: attitude toward modern life; preparation for modern life; and involvement in modern
life. Each of these variables was created by isolating specific questions from a sociological
questionnaire. Then a factor analysis2 determined how best to combine the scores on spe-
cific questions into a single overall score that defined the variable under consideration. Two
cultural incongruity variables were then defined. One involved the discrepancy between
attitude toward modern life and involvement in modern life; the other was defined as the
discrepancy between preparation for modern life and involvement in modern life.
These variables were then analyzed to determine their relationship, if any, to blood pres-
sure and coronary heart disease. Individuals with large positive or negative scores on either
of the two incongruity variables were hypothesized to have high blood pressure and to be at
high risk for coronary heart disease.
One approach to analysis involved categorizing both discrepancy scores into high and
low groups. Then a two-way analysis of variance could be performed using blood pressure

2
Factor analysis was described in Chapter 24 of the second edition of this text, but this topic is not included as a topic
in this (fifth) edition.

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.3 Concluding Remarks    5

as the outcome variable. We will see later that this problem can also be described as a regres-
sion problem. ■

■ Example 1.4 Study of the association between alcohol consumption frequency and
body-mass index (BMI) in the Behavioral Risk Factor Surveillance System (BRFSS).
The BRFSS is a large and ongoing surveillance project managed by the U.S. Centers
for Disease Control and Prevention (CDC) and conducted by state health departments as
telephone-based interviews, based on random-digit dialing. Its purpose is to “generate infor-
mation about health risk behaviors, clinical preventive practices, and health care access and
use primarily related to chronic diseases and injury”(CDC 2012).
The unpublished example considered here examines the relationship between frequency
of alcohol use in the previous 30 days and the response variable of BMI, a common measure
of body fat defined as (weight in kg)Y(height in m)2. Dozens of studies have demonstrated
cardiovascular benefits of red wine consumption. Yet the relationship between alcohol con-
sumption and BMI, an important risk factor for numerous chronic diseases, is less clear.
An analysis of data from the National Health Interview Survey found a moderate reduction
in BMI associated with increasing drinking frequency, yet an increase in BMI with greater
drinking volume (Breslow and Smothers 2005). These relationships were different for males
and females (an example of interaction; see Chapter 11), who are known to metabolize alco-
hol differently.
This analysis of drinking frequency and BMI considers females who live in the state of
Georgia and who consume nonheavy amounts of alcohol (for the 2010 BRFSS data collec-
tion year). Straight-line regression analysis is used to quantify the same negative association
between drinking frequency and BMI found by others. Multiple regression analysis and analy-
sis of covariance are used to additionally consider the effects of age and other health behaviors
(e.g., sleep quality, exercise, and tobacco use) that are known to be associated with BMI.
This example is unique in that it provides key illustrations of the objectives of regres-
sion techniques for the analysis of public health surveillance data on a health outcome with
numerous determinants. These objectives can differ from those used for the analysis of
data emanating from more controlled health studies (such as randomized controlled clini-
cal trials). In particular, the large sample size associated with the BRFSS provides oppor-
tunities for the detection of statistically significant (and sometimes both unexpected and
meaningful) associations between certain determinants and BMI that might otherwise be
challenging to detect. Such hypothesis-generating regression findings can suggest avenues
for further research. It is important to mention that such surveillance studies limit causal
interpretations of the findings. These and related issues are discussed further in several
chapters that follow.

1.3 Concluding Remarks


The four examples described in Section 1.2 indicate the variety of research questions to
which multivariable statistical methods are applicable. In Chapter 2, we will provide a broad
overview of such techniques; in the remaining chapters, we will discuss each technique in
detail.

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6    Chapter 1   Concepts and Examples of Research

References
Breslow, R. A., and Smothers, B. A. 2005. “Drinking Patterns and Body Mass Index in Never Smokers:
National Health Interview Survey, 1997–2001.” American Journal of Epidemiology 161(4):
368–76.
Campbell, D. T., and Stanley, J. C. 1963. Experimental and Quasi-experimental Designs for Research.
Chicago: Rand McNally.
CDC Office of Surveillance, Epidemiology, and Laboratory Services. 2012. “Behavioral Risk Factor
Surveillance System: BRFSS Frequently Asked Questions (FAQs).” http://www.cdc.gov/brfss/
faqs.htm.
Hulka, B. S.; Kupper, L. L.; Cassel, J. C.; and Thompson, S. J. 1971. “A Method for Measuring
Physicians’ Awareness of Patients’ Concerns.” HSMHA Health Reports 86: 741–51.
James, S. A. 1973. “The Effects of the Race of Experimenter and Race of Comparison Norm on Social
Influence in Same Race and Biracial Problem-Solving Dyads.” Ph.D. dissertation, Department
of Clinical Psychology, Washington University, St. Louis, Mo.
Kleinbaum, D. G.; Kupper, L. L.; and Morgenstern, H. 1982. Epidemiologic Research. Belmont, Calif.:
Lifetime Learning Publications.
Patrick, R.; Cassel, J. C.; Tyroler, H. A.; Stanley, L.; and Wild, J. 1974. “The Ponape Study of
Health Effects of Cultural Change.” Paper presented at the annual meeting of the Society for
Epidemiologic Research, Berkeley, Calif.
Thompson, S. J. 1972. “The Doctor–Patient Relationship and Outcomes of Pregnancy.” Ph.D.
dissertation, Department of Epidemiology, University of North Carolina, Chapel Hill.

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2
Classification of Variables
and the Choice of Analysis

2.1 Classification of Variables


Variables can be classified in a number of ways. Such classifications are useful for determining
which method of data analysis to use. In this section, we describe three methods of classifica-
tion: by gappiness, by level of measurement, and by descriptive orientation.

2.1.1 Gappiness
In the classification scheme we call gappiness, we determine whether gaps exist between
successively observed values of a variable (Figure 2.1). If gaps exist between observations, the
variable is said to be discrete; if no gaps exist, the variable is said to be continuous. To speak
more precisely, a variable is discrete if, between any two potentially observable values, a value
exists that is not possibly observable. A variable is continuous if, between any two potentially
observable values, another potentially observable value exists.
Examples of continuous variables are age, blood pressure, cholesterol level, height, and
weight. Discrete variables are often counts, such as of the numbers of deaths or car accidents.
Additionally, nonnumeric information is often numerically coded in data sources using dis-
crete variables. Examples of this are sex (e.g., 0 if male and 1 if female), group identification
(e.g., 1 if group A and 2 if group B), and state of disease (e.g., 1 if a coronary heart disease
case and 0 if not a coronary heart disease case).
© Cengage Learning

Gaps No gaps

(a) Values of a discrete variable (b) Values of a continuous variable

FIGURE 2.1 Discrete versus continuous variables

7
Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8    Chapter 2   Classification of Variables and the Choice of Analysis

Relative frequency

Relative frequency

© Cengage Learning
(a) Histogram of a continuous variable (b) Line chart of a discrete variable

FIGURE 2.2 Sample frequency distributions of a continuous and a discrete variable

In analyses of actual data, the sampling frequency distributions for continuous variables
are represented differently from those for discrete variables. Data on a continuous variable
are usually grouped into class intervals, and a relative frequency distribution is ­determined
by counting the proportion of observations in each interval. Such a distribution is usually
rep­resented by a histogram, as shown in Figure 2.2(a). Data on a discrete variable, on the
other hand, are usually not grouped but are represented instead by a line chart, as shown in
Figure 2.2(b).
Discrete variables can sometimes be treated for analysis purposes as continuous variables.
This is possible when the values of such a variable, even though discrete, are not far apart
and cover a wide range of numbers. In such a case, the possible values, although technically
gappy, show such small gaps between values that a visual representation would approximate
an ­interval (Figure 2.3).
Furthermore, a line chart, like the one in Figure 2.2(b), representing the frequency dis-
tribution of data on such a variable would probably show few frequencies greater than 1 and
thus would be uninformative. As an example, the variable “social class” is usually measured as
discrete; one measure of social class1 takes on integer values between 11 and 77. When data
on this variable are grouped into classes (e.g., 11–15, 16–20, etc.), the resulting frequency
histogram gives a clearer picture of the characteristics of the variable than a line chart does.
Thus, in this case, treating social class as a continuous variable is sometimes more useful than
treating it as discrete.
Just as it is often useful to treat a discrete variable as continuous, some fundamentally
­continuous variables may be grouped into categories and treated as discrete variables in a
given analysis. For example, the variable “age” can be made discrete by grouping its values
into two categories, “young” and “old.” Similarly, “blood pressure” becomes a discrete vari-
able if it is ­categorized into “low,” “medium,” and “high” groups or into deciles.

FIGURE 2.3 Discrete variable that may be treated as continuous (© Cengage Learning)

1
Hollingshead’s “Two-Factor Index of Social Position,” a description of which can be found in Green (1970).

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.1 Classification of Variables    9

The decision to categorize a continuous variable into discrete levels is nuanced, requiring
consideration of both pros and cons. On the one hand, a discrete version of a variable might
make the data easier to collect and summarize. This often, in turn, aids in the presentation of
results to colleagues. Yet these advantages must be balanced against the loss of information
that comes with converting a continuous variable into a discrete one. The choice of variable
type often impacts the type of analysis that can ultimately be conducted, and the desire to use
a certain analysis technique may drive decisions about the treatment of variables.
A further consideration concerns when to categorize continuous data. One may catego-
rize a continuous variable either at the time of data collection or at the time of data analysis.
The former choice often allows cheaper, quicker, and/or less precise methodology for data
collection to be employed. Yet this may also introduce human error (e.g., when a clini-
cian is given the extra step of classifying a continuous reading into one of several groups).
Categorization at the time of analysis reduces the likelihood of human error and also allows
for multiple classification schemes to be later considered, since the original continuous data
have not been forfeited.
A related issue is that both continuous and discrete variables can be error-prone. Contin-
uous variables can be measured with error, and discrete variables can be misclassified. When
such error-prone variables are used in regression analyses, incorrect statistical conclusions can
be made (i.e., statistical validity can be compromised). In this textbook, it will be assumed
that variables to be considered are not subject to either measurement error or misclassifica-
tion error. A discussion of rigorous statistical methods for dealing with error-prone variables
in regression analyses is beyond the scope of this textbook, but Gustafson (2004) provides
numerous relevant references to such methods.

2.1.2 Level of Measurement


A second classification scheme deals with the preciseness of measurement of the variable.
There are three such levels: nominal, ordinal, and interval.
The numerically weakest level of measurement is the nominal level. At this level, the
values assumed by a variable simply indicate different categories. The variable “sex,” for
example, is nominal: by assigning the numbers 1 and 0 to denote male and female, respec-
tively, we ­distinguish the two sex categories. A variable that describes treatment group is
also nominal, provided that the treatments involved cannot be ranked according to some
criterion (e.g., dosage level).
A somewhat higher level of measurement allows not only grouping into separate cat-
egories but also ordering of categories. This level is called ordinal. The treatment group may
be considered ordinal if, for example, different treatments differ by dosage. In this case, we
could tell not only which treatment group an individual falls into but also who received a
heavier dose of the treatment. Social class is another ordinal variable, since an ordering can
be made among its different categories. For example, all members of the upper middle class
are higher in some sense than all members of the lower middle class.
A limitation—perhaps debatable—in the preciseness of a measurement such as social
class is the amount of information supplied by the magnitude of the differences between
­different categories. Thus, although upper middle class is higher than lower middle class, it
is debatable how much higher.

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
10    Chapter 2   Classification of Variables and the Choice of Analysis

A variable that can give not only an ordering but also a meaningful measure of the
­ istance between categories is called an interval variable. To be interval, a variable must be
d
expressed in terms of some standard or well-accepted physical unit of measurement. Height,
weight, blood pressure, and number of deaths all satisfy this requirement, whereas subjective
measures such as perception of pregnancy, personality type, prestige, and social stress do not.
An interval variable that has a scale with a true zero is occasionally designated as a
ratio or ratio-scale variable. An example of a ratio-scale variable is the height of a person.
Temperature is commonly measured in degrees Celsius, an interval scale. Measurement of
temperature in degrees Kelvin is based on a scale that begins at absolute zero and thus is a
ratio variable. An example of a ratio variable common in health studies is the concentration
of a substance (e.g., cholesterol) in the blood.
Ratio-scale variables often involve measurement errors that follow a nonnormal
distribution and are proportional to the size of the measurement. We will see in Chapter 5
that such proportional errors violate an important assumption of linear regression—namely,
equality of error variance for all observations. Hence, the presence of a ratio variable is a
signal to be on guard for a possible violation of this assumption. In Chapter 14 (on regression
diagnostics), we will describe methods for detecting and dealing with this problem.
As with variables in other classification schemes, the same variable may be considered at
one level of measurement in one analysis and at a different level in another analysis. Thus,
“age” may be considered as interval in a regression analysis or, by being grouped into catego-
ries, as nominal in an analysis of variance.
The various levels of mathematical preciseness are cumulative. An ordinal scale possesses
all the properties of a nominal scale plus ordinality. An interval scale is also nominal and
­ordinal. The cumulativeness of these levels allows the researcher to drop back one or more lev-
els of measurement in analyzing the data. Thus, an interval variable may be treated as nominal
or ordinal for a particular analysis, and an ordinal variable may be analyzed as nominal.

2.1.3 Descriptive Orientation


A third scheme for classifying variables is based on whether a variable is intended to describe
or be described by other variables. Such a classification depends on the study objectives rather
than on the inherent mathematical structure of the variable itself. If the variable under
­investigation is to be described in terms of other variables, we call it a response or de­pendent
variable, typically denoted by the letter Y. If we are using the variable in conjunction with
other variables to describe a given response variable, we call it a predictor, a regressor, or an
independent variable,2 typically denoted by the letter X. Some independent variables may

2
The term independent variable is a historical term meant to evoke the notion that these measured factors may freely
vary from subject to subject, whereas changes in the dependent variable are thought to depend on and be determined by
the values of a subject’s independent variables. This usage of the term independent differs from the statistical concept
of independence. Two variables are statistically independent when the statistical behavior of one variable is completely
unaffected by the statistical behavior of the other variable. When two variables are independent, they are uncorrelated,
although zero correlation does not imply independence. In most regression analysis situations, there are nonzero cor-
relations among the independent (or predictor) variables. Though not ideal terminology, the phrase independent variable
is still commonly used in practice to denote a predictor variable in regression analysis, and we use this standard termi-
nology in this textbook.

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.2 Overlapping of Classification Schemes    11

Perception Worry
Perception Desire
Perception Birth
Informational Communication
Affective Communication Satisfaction

6
Social Class
Age Control
Education variables

© Cengage Learning
Parity       

Independent variables Dependent variable


FIGURE 2.4 Descriptive orientation for Thompson’s (1972) study of satisfaction with
medical care

affect relationships among other independent variables and/or the dependent variables but
be of no intrinsic interest in a particular study. Such ­variables may be referred to as control or
nuisance variables or, in some contexts, as covariates or confounders.
For example, in Thompson’s (1972) study of the relationship between patient per-
ception of pregnancy and patient satisfaction with medical care, the perception variables
are independent variables (or regressors), and the satisfaction variable is the dependent
(or response) variable (Figure 2.4).
Usually, the distinction between independent and dependent variables is clear, as it is in
the examples we have given. Nevertheless, a variable considered as dependent for purposes
of evaluating one study objective may be considered as independent for purposes of evaluat-
ing a different objective. For example, in Thompson’s study, in addition to determining the
­relationship of perceptions as independent variables to patient satisfaction, the researcher
sought to determine the relationships of social class, age, and education to perceptions
treated as dependent variables.

2.2 Overlapping of Classification Schemes


The three classification schemes described in Section 2.1 overlap in the sense that any vari-
able can be labeled according to each scheme. “Social class,” for example, may be considered
as ordinal, discrete, and independent in a given study; “blood pressure” may be considered
interval, continuous, and dependent in the same or another study.
The overlap between the level-of-measurement classification and the gappiness classifi-
cation is shown in Figure 2.5. The diagram does not include classification into dependent or
independent variables because that dimension is entirely a function of the study objectives

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
12    Chapter 2   Classification of Variables and the Choice of Analysis

Interval

Different

Co
representations

nti
of variable “age” Ordinal

nuo
us
Nominal Variable “sex”

© Cengage Learning
Dis
cre
te
FIGURE 2.5 Overlap of variable classifications

and not of the variable itself. In reading the diagram, one should consider any variable as
being representable by some point within the triangle. If the point falls below the dashed
line within the triangle, it is classified as discrete; if it falls above that line, it is continuous.
Also, a point that falls into the area marked “interval” is classified as an interval variable, and
similarly for the other two levels of measurement.
As Figure 2.5 indicates, any nominal variable must be discrete, but a discrete variable
may be nominal, ordinal, or interval. Also, a continuous variable must be either ordinal
or interval, although ordinal or interval variables may exist that are not continuous. For
example, “sex” is nominal and discrete; “age” may be considered interval and continuous or,
if grouped into categories, nominal and discrete; and “social class,” depending on how it is
measured and on the viewpoint of the researcher, may be considered ordinal and continuous,
ordinal and discrete, or nominal and discrete.

2.3 Choice of Analysis


Any researcher faced with the need to analyze data requires a rationale for choosing a par-
ticular method of analysis. Four considerations should enter into such a choice: the purpose
of the investigation; the mathematical characteristics of the variables involved; the statistical
assumptions made about these variables; and how the data are collected (e.g., the sampling
­procedure). The first two considerations are generally sufficient to determine an appropriate
analysis. However, the researcher must consider the latter two items before finalizing initial
recommendations.
Here we focus on the use of variable classification, as it relates to the first two
­­con­siderations noted at the beginning of this section, in choosing an appropriate method
of ­anal­ysis. Table 2.1 provides a rough guide to help the researcher in this choice when sev-
eral variables are involved. The guide distinguishes among various multivariable methods.

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2.3 Choice of Analysis    13

Table 2.1 Rough guide to multivariable methods

Classification of Variables

Method Dependent Independent General Purpose

Multiple Continuous Classically all To describe the extent, direction, and strength of the
linear regres- continuous, but relationship between several independent variables and a
sion analysis in practice any continuous dependent variable
type(s) can be
used

Analysis of Continuous All nominal To describe the relationship between a continuous


variance dependent variable and one or more nominal independent
variables

Analysis of Continuous Mixture of To describe the relationship between a continuous


covariance nominal variables dependent variable and one or more nominal indepen-
and continuous dent variables, controlling for the effect of one or more
variables (the latter continuous independent variables
used as control
variables)*

Logistic Dichotomous A mixture of vari- To determine how one or more independent variables are
regression ous types can be related to the probability of the occurrence of one of two
analysis used possible outcomes

Poisson Discrete A mixture of vari- To determine how one or more independent variables
regression ous types can be are related to the rate of occurrence of some outcome
analysis used

*Generally, a control variable is a variable that must be considered before any relationships of interest can be quantified; this is because a
control variable may be related to the variables of primary interest and must be taken into account in studying the relationships among the
primary variables. For example, in describing the relationship between blood pressure and physical activity, we would probably consider “age”
and “sex” as control variables because they are related to blood pressure and physical activity and, unless taken into account, could confound
any conclusions regarding the primary relationship of interest.

© Cengage Learning

It considers the types of variable sets usually associated with each method and gives a gen-
eral description of the purposes of each method. In addition to using the table, however,
one must carefully check the statistical assumptions being made. These assumptions will be
described fully later in the text. Table 2.2 shows how these guidelines can be applied to the
examples given in Chapter 1.
Several methods for dealing with multivariable problems are not included in Table 2.1
or in this text—among them, nonparametric methods of analysis of variance, multivariate
multiple regression, and multivariate analysis of variance (which are extensions of the cor-
responding methods given here that allow for several dependent variables), as well as methods
of cluster analysis. In this book, we will cover only the multivariable techniques used most
often by health and social researchers.

Copyright 2013 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).
Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Random documents with unrelated
content Scribd suggests to you:
The Project Gutenberg eBook of Anti-Semitism
in the United States
This ebook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this ebook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.

Title: Anti-Semitism in the United States


Its history and causes

Author: Lee J. Levinger

Release date: March 21, 2024 [eBook #73221]

Language: English

Original publication: New York: Bloch Publishing Co., Inc, 1925

*** START OF THE PROJECT GUTENBERG EBOOK ANTI-SEMITISM


IN THE UNITED STATES ***
Transcriber’s Note:
New original cover art included with this eBook
is granted to the public domain.
A N T I - SEMI T I SM I N T HE
UN I T ED ST A T ES
ITS HISTORY AND CAUSES

BY

RABBI LEE J. LEVINGER, Ph.D.


Author of “A Jewish Chaplain In France”

NEW YORK
BLOCH PUBLISHING CO., Inc.
“THE JEWISH BOOK CONCERN”
1925
Copyright, 1925, by
Lee J. Levinger

Printed in the United States


TO MY PARENTS
WHO FIRST TAUGHT ME THE MEANING OF TOLERANCE
And all must love the human form,
In heathen, Turk, or Jew;
Where Mercy, Love, and Pity dwell,
There God is dwelling too.
William Blake.
PREFACE

This study, which was submitted as one of the requirements for


the degree of Doctor of Philosophy at the University of Pennsylvania,
has meant the assembling of personal and theoretical interests of
various types. It has two chapters of pure theory on which the
practical application is based. To the student of social philosophy or
sociology, then, chapters 1 and 2 will contain the essentials of the
study. The general reader, not interested in the technical basis but in
the conclusions, may prefer to omit these chapters from the reading,
and to proceed from the introduction directly to the applications of
this theory in American history and specifically to the problem of the
Jew in America, as developed in chapters 3 to 9.
Grateful acknowledgments are due to Professor Edgar A. Singer,
Jr., of the University of Pennsylvania, Professor Julius Drachsler of
the College of the City of New York, and Mr. Leon L. Lewis, Secretary
of the Anti-Defamation League, for their very stimulating aid, both
prior to and during the writing of this study, and to my wife for her
assistance in the preparation of the manuscript.
Lee J. Levinger
Wilmington, Delaware, May, 1925.
CONTENTS

CHAPTER PAGE

Introduction: A Statement of the Problem 9


I. The “Group Mind,” a Definition and a Description 18
II. Groups in Contact 32
III. Intolerance 45
IV. American History, a Development of Groups 56
V. The World War and Its Aftermath 67
VI. The Ku Klux Klan and Other Group Reactions 74
VII. Anti-Semitism 85
VIII. The Retort to Anti-Semitism 96
IX. The Future of the American Mind 111
INTRODUCTION
A STATEMENT OF THE PROBLEM
The existence of an anti-Semitic movement in the United States
of America since the World War is a paradox that attracts attention
at once. The most ancient and most pervasive form of intolerance is
now at home in a nation founded by revolution and dedicated to the
principles of freedom and tolerance. How can such a movement exist
in such a nation? The apparent contradiction leads us at once into
the many contradictions of the psychology of large groups of human
beings, which both parallels and contradicts the simpler psychology
of their constituent individuals. This is a leading question, to answer
which we must go as deeply as we can into the mind of the group,
into the relation of groups to the smaller groups of which they are
composed and of those smaller groups to each other, into the
genesis and implications of tolerance and intolerance.
This theoretical study completed, we shall then have to verify the
principles there worked out by application to the difficult and crucial
problem of the present study. If a theory of group and sub-group
can explain the existence and the development of anti-Semitism in
America, it will have solved a problem of exceptional complexity and
significance, one central to the whole field. This will involve a study
of the mind of the American people, in brief outline, with its various
movements of intolerance in their bearing on the present one. It will
also necessitate a slight study of the various anti-Semitic examples,
historic and contemporary, from which the American movement
derives in part. It will conclude with a consideration of the future of
the American people as a united group, taking into view the
tendencies of the sub-groups within the bounds of their common
nation, or over-group.
Anti-Semitism is the modern form of the ancient prejudice against
the Jew; it began in Germany in 1871, directly after the Franco-
Prussian War, and bases its opposition to the Jews on the race
theory. Anti-Judaism is, of course, much older, as old as the people
against whom it was directed. In most ancient times, as represented
by the Egyptian taskmasters and the Haman of the Book of Esther, it
was like any other national hatred or prejudice. Later it took on a
distinctly religious coloring, so that we find a Philo going to Rome to
appeal for the Jewish colony in Alexandria or a Josephus writing a
defense of his people against Apion. With the growth of Christianity
into a persecuting body, anti-Judaism became strictly a religious
matter, based on the New Testament story that the Jews were
responsible for the death of Jesus. Medieval laws on the Jews were,
then, often based on the principle of expiation, such as the yellow
badge which distinguished the wearer when he left the compulsory
shelter of the Ghetto. A different form of religious motivation was
shown in the frequent accusations of desecrating the Host or of
using the blood of a Christian child in preparing the unleavened
bread of Passover, which appears in the Canterbury Tales and was
revived as recently as 1911 in the notorious Beilis case at Kiev,
Russia. Along with this went occasional mob outbreaks such as occur
against the negroes in our Southern states, and still more rarely
decrees of expulsion, which drove the entire Jewish population from
England in 1294, from Spain in 1492, and from other countries at
other times, for a longer or shorter period.
The actual applications of this religious anti-Judaism were far too
many to enumerate here, ranging from the prohibition of tilling the
soil to compulsory attendance at a Christian sermon, as in
Browning’s “Holy Cross Day.” Counteracting it were the frequent
intercourse and occasional intermarriage through the Middle Ages,
the paid protection of the Holy Roman Emperor for his
Kammerknechte, the toleration of the Moors and later of Holland,
finally the emancipation of the French Revolution on abstract
grounds of the Rights of Man. Religious discrimination was forbidden
in the American Constitution, so that anti-Judaism of the religious
type had no footing in the new nation, strong as it had previously
been in several of the colonies. In addition, the number of Jews in
America was very small, so that discrimination against them might
exist in principle but could have little exercise in practise. And those
few were often wealthy and cultured descendants of the old Spanish
Jewry. During the most of the nineteenth century the Jews entering
the country met the same difficulties as other immigrants, with very
little variation.
But then the problem changed; the number of Jews increased
from 3,000 in 1800 to 250,000 in 1880. Some of these achieved
wealth and began to associate with non-Jewish social circles. The
opposition to them now became largely social. They were excluded
from many hotels and summer resorts, from clubs, college
fraternities and the like. This phase of the problem was often acute
but never important, and is here mentioned merely in passing,
though it will have its bearing on the theory to be developed. In
addition, the religious prejudice continued, similar to that between
Christian denominations but stronger, owing to the frequent teaching
of Jewish responsibility for the crucifixion. These two aspects of anti-
Judaism persisted as the only ones in America until after the World
War, and these were sporadic, and often opposed by the tendency of
our political democracy and by various groups of religious liberals.
Meanwhile, modern racialism had been born and with it modern
anti-Semitism, the attack on the Jew as a member of a different
race, inferior or at least unassimilable by the Aryan. Writers against
the Jew no longer turned for their weapons to Eisenmenger’s
“Endecktes Judentum” of 1701, with its religious criticism and
personal strictures. The new classics are Werner Sombart’s “Die
Juden und das Wirtschaftsleben” and Drumont’s “La France Juive.”
An elaborate scientific basis has been constructed, on which a
movement of opposition was erected, apparently much the same as
that of the Inquisition or of Apion. One of the conclusions of the
present study will be that it is in fact the same, and that the racial
theory can be almost overlooked in estimating the actual causes and
processes of anti-Semitism. It would be an interesting, though not
essential task, to examine this racial theory in detail and determine
how much scientific authenticity it may possess. In Russia the
conditions of autocracy threatened by liberalism and war led to
official anti-Semitism, with pogroms or massacres of the Jews
actually led by army officers. In Germany the officialism and social
stratification led to discrimination against Jews in the appointment of
judges, university professors and army officers. In France anti-
Semitism became a part of royalism and clericalism, and from the
military and royalist group came the Dreyfus case. In England anti-
Semitism was chiefly literary; Hillaire Belloc proves the Jews to be
aliens who should all be sent to Palestine, while Gilbert K.
Chesterton visits Palestine and reports that the Jews there are
terrible creatures and ought to be excluded from the Holy Land!
But all this time there was no anti-Semitism, as a literary, political
or economic movement in the United States. That was a product of
the period after the World War. There was merely religious prejudice
of the orthodox and social ostracism of the elite among gentile
society. The Jew had not even attracted the special attention of the
various anti-alien movements in American history, owing to his small
numbers and frequent rapid Americanization. It seemed as though
anti-Semitism was a movement foreign to American life and
institutions. Now, however, the movement exists and may be
considered briefly in four phases.
1. The first to be considered is the attempt to limit the
percentage of Jews in American universities. The “numerus clausus,”
typical of Russia under the Czars, has been one of the favorite
projects of the anti-Semitic parties in various European countries,
working either through their representatives in the parliaments or
through their sympathizers in the universities themselves. Whether
the motive was to brand the Jew as inferior mentally, or to make him
so through lack of education, is hard to say—probably it is merely
another manifestation of the process which this paper aims to trace.
In American institutions of higher learning there has been a
growing problem of the increase of entering classes, as well as a
growing perplexity at the number of Jewish immigrants who seek an
advanced education. These young people often lack American
manners and background, standing out from the great mass of the
student body, whether for good or bad is immaterial. What more
natural than that some would attempt to solve the two problems at
once by excluding a certain percentage of these objectionable
persons, at the same time cutting down enrollment? I do not speak
of rumors that this purpose has been achieved in certain institutions
by personal interviews, psychological tests, and the like, even
though statistics seem to bear out this interpretation. I consider only
the Harvard incident, which is public and official.
In June 1922 President Lowell of Harvard, in his address at the
graduation exercises drew attention to the double phase of the
problem, the increase of registration and the danger to the social
and personal standard of the university, and recommended its full
investigation by committees of the faculty and board of trustees of
the university. The sensation caused by this bringing into the open of
a subject long covertly agitated, especially in view of the large
Jewish population of Boston, and fairly large registration at Harvard,
was extreme. The matter came to an end April 9, 1923, when the
committee recommendation was accepted by the Board of Overseers
for the University. The report recommended:
In the administration of rules for admission Harvard College
maintains its traditional policy of freedom from discrimination on
grounds of race or religion. Concerning proportional representation,
your committee is unanimous in recommending that no departure be
made from the policy that has so long approved itself—the policy of
equal opportunity for all, regardless of race or religion. Any action
liable to interpretation as an acceptance of the principle of racial
discrimination would to many seem like a dangerous surrender of
traditional ideals.
The report even avoids recommending any test of personal fitness
which might be interpreted as a cover for racial or religious
discrimination.
2. A further expression of anti-Semitism appeared in the form of
books and magazine articles. “The Cause of World Unrest,” an
English book, was reprinted in 1920 by G. P. Putnam’s Sons of New
York; “The Protocols of the Meetings of the Zionist Men of Wisdom”
by Small, Maynard and Co. of Boston in the same year; “The Jews in
America” by Burton J. Hendrick, appeared as a series of articles in
the World’s Work, and was issued later as a book by Doubleday Page
and Co. of New York in 1923. Periodicals such as “The Searchlight”
of Atlanta, the “Fellowship Forum” of Washington, D. C., and “The
American Standard” of New York City (to mention only a few of a
large number) conducted an active campaign against Jews and
Catholics, which still continues.
Most conspicuous of all was the long series of articles on the
Jewish problem carried by the Dearborn Independent of Dearborn,
Mich., the personal organ of Mr. Henry Ford. This series began in
May, 1920; the four booklets containing their reprinted form are
dated, the first on November, 1920; the fourth, May 1922. They take
ostensibly the position that international finance, under the
leadership of certain Jews, is endeavoring to rule the world. Actually,
however, they use any anti-Semitic theme that comes to hand, from
the race theory to articles on the “Jewish liquor trust” and “the
Jewish aids of Benedict Arnold.” Their chief arsenal of material is the
Protocols of the Learned Elders of Zion, referred to above, a
purported record of secret meetings held by leaders of world Jewry
with the object of overthrowing the gentile nations and ruling the
world themselves. This work first saw the light in Russia in 1901 and
was utilized in 1905 as part of the propaganda against the abortive
revolution of that year; it was the work of one Serge Nilus. Later
study has shown it to be a forgery, largely copied from a French
political pamphlet directed against Napoleon III and published in
Brussels in 1865 by Maurice Joly under the title, “Dialogues in Hell
between Machiavelli and Montesquieu”! The Russian editions of this
work, and those in German, as well, included virulent attacks on
Britain and America as representatives of liberalism, and therefore of
Judaism; naturally, these have been omitted from the English
versions.
3. This agitation could not remain theoretical—in fact, probably
the theory was itself a late product of a broader tendency. The
Johnson immigration act, setting the quota of immigrants to be
admitted to the United States on the basis of their proportion in this
country in 1890, was avowedly planned on a racial basis to
encourage immigrants from northern and western Europe, and
exclude those from eastern and southern Europe and from other
continents. Secretly there seems to have been both anti-Jewish and
anti-Catholic sentiment involved, as certain partisan publications
boast quite openly.
By far the most significant expression of anti-Semitism in the
United States is the Ku Klux Klan, which will later be considered in
some detail. At this point it is sufficient to point out that the Klan
was organized in 1915 by William J. Simmons of Atlanta, Ga., and
became a national movement in 1920. Its name and much of its
ritual are taken from the Ku Klux Klan of 1867–71, but its motives
are quite different, for the old Klan was a local movement intended
to protect the defeated Confederacy, to overawe the negroes and to
oppose the North; while the modern Klan is not sectional, but in
every section opposes the negro, the Jew, the Catholic and the
foreign-born. Its membership is exclusively “white, gentile,
Protestant American” and it therefore claims to be the only “one
hundred per cent. Americans”. The Klan defends its purpose and
attacks the proscribed groups by business boycott, political
opposition, sometimes even by threats or by physical violence. The
Klan is the most important symptom at hand of the nature of anti-
Semitism in the United States, beside being a most significant type
of social grouping and of social motive.
4. A final type of anti-Semitism in America was a direct
importation from Europe through a group of Russian emigrés, some
of them living in this country as private citizens, others as employees
of the section on Russia of the Department of State. These men
were bitterly anti-Soviet, anti-radical, and (whether for propaganda
purposes, or through the convictions of the Russian aristocracy as a
whole) bitterly anti-Semitic. Anti-Semitism is an article in the creed
of every reactionary movement in Europe, with the single exception
of the Italian Fascisti, and is strongest of all among the Russians. It
seems to have been these people who persuaded Mr. Henry Ford of
the authenticity of the Protocols, and introduced these to America as
a whole. They seem also to have been active in the anti-radical
agitation of the post-war period, which tried to identify foreigner,
radical and Jew in the mind of the American people, and to attribute
the Russian revolution, the Bolshevist government and the radical
groups in America, alike to insidious Jewish influence.
As this tendency was not as public as the others, I give some
proof of its existence. It was discussed in Hearst’s International
Magazine in 1923 in a series of articles by the editor, Norman
Hapgood; and in the Bnai Brith Magazine of October and November,
1924, in two articles by Jacob Spolansky, a former agent of the
United States Department of Justice, who was employed to hunt
down radicals and if possible to find Jews among them. As the most
official statement, I quote Mr. Louis Marshall, president of the
American Jewish Committee, in his annual report to that body,
1
delivered November 13, 1921.
The committee conducted an investigation with a view to
discovering the identity of those who instigated the attacks against
the Jews of America. It was found that they consisted of a group of
Russian emigrés who had wormed themselves into the confidence of
some Americans who, in turn, had succeeded in securing the
assistance of others whose co-operation was given either because
they were gullible and believed the fantastic inventions of men
schooled in intrigue in the Russian police system, or because they
already cherished ill-will against Jews and were ready to assist in any
movement through which they could satisfy real or fancied grudges.
2
In the report of the same body, October 19, 1919, reference is
made to the hearing before the sub-committee of the Judiciary
Committee of the United States Senate in February 1919, when—
Dr. George Simons, who had been for a number of years in Russia,
testified regarding the alleged activities of Jews in the Bolshevist
movement in Russia and stated that the present conditions there are
due, in large part, to the activities of Yiddish agitators from the East
Side of New York City who went to Russia immediately following the
overthrow of the Czar. Dr. Simons stated further that the Bolshevist
movement in Russia was being supported financially and morally by
certain elements on the East Side of New York City.
There is, then, an anti-Semitic movement in America, and has
been since 1919 or 1920. Its philosophy of racialism, exclusiveness
and “hundred per cent.” Americanism, is derived largely from the
Voelkische parties of Germany and other nations of Europe, which
lay great stress on Aryan race and especially on its Nordic or
Teutonic branch. The extreme of this position is found in the
apparently well reasoned position of Burton J. Hendricks, who
attempts to prove that the Spanish and German Jews were desirable
because white, but that the Russian Jews are undesirable
immigrants because they are descended from the Chazars, a Tartar
tribe which embraced Judaism in the ninth century. The premises of
this writer seem untenable, and the conclusions do not necessarily
follow on them. Much of this anti-Semitic literature and public action
seems to be based on similar rationalizations of intolerance, of group
prejudice.
In studying this anti-Semitic movement in America as a crucial
example of the relations of group and sub-group, I stand in the
contrary danger, that of rationalizing the inferiority complex of a
persecuted group. My only justification for facing this danger is that
nobody can approach this type of problem without one danger or
the other, and the subject is too vital to be entirely neglected. I can
only hope that my analysis of the underlying problem of the nature
of human groups and of their interrelations may be made in such a
scientific spirit that the application of my theory to the special
problem of anti-Semitism in the United States may be of some value
in the clearing up of this great field of human action.
CHAPTER I.
THE “GROUP MIND”
A DEFINITION AND A DESCRIPTION
The causes of intolerance rest, not in what men say but in what
they do. The reasons alleged for dislike or suspicion of the Jew are
valuable merely for showing a state of mind in the anti-Semite
himself, not for revealing the actual reasons for his attitude. For that
reason I shall disregard these reasons very largely in searching for
the causes of anti-Semitism in America. Instead, I shall turn to the
field of social study to find out how groups of men act toward one
another, and why and under what circumstances intolerance is one
of their by-products. I shall apply to the phenomena of group life the
method of behaviorism, now being adopted by sociology from its
original field of psychology, in such definitions as that of E. C.
3
Lindeman: “Sociology is the science of collective behavior.”

1.
The prevailing view of students of society seems now to be that
society is a natural phenomenon on the mental plane. Human
society is not now regarded, as by Buckle, as a reflection of
environment, even though the importance of physical background
and racial constitution must be recognized. As Charles A. Ellwood
4
says, “Society is a group of psychically interacting individuals.” “The
5
essential element in the social process is the psychical element.”
That is to say, mental material—instincts, emotions, feeling, and
ideas—are the plane on which groups of individuals combine into
social structures, operate in social functions, develop to social
progress. Relations between individuals (except for the limited
biological function) are mental relations, carried on through physical
media such as postures, speech and writing.
These mental interactions of human beings are not an artificial
construct from primitive egoism by the social contract or any other
method. William MacDougall is almost alone in holding that the
social sentiments are derived from the self-regarding ones through
the operation of the tender emotion and the parental instinct.
6
Hobhouse says: “The conception of a primitive egoism on which
sociality is somehow overlaid is without foundation in either biology
or psychology.” John Dewey puts this view most forcibly:
7
The fact is that the life, the experience, of the individual man, is
already saturated, thoroughly interpenetrated, with social inheritance
and references.... Education, language, and other means of
communication are infinitely more important categories of knowledge
than any of those exploited by absolutists. And as soon as the
methodological battle of instrumentalism is won ... the two services
that will stand to the credit of instrumentalism will be calling attention
first to the connection of intelligence with a genuine future, and
second, to the social constitution of personal, even of private
experience, above all of any experience that has assumed the
knowledge form.
And Ellwood adds—expressing here the general opinion of both
sociologists and modern social philosophers—“All human
consciousness is socially conditioned.... This is as true of the racially
inherited aspects of consciousness—the feeling-instincts—as it is of
the acquired traits.” Man is a social animal and his sociality is one of
the few unescapable things about him. He is born in some kind of a
social group; he gets the most of his ideas from his association with
others; his whole development is a give-and-take in which the take
is from the first, and often remains, the greater element.
But the recognition of this fact does not bind us to any one
explanation of it. We do not need to accept the “consciousness of
kind” of Giddings, the “herd instinct” of Trotter, or the “imitation” of
Tarde,—in fact, we may very well consider that there is no one
principle to explain so universal and complex a phenomenon; that
these terms and others like them are in no sense explanations, but
merely different words for the same fact, that man is a naturally
gregarious or social being. We may rather turn to the more
generalized modes of expressing this conception, the group mind or
general will, as developed by Durkheim, Wundt and in our day by
Baldwin, MacDougall, and others.

2.
Before attacking this problem directly, I must clear away several
misconceptions of the “group mind,” which I cannot accept as a part
of this theory. First, this thesis need not exclude the operation of
physical and biological forces on social groups, any more than it
excludes their operation on any individual, who is also a
psychological unit. Society may well be a unit, just as the individual
is, in a world of varying forces—climate, birth rates, and the like.
Second, a theory of group mind may be empirical, and need not
necessarily rest on an idealistic conception of the Volksgeist. By
adopting the historical method, rather than the statistical, relying on
values to indicate our problem rather than trying to express it in
terms of natural science, we shall find ourselves treating the theory
of the group in a realistic and empirical way, eschewing the
dogmatism of applying a priori principles to human material, and the
equal fallacy of considering minds in the same terms as chemical
8
elements.
Third, a modern social psychology need not be a literal
transcription of Durkheim or Wundt, relying on an antiquated
psychology for its analogies and its basic conceptions. A theory of
group mind today must recognize that personality is not always a
unity, that it is never a complete unity; the vast field of the
unconscious in mental life has just been opened to view. Both of
these conceptions apply to the mental life of men in great masses as
truly as when alone. Neither the individual nor the group is
something hard, fixed and static; neither can be summed up as a
group of faculties or a system of ideas. Both individual and group
9
must be conceived in process, to take the words of Lindemann, as
“the total equipment with which man responds to his environment,
all that enters into behavior from the side of human nature.”
Some views of group mind are vitiated for our present purpose by
the narrow limits they impose, or by the one-sided way in which
they arrive at their definitions. This applies especially to those who
use the mob as the typical group and consider “crowd-mindedness”
(to use Everett Dean Martin’s term) as a synonym of sociality. The
crowd, the herd, the mob are various terms for an exceptional type
of group of human beings, bound together by physical presence,
transformed by a powerful emotion, launched finally into unified and
10
often violent action. But as Baldwin says: “The mind of the crowd
is essentially a temporary, unorganized, ineffective thing.... The mob
is a by-product of society, it is the exaggeration of the normal.”
Finally, the group mind need not be expressed entirely in terms of
instinctive adaptation, any more than the mind of the individual;
either may have many types, may be instinctive or impulsive or
rational, may have a growing sense of rationality and a growing
power of independent, deliberate action. In opposition to
MacDougall, with his elaborate system of instincts and sentiments,
we may place the vast majority of students of the problem, Cooley,
Platt, Ellwood, Baldwin, and so on. Even when the members of a
group all use reason to a very high degree, they still constitute a
group if they have organization and some method of reaching a
general decision, as in a congress, a national association of
scientists, or a business corporation.
Obviously, human beings form many kinds of groups, and there
would then, on an empirical basis, be many varieties of group minds.
Individuals fall into many classes, as we all know, primitive and
cultured, ignorant and educated, the infant, the child and the adult,
the moron and the genius. So with the group. There are large and
small groups, from families to nations; temporary and permanent
ones, from the theatre audience to the church; simple and complex,
from town meeting to a Federal union, comprising states, counties,
cities and townships; unorganized and organized; groups founded on
physical presence, like a baseball team, and international bodies of
scientists or philosophers who may form “a school of thought” but
may never hold a meeting. The study of these various types is not
only interesting in itself; it may help us in formulating the principle
of the mind of the group as a whole. To begin with the definition of
the primitive group by Franz Boaz:
11
There are a number of primitive hordes to whom every stranger
not a member of the horde is an enemy, and where it is right to
damage the enemy to the best of one’s power and ability, and if
possible to kill him. This custom is based largely on the idea of the
solidarity of the horde, and on the feeling that it is the duty of every
member of the horde to destroy all possible enemies.... The feeling of
the fellowship in the horde corresponds to the feeling of unity in the
tribe, to a recognition of bonds established by a neighborhood of
habitat, and further on to the feeling of fellowship among members of
nations.
12
“He who is not with me is against me,” said Jesus for the religious
group. How far we have proceeded from the horde in our civilized
nations, and how near we are to it still in the essential character of
the mind of the group!

3.
Does the group mind exist? Not as a super-consciousness,
external to the individuals composing it—that view has been
discarded long ago. But as a category which is needed to explain
many phenomena, and which we can then proceed to study and
explain in greater detail, a term with pragmatic value, such as “life”
or “mind.” “Life” is no longer used as a principle of explanation, as a
vital principle which is infused into dead matter, but life exists, for all
that, and we can see its effects and study them. “Mind” is not
something separate and distinct from the body in which it dwells or
from the world in which it acts, but we know that mind is a useful
and necessary category in which to include a whole phase of living
being, especially of human life. “Group Mind” is the same sort of
category as these. Just as mind inheres in the neurones and is
coincident with the chemical changes in them, and yet cannot be
summed up by chemical changes; so group mind inheres in the
brains, of individuals and is coincident with individual ideas and acts,
yet cannot be summed up as so many individual responses but as
the unified response of a group of persons at once.
Morris Ginsberg, in his Psychology of Society, opposes any type of
group theory, as he sees only individuals in a social environment; he
holds that the group may have unity of content but not of process,
of ideas and ideals but not of mind. Floyd H. Allport speaks of “The
13
group fallacy,” “the error of substituting the group as a whole as a
principle of explanation in place of the individuals in the group,” to
14
which Emory S. Bogardus replies in his discussion that “if there is
a group fallacy, there is also an individual fallacy.”
On the other hand, so radical a behaviorist as E. C. Lindeman
15
remarks, “The group is a plurality of individuals, but what the
16
group does is not plural but singular.” “From the purely descriptive
point of view, the group becomes a new quality.” Dr. M. M. Davis
17
puts it this way: “Millions of brain cells are co-ordinated to think as
one brain. Psychology tries to tell how. Millions of brains co-ordinate
themselves and function in many ways as one brain. The how of that
marvel is for sociology.” Giddings calls the group mind “the concert
of thought, emotion and will” of individual minds. Cooley says:
18
“The unity of the social mind consists not in agreement but in
organization.” Ellwood phrases it somewhat differently:
19
The only unity we have in society is a unity of process. The
individual consciousness is unified both structurally and functionally....
There is a collective mental life, but no social mind in the same sense
in which there is an individual mind.
Dr. Baldwin sums up his view in the last sentences of the Social
and Ethical Interpretations:
20
Society is the form of natural organization which ethical
personalities come into in their growth. Ethical personality is the form
of natural development which individuals grow into who live in social
relationships. The true analogy, then, is not that which likens it to a
physiological organism, but rather that which likens it to a
psychological organization.
And so, if this were primarily a historical study, I might go over
many similar and differing theories, which consider the group as a
unity on the mental plane, that is, in one sense or another, as a
group mind.
The material is still being collected for this study, the essential
points of view still being defined, and such important factors as
instinct and intelligence are still being redefined with the rapid
progress of science today. As several of the terms cited above
suggest, the difference between the individual as a mind and the
group of individuals as a mind is always given and must always be
given in terms of structure. In the words of Lindeman:
21
The individual may be viewed as an integration of functioning
organs, and the group merely an integration of functions.... There can
be nothing organic about society or a group; there can be only a
series of relations, the results of specific responses to specific
situations.
Not to cite more opinions on a point on which there seems general
agreement, we may take it for granted that the chief, perhaps the
only difference historically pointed out between the mind of one man
and of a group of men is that the man has a brain and a nervous
system, while the group has neither, but operates apparently
through the brains and nervous systems of its members. But in their
functioning, in their activities, the mind of the man and of the nation
or other group are so similar as to be almost indistinguishable.
Of course, this distinction depends, finally, on the definition of
mind which we are prepared to accept. Dennes gives an adequate
summary and criticism of Durkheim, for instance, who considered
collective mind to consist of the collective ideas or representations of
a society; and of Wundt, who considered mind an integration of
processes, not of ideas, and therefore sought for the group mind in
the collective results of group mental process, in speech, religion
and custom. But Dennes himself seems confused by the need of
defining mind without regard to bodily structure. He says:
22
“Individual minds or persons have or produce bodies as well as
objective mental products. But social groups are not minds and have
no bodies. They are associations of minds.” MacDougall defines mind
23
as “An organized system of mental or purposive forces,” and
continues, “In the sense so defined, every highly organized human
society may be properly said to possess a group mind.” While
MacDougall’s definition seems circular in nature, it still recognizes
that a functional definition of mind can make no distinction of
structure, whether any particular mind is associated with one or
24
many bodies. Lindeman calls mind “the total equipment with
which man responds to his environment”, which seems more than
one can accept, for “total equipment” would include hands and feet,
as well as mind. A more precise statement of the same general tenor
appears in Dr. Singer’s Mind as Behavior:
25
Consciousness is not something inferred from behavior; it is
behavior. Or, more accurately, our belief in consciousness is an
expectation of probable behavior based on an observation of actual
behavior, a belief to be confirmed or refuted by more observation, as
any other belief in a fact is to be tried out.
Thus, any functional definition of mind that has no reference to brain
or nervous system, must apply and does apply in the group of
persons in exactly the same sense as to the single individual. If
there is “unified behavior,” if there is “organized system of purposes,”
if there is “response to environment,” then we have mind, whether
the behavior, response, or purpose dwell in one or two or many
bodies.
One question remains, and a most perplexing one. How can one
distinguish between a group mind and a group purpose, or the
accidental coincidence of many minds and many purposes? A flock
of migrating birds has no group mind—each bird would travel south
at the same time and the same rate of speed, were there no flock at
all. Or still lower forms, such as unicellular organisms, may move
simultaneously to warmer waters. On the other extreme, the hordes
of Huns led by Attila had a group purpose in their migration; the
leader gave the word, and the followers leaped together to their
horses’ backs to ride from Asia into Europe. But when a half million
negroes migrate from the southern to the northern states in a few
years, coming family by family, as the opportunity affords, yet with a
steady tendency of drift, is that a group mind or the accidental
agreement of many individuals? Is it mind or minds? And the same
problem is present in a declaration of war, or the victory of a foot-
ball team, or the adoption of a new fashion of clothes. When does
the group act and when the individual members? When do we have
the mind of all, when the mind of each?
To this crucial problem I must present one qualification and one
answer. The qualification is: the group never acts except through its
constituent individuals, any more than the mind acts without its
brain cells and bodily organs. The difference between the act of all
and the act of each is not a complete disjunction but a difference of
emphasis, of interpretation, of purpose. When the army marches,
every soldier goes ahead; when the nation elects its president, the
millions of voters cast their ballots; when the church adopts a creed
or reforms its ritual, the many believers experience a change in their
faith and their hope. Not that group opinion need be unanimous; it
is rather a mode of general consent by which unified action can arise
out of conflicting opinions, by which many individuals are absorbed
into a group mind. Thus in many, perhaps in most cases we cannot
say definitely: this is group mind, not personal preference, or this is
individual action, and the group has nothing to do with it. The
problem is much like that which faced Kant in defining moral action,
when the demands of the universal law may often coincide with
personal preference, perhaps even with the greatest and most
appealing happiness.
And our answer may be similar to his. Kant turned to the test
case. We know we have morality, said he, when duty and pleasure
are opposed, and the man obeys the voice of duty. Similarly, we can
say: we know that we have group mind and purpose when the
pleasure of the individual is opposed to the will of the group, and the
individual gives up his purpose for that of the army, the nation, the
church. When the soldier or the martyr gives up his urge for self-
preservation and offers his body to the bullets of the enemy or the
stake of the persecutor, then we know that he has abdicated his
individuality and is acting only as a member of the greater whole.
Lindeman, whose study is based on observation of farmers’ co-
operative societies, presents a contrary view:
26
It was formerly asserted that the chief significance of a group
consisted in the fact that the individuals comprising it had sacrificed
certain individual prerogatives, rights, privileges, etc., in order to
achieve the larger collective end. But it could not be discovered that
the farmers who became members of the co-operative associations
had done anything of the sort. On the contrary, they were chiefly
interested in enhancing their own individual interests; they desired a
larger income from the sale of their products and the co-operative
movement promised exactly this.
If this were true, these associations would constitute merely a set of
books, not a group of persons. But we see further on in the same
book that the co-operative associations demanded loyalty even at
the cost of whim or momentary interest; they enforced their
contracts with the farmer by which he agreed to sell only through
the association. If he got tired of waiting for his money, or if a dealer
placed a financial premium on disloyalty, still he was expected to be
loyal to the group. Finally, the group had to take cognizance of other
aspects of the human life of its members besides the sale of their
cotton or tobacco; it built up personal and social groupings for the
entire family; it became a truly unified group mind, through the slow
process of integration of individuals and of local groups, resting on a
basis of personal friendship. Thus, even in an interest group, a true
group mind is developed through participation and sacrifice.
Welcome to our website – the ideal destination for book lovers and
knowledge seekers. With a mission to inspire endlessly, we offer a
vast collection of books, ranging from classic literary works to
specialized publications, self-development books, and children's
literature. Each book is a new journey of discovery, expanding
knowledge and enriching the soul of the reade

Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.

Let us accompany you on the journey of exploring knowledge and


personal growth!

ebookmass.com

You might also like