Lecture 7 Bootstrapping, Randomization, 2B-PLS
Lecture 7 Bootstrapping, Randomization, 2B-PLS
Bootstrapping, Randomization,
2B-PLS
Null hypothesis – usually the hypothesis that the estimated statistics from two
or more samples result from two or more random samplings of the same
population. In other words, the hypotheses that the sample do not come from
different populations.
These tests randomize the data with respect to the statistic being measured
Observed value from the real data is compared to see whether it falls within
the range of randomized values
Jackknife. Same as bootstrap, but where each individual data point is left out
in turn and the test statistic recalculated each time to determine standard
error.
2. Pool samples and randomly draw new sample 1 and 2 with replacement.
Review
Eigenvalues
variance on each PC axis
(In Mathematica: Eigenvalues[CM])
Eigenvectors
loading of each original variable on each PC axis
(In Mathematica: Eigenvectors[CM])
2. Regression parameters (slopes, intercepts): indicate the axis in shape space associated
with the factor, useful for modeling the aspect of shape associated with the factor
3. Correlation coefficient (R): indicates the strength of the association between the variance
and the factor
a = 2.0
b = 0.5
PC 1 Scores
R2 = 0.85
Size
R2 also ranges from 1.0 (100% explained) to 0.0 (0% explained).
Example of ShapeRegress[]
ShapeRegress[proc, x, 2]
0.15
0.10
Graph shows one
dimension of the
0.05 regression
PC 2
0.00
-0.05
Regression axes
Block 1
Block 2
Correlation
between
blocks on
each axis
Regression axes
Block 1
Block 2
Regression axes
Block 1
Block 2
This function performs a two-block partial least squares analysis following the methodology of
Rohlf and Corti (2000). Two blocks of data are given to the function, along with a list of two
strings indicating the type of data. Allowable types are "Shape" (Procrustes superimposed
coordinates), "Standardized" (independent variables with different units of measurement that
need to be standardized), and "Unstandardized" (independent variables with the same unit of
measurement that do not need to be standardized). An optional argument is the number of the
PLS axis to plot in the output graph. By default PLS 1 is plotted.
Arguments:
• data1 and data2 are two blocks of variables, one or both of which can be a matrix of
Procrustes superimposed landmark coordinates with objects in rows and coordinates in
columns.
• {type1, type2} is a list of two data types, in quotation marks. Allowable types are
"Shape" (Procrustes superimposed coordinates), "Standardized" (independent variables
with different units of measurement that need to be standardized), and
"Unstandardized" (independent variables with the same unit of measurement that do not
need to be standardized).
• PLS is an integer indicating which PLS axis to plot.
0.01
0.00