1. APPLIED MULTIVARIATE METHODS
1.1 An Overview of Multivariate Methods
Variable-and Individual-Directed Techniques
Creating New Variables
Principal Components Analysis
Factor Analysis
Discriminant Analysis
Canonical Discriminant Analysis
Logistic Regression
Cluster Analysis
Multivariate Analysis of Variance
Canonical Variates Analysis
Canonical Correlation Analysis
Where to Find the Preceding Topics
1.2 Two Examples
Independence of Experimental Units
1.3 Types of Variables
1.4 Data Matrices and Vectors
Variable Notation
Data Matrix
Data Vectors
Data Subscripts
1.5 The Multivariate Normal Distribution
Some Definitions
Summarizing Multivariate Distributions
Mean Vectors and Variance-Covariance Matrices
Correlations and Correlation Matrices
The Multivariate Normal Probability Density Function
Bivariate Normal Distributions
1.6 Statistical Computing
Cautions About Computer Usage
Missing Values
Replacing Missing Values by Zeros
Replacing Missing Values by Averages
Removing Rows of the Data Matrix
Sampling Strategies
Data Entry Errors and Data Verification
1.7 Multivariate Outliers Locating Outliers Dealing with Outliers Outliers May Be Influential
1.8 Multivariate Summary Statistics
1.9 Standardized Data and/or Z Scores
Exercises
2. SAMPLE CORRELATIONS
2.1 Statistical Tests and Confidence Intervals
Are the Correlations Large Enough to Be Useful?
Confidence Intervals by the Chart Method
Confidence Intervals by Fisher's Approximation
Confidence Intervals by Ruben's Approximation
Variable Groupings Based on Correlations
Relationship to Factor Analysis
2.2 Summary
Exercises
3. MULTIVARIATE DATA PLOTS
3.1 Three-Dimensional Data Plots
3.2 Plots of Higher Dimensional Data
Chernoff Faces
Star Plots and Sun-Ray Plots
Andrews' Plots
Side-by-Side Scatter Plots
3.3 Plotting to Check for Multivariate Normality
Summary
Exercises
4. EIGENVALUES AND EIGENVECTORS
4.1 Trace and Determinant
Examples
4.2 Eigenvalues
4.3 Eigenvectors
Positive Definite and Positive Semidefinite Matrices
4.4 Geometric Descriptions (p = 2)
Vectors
Bivariate Normal Distributions
4.5 Geometric Descriptions (p = 3)
Vectors
Trivariate Normal Distributions
4.6 Geometric Descriptions (p > 3)
Summary
Exercises
5. PRINCIPAL COMPONENTS ANALYSIS
5.1 Reasons for Using Principal Components Analysis
Data Screening
Clustering
Discriminant Analysis
Regression
5.2 Objectives of Principal Components Analysis
5.3 Principal Components Analysis on the Variance-Covariance Matrix Σ
Principal Component Scores
Component Loading Vectors
5.4 Estimation of Principal Components
Estimation of Principal Component Scores
5.5 Determining the Number of Principal Components
Method 1
Method 2
5.6 Caveats
5.7 PCA on the Correlation Matrix P
Principal Component Scores
Component Correlation Vectors
Sample Correlation Matrix
Determining the Number of Principal Components
5.8 Testing for Independence of the Original Variables
5.9 Structural Relationships
5.10 Statistical Computing Packages
SASR PRINCOMP Procedure
Principal Components Analysis Using Factor Analysis
Programs
PCA with SPSS's FACTOR Procedure
Summary
Exercises
6. FACTOR ANALYSIS
6.1 Objectives of Factor Analysis
6.2 Caveats
6.3 Some History of Factor Analysis
6.4 The Factor Analysis Model
Assumptions
Matrix Form of the Factor Analysis Model
Definitions of Factor Analysis Terminology
6.5 Factor Analysis Equations
Nonuniqueness of the Factors
6.6 Solving the Factor Analysis Equations
6.7 Choosing the Appropriate Number of Factors
Subjective Criteria
Objective Criteria
6.8 Computer Solutions of the Factor Analysis Equations
Principal Factor Method on R
Principal Factor Method with Iteration
6.9 Rotating Factors
Examples (m = 2)
Rotation Methods
The Varimax Rotation Method
6.10 Oblique Rotation Methods
6.11 Factor Scores
Bartlett's Method or the Weighted Least-Squares Method
Thompson's Method or the Regression Method
Ad Hoc Methods
Summary
Exercises
7. DISCRIMINANT ANALYSIS
7.1 Discrimination for Two Multivariate Normal Populations
A Likelihood Rule
The Linear Discriminant Function Rule
A Mahalanobis Distance Rule
A Posterior Probability Rule
Sample Discriminant Rules
Estimating Probabilities of Misclassification
Resubstitution Estimates
Estimates from Holdout Data
Cross-Validation Estimates
7.2 Cost Functions and Prior Probabilities (Two Populations)
7.3 A General Discriminant Rule (Two Populations)
A Cost Function
Prior Probabilities
Average Cost of Misclassification
A Bayes Rule
Classification Functions
Unequal Covariance Matrices
Tricking Computing Packages
7.4 Discriminant Rules (More than Two Populations)
Basic Discrimination
7.5 Variable Selection Procedures
Forward Selection Procedure
Backward Elimination Procedure
Stepwise Selection Procedure
Recommendations
Caveats
7.6 Canonical Discriminant Functions
The First Canonical Function
A Second Canonical Function
Determining the Dimensionality of the Canonical Space
Discriminant Analysis with Categorical Predictor Variables
7.7 Nearest Neighbor Discriminant Analysis
7.8 Classification Trees
Summary
Exercises
8. LOGISTIC REGRESSION METHODS
8.1 Logistic Regression Model
8.2 The Logit Transformation
Model Fitting
8.3 Variable Selection Methods
8.4 Logistic Discriminant Analysis (More Than Two Populations)
Logistic Regression Models
Model Fitting
Another SAS LOGISTIC Analysis
Exercises
9. CLUSTER ANALYSIS
9.1 Measures of Similarity and Dissimilarity
Ruler Distance
Standardized Ruler Distance
A Mahalanobis Distance
Dissimilarity Measures
9.2 Graphical Aids in Clustering
Scatter Plots
Using Principal Components
Andrews' Plots
Other Methods
9.3 Clustering Methods
Nonhierarchical Clustering Methods
Hierarchical Clustering
Nearest Neighbor Method
A Hierarchical Tree Diagram
Other Hierarchical Clustering Methods
Comparisons of Clustering Methods
Verification of Clustering Methods
How Many Clusters?
Beale's F-Type Statistic
A Pseudo Hotelling's T2 Test
The Cubic Clustering Criterion
Clustering Order
Estimating the Number of Clusters
Principal Components Plots
Clustering with SPSS
SAS's FASTCLUS Procedure
9.4 Multidimensional Scaling
Exercises
10. MEAN VECTORS AND VARIANCE-COVARIANCE MATRICES
10.1 Inference Procedures for Variance-Covariance Matrices
A Test for a Specific Variance-Covariance Matrix
A Test for SphericityA Test for Compound Symmetry
A Test for the Huynh-Feldt Conditions
A Test for Independence
A Test for Independence of Subsets of Variables
A Test for the Equality of Several Variance-Covariance
Matrices
10.2 Inference Procedures for a Mean Vector
Hotelling's T2 Statistic
Hypothesis Test for μ
Confidence Region for μ
A More General Result
Special Case—A Test of Symmetry
A Test for Linear Trend
Fitting a Line to Repeated Measures
Multivariate Quality Control
10.3 Two Sample Procedures
Repeated Measures Experiments
10.4 Profile Analyses
10.5 Additional Two-Group Analyses
Paired Samples
Unequal Variance-Covariance Matrices
Large Sample Sizes
Small Sample Sizes
Summary
Exercises
11. MULTIVARIATE ANALYSIS OF VARIANCE
11.1 MANOVA
MANOVA Assumptions
Test Statistics
Test Comparisons
Why Do We Use MANOVAs?
A Conservative Approach to Multiple Comparisons
11.2 Dimensionality of the Alternative Hypothesis
11.3 Canonical Variates Analysis
The First Canonical Variate
The Second Canonical Variate
Other Canonical Variates
11.4 Confidence Regions for Canonical Variates
Summary
Exercises
12. PREDICTION MODELS AND MULTIVARIATE REGRESSION
12.1 Multiple Regression
12.2 Canonical Correlation Analysis
Two Sets of Variables
The First Canonical Correlation
The Second Canonical Correlation
Number of Canonical Correlations
Estimates
Hypothesis Tests on the Canonical Correlations
Interpreting Canonical Functions
Canonical Correlation Analysis with SPSS
12.3 Factor Analysis and Regression
Summary
Exercises
APPENDIX A: MATRIX RESULTS
A.1 Basic Definitions and Rules of Matrix Algebra
A.2 Quadratic Forms
A.3 Eigenvalues and Eigenvectors
A.4 Distances and Angles
A.5 Miscellaneous Results
APPENDIX B: WORK ATTITUDES SURVEY
B.1 Data File Structure
B.2 SPSS Data Entry Commands
B.3 SAS Data Entry Commands
APPENDIX C: FAMILY CONTROL STUDY
REFERENCES
Index