Principal+Component+Analysis+(PCA)

Principal Components Analysis (PCA)

Application: Principal Components Analysis (PCA) is a multivariate statistical technique. Thus the core of PCA is examining the structure or pattern in a dataset. PCA is a useful technique for dimension (variable) reduction. PCA reduces a large number of variables to a small number of linear combinations of those variables to capture as much variation in the dataset as possible (Crawley, 2007). For a more detailed explanation of PCA see McGarigal, Cushman and Stafford, 2000.

1) Independence among samples. 2) Underlying data structure is //multivariate normal//. 3) Linearity.
 * Assumptions:**

Program: R (2.9.2) Package: stats (2.9.2) Calls: princomp, prcomp, cor, scale, summary, loadings, plot, biplot Full Worked Example: (Taken from R (2.92) help menu) Note full worked example has been modified from original format for clarity. ( pc.cr <- princomp ( USArrests )) # inappropriate Call: princomp(x = USArrests) Standard deviations: Comp.1 Comp.2 Comp.3 Comp.4 82.890847 14.069560 6.424204 2.457837 4 variables and 50 observations.
 * 1) The variances of the variables in the
 * 2) USArrests data vary by orders of magnitude, so scaling is appropriate

**## It is a very good idea to “scale” your variables, this is because the variances tend to differ significantly between them.** princomp ( USArrests, cor = TRUE ) # =^= prcomp ( USArrests , scale=TRUE ) Call: princomp(x = USArrests, cor = TRUE) Standard deviations: Comp.1 Comp.2 Comp.3 Comp.4 1.5748783 0.9948694 0.5971291 0.4164494 4 variables and 50 observations.
 * 1) Similar, but different:
 * 2) The standard deviations differ by a factor of sqrt(49/50)

** ## Above shows the two functions for carrying out PCA in R. “prcomp” is often the preferred method (Crawley, 2007). It makes the calculation based on a single value decomposition of the centred and scaled dats matrix (Crawley, 2007). While “princomp” uses **// eigen // **function on the correlation or covariance matrix determined by** //cor// **(Crawley, 2007).**

Continued on Next Page