An explanation and application of Principal component analysis (PCA)
PCA is a statistical technique for reducing the dimensionality of a dataset. This is accomplished by linearly transforming the data into a new coordinate system where (most of) the variation in the data can be described with fewer dimensions than the initial data.
Definition and Derivation of Principal Components
Application of PCA to a data set
The libraries and load the csv
We see a general glance at the data
Then we see if there is any null data element
In this example we will drop the column 'date' for simplicity
From the previous graphic, we can select for example PCA_1, PCA2, PCA_6, PCA_7 and PCA_10 and we captured 78.79% of the variance.
Now we build the feature vector