PCA (Principle Component Analysis) is an Unsupervised Learning Technique. -It is part of feature selection -Used in data science to understand data completely -deterministic algorithm -applicable only on continuous data
Used to: -identify relation between columns -reduce number of columns -visualize in 2D
Note: It is mainly used to reduce number of features when more number of features are present
When number of columns are more let it be 100 then its difficuilt to analyze such large data so by usind PCA we reduce columns simply saying we are coverting any dimension in to 2-dimensional.
Suppose for 100 dimensions 100 pca's are generated among them only first 10 pc's are giving 90% of information then we can go with those 10 pc's instead of using all 100 columns.
Conditions:
1. Varaince of PC1>PC2>PC3.........
2. correlation between any pc's is '0'
3. Sum of square weights=1
4. All the PC's should be orthognal to each other
Universities, Wine
Python
The Codes regarding this PCA with its datasets are present in this Repository in detail