You are providing such a amazing content. I have seen EDA and PCA playlist from your channel and I loved the way you explained. Thank you so much for the content . Apart from it I have a question , Can we see those 200 features which is covering more than 90% information?
The playlist was amazing. I have the following question right at the beginning of the introduction of the variance concept. How the maintaining of explaining variance very high is related to preserving of original data? Can someone help me with this? At least try to provide me the relevant blog links.
Variance is also a measure of information content in a dataset. As an extreme case, imagine a dataset where all the points have the same value. Such a dataset has very little information content in it and a variance of zero. More the spread in the data points, more the information content in the data set. That’s why in PCA, we try to preserve as much variance as possible while projecting to lower dimensions.
Thanks for excellent deep explaination! Really helpful!!
You are providing such a amazing content. I have seen EDA and PCA playlist from your channel and I loved the way you explained. Thank you so much for the content . Apart from it I have a question , Can we see those 200 features which is covering more than 90% information?
You can obtain the top information features by using the top eigenvectors as each top eigenvector corresponds to the direction of high variance.
Super video
In the plt.plot(), we only mentioned cum_var_ecplained as a parameter. How did we get n_components as the x-axis?
Thank you so much for the PCA playlist video
Isn't it required for var_explained and cum_var_explaimed to be in a for loop so that the loop goes on from i=0 to i= 783?
The playlist was amazing.
I have the following question right at the beginning of the introduction of the variance concept.
How the maintaining of explaining variance very high is related to preserving of original data?
Can someone help me with this?
At least try to provide me the relevant blog links.
Variance is also a measure of information content in a dataset. As an extreme case, imagine a dataset where all the points have the same value. Such a dataset has very little information content in it and a variance of zero. More the spread in the data points, more the information content in the data set. That’s why in PCA, we try to preserve as much variance as possible while projecting to lower dimensions.
@@AppliedAICourse Even if I have same data points in my dataset, why is it considered to have little information content? Isn't all this still data?
Can we get back to the original data from reduced data?
You cannot recover the original data if you do not have all the eigenvalues and eigenvectors.
@@AppliedAICourse So this is why we intially store the data in a new variable before reducing the dimensionality?