HI Thank you again, I remember seeing a comment about how you do these videos and write on the screen, but I can't find your answer on it, you have a resource that talk about how you do these screencasts? Thank you again.
Hey Sherif! Really happy that you enjoy it. The description is here: www.weaklysupervised.com/2021/03/12/making-lecture-videos/ I might move it to my home page (www.kamperh.com) at some point, but it lives there for now! :)
Thanks for all your effort. I just started watching your lectures and they are extremely useful. Btw, I have some questions: 1. How do we know that our data is high-dimensional? If the data is strictly high dimensional then we won't be able to do a dimensional reduction on that data, right? So the dimensionality reduction techniques would fail, right? Is there an example of such datasets? 2. MNIST lives in 784 dimensions. However, when we do dimension reduction of MNIST dataset and we can clearly see that dataset forming well-separable clusters in 2D. Does this mean that MNIST essentially lives in 2D-manifold and other 782 dimensions are superfluous?
These are both really good questions, and neither of them have easy answers, mainly because the real world is messy. 1. I think what you are asking is: What if my data is inherently high-dimensional? Won't this mean that I mess up when I do dimensionality reduction? The answer is: it depends on what you want to do with your data, and what it means for it to be "inherently high-dimensional". In the real world, you will almost always be throwing away some information. And that could be beneficial if the information you are throwing away is not what you are ultimately interested in. To distinguish a cat vs. dog, the background colour of an image might not be so relevant, so if dimensionality reduction throws away that information that's great! But what you were actually interested in detecting something in the background of the image? Then that would be super bad. So in practice, you should basically test whether the dimensionality reduction keeps or throws away the "information" you are interested in. There is also the concept of "proportion of variance explained", which is exactly a metric that can tell you how much of the variation in your original data you are keeping with the dimensionality reduction. Have a look at Section 10.2.3 in the ISL textbook (www.statlearning.com/). 2. This is very related to your first question: you qualitatively looked at the projection and then said "this looks pretty good". But if you look closely at those MNIST plots, you will see that the clusters aren't perfectly separated. So if you e.g. trained a linear classifier on top of the 2-D PCA projected data, you might get an okay accuracy (let's say 80%). But if you really wanted to get the top possible performance (say 95%), you will see that you will actually need to keep a lot more dimensions, particularly to better capture characteristics of digits lying on the boundaries between the clusters. I hope that helps, but shout if it doesn't.
HI Thank you again, I remember seeing a comment about how you do these videos and write on the screen, but I can't find your answer on it, you have a resource that talk about how you do these screencasts?
Thank you again.
Hey Sherif! Really happy that you enjoy it.
The description is here: www.weaklysupervised.com/2021/03/12/making-lecture-videos/
I might move it to my home page (www.kamperh.com) at some point, but it lives there for now! :)
@@kamperh thank you
Thanks for all your effort. I just started watching your lectures and they are extremely useful.
Btw, I have some questions:
1. How do we know that our data is high-dimensional? If the data is strictly high dimensional then we won't be able to do a dimensional reduction on that data, right? So the dimensionality reduction techniques would fail, right? Is there an example of such datasets?
2. MNIST lives in 784 dimensions. However, when we do dimension reduction of MNIST dataset and we can clearly see that dataset forming well-separable clusters in 2D. Does this mean that MNIST essentially lives in 2D-manifold and other 782 dimensions are superfluous?
These are both really good questions, and neither of them have easy answers, mainly because the real world is messy.
1. I think what you are asking is: What if my data is inherently high-dimensional? Won't this mean that I mess up when I do dimensionality reduction? The answer is: it depends on what you want to do with your data, and what it means for it to be "inherently high-dimensional". In the real world, you will almost always be throwing away some information. And that could be beneficial if the information you are throwing away is not what you are ultimately interested in. To distinguish a cat vs. dog, the background colour of an image might not be so relevant, so if dimensionality reduction throws away that information that's great! But what you were actually interested in detecting something in the background of the image? Then that would be super bad. So in practice, you should basically test whether the dimensionality reduction keeps or throws away the "information" you are interested in.
There is also the concept of "proportion of variance explained", which is exactly a metric that can tell you how much of the variation in your original data you are keeping with the dimensionality reduction. Have a look at Section 10.2.3 in the ISL textbook (www.statlearning.com/).
2. This is very related to your first question: you qualitatively looked at the projection and then said "this looks pretty good". But if you look closely at those MNIST plots, you will see that the clusters aren't perfectly separated. So if you e.g. trained a linear classifier on top of the 2-D PCA projected data, you might get an okay accuracy (let's say 80%). But if you really wanted to get the top possible performance (say 95%), you will see that you will actually need to keep a lot more dimensions, particularly to better capture characteristics of digits lying on the boundaries between the clusters.
I hope that helps, but shout if it doesn't.