Perplexity intuitively means we are trying to guess the number of neighbors around a point. This is why we keep it low for dense regions since there is a large cluster of points around each point.
Good you asked. In my video, I cover the gaussian formula and compare it back to back with the t-distribution. The literature calls the gaussian version of formulation as SNE algorithm. In practice, SNE's embeddings lag behind TSNE in providing clarity when you have large dimensions and some non-linearity. Have a look at the video's formulations for SNE's q version as it is compared to TSNE. Happy to answer should you have any other questions.
Why to keep perplexity value small(low) for dense region?
Perplexity intuitively means we are trying to guess the number of neighbors around a point. This is why we keep it low for dense regions since there is a large cluster of points around each point.
@@machinelearningmastery how do we know if regions are dense, a priori?
I have a question why not represent the low dimension probability ‘ q’ with gaussian rather than using t distribution
Good you asked. In my video, I cover the gaussian formula and compare it back to back with the t-distribution. The literature calls the gaussian version of formulation as SNE algorithm. In practice, SNE's embeddings lag behind TSNE in providing clarity when you have large dimensions and some non-linearity. Have a look at the video's formulations for SNE's q version as it is compared to TSNE. Happy to answer should you have any other questions.