Slide 4: where does sigma come from? Is the value parametric from outside, or a result of some calculation based on the data? And what is perplexity? : Value? Relation to sigma? Role in the equations? Thank you for the clear explanations.
Hi, thanks for the great video! By the way, in ppt page 5. smaller "coast" for representing widely separated data points...? Is that cost or something?
Slide 4: where does sigma come from? Is the value parametric from outside, or a result of some calculation based on the data?
And what is perplexity? : Value? Relation to sigma? Role in the equations?
Thank you for the clear explanations.
perplexity- number of points in the neighborhood , whose distance we are preserving
sigma is a normalizing factor, it came from same data.
Thank you for this very instructive video !
Wow! Good tool for data dim. reduction. Thanks for sharing!
p(i|i) should be equal to 1 I suppose. because xi-xi=0 and e^0 = 1
It sets to zero, since the method are looking for similarity between distinct points.
please state the relation between perplexity and variance.
Hi, thanks for the great video!
By the way, in ppt page 5. smaller "coast" for representing widely separated data points...? Is that cost or something?
Thank you! Yes, that is a typo it should be cost.
Thank you for the video! Could you provide a link for the slides please.
Hi Antony Harnist, I am glad you liked it! Slides can be found here: github.com/Divyagash/t-SNE/blob/master/tSNE_Presentation.pdf
top!
You made one mistake, it is not a conditional probability, it is something like probability from j to i
It is condition probability in high dim and joint probability in low dim
It basically means prob of point j given point xi