Corrections: 6:17 I should have said that the blue points have twice the density of the purple points. 7:08 There should be a 0.05 in the denominator, not a 0.5. Support StatQuest by buying my books The StatQuest Illustrated Guide to Machine Learning, The StatQuest Illustrated Guide to Neural Networks and AI, or a Study Guide or Merch!!! statquest.org/statquest-store/
Thanks very much for the informative lecture and it is really helpful. UMAP is more and more popular now, could you explain it and compare with tSNE as well? Thanks in advance.
@@statquest UMAP is great, I dont know if it is more popular. There are more stringent reductions out there like ICA. I wonder the thoughts of Josh about it?
@@CompBioQuest I guess it largely depends on the field. Right now, genetics and molecular biology are going bonkers over UMAP. However, ICA is very interesting. Thanks to your question, I found this article which is fascinating: gael-varoquaux.info/science/ica_vs_pca.html
Thank you. I am not sure if you remember me from the PCA video. I have a job now. My job do not have high salary, but I could now support you by donating and thank you now. 😊
I am always blown away by how you make statistics & machine learning algorithms so simple to understand and how you graciously share your knowldege. Keep up the great work man, you are awesome!
I regret I can't put 1000 likes! I read about 20 articles about t-SNE, they are similar to one another, almost identical - and they don't get me closer to the point. But your video - I watched it 4 times (because the topic is hard, at least for me) with making some and drawing - but finally I understand how it works, up to the point that I can explain it to someone else. So many thanks to you!
I'm writing this comment while having watched only half way into this video, which is pretty unusual for me! It is so clearly explained! I once glanced at the t-SNE paper and didn't understand it. If this is what it does then this is how things like this should be explained! Really, we need people explaining science like this! It's possible to read scientific papers, but what they fail to do is properly communicate the core idea to the reader so that the reader quickly grasps the big picture and the intent of the mathematical details without getting lost in the details. Frequently, even a missing definition can make reading papers much harder for non experts.
It's impressive how you managed to explain the essential concepts of this chain of algorithms in such a clear way! I'm sharing this video with my beginner fellows, who normally flee as soon as I say words like nearest-neighbor or stochastic. Thank you very much!
Josh, i literally love your videos, they are really helping me get through my ADV CS degree. I am going to buy one of your shirts, and wear it on campus as a thank you!
Fantastic video. I really appreciate all the slides that you made to get the animation effect. It really helped. Possibly the best explanation of t-SNE around. Keep up the good work.
Very nice way of teaching ! ML concepts CLEARLY EXPLAINED and BAM adds lot of curiosity in the videos :) Thanks for your videos. And not to forget your songs are really nice :)
This explanation almost makes tSME sound like a clustering technique not a reduction technique..... That said, this was by far the best explanation I've heard to date.
@@statquest Now if you can explain how to interpret a tSME plot. This would help immensely as it's virtually impossible to determine the correct perplexity number without understanding how to interpret the plot. This seems like one of those "blackbox" methods which we just trust. Keep up the great work!
this is such an awesome explanation of tsne that i dont need to watch any other video or read any other website/book. I dont think there can be a better explanation. Superlike.
Just hear about t-SNE and I did not quite understand how it works so I crossed my fingers hoping that josh did a video of this and of course he did!! haha I have my popcorn ready to enjoy this video :)
Awesome explanation, thank you so much! I read a few papers/books multiple times and barely have a clue, but with your vid I understand the concept just by watching it once!
Came here for understanding the t-SNE plots used in single cell transcriptomics - which I finally did, thanks! Overall, you helped me out already plenty of times! To display cells in during cell fate transition/acquisition e.g. different time points during neurodevelopment, often pseudo-temporal ordering is used. Since scRNA seq is becoming more and more popular, this might be a good next topic
Hey, love your videos! Just a typo but it should be 0.05 on the values to the right at 07:19. Confused me for a second so might clear things up for others.
thanks for your great explaination. I just wonder from 5:00 - 5:45, Why when you plot the distance on the normal curve the red and the orange is on different sides of normal curve. I thought distance didn't have direction. Can you please explain more detail about this different direction of the red and orange?
@@statquest yeah, i understood. Because we take p as similarities values so right or left is the same. Thanks a lot. Your videos help me a lot in my machine learning studying.
Hi Josh, I can't thank you enough for how much I have benefitted from your videos even though I do data science as part of my day job. Thank you so much for sharing your knowledge! One request for a video: could you do a video of when to use which methods / models in a typical data science problem? Much appreciated.
Great video - thank you! One small insertion that I think would improve it: at ~2:07, right after showing what projecting on to the X or Y axis would look like, show one more example of projecting onto an arbitrary line to try to retain as much variance as possible (basically PCA). I think this could be done in 15-20 seconds, and would be helpful in comparing t-SNE to one of its most popular alternatives, which is helpful in deciding *when* to use an algorithm - one of the hardest things for beginners like myself.
Thank you so much! Right now everyone in our department (Systems Genetics at NYU Langone) is using UMAP. There aren't many great videos about it - it would be awesome if you could help us understand what all the hype is about!
Thank you a lot for the video Josh. Let me point something out, and by minute 10:40, it looks like that t-sne perform a sort of the matrix, instead of minimizing the loss function by gradient descent.
Hey, love your videos! We are actually using it to help explain key concepts in our application-focused courses. I'd love to see UMAP (similar to t-SNE), which is a bit more scalable.
@@statquest Awesome! I'm using your content in my courses - Students love it. PCA, K-Means, & t-SNE. Will be using your ML videos as well. Your explanations are the best!
The mean of the normal curve is 0, the distance from the point we are calculating similarities to and itself. The standard deviation is a function of the density of the points around it and, I believe, the perplexity fudge factor. I can't remember the formula off the top of my head, but the higher the density of point, the smaller the standard deviation, and the lower the density of points, the higher the standard deviation.
Hi Josh, great videos as always! I'm not sure if there's a video about this already, but could you do one with all the clustering or classification or dimensionality reduction methods compiled together and then compare their differences and similarities and talk about situations when we should use which? For example, after looking at many of the videos, I think I'm already a little lost on if I should use PCA or MDS or t-SNE on my data. Ty.
Hi Josh, great video, many thanks! Anyway, I still don't get how do you determine the distribution properties (like standard deviation) for calculating unscaled similarity between two points. When you introduced half as dense cluster as the others, you used normal distribution with standard deviation doubled, what is quite intuitve. But you knew that this cluster is just half as dense as the others. The question is, how to know the properties of these distribution curves?
Thank you Josh . I love the way you present concepts with simple examples. Could you please explain how you decided the red dot directions to the left, where as the orange on right side @5:30 ?
It doesn't matter what side of the curve the points are on, since the distance from the y-axis values on the curve will be the same (normal curves are symmetrical). However, in order for the points to be easily seen, I spread them out on different sides rather than piling them all up on top of each other.
Just out of curiosity.... ....are there any plans to do a video on trajectory analysis? I'm doing an analysis on whether the floating properties of ducks and wood can be used to predict the outcome of being a witch or not.
Love it! A few things could still be clarified (please?): At 07:40, which vector of distances must add up to 1 after scaling? The sum of distances from each point to all other points (regardless of cluster)?
Hi Josh, quality content! This channel continuously helps me to understand the idea behind so that the dry textbook explanations actually make sense. I still have a question. When you calculate the unscaled similarity score, how do you exactly determine the width of your guassian? I get it in the example that we already know the cluster. If I only want to visualize the data without having pre-defined clusters, what happens then?
I talk more about the details of t-SNE and how it works in my videos on UMAP: ruclips.net/video/eN0wFzBA4Sc/видео.html and ruclips.net/video/jth4kEvJ3P8/видео.html
Corrections:
6:17 I should have said that the blue points have twice the density of the purple points.
7:08 There should be a 0.05 in the denominator, not a 0.5.
Support StatQuest by buying my books The StatQuest Illustrated Guide to Machine Learning, The StatQuest Illustrated Guide to Neural Networks and AI, or a Study Guide or Merch!!! statquest.org/statquest-store/
Thanks very much for the informative lecture and it is really helpful. UMAP is more and more popular now, could you explain it and compare with tSNE as well? Thanks in advance.
@@linweitao6470 I should have a UMAP StatQuest ready in a few weeks. I'm working on it right now.
@@statquest Thanks again!
@@statquest UMAP is great, I dont know if it is more popular. There are more stringent reductions out there like ICA. I wonder the thoughts of Josh about it?
@@CompBioQuest I guess it largely depends on the field. Right now, genetics and molecular biology are going bonkers over UMAP. However, ICA is very interesting. Thanks to your question, I found this article which is fascinating: gael-varoquaux.info/science/ica_vs_pca.html
Thank you. I am not sure if you remember me from the PCA video. I have a job now. My job do not have high salary, but I could now support you by donating and thank you now. 😊
WOW! Thank you so much. And congratulations on getting a job!!! HOORAY!!! TRIPLE BAM! :)
@@statquest Keep doing great work sir! Also, it would be great if you could make a video about the comparation between clustering methods. 😁
@@tuongminhquoc Thanks and I'll keep that in mind!
I am always blown away by how you make statistics & machine learning algorithms so simple to understand and how you graciously share your knowldege. Keep up the great work man, you are awesome!
Thank you very much! :)
Whenever I find statistics technique I have never seen in scientific article, I always visit your channel. Thanks a lot!!
Happy to help! :)
I regret I can't put 1000 likes! I read about 20 articles about t-SNE, they are similar to one another, almost identical - and they don't get me closer to the point. But your video - I watched it 4 times (because the topic is hard, at least for me) with making some and drawing - but finally I understand how it works, up to the point that I can explain it to someone else. So many thanks to you!
HOORAY!!! TRIPLE BAM! I'm glad the video was helpful. BAM! :)
I never leave comments, but I really feel the need to thank you for being able to explain this in such a simple way
Thank you! :)
As entertaining as watching a Walt t-SNE movie!
You made me laugh out loud! BAM! :)
Best stat-word-play of the year! 😂
I'm writing this comment while having watched only half way into this video, which is pretty unusual for me!
It is so clearly explained! I once glanced at the t-SNE paper and didn't understand it. If this is what it does then this is how things like this should be explained!
Really, we need people explaining science like this! It's possible to read scientific papers, but what they fail to do is properly communicate the core idea to the reader so that the reader quickly grasps the big picture and the intent of the mathematical details without getting lost in the details.
Frequently, even a missing definition can make reading papers much harder for non experts.
I'm glad you liked this video so much! :)
Josh is so far my favorite RUclipsr that is able to explain complex stats concepts so smoothly.
Thank you so much! :)
"This is Josh Starmer, and you're watching Tisney Channel!"
Triple BAM! :)
It's impressive how you managed to explain the essential concepts of this chain of algorithms in such a clear way! I'm sharing this video with my beginner fellows, who normally flee as soon as I say words like nearest-neighbor or stochastic.
Thank you very much!
Thank you very much! :)
🤣🤣🤣🤣it's that terrifying?!? Barbara Oakley in her book, "a mind for numbers" called them zombies🤣🤣🤣
Great explanations! Can you please do a video explaining UMAP and potentially how it compares to t-SNE? Thanks!
+1
+1
+1
+1
+1
Josh.. Your explanation is always "simple and easy to understand" even for layman.You are simply "The life Saviour" !!!
Thank you so much :)
Hooray! I'm glad my video was helpful. :)
The only educational channel which brings a smile to my face.
bam!
I never knew machine learning could be as simple as... BAM
Thats like the most important lesson.
Double bam 💥
Just a random comment so that someone can say triple bam
Triple bam 💥
hurayyyy we have made it to the END !!!
I am a student in Japan.
I'm not good at English, but it was very easy to understand and I learned a lot:)
Awesome! :)
I really can't appreciate you enough for your videos.
Books and blogs only make sense after I watch your videos!
Thank you very much! :)
I was so confusing about t-SNE until I watched this. It's clear and very easy to understand! Thank you! Like your BAM. :D
BAM! :)
It's rare to come across such a brilliant explanation.
Thank you! :)
Josh, i literally love your videos, they are really helping me get through my ADV CS degree. I am going to buy one of your shirts, and wear it on campus as a thank you!
That would be awesome!!! Thank you very much! :)
I just love the way you start all your videos! Stat-Questtttttt :)
BAM! :)
Very clearly explained!
Loved the way you explained such a complicated concept so intuitively.
Thank you.
Glad it was helpful!
Fantastic video. I really appreciate all the slides that you made to get the animation effect. It really helped. Possibly the best explanation of t-SNE around. Keep up the good work.
Very nice way of teaching ! ML concepts CLEARLY EXPLAINED and BAM adds lot of curiosity in the videos :) Thanks for your videos. And not to forget your songs are really nice :)
Thank you!
This explanation almost makes tSME sound like a clustering technique not a reduction technique..... That said, this was by far the best explanation I've heard to date.
That's a good observation. In many ways t-SNE is a hybrid method that reduces dimensions by clustering.
@@statquest Now if you can explain how to interpret a tSME plot. This would help immensely as it's virtually impossible to determine the correct perplexity number without understanding how to interpret the plot. This seems like one of those "blackbox" methods which we just trust. Keep up the great work!
Why I couldn't stop bamming the like button??!! You're the best Josh!!
Thanks!
The Best tutorial and explanation for TSNE so far! It's of great help! Thanks a lot!
Thanks! :)
you are the hero, keep explaining complex thing into simple. thankss
Thank you! :)
Excellently explained! I really like your simple, clear, concise explanation - those 3 factors make a world of difference. And, great animations.
Awesome, thank you!
Very well explained ! Your video was recommended to us by our professors at ETH-Zürich.:)
this is such an awesome explanation of tsne that i dont need to watch any other video or read any other website/book. I dont think there can be a better explanation. Superlike.
This is the best video for t-SNE that I have ever seen. Thanks a lot, man
You are incredible, Josh Starmer!! I loved this
Thank you! :)
Just hear about t-SNE and I did not quite understand how it works so I crossed my fingers hoping that josh did a video of this and of course he did!! haha
I have my popcorn ready to enjoy this video :)
Worth it!
BAM! :)
Great as always. I've heard of t-SNE before, but this was my first real introduction to it. Definitely want to go look at some more resources now.
Awesome explanation, thank you so much! I read a few papers/books multiple times and barely have a clue, but with your vid I understand the concept just by watching it once!
Came here for understanding the t-SNE plots used in single cell transcriptomics - which I finally did, thanks! Overall, you helped me out already plenty of times!
To display cells in during cell fate transition/acquisition e.g. different time points during neurodevelopment, often pseudo-temporal ordering is used.
Since scRNA seq is becoming more and more popular, this might be a good next topic
Same here, and I did not expect to understand so fast and clearly!
Hey, love your videos!
Just a typo but it should be 0.05 on the values to the right at 07:19. Confused me for a second so might clear things up for others.
Brilliant explanation, this has been bugging me all day, thank you!!
Glad it helped!
Thanks a lot. I really struggled to understand the concept first time I came across it in a book. Your video helped a lot. Great job!
Super Mega BAM !! So great at what you do as always ... Tons of love sent your way ! Keep up the amazing work :D
Thanks so much!!
thanks for your great explaination. I just wonder from 5:00 - 5:45, Why when you plot the distance on the normal curve the red and the orange is on different sides of normal curve. I thought distance didn't have direction. Can you please explain more detail about this different direction of the red and orange?
The normal curve is symmetrical, so we can puts the dots on either side. In this case, I used both sides so that not all the dots would overlap.
@@statquest yeah, i understood. Because we take p as similarities values so right or left is the same. Thanks a lot. Your videos help me a lot in my machine learning studying.
Thanks for such a clear explanation. You know, your channel already in the top list for me and very soon I'll watch all your videos..
Hello Josh, thank you for coming with such incredible videos. Data scientist’s life becomes easy.😬
Thank you! :)
StatQuest with Josh Starmer Hi a request to do a tutorial of UMAP.
You make a complex idea becomes so simple and understanding ! Great video. Thanks a lot
Hi Josh, I can't thank you enough for how much I have benefitted from your videos even though I do data science as part of my day job. Thank you so much for sharing your knowledge!
One request for a video: could you do a video of when to use which methods / models in a typical data science problem? Much appreciated.
That's a great idea.
Thanks a lot!! These videos are much more clear than any article!
A video explaining UMAP (related to t-SNE) would be awesome !
I'm working on UMAP. For now, however, know that it is almost 100% the same as t-SNE. The differences are very subtle.
excellent explanation , this intuition helps to follow maths behind t-SNE
Excellent video! Perhaps you could add another video where you go through the actual algorithm and how the moves is actually computed.
yes!!! pleasee!!
Difficult concept made so simple. Just brilliant!!!!
Thanks a lot 😊!
Dude this is super clear. Love the content! BAM
Thank you very much! :)
One word reaction after watching this video --> AWESOME!!
Thank you so much 😀!
Kudos, I understood so effortlessly....tripple BAM!!!
Thanks! :)
Great video - thank you! One small insertion that I think would improve it: at ~2:07, right after showing what projecting on to the X or Y axis would look like, show one more example of projecting onto an arbitrary line to try to retain as much variance as possible (basically PCA). I think this could be done in 15-20 seconds, and would be helpful in comparing t-SNE to one of its most popular alternatives, which is helpful in deciding *when* to use an algorithm - one of the hardest things for beginners like myself.
Thanks for the tip!
Thanks really great videos understood concepts so well
Glad it was helpful!
"Clearly Expalined" indeed!
Wish I could *Triple Bam* like this video! Such a simple explanation. Thanks a lot Josh :-)
Glad you liked it!
Thank you so much for this great resource and how much investment you have made into it. I have understood this well.
Glad it was helpful!
i am a huge fan of this channel! greetings from brazil ^^
Muito obrigado! :)
Amazing work! perfectly explained!!!
Thanks a lot!
I am at the intro and love it already!
BAM! :)
OH God, this is a great explanation, as Radel mention below, it would be nice to have an extended video of the algorithm as the one from PCA!!
Thank you! Yes, one day I'll break the actual equations down and do "step-by-step" explanation of t-SNE.
Looking forward to this.
"Bam, I made that terminology up" :D :D , great vid, keep up the good work.
Thanks! 😁
I never thought I'd not understand a statquest video! :(
Bummer. What time point was confusing?
your explanation is very very good! thanks!!!
Thank you! :)
thank you so much for this nice explanation. will help me a lot in my exams
Glad to hear that!
Love the vid. I was wondering how tsne works and you broke it down great and the explanation for the t distribution was short and to the point.
Thank you! :)
great explanation especially for beginners.Thanks
Thank you! :)
Excellent work, thank you !!
Thanks!
Great videos! Great channel! Big thumbs UP!
Big thanks!
Thank you so much! Right now everyone in our department (Systems Genetics at NYU Langone) is using UMAP. There aren't many great videos about it - it would be awesome if you could help us understand what all the hype is about!
UMAP is on the to-do list. I hope to get to it in the spring.
Thanks for this wonderful video❤️
Glad you enjoyed it!
Your speak like Kevin from The Office. Great explanation, thanks a lot:)
Thank you very much Josh . You made it easier to understand.
Hooray! I'm glad the video was helpful! :)
Thanks a million for this masterpiece !!!
Thank you!
Thank you a lot for the video Josh.
Let me point something out, and by minute 10:40, it looks like that t-sne perform a sort of the matrix, instead of minimizing the loss function by gradient descent.
Good point. I represented it as a matrix because, internally, all of the similarity scores are maintained that way.
Excellent intro to tSNE
Thank you! :)
Subscribed because that intro gave me life!
Ha!!! Thanks! :)
This is a great explanation thank you!
Glad you enjoyed it!
Hey, love your videos! We are actually using it to help explain key concepts in our application-focused courses. I'd love to see UMAP (similar to t-SNE), which is a bit more scalable.
Thank you so much! It's on the to-do list. :)
@@statquest Awesome! I'm using your content in my courses - Students love it. PCA, K-Means, & t-SNE. Will be using your ML videos as well. Your explanations are the best!
Amazing explanation! Thank you!
Great video. Just if you could explain a bit who the shape of normal curve has been determined would be wonderful! I'm a bit confused there at 4:41.
The mean of the normal curve is 0, the distance from the point we are calculating similarities to and itself. The standard deviation is a function of the density of the points around it and, I believe, the perplexity fudge factor. I can't remember the formula off the top of my head, but the higher the density of point, the smaller the standard deviation, and the lower the density of points, the higher the standard deviation.
Fantastic video! Thanks so much
Thanks! :)
Hi Josh, great videos as always! I'm not sure if there's a video about this already, but could you do one with all the clustering or classification or dimensionality reduction methods compiled together and then compare their differences and similarities and talk about situations when we should use which? For example, after looking at many of the videos, I think I'm already a little lost on if I should use PCA or MDS or t-SNE on my data. Ty.
Thanks! I'll keep that in mind.
VERY CLEAR EXPLANATIONS :) THANK YOU FOR ALL YOUR VIDEOS
Fantastic explanation and comments. Thanks so much!
Thank you!! I'm glad you like the video. :)
Hi Josh, great video, many thanks! Anyway, I still don't get how do you determine the distribution properties (like standard deviation) for calculating unscaled similarity between two points. When you introduced half as dense cluster as the others, you used normal distribution with standard deviation doubled, what is quite intuitve. But you knew that this cluster is just half as dense as the others. The question is, how to know the properties of these distribution curves?
You estimate it from the data.
I need to watch 3 more times to fully understand. TRIPLE BAM!!!
:)
Thank you Josh . I love the way you present concepts with simple examples.
Could you please explain how you decided the red dot directions to the left, where as the orange on right side @5:30 ?
It doesn't matter what side of the curve the points are on, since the distance from the y-axis values on the curve will be the same (normal curves are symmetrical). However, in order for the points to be easily seen, I spread them out on different sides rather than piling them all up on top of each other.
@@statquest Thank you again
Incredibly helpful and well presented. Thank you.
Just out of curiosity.... ....are there any plans to do a video on trajectory analysis? I'm doing an analysis on whether the floating properties of ducks and wood can be used to predict the outcome of being a witch or not.
Whatever your model, it will probably improve if you incorporate the average airspeed of swallow.
Excellent Explaination. Tripple BAM !!!
Thank you!
Thanks for expalining this.
Thanks!
can you explain the math more?
Made it look real simple.. thanks!
Hooray! :)
Very clear explanations, thanks a lot!
Love it! A few things could still be clarified (please?):
At 07:40, which vector of distances must add up to 1 after scaling? The sum of distances from each point to all other points (regardless of cluster)?
Yes.
Hi Josh, quality content! This channel continuously helps me to understand the idea behind so that the dry textbook explanations actually make sense. I still have a question. When you calculate the unscaled similarity score, how do you exactly determine the width of your guassian? I get it in the example that we already know the cluster. If I only want to visualize the data without having pre-defined clusters, what happens then?
I talk more about the details of t-SNE and how it works in my videos on UMAP: ruclips.net/video/eN0wFzBA4Sc/видео.html and ruclips.net/video/jth4kEvJ3P8/видео.html
Thanks a lot, TRIPLE BAM for you!
thanks! they're the best ❤
Thanks!