Just amazing explanation. I had a blurry understanding of svd after taking a class and your video made the concept absolutely clear for me. Thanks a lot.
Been watching your videos for months now. Very much enjoy how general your videos can be for someone outside of data science. I generally like watching math videos from non-math educators because they have a great balance of an explanation. One thing I really enjoy about your videos is at the end you bring it back to your field and why this is useful in your world. Reduced in entries for storage or for further calculations is very tangible to see the real world application.
I think this video is amazing. I have been wanting to watch videos of this channel since the past 2 years but never could because i lacked the basic knowledge to gain from the explanations here. I was taught this concept in class very poorly, i immediately knew i could finally come here. The way this ties in with pca, if i am correct and the ease with which the mathematical kinks were explained was phenomenal. Glad to finally benefit from this channel. Thanks a ton.
Absolutely love your videos! Just to clear possible confusion for learners at 4:05 abt VtV=I because of orthonormality, not merely independence which is only a consequence. Great job!
I really like your videos! 👍 Methods are very clearly and concisely explained, explaining the applications of the methods in the end also helps a lot to remember what it really does. The time span of the video is also perfect! Thanks and hope to see more videos from you
The columns of M, before SVD, could mean features. Do the columns of U and V (the left and right singular vectors) carry any physical meaning? The video keeps two singular values. How many do people usually keep?
This is really realy informative. Just one question. What are sigmas? Are they eigenvalues from SVD or something else? How did you get 2 and 3 in your example?
Thanks for the comment! I was debating whether to go back to the marker and paper style where I would write more stuff in real time. This suggestion is very helpful to me.
@@ritvikmath thanks for the reply! I am especially grateful for your PCA math video since I am currently doing research with a functional data analysis algorithm that uses multivariate functional PCA and I've looked EVERYWHERE for an easy explanation. Your PCA video (and the required videos) is hands down the best explanation out there. I am forever grateful :-)
@@saraaltamirano I initially had the same PoV but then after consuming the whiteboard type content for a while, I've gotten used to it, and recently he has started moving away from the board at the end of the video so that we can pause and ponder upon it.
Hi Ritvik, thanks for the very explanatory video. Really very helpful to understand. However, when you see that you achieved a 75% computation reduction in this case, was it really because we assumed sigma(3) onwards to be approximately equal to zero. Does this assumption sway away from reality or this is how it always happens. Eager to hear your thoughts. Happy to learn from this video.
almost, it's a small but important distinction. A full rank matrix has its columns linearly independent to each other. An orthogonal matrix (like the ones in this video) are full rank but also satisfy the property that its rows and columns are orthogonal to each other (have dot product 0 for any pair of different rows/columns). So for an orthogonal matrix, like you said, its transpose is its inverse. But that's not generally true for any full rank matrix. Looking back, I did say the wrong thing, and I'll go correct it in the comments. Thanks for pointing it out!
All the 'u' are linear independent to each other, which means we multiply matrix of U with its transpose, we will get identity matrix. Don't get why linear independent leads to orthonormal?
Isn't this the thin form of SVD? And aren't you using the numerical rank in the praxis example? Because A = U [Σ; 0] V^T, but since Σ in nxn and 0 (m-n)xn then U = [U_1 U_2] where U_1 mxn and U_2 mxm-n therby A = U_1*Σ*V^T. Also U_2 will be in the null space of A (?). And the skip to rank truncated matrix instead of explaining how u_r+1, ..., u_m will be the basis for N(A^T) and v_r+1, ..., v_n be the basis for N(A). Also I'm still unsure on how the eigenvectors of A^T A and A A^T tells you the most important information in A. Are we projecting the data onto the eigenvectors like in PCA? The eigendecomposition and SVD videos are some of the most compact and understandable videos I have found on those topics, it made the link between the change of basis to the eigenbasis, then calculating linear transformation then back to original basis much more clear to me, thanks.
I looked all over for SVD for 3 hours, and your video in 10 minutes explained it, so nicely. Thanks.
no problem! happy to help
You have a gift for explaining things clearly! This is so much better than the 5 SVD videos I watched prior to this haha
Glad it was helpful!
bros who make youtube math tutorials are the real MVPs
thanks!
Just amazing explanation. I had a blurry understanding of svd after taking a class and your video made the concept absolutely clear for me. Thanks a lot.
Glad it helped!
Been watching your videos for months now. Very much enjoy how general your videos can be for someone outside of data science. I generally like watching math videos from non-math educators because they have a great balance of an explanation.
One thing I really enjoy about your videos is at the end you bring it back to your field and why this is useful in your world.
Reduced in entries for storage or for further calculations is very tangible to see the real world application.
I think this video is amazing. I have been wanting to watch videos of this channel since the past 2 years but never could because i lacked the basic knowledge to gain from the explanations here. I was taught this concept in class very poorly, i immediately knew i could finally come here. The way this ties in with pca, if i am correct and the ease with which the mathematical kinks were explained was phenomenal. Glad to finally benefit from this channel. Thanks a ton.
Absolutely love your videos! Just to clear possible confusion for learners at 4:05 abt VtV=I because of orthonormality, not merely independence which is only a consequence. Great job!
Thanks for that!
Thank you. Short video that packs all the right punches.
Your explained most important things about SVD. Thank you
I was struggling to understand the concept in the class and this video made it very clear for me. Thank you so much. Keep them coming :)
Glad it helped!
I really like your videos! 👍 Methods are very clearly and concisely explained, explaining the applications of the methods in the end also helps a lot to remember what it really does. The time span of the video is also perfect! Thanks and hope to see more videos from you
Thanks man for posting! Loved the explanation!
I am glad of the note about 4:06, I freaked out when that was said. GREAT VIDEO!!
Until you learn PCA and revisit this video, everything really makes sense!!
Thanks so much for clearing the concepts. now I can connect the dots for what's the reason we use SVD in recommended systems. 👍
I like this explanation too, thank you. Wish I had discovered you earlier during my data analysis course, but oh well :P
I was really struggling with leaner algebra ,your video's are really a saviour
Wow, now I have a real appreciation for SVD :)
Thanks ritvik, I'm phd candidate from Malaysia. Your videos are helping me a lot.
It's my pleasure!
Unbelivable it has only 9k views... Video is great!
Glad you liked it!
Very relevant subject right now. Thanks
The columns of M, before SVD, could mean features. Do the columns of U and V (the left and right singular vectors) carry any physical meaning? The video keeps two singular values. How many do people usually keep?
Thanks for the regular quality content !!
marker flip on point
Thanks, the applications which I can think of rightaway is: PCA and Matrix Factorization. what could be other possible applications?
Great vid man, keep up the good work!
Your explanation is awesome
Super clear! Thank you so much!
Super well explained.
This is really realy informative. Just one question. What are sigmas? Are they eigenvalues from SVD or something else? How did you get 2 and 3 in your example?
thanks for a great video. do you also have a video on how to find those lambda values?
I usually hear SVD used synonymously with PCA. The way you described it, SVD is like a compression of the data but how is that different from PCA?
I am a big fan of your videos, but I think I liked the old format better, where you do the math step-by-step and write it on the whiteboard :/
Thanks for the comment! I was debating whether to go back to the marker and paper style where I would write more stuff in real time. This suggestion is very helpful to me.
@@ritvikmath thanks for the reply! I am especially grateful for your PCA math video since I am currently doing research with a functional data analysis algorithm that uses multivariate functional PCA and I've looked EVERYWHERE for an easy explanation. Your PCA video (and the required videos) is hands down the best explanation out there. I am forever grateful :-)
@@saraaltamirano I initially had the same PoV but then after consuming the whiteboard type content for a while, I've gotten used to it, and recently he has started moving away from the board at the end of the video so that we can pause and ponder upon it.
I prefer this current style greatly, no need to spend time writing. Can concentrate on explaining more.
I feel like such a baby because I laughed everytime you said u p and sigma p. Anyway, great video as always :).
can we say that using SVD we are extracting significant features?
Hi Ritvik, thanks for the very explanatory video. Really very helpful to understand. However, when you see that you achieved a 75% computation reduction in this case, was it really because we assumed sigma(3) onwards to be approximately equal to zero. Does this assumption sway away from reality or this is how it always happens. Eager to hear your thoughts. Happy to learn from this video.
Also, if you could do Moore-Penrose Pseudoinverse video as well. TIA
Hi Ritvik, what if the data is well constructed and there are 10 significant non-zero singular values? What can we do about this data
This is really helpful
The explanations are good but for Linear Algebra the best videos are from Prof. Gilbert Strang
i don't understand how to get independent vectors of a matrix to get its rank and what is mean by independent vector
Thank you so much!
I have always thought of such applications of matrices but never worked on finding how. beauty of maths.
Wait what at 4:06 you said that matrices with full rank always have their transposes as inverses?
almost, it's a small but important distinction.
A full rank matrix has its columns linearly independent to each other.
An orthogonal matrix (like the ones in this video) are full rank but also satisfy the property that its rows and columns are orthogonal to each other (have dot product 0 for any pair of different rows/columns).
So for an orthogonal matrix, like you said, its transpose is its inverse. But that's not generally true for any full rank matrix. Looking back, I did say the wrong thing, and I'll go correct it in the comments. Thanks for pointing it out!
Awesome !!
All the 'u' are linear independent to each other, which means we multiply matrix of U with its transpose, we will get identity matrix. Don't get why linear independent leads to orthonormal?
forget it, i just saw the *note*
Awesome 👍
Hi ritvik, I might missed it in your video but how do you get sigma?
How do we get U and V?
but don't orthonormal matrix have to be square?
Isn't this the thin form of SVD? And aren't you using the numerical rank in the praxis example? Because A = U [Σ; 0] V^T, but since Σ in nxn and 0 (m-n)xn then U = [U_1 U_2] where U_1 mxn and U_2 mxm-n therby A = U_1*Σ*V^T. Also U_2 will be in the null space of A (?). And the skip to rank truncated matrix instead of explaining how u_r+1, ..., u_m will be the basis for N(A^T) and v_r+1, ..., v_n be the basis for N(A).
Also I'm still unsure on how the eigenvectors of A^T A and A A^T tells you the most important information in A. Are we projecting the data onto the eigenvectors like in PCA?
The eigendecomposition and SVD videos are some of the most compact and understandable videos I have found on those topics, it made the link between the change of basis to the eigenbasis, then calculating linear transformation then back to original basis much more clear to me, thanks.
tysm
OMG again!!!
this is ~75% reduction. from 1000 to 222
true! thanks for pointing that out. I think I meant now you only need around 25% storage.
bro open your head , put the information , and close it
Thank you! You have been a life saver 🛟
The best! This is 101 on how to have fun with math in ds🫰