Just want to say, you are really gifted teacher. I have seen so many videos from this channel and all of it hit the right spot, clear, deep enough, but concise. and with only one whiteboard, the animation is not even needed to understand your teaching. very impressive. Thank you so much. I would love to support and looking forward to all your contents in the future.
Hey Ritvik, I just discovered your channel and wanted to say it is super helpful for me. Quick sidenote: at first I agreed with you that we need 9 data points to store the projected vectors, but now I think we only need 8. This is because the unit vector u needs only one number to store, namely the angle -- we already know its length is 1. Thank you very much for making these videos. I am learning so much!
I'm very new to the actual math behind these concepts, so first off want to give big thanks to this channel for your absolutely dedicated work. I want to leave this comment just in case anyone else runs into the, admittedly novice, problem i experienced and how i solved the problem. Sorry to all u big wigs who are going to find this blatantly obvious, but I've never taken a linear algebra or calculus class before; totally self study. From vector projection video we have the formulation: the projection of X onto vector V is called p and is equivalent to = (X dot V / magnitude(V) ) matrix multiply (V / magnitude(V) ) = ((X • V)/||V||)*(V/||V||) When I saw the formulation here, as : U1transpose * Xi * U, p = U1T * Xi * U, I was totally lost and didn't see where that came from. Again I know this is an incredibly stupid question now but i first had to rewatch vector projection to make sure i didn't miss anything obvious, then break down the question into what I did and didn't understand, and finally arrived at how to make that jive. Solution: V/||V|| = U, directly from the vector projection video, and finally after some time staring at it U1 • Xi = U1T*Xi So we have ((Xi • U1)/||U1||)*(U1/||U1||) = U1T * Xi * U, Or to make your math come out exactly: (U1T * Xi) * U = (Xi • U1) * U And ((U1T * Xi) * U)/||U1|| = ((Xi • U1)/||U1||) * U Sorry, again I know this is obvious now but just wanted to leave it in case some other newbie stumbles across the same "issue". Thanks Ritvik!!!
First off, I really appreciate the kind words. And second, I am extremely grateful to you for writing all this out. People learn in different ways and neither I nor any other person/textbook/resource out there can account for all of them. I undoubtedly think your comment will help others.
@@ritvikmath to be honest, I should also make it clear that U1T *Xi * U works perfectly if U1 = U = unit vector with ||U||=1, which you stated explicitly in your video was necessary or the case in your example. Therefore, my formulations only work if you are too laze to turn U1 into a normalized unit vector U (for example i just used my vector V from projections video without realizing that it needed to be/ should have been initialized by normalization first z lol straight over my head). Turns out your answer needed no explanation from me, but I really am thankful that you provide this content; i love it! My favorite challenge so far has been proving that the "derivative" of XT * X is 2X, from the Lagrange video, which like you said I would be able to handle on my own without you needing to prove it in the video, AND I DID! You are amazing sir, thanks
Thank you!! Finally videos that connect LA to DS! Very good explanation and really like how you provide application/use-case before starting the lecture.
@@ritvikmath albeit your camera isn't so clear. It goes out of focus a few times. Maybe it's on auto and focuses on you while you move back and forth but it should be manually set to a fixed distance.
For those who are checking videos for their ML exams. This is the best course for university students: concise and profound. Not so shallow as most top ML videos, but not so purely math-driven that it might be hard to understand.
Side Note: The variable we were looking for was k. The answer you gave, by identification of the last result, is dot(x, v) / norm(v), which is totally correct. However, there is another way to get there pretty quickly. The dot product between two vectors **is** the product between the norm of the projected vector and the norm of the vector on which we project (the dot product is commutative). From there we can easily see that, if we only need the norm of the projected vector (k), all we have to do is divide the dot product by the norm of the vector on which we project. Hence: dot(x, v) / norm(v).
Yep, that is proj_v (x) = x•v / ||v|| which will become the coordinate of x on the principal component axis. This can also be seen from x•v = ||x|| ||v|| cos theta In PCA that projection will be an eigenvector of the covariance matrix. It's the optimal one for retaining info. Matrix multiplication is a bundle of dot products so the loadings in PCA are eigenvector elements.
Thanks Rithvik, your explanations are really good. If I may request you, how about writing a book possibly for math behind ML. Or if any books are there to suggest. Kindly let me know. I would like to keep refering to them regularly. On another note, people like me, who have spent a lot of their career in IT, some times, just one more lower level would be helpful. A little bit of more explanation or depth is required. For example in your covariance video, I found it difficult to understand how apple banana covariance is applied for vectors. Esp multi variant formula is difficult to understand. If you can emphasize on deriving univariate to multivariate in videos or blogs, that would super helpful. I wish you good luck in your journey and happy to even discuss more if you have some free time.
Why didn't you add a bar or some way to show p is a vector and d is a vector and u is a vector by using some common way of indicating it is a vector ? Just curious.
If you are watching this and you don't follow: I watched this when it first came out and I was in the same boat. I am coming back now to review some concepts and it all makes so much more sense!
I do not know how to thank you man. things get easier when I watch your videos. I really want to see this channel picking up with more subscribers and views. I suggest you start charing for data science courses like 10 streamed courses or key concepts in DS. I wish you well mate
Hi. Awesome video. If you could maybe suggest some books/ papers for each individual concept like PCA or SVM that would be wonderful for all the novice researchers like me out there.
it seems not make sense to me, you do P * d = k*x*u - k*u*u = 0 -> u*u equals 1, so you get P*d=k*x - k, but later you got P = (x*u)u -> x * v/||v|| * u, the later formular also seems indicate P = x according with your P*d theory?
I can't help but wonder if there are some limitations to vector projection. Its working to compress data, but isn't some data lost in the process? Can we recover the X1 by X2 table from the vector projection method of storage if we wish to? That is to ask: how are the directions of the other vectors being stored within the vector projection?
It is the same as fitting a histogram of population onto a Gaussian normal distribution, the particularsof the data are lost but its features are retained, it allows us to bypass privacy restrictions too.
at 11:16 does it mean that if the target vector is shorter or longer such that the projection is unable to fall at the right angle, then projection cannot happen?
Great video. Confused regarding the memory reduction in the first half of the video. When projecting a data point (vector) onto another vector (v), there is no way to get the original data point back. Correct? I believe that in order to do so, one would also need to know its projection onto the vector normal to v.
Thank you so much I have been flunking my Coursera LA quiz on this and was getting frustrated. Vector projection just seemed so arbitrary. Now with understanding from your vid that it is part of larger schemes to compress data nd much more I am much better motivated to continue studying LA. Extra Special Easter Egg - I am watching season 3 of Silicon Valley. Right after your vid I watched the episode where the fired Nucleon programmers discover the missing link in their understanding of the "middle out" algorithm.
YOUR TITLE IS WRONG. Vector Projection. When you are doing Scalar Projection. OOPS. it means you are doing PARROT TEACHING - NOT UNDERSTANDING. Also start of video, you are failing to emphasize the angel of projection is always 90 degrees to the "onto vector", at the start of the lesson (later it is clear). As you assume the viewers knows this but, you are ambiguous in the first diagram with the little x each side. The focus is lost because you did not bridge the gap between "Scalar Projection" and "Vector Projection".
could you explain IN AN INTUITIVE WAY why am i using dot products in the perceptron? does it have to do with projections? def net_input(self, X): """Calculate net input""" return np.dot(X, self.w_[1:]) + self.w_[0] mllog.github.io/2016/11/04/Python-ML-Chapter-2/#bellow-is-implemetation-of-the-perceptron-learning-algorithm-in-python
Just want to say, you are really gifted teacher. I have seen so many videos from this channel and all of it hit the right spot, clear, deep enough, but concise. and with only one whiteboard, the animation is not even needed to understand your teaching. very impressive. Thank you so much. I would love to support and looking forward to all your contents in the future.
Agree partly, but would not use the word gifted. I guess he just put in the hours and learned to be good
You did want to say. That’s why you did.
Finally someone took this up !!!
Thank you so so much Ritvik, you are Amazing.
Thanks :)
One of the best intuitive video I have seen yet, great job!
Hey Ritvik, I just discovered your channel and wanted to say it is super helpful for me.
Quick sidenote: at first I agreed with you that we need 9 data points to store the projected vectors, but now I think we only need 8. This is because the unit vector u needs only one number to store, namely the angle -- we already know its length is 1.
Thank you very much for making these videos. I am learning so much!
I'm very new to the actual math behind these concepts, so first off want to give big thanks to this channel for your absolutely dedicated work.
I want to leave this comment just in case anyone else runs into the, admittedly novice, problem i experienced and how i solved the problem. Sorry to all u big wigs who are going to find this blatantly obvious, but I've never taken a linear algebra or calculus class before; totally self study.
From vector projection video we have the formulation: the projection of X onto vector V is called p and is equivalent to = (X dot V / magnitude(V) ) matrix multiply (V / magnitude(V) )
= ((X • V)/||V||)*(V/||V||)
When I saw the formulation here, as :
U1transpose * Xi * U,
p = U1T * Xi * U, I was totally lost and didn't see where that came from. Again I know this is an incredibly stupid question now but i first had to rewatch vector projection to make sure i didn't miss anything obvious, then break down the question into what I did and didn't understand, and finally arrived at how to make that jive.
Solution:
V/||V|| = U, directly from the vector projection video, and finally after some time staring at it
U1 • Xi = U1T*Xi
So we have
((Xi • U1)/||U1||)*(U1/||U1||) =
U1T * Xi * U,
Or to make your math come out exactly:
(U1T * Xi) * U = (Xi • U1) * U
And
((U1T * Xi) * U)/||U1|| = ((Xi • U1)/||U1||) * U
Sorry, again I know this is obvious now but just wanted to leave it in case some other newbie stumbles across the same "issue".
Thanks Ritvik!!!
First off, I really appreciate the kind words. And second, I am extremely grateful to you for writing all this out. People learn in different ways and neither I nor any other person/textbook/resource out there can account for all of them. I undoubtedly think your comment will help others.
@@ritvikmath to be honest, I should also make it clear that
U1T *Xi * U works perfectly if U1 = U = unit vector with ||U||=1, which you stated explicitly in your video was necessary or the case in your example. Therefore, my formulations only work if you are too laze to turn U1 into a normalized unit vector U (for example i just used my vector V from projections video without realizing that it needed to be/ should have been initialized by normalization first z lol straight over my head). Turns out your answer needed no explanation from me, but I really am thankful that you provide this content; i love it!
My favorite challenge so far has been proving that the "derivative" of XT * X is 2X, from the Lagrange video, which like you said I would be able to handle on my own without you needing to prove it in the video, AND I DID! You are amazing sir, thanks
Thank you!! Finally videos that connect LA to DS! Very good explanation and really like how you provide application/use-case before starting the lecture.
Thank you
Crystal clear, thank you for an amazing lecture.
You're very welcome!
@@ritvikmath albeit your camera isn't so clear. It goes out of focus a few times. Maybe it's on auto and focuses on you while you move back and forth but it should be manually set to a fixed distance.
Thank you!
Big thanks
For those who are checking videos for their ML exams. This is the best course for university students: concise and profound. Not so shallow as most top ML videos, but not so purely math-driven that it might be hard to understand.
I have watched this video several times, each time I learned something new
so far my understanding
is getting better. The best explaination of how and why.
Light example is quite intuitive!!! I just landed to watch your Eigen vector video but ended up watching all your videos...
Same! Thank you so much for the great content!
same here, just ended up watching all. It's so magic
Same here hahah :D
Side Note: The variable we were looking for was k. The answer you gave, by identification of the last result, is dot(x, v) / norm(v), which is totally correct. However, there is another way to get there pretty quickly.
The dot product between two vectors **is** the product between the norm of the projected vector and the norm of the vector on which we project (the dot product is commutative). From there we can easily see that, if we only need the norm of the projected vector (k), all we have to do is divide the dot product by the norm of the vector on which we project. Hence: dot(x, v) / norm(v).
Yep, that is proj_v (x) = x•v / ||v|| which will become the coordinate of x on the principal component axis.
This can also be seen from
x•v = ||x|| ||v|| cos theta
In PCA that projection will be an eigenvector of the covariance matrix. It's the optimal one for retaining info. Matrix multiplication is a bundle of dot products so the loadings in PCA are eigenvector elements.
Thank you for providing the actual derivation and intuition behind the equation.
Awesome ...this made vector projection very clear..I think the idea of deriving it in algebra makes it so much more clear
Beautiful and coherent explanation! Thank you! 💯
You're so welcome!
Love this!
Thanks!!
This guy is one of the hero's of the internet
thank you
You're welcome
Thanks Rithvik, your explanations are really good. If I may request you, how about writing a book possibly for math behind ML. Or if any books are there to suggest. Kindly let me know. I would like to keep refering to them regularly. On another note, people like me, who have spent a lot of their career in IT, some times, just one more lower level would be helpful. A little bit of more explanation or depth is required. For example in your covariance video, I found it difficult to understand how apple banana covariance is applied for vectors. Esp multi variant formula is difficult to understand. If you can emphasize on deriving univariate to multivariate in videos or blogs, that would super helpful. I wish you good luck in your journey and happy to even discuss more if you have some free time.
this is an excellent explanation - thanks!
Awesome explanation. Thanks a lot!!
It helped me make more sense of linear regression!!!
Amazing content im into watching the 20th video now i guess
Why didn't you add a bar or some way to show p is a vector and d is a vector and u is a vector by using some common way of indicating it is a vector ? Just curious.
Just to clarify, could we say at the end there that the unit vector of v squared times x is the projection vector?
If you are watching this and you don't follow: I watched this when it first came out and I was in the same boat. I am coming back now to review some concepts and it all makes so much more sense!
I do not know how to thank you man. things get easier when I watch your videos. I really want to see this channel picking up with more subscribers and views.
I suggest you start charing for data science courses like 10 streamed courses or key concepts in DS. I wish you well mate
Hi. Awesome video. If you could maybe suggest some books/ papers for each individual concept like PCA or SVM that would be wonderful for all the novice researchers like me out there.
Very good lecture ..People like u gift us more knowledge through these kind of videos..Thank you so much
God bless u mann !!! going through Data Science Basics series and it is wonderful !!
Thanks a lot for kind explanation. these lecture should be big help for my automatic appraisal business.
it seems not make sense to me, you do P * d = k*x*u - k*u*u = 0 -> u*u equals 1, so you get P*d=k*x - k, but later you got P = (x*u)u -> x * v/||v|| * u, the later formular also seems indicate P = x according with your P*d theory?
Thank you so much for your clear explanation
amazing my hero professor
Thanks!!
welcome :)
I can't help but wonder if there are some limitations to vector projection. Its working to compress data, but isn't some data lost in the process? Can we recover the X1 by X2 table from the vector projection method of storage if we wish to? That is to ask: how are the directions of the other vectors being stored within the vector projection?
It is the same as fitting a histogram of population onto a Gaussian normal distribution, the particularsof the data are lost but its features are retained, it allows us to bypass privacy restrictions too.
brother u r doing a great job. its a commentary on the system why y u have so less views.
at 11:16 does it mean that if the target vector is shorter or longer such that the projection is unable to fall at the right angle, then projection cannot happen?
i love u
hey! really like all your awesome content. PCA brought me here... but please fix the camera focus :P
9:33 p+d=x? that can't be true! maybe you mean pythagorean theorem?
No bro,
Triangle law
p and d are vectors so walk p steps, turn, then walk d steps to get to the ultimate destination of x.
Great video. Confused regarding the memory reduction in the first half of the video. When projecting a data point (vector) onto another vector (v), there is no way to get the original data point back. Correct? I believe that in order to do so, one would also need to know its projection onto the vector normal to v.
Thank you for the video. MSC data science student
great explanation
Glad it was helpful!
Can someone pls explain how projecting X on the unit vector helped us??
why is p+d=x? Suppose to be Pythagorean theorem , p square + d square = x square
thank you please make video for partial derivatives as optimisation require partial derivatives
went from hero to zero real quick hahahha
Isn't it more straightforward to first normalize v so that the length of the projection of x onto v is simply the dot product of x.v?
Want to learn more on auto regression, can u do more..?
more videos coming up soon!
(13:19) If p=(x.u).u then it follows that p= x.(u.u) == x
So is p=x? Doesn't look right!!
Yeah he made a mistake in the end.
projection of a vector on an another vector is what??scalar or vector??
Vector
1.Vector projection
2.Scalar projection.
Your videos are the best videos among all the videos I have watched. Could you please make a video on the math behind SVM
good idea! I'll look into it
Thank you so much I have been flunking my Coursera LA quiz on this and was getting frustrated. Vector projection just seemed so arbitrary. Now with understanding from your vid that it is part of larger schemes to compress data nd much more I am much better motivated to continue studying LA.
Extra Special Easter Egg - I am watching season 3 of Silicon Valley. Right after your vid I watched the episode where the fired Nucleon programmers discover the missing link in their understanding of the "middle out" algorithm.
My name is Rithvik!!!!
nice !
YOUR TITLE IS WRONG. Vector Projection. When you are doing Scalar Projection. OOPS. it means you are doing PARROT TEACHING - NOT UNDERSTANDING.
Also start of video, you are failing to emphasize the angel of projection is always 90 degrees to the "onto vector", at the start of the lesson (later it is clear). As you assume the viewers knows this but, you are ambiguous in the first diagram with the little x each side. The focus is lost because you did not bridge the gap between "Scalar Projection" and "Vector Projection".
More confused...Cant get anything..I am not mathematically strong.
It's ok if you are not mathematically strong, we can all learn! Do you have any specific questions?
bump
could you explain IN AN INTUITIVE WAY why am i using dot products in the perceptron? does it have to do with projections?
def net_input(self, X):
"""Calculate net input"""
return np.dot(X, self.w_[1:]) + self.w_[0]
mllog.github.io/2016/11/04/Python-ML-Chapter-2/#bellow-is-implemetation-of-the-perceptron-learning-algorithm-in-python
nn