Could you please explain in a real example how to we estimate real coordinates of a certain car from a 2D coordinates in case if I have a sequence of car images and I want to know its positions in each image
Hi Cyrill, I really like the lectures you are giving. Is the statistically optimal solution obtained through bundle adjustment (and I guess this is equivalent to doing a Least Sqaures fitting if we use more than 3 points from the world space/images)??
Thanks very much. I'm interested in applying this to camera and object motion solving from known 3d spacial coordinates and correlated 2d image tracking coordinates. How does this DLT method compare to the Least Squares Approximation method ?
DLT works without an initial guess, which is a big advantage. Least squares will need one. However, LS will allow you to consider the non-linear parameters and will allow to use uncertainties properly into account. So, starting with DLT for the initial guess and then go further with a LS approach could be your way to go (assuming you do not know your calibration params already).
Would it suffice to use three or four points if they are along the axis of the world coordinate system, e.g. the corner of a room: (o = [0,0,0]), x = [0.2,0,0], y = [0,0.2,0], z = [0,0,0.2]? And eventually filling the ranks with 'reused' base vectors (or null vector?), since the other control points / vectors would be linear combinations of these base vectors anyway?!
Hi, Cyrill. Awesome lecture and thank you for making this. But I do have some weird trouble when I really implemented a naive DLT in MATLAB. I basically generate a camera pose(R&t) and 6 landmarks(not on same plane) lying at around positive-Z axis of the camera and a K matrix with positive camera constants. The result I got is correct t estimation, but for R and K I need to multiply the matrix you mentioned (R_z_pi [-1 0 0;0 -1 0; 0 0 1]) to make them same as ground truth. I just got confused because what I understand in your tutorial is for normal operation you will get positive diagonal elements of K, and the mat R_z_pi is used for special requirement that you want a negative camera constants. Is my understanding wrong ? Or my understanding is correct just need to check the code again?
Sorry, just found that the default qr decomposition of matlab cannot guarantee positive elements in R, so I manually fix those negative signs( for Q and R both) and everything looks good now. Still not sure whether this is a correct method because I just tested on several samples...
We normally use checkerboard for camera calibration and assume the world coordinates to be on the board plane (Zi = 0). How does the estimation of P work in this case? Also, we know the world points (because we know the checker dimension). Therefore we should be getting the true extrinsics or absolute rotation and translation (just like opencv's solvepnp). Why is it mentioned that the P matrix is homogeneous?
Thank you for this video. I have one question - How can we measure the coordinates of control points of real objects that are used for estimating of camera parameters ? How can we obtain these coordinates which are stored in vector X ?
Do you have a Video explaining how to get the projection matrix P, if given R the rotation matrix and K the camera matrix ? I have been trying to figure it out from my professors slides for a week now. His slides say that P = diag(f,f,1 )[i| O] and that P= K[R|t]. ('R|t' means writing a vector into a matrix as an extra column.' i|O' is never explained ) Do I need to multiply a 3x4 matrix with a 3x3 matrix? Do I pad the 3x3 matrix somehow ?
diag(f,f,1) is 3x3 diagonal matrix. i is 3x3 identity matrix (no rotation). 0 is 3x1 zero matrix (no translation). K is 3x3 intrinsic matrix. R is 3x3 rotation matrix, t is 3x1 translation matrix(or column)
If all the points lie on the same plane, then there is an infinite number of possible camera projection matrices, right? since the nullity of the matrix will be strictly greater than 1?
R(z,180deg) used at time 30:10 min is rotation around the z-axis which is yaw angle right. if the Z-axis of the camera lies in the optical axis of the camera. then it will rotate the x-y plane along the z axis and image plane still on the other side. Am I missing something here? I am a little confused. can I get some explanation on this
You can completely skip your 3D imagination and resolve it in a different way: Decomposing H^-1 leads to K^-1, the inverse calibration matrix with positive elements on the main diagonal and thus also for K. We, however, want the camera constant to be negative (so that we are in the coordinate system we want to use). So what can we do in order to achieve that? We can introduce a 3x3 matrix, let's call it B, which has zero off-diagonal elements and -1,-1,1 on the main diagonal, i.e.m B = Diag(-1,-1,1). Note that B is a rotation matrix (called R(z,pi) in the video). Also, note that B = B^T. Now we can write H = K R = K B B^T R = K B B R. Our final calibration matrix, let's call it K', can now be written as K' = K B. The matrix R' = B R is rotation matrix as B and R are rotation matrices. Thus, through the introduction of B, we created a change in the orientation of the camera (R -> R') so that the calibration matrix now has the correct sign for the camera constant (K -> K'). That's it. By design, the "magic" matrix B, which does that job for us, can be seen as a rotation by pi around the z-axis.
Why don't you just stack the coordinates, to get matrix x and X, and equation x = MX. Then do penrose inverse on X to get X^+, then find best fit M b with M = xX^+
This amount of details in such a short video, I call this glory.
The best explanation for the DLT.
Thank you for providing such a great quality of lectures 👏👏👍👍
😊 thanks
Thank you for your videos, One of the best lecture in Germany for robotics. I learned a lot by watching your videos in last two years.
Thanks
I just can say thank you, it has been the most clear explanation I have seen!
Could you please explain in a real example how to we estimate real coordinates of a certain car from a 2D coordinates in case if I have a sequence of car images and I want to know its positions in each image
Exceptionally clear
You are the best!
Does this mean that DLT is not suitable with the standard checkerboard pattern? With the checkerboard pattern all control points lie on a plane
Yes
Hi Cyrill, I really like the lectures you are giving. Is the statistically optimal solution obtained through bundle adjustment (and I guess this is equivalent to doing a Least Sqaures fitting if we use more than 3 points from the world space/images)??
BA is a least squares approach and yes, you can exploit more points including uncertainties (I hope that answers the question).
Great video. I applied for your Master Program in Bonn - but unfortunately I am not accepted.
Can you please do a video on Levenberg MArquardt?
Thanks very much. I'm interested in applying this to camera and object motion solving from known 3d spacial coordinates and correlated 2d image tracking coordinates. How does this DLT method compare to the Least Squares Approximation method ?
DLT works without an initial guess, which is a big advantage. Least squares will need one. However, LS will allow you to consider the non-linear parameters and will allow to use uncertainties properly into account. So, starting with DLT for the initial guess and then go further with a LS approach could be your way to go (assuming you do not know your calibration params already).
Would it suffice to use three or four points if they are along the axis of the world coordinate system, e.g. the corner of a room: (o = [0,0,0]), x = [0.2,0,0], y = [0,0.2,0], z = [0,0,0.2]? And eventually filling the ranks with 'reused' base vectors (or null vector?), since the other control points / vectors would be linear combinations of these base vectors anyway?!
No, you need to fix 11 DoF, so 6 points are needed (as each point generates a 2D observation vector, the pixel coordinates)
Hi, Cyrill. Awesome lecture and thank you for making this. But I do have some weird trouble when I really implemented a naive DLT in MATLAB. I basically generate a camera pose(R&t) and 6 landmarks(not on same plane) lying at around positive-Z axis of the camera and a K matrix with positive camera constants. The result I got is correct t estimation, but for R and K I need to multiply the matrix you mentioned (R_z_pi [-1 0 0;0 -1 0; 0 0 1]) to make them same as ground truth. I just got confused because what I understand in your tutorial is for normal operation you will get positive diagonal elements of K, and the mat R_z_pi is used for special requirement that you want a negative camera constants. Is my understanding wrong ? Or my understanding is correct just need to check the code again?
Sorry, just found that the default qr decomposition of matlab cannot guarantee positive elements in R, so I manually fix those negative signs( for Q and R both) and everything looks good now. Still not sure whether this is a correct method because I just tested on several samples...
We normally use checkerboard for camera calibration and assume the world coordinates to be on the board plane (Zi = 0). How does the estimation of P work in this case? Also, we know the world points (because we know the checker dimension). Therefore we should be getting the true extrinsics or absolute rotation and translation (just like opencv's solvepnp). Why is it mentioned that the P matrix is homogeneous?
i think checkerboard uses different algorithm.
Thank you for this video.
I have one question - How can we measure the coordinates of control points of real objects that are used for estimating of camera parameters ? How can we obtain these coordinates which are stored in vector X ?
I guess Im quite off topic but do anyone know a good place to watch new series online?
@Chandler Roman flixportal :)
@Jon Jett Thanks, I signed up and it seems like a nice service :) Appreciate it!!
@Chandler Roman no problem :)
Isn't there any advantage of knowing intrinsic parameters? It leads more accurate solutions using intrinsics?
DLT estimates the ex and intrinsics. In case you know them already, you do not need the DLT. Go for PnP solutions
So PnP gives more accurate results for extrinsic matrix compared to DLT? If not, there is no point in using intrinsics.
Thank you for this lecture.
Do you have a Video explaining how to get the projection matrix P, if given R the rotation matrix and K the camera matrix ? I have been trying to figure it out from my professors slides for a week now. His slides say that P = diag(f,f,1 )[i| O] and that P= K[R|t]. ('R|t' means writing a vector into a matrix as an extra column.' i|O' is never explained ) Do I need to multiply a 3x4 matrix with a 3x3 matrix? Do I pad the 3x3 matrix somehow ?
Do 3X3 matrix infront? 3X3 * 3X4?
diag(f,f,1) is 3x3 diagonal matrix. i is 3x3 identity matrix (no rotation). 0 is 3x1 zero matrix (no translation). K is 3x3 intrinsic matrix. R is 3x3 rotation matrix, t is 3x1 translation matrix(or column)
See my lecture on camera parameters here: ruclips.net/video/uHApDqH-8UE/видео.html
@@CyrillStachniss Thank you very much.
If all the points lie on the same plane, then there is an infinite number of possible camera projection matrices, right? since the nullity of the matrix will be strictly greater than 1?
You will simply have a rank deficiency in your matrix defining the linear system.
R(z,180deg) used at time 30:10 min is rotation around the z-axis which is yaw angle right. if the Z-axis of the camera lies in the optical axis of the camera. then it will rotate the x-y plane along the z axis and image plane still on the other side. Am I missing something here? I am a little confused. can I get some explanation on this
You can completely skip your 3D imagination and resolve it in a different way:
Decomposing H^-1 leads to K^-1, the inverse calibration matrix with positive elements on the main diagonal and thus also for K. We, however, want the camera constant to be negative (so that we are in the coordinate system we want to use). So what can we do in order to achieve that?
We can introduce a 3x3 matrix, let's call it B, which has zero off-diagonal elements and -1,-1,1 on the main diagonal, i.e.m B = Diag(-1,-1,1). Note that B is a rotation matrix (called R(z,pi) in the video). Also, note that B = B^T.
Now we can write H = K R = K B B^T R = K B B R. Our final calibration matrix, let's call it K', can now be written as K' = K B. The matrix R' = B R is rotation matrix as B and R are rotation matrices. Thus, through the introduction of B, we created a change in the orientation of the camera (R -> R') so that the calibration matrix now has the correct sign for the camera constant (K -> K'). That's it.
By design, the "magic" matrix B, which does that job for us, can be seen as a rotation by pi around the z-axis.
@@CyrillStachniss Makes sense. Thanks alot for such an amazing course and providing the explanation.
Given the projection matrix.......how do we obtain rotation matrix?!
See video @24:30
Why don't you just stack the coordinates, to get matrix x and X, and equation x = MX. Then do penrose inverse on X to get X^+, then find best fit M b with M = xX^+