9. Constraints: Visual Object Recognition
HTML-код
- Опубликовано: 17 ноя 2024
- MIT 6.034 Artificial Intelligence, Fall 2010
View the complete course: ocw.mit.edu/6-0...
Instructor: Patrick Winston
We consider how object recognition has evolved over the past 30 years. In alignment theory, 2-D projections are used to determine whether an additional picture is of the same object. To recognize faces, we use intermediate-sized features and correlation.
License: Creative Commons BY-NC-SA
More information at ocw.mit.edu/terms
More courses at ocw.mit.edu
R.I.P Patrick Winston
Wish the professor could expand the binary mask technique (for finding correlation) to higher dimensions and to non-binary cases.
Do you later figured that out yet?
Interesting! The faces are easy to recognize when they're upside down or noisy, but not both. It seems that our brains rely on lower-level features specific to the individual face to recognize the faces when they're upside and higher-level specific features to recognize the face if the photo is noisy but right-side-up.
What I'm saying is that there is BOTH low-level and high-level features inside our brains that specifically identify Bill Clinton.
Thats pretty accurate
Nice view 0:30 1:17 showing us the professor instead of the material he shows to his students
Pc culture
this great man is worthy of being cloned ,,,your lectures are exquisited thanks mr, PATRICK WINSTON,,Colombia resiste may 2021
Thank you for this great lecture.
i dont understand the 17:20 . When 3 objects sufficient for 3 axis rotations and translation and 2 objects are sufficient for 1 axis rotation and translation. how so?
RIP.
6:30 Now Deep Learning is able to do this automatically. It is hard to describe the feelings I have when seeing this pre-deep learning era lecture with the current development of DL.
Eye enjoy this very much. :-)
as he is comparing only two images to find point corresponding to the in third image. Then should not he select only two point and get all other corresponding points in third image directly? why does he need to select three point to get other corresponding points? #MIT
thank you for sharing.
what softwares do professor winston use for the demonstrations?
Much of the material in 6.034 is reinforced by on-line artificial-intelligence demonstrations developed by us or otherwise available on the web. Those demonstrations developed by us are provided via the easy-to-use Java Web Start mechanism, which comes with the Java Runtime Environment, the so-called JRE. See the "Demonstrations" section of the course on MIT OpenCourseWare at: ocw.mit.edu/6-034F10.
Boom! Tetris for Jeff!
where
Somebody get this man some oxygen!
I don't think he needs oxygen anymore since he's dead.. dick
Where is lecture 8 ? Or is it mistakenly labeled ?
Here's lecture 8: ruclips.net/video/dARl_gGrS4o/видео.html. For more info and course materials, visit MIT OpenCourseWare at: ocw.mit.edu/6-034F10. Best wishes on your studies!
this lesson it's great. I think teacher is very tired :)
Think that's his style.. Since he's been like that from the very first lecture..
but is he useful and you can do project by the end of the course
Can anybody explain why, in the projections, he *subtracts* the Ys*Sin(theta) instead of adding it? If we are in a vector space, subtracting Ys*Sin(theta) would mean our new point is going to be down and under our current point, and not up and above like it is shown on the graph... Did he make a mistake or did I just missed something?
I'm not familiar with linear algebra, so I'm not really sure. In fact, I don't really understand why the Ys*Sin(theta) term was included at all when Xs*Cos(theta) seemed to do just what he needed.
That said, focus on the fact that he's solving for Xa there. Whether Ys*Sin(theta) is included in the equation or not, it won't affect the point's upward position on the graph, only the horizontal position. It may also be helpful to note that he states that he subtracts because the Ys*Sin(theta) vector is going in the wrong direction.
It's a subtraction because what he's actually doing is rotating the entire triangle. I wasn't sure this was possible so I derived the expression for Xa from Xa = s*cos(theta_s+theta_a) and s = Xs/cos(theta_s), and when you do some manipulations you can achieve the same result. Because he's rotating the entire triangle, there is an x component now associated with the angle y_s is on. He takes the new x component of the old x component, Xs, shown by Xs*cos(theta_a), and the x component of the old y component, Ys, shown by Ys*sin(theta_a) to get the new resultant x component. Essentially he treats X_s as the hypotenuse to a new triangle, the x component of which is the same as Xa. I feel like I've explained this poorly, if you still need clarification flick me a pm and I'll upload a graph and my derivation somewhere.
The sign there is just a choice. He intuitively used minus cause the new positions were getting "shorter".
I understand it this way: Take x_s first. x_s*cos(theta) gives you the projection of x_s "beneath" the xs (counter-clock-wise direction). Now, if you take x_s*cos(-theta) you take the projection of x_s in clock-wise direction which is what we want. But since cos(-theta) = cos(theta) you don't see a minus there. Now, for y_s, you have y_s*sin(-theta) which gives you -y_s*sin(theta).
very interesting, now back to my RPi3 and opencv AI logic. thanks.
I am just a B Tech first year student.... just a small query.... if we have orthographic projections... we take views from mutually perpendicular directions... if my coordinate system is set with axis parallel to our viewing direction then won't the computation be much easier.... view along x axis and view along y axis will always have same z coordinate and that along x and Z would have same y coordinate... so won't these condition actually give the object in 3d(I mean a 3 dimensional array with known XYZ coordinate of all vertices)... later then we can rotate and check if we can generate similar 2d images from 3d view?)....
experts please check this one
jasdeep singh Grover try asking the question on stackoverflow, man.
is that a useful course mate
Not sure what you are asking but you don't get to take "mutually perpendicular directions" of pictures in practice.
hahah, yeah, the power of storytelling :D not the power of love or power of dream :D is the real power! the other two are fake power, just for propaganda.
"It's still not solved." Why am I wasting my time watching this then? Maybe I should just go back to actually solving it and shit.
just fed the image to chatgpt. It recognizes it.