Real-Time Head Pose Estimation: A Python Tutorial with MediaPipe and OpenCV
HTML-код
- Опубликовано: 26 авг 2024
- Inside my school and program, I teach you my system to become an AI engineer or freelancer. Life-time access, personal help by me and I will show you exactly how I went from below average student to making $250/hr. Join the High Earner AI Career Program here 👉 www.nicolai-ni... (PRICES WILL INCREASE SOON)
You will also get access to all the technical courses inside the program, also the ones I plan to make in the future! Check out the technical courses below 👇
_____________________________________________________________
In this video 📝 we are going to do Head Pose Estimation with MediaPipe and OpenCV in Python. We are going to do face mesh detection to get the points used for pose estimation with OpenCV. MediaPipe has a lot of built-in customizable Machine Learning Solutions that we are going to take a look at in the upcoming videos. MediaPipe is the newest and fastest within machine learning solutions and can be run on common hardware which we are going to see throughout this tutorial.
If you enjoyed this video, be sure to press the 👍 button so that I know what content you guys like to see.
_____________________________________________________________
🛠️ Freelance Work: www.nicolai-ni...
_____________________________________________________________
💻💰🛠️ High Earner AI Career Program: www.nicolai-ni...
⚙️ Real-world AI Technical Courses: (www.nicos-scho...)
📗 OpenCV GPU in Python: www.nicos-scho...
📕 YOLOv7 Object Detection: www.nicos-scho...
📒 Transformer & Segmentation: www.nicos-scho...
📙 YOLOv8 Object Tracking: www.nicos-scho...
📘 Research Paper Implementation: www.nicos-scho...
📔 CustomGPT: www.nicos-scho...
_____________________________________________________________
📞 Connect with Me:
🌳 linktr.ee/nico...
🌍 My Website: www.nicolai-ni...
🤖 GitHub: github.com/nic...
👉 LinkedIn: / nicolaiai
🐦 X/Twitter: / nielsencv_ai
🌆 Instagram: / nicolaihoeirup
_____________________________________________________________
🎮 My Gear (Affiliate links):
💻 Laptop: amzn.to/49LJkTW
🖥️ Desktop PC:
NVIDIA RTX 4090 24GB: amzn.to/3Uc7yAM
Intel I9-14900K: amzn.to/3W4Z5Cb
Motherboard: amzn.to/4aR6wBC
32GB RAM: amzn.to/3Jt2XVR
🖥️ Monitor: amzn.to/4aLP8hh
🖱️ Mouse: amzn.to/3W501GH
⌨️ Keyboard: amzn.to/3xUGz5b
🎙️ Microphone: amzn.to/3w1F1WK
📷 Camera: amzn.to/4b4Ryr9
_____________________________________________________________
Tags:
#HeadPose #PoseEstimation #OpenCV #MediaPipe #ComputerVision #MachineLearning
Join My AI Career Program
www.nicolai-nielsen.com/aicareer
Enroll in My School and Technical Courses
www.nicos-school.com
Where is the source code of this project / tutorial?
@@Studio-gs7ye he never made it available...unless you pay
Hi Nicolai, I am working in the same problem. But I have some question
1. What if a person adjust his seat. I think your solution will not work then.
2. What if a person adjust his laptop screen then also the code will not work as per requirement.
Can you please suggest me how to tackle these problem?
Hello, i would like to add some details for anyone looking to implement this method. In the mediapipe doc it states that the z coordinate is in roughly the same scale as the x coordinate so you should also scale z with the image width. Next, the 3d points that you feed to solvePnP should be constant for every frame. That means you should keep the 3d points of a single frame and pass them with the 2d points of each new frame to the solvePnP function. Then the angles will actually be in degrees and will not need denormalization ( you should not multiply the angles with 360). You should however convert them to radians and use sin - cos functions to find the second (extended) point starting from the nose. The coordinates for this second point are (int(nose_landmark[0] - math.sin(y)*math.cos(x)*50, int(nose_landmark[1] + math.sin(x)*50) where the actual 3d length of this line is 50 and you can change it to make it larger or smaller. Dont hesitate to reply for more details!
Thanks a lot, please provide all the details u have
The results are very very bad tho when u scale the depth and have fixed 3d reference points. Can't be used for anything
@@NicolaiAI I get very satisfying results. I will update you when i upload my project on github!
great, please let me know when u have uploaded it! Appreciate it :)
@@NicolaiAI I also tried using fixed 3d points. The results are indeed bad for x and y angles, but really good for z angles (which the same cannot be same when using changing 3d points). Would love to know a "universal" solution to get good x,y,z angles.
Also it turns out that solvepnp isn't very very stable / robust. Do you know other methods that can be more robust?
I'm trying to use some of your tutorials to make a couple of addons for a 3d software called blender, I really appreciate how thorough you are in your explanation of your code!
Thanks a lot!
Great Video! You could use the blue line as a mouse on screen! a nose mouse! never have to take your fingers off the keyboard again! Input your monitor size and ratio, perform a little calibration and you might have something. I wonder what you could do to stabilize the blue line.. some filtering? That would make for a great video!
Thanks a lot! Great idea actually and a cool useful project
ı didnt find this proje your github ?
Hi Nico, thank you for the informative video. Just a note - the yaw readings are wrong. The max output of the head yaw (rotation around y axis) is 30 degrees, even when I turn my head 90 degrees relative to the camera.
Did you figure out what went wrong with the yaw readings? And what did you do to fix it, just multiplying it by 3 or something non-trivial?
I also had the same observation. Do you have any updates on this? Many thanks!
The same problem here. Why he is not answering this question? Anyone could solve it?
If anyone has trouble with FACE_CONNECTIONS it is now FACEMESH_CONTOURS as far as I know
Thanks for this video. This helped me troubleshoot a project I'm working on.
I'm working on a dataset tool that processes images of faces and exports data of the landmarks and the rotation angles (the pitch, roll, and yaw of the head).
did you fid the source code ?
Hey Hi, This is simply awesome, but while I'm running code on my pc I'm getting a divided by zero exception at line 116( totalTime = end - start
) please give me clarity on this Nic. Thank you
Hey. Keep up this good work. Your computer vision videos are really interesting.
Hi thank you very much! Really appreciate it
can't find that code on your github can give link of that plz i really need that.....
hi sir if I want to enhance this code to calculate depth to detect antispoofing, how can I do this?
Excellent, very helpful. Thank you for the tutorial.
Glad it was helpful!
i have questions. At line 83, why we time 360? angle[0] in the video is sin or cos, isn't it?
I did the same coding for my papplication but the line is reaally unstable!
You can try with an average filter for some simple smoothing
bro can we train a model with python to get accurate results, what do u say?
Thanks for your tutorial. I guess it would be better if you would calculate the time of the computations via time.perf_counter() method, and also it may be more correct to calculate the end time right before displaying the results, it means right before the cv2.imshow() function and not before the mp_drawing, as a result, the (correct) FPS will be slightly lower.
The FPS should be calculated without any drawings or visualizations since only the algorithm should be timed. I'm using the perf counter in recent videos
Dude this video was great. Is there any other tutorials on how to do this for makeup try on?
Or implementing 3d object on the user's face?
Thanks for your tutorial
Thanks a lot for watching! Hope that u can use it
The video really great, may I know where the center(origin) of face, I'm trying to inverse the rotation angle to make the face to the camera, thank you very much.
I think the roll (z) is not very well calculated because when i print it and test it by movin my head it just moves a little and the yaw (y) moves more than it
also, when you are facing the camera, why does the nose angle points by default a little bit to the right?
Hey Nicolai, it's a very well-explained video for head pose estimation, I just want to ask a query if we don't face the camera then how to indicate that you are not looking at the camera.
Thanks a lot! If u are not facing the camera it won't be able to do detection. So I could just add another if statement with looking away
@@NicolaiAI Thank you for pointing that out. Now I know I shouldn't try to implement this, since it is not enough to solve my project problem - detecting where people in front of a mobile robot are looking. Could you please recommend some stuff for that?
@@iam.damian You could try to detect just a face. If don't find a face then no one is using the mobile, or you can try to detect the iris...
I want to get value of roll movement of my head, which part should I change?
Hi Nicolai, I watched your video. It's Very good. I have a few questions. I try to put the lines above the pupils and I want the program to tell where it is looking. how can I do it?
Hello, why don't the positions of the 'img_h/2' and 'img_w/2' variables in the cam_matrix camera matrix on line 66 of the code have to be exactly reversed? What does it achieve for 'img_w/2' to be below?
how can i draw the bounding box?
Thank you for your video! But, module 'mediapipe' has no attribute 'solutions'. Where I can find it?
good video mate! can i add a closed eye and yawning function in this media pipe? could you do a video about it?
Amazing Job . Thanks
Thanks for watching!
@@NicolaiAI Thanks you too
I need to add a time for directions, like when I turn right for two seconds it will give me turn right.. how can I did that? thanks a lot
Hi,
Thanks for sharing this video.
I just want to know, How can I achieve same Pose Estimation by using javascript?
The code hangs when there are more than 1 faces in the camera frame(After adding the attribute max_num_faces=2). Please help. Is my CPU not strong enough or what.
Hi Nicolai . I wanna do the head pose estimation for image . what are the changes should I made? can you suggest
really well explained. I would like to see how to integrate this into a game engine like UE-5 =)
Hello, What do you recommend to install cv2 and media pipe in? Do you recommend Docker?
Just anaconda for development
Hello! Thank you so much for this tutorial. I am comparing the output of your code with datasets that include the euler angle ground truths but no matter what I do, the angles never match up with the ground truths and I am not sure why.
I have used the pointing 04 dataset and the aflw2000 datasets but am having no luck with matching the predictions from your code to the given ground truths.
I updated your code for passing an image only instead of a video. Any advice you could give would be truly appreciated.
Hey man, I am also struggling with similar problem although i used some other code. Now i am about to watch this video and implement it. Did you prepare any notebooks by any chance? that would be great help to me
The same problem here. These x y z angles are not real angles. Anyone has an idea to solve it?
Can I make a program combining head position estimation and iris detection with media pipe in a raspbarry?
Yeah for sure. But I don't think u will get many fps
Thankyou for the great video, however I drew inspiration from it and wanted to use it to detect head nod and head shake using the head position. I got stuck as there was no fixed pattern to decide whether its a head shake or the person is genuinely looking left and right.Can anyone help me in how to detect head nod and head shake.
Thanks for watching the video! U could try to look at the rate of change for the values and then based on that say if it's a shake or just looking in one direction
@@NicolaiAI Thankyou for the response, I will try to add the rate of change. Moreover, I also tried to pose the problem as an action detection, but no luck over there as well. Can you guide me if it could be done.
Do you end up not using nose_3d_projection? I see some pose estimation tutorials that use it but from my own use doesn't seem to be correct / robust
is there a way to check what they are looking at?
Really great video! Is there any solution for head pose estimation in android using mediapipe?
Thanks a lot! Yeah u can get the face mesh detector for Android with mediapipe
can you develop a app for detect drivers head pose and if driver's head deviates from the safe range trigger an alert like system
Hi Nicolai, Can you help me? I am using this program on a Jetson Nano, how can I reduce GPU consumption ?
Hi, thanks for this tutorial, how to use mediapipe with gpu..
Hi thanks a lot for watching! Have seen gpu support before but will def check up on it. Been some time since I have used mediapipe
What kind of camera do you use for such a high frame rate? Thanks.
It’s not the cameras frame rate but the model’s
thankyou sir
Thanks for watching!
Hi Nicolai Nielsen,
I have one error message as below:
INFO: Created TensorFow Lite XNNPACK delegate for CPU
and the code is not running,
can you help/ guide, what to do?
thanks
waiting for your response.
best regards
Gul Rukh
Really interesting, thank you!
Could you tell me how how get visual code's intelisense working so I don't have to use the full path for mediapipe all the time? ie
'p_face_mesh = mp.solutions.mediapipe.python.solutions.face_mesh' should be 'mp.solutions.facemesh' but then I don't get any intelisense.
Can I get the code please if possible?
did you get the code ?
can you make it detect multiple faces?
perfect
Why do multiply the nose_3d right hand side by 3000?
Thanks!
Welcome!
hi, is there some reason that you set *3000 for the nose_3d lm.z?
just to get better line visualisation for the direction
@@NicolaiAI Thanks a lot! When you append face_2D and face_3D, Z value is the only normalized value, where x and y are multiplied by image width and height. is it okay to do like that? I just ask you as i don't have background knowledge as mush as you.
Hi, where is the github link to the code for this project? I did not find.
hi, cool stuff. What camera do you use to get above 100 fps (frames per second) and which resolution ? thx
It's not the camera's fps but the algorithm's. It's 640x480 resolution on a CPU
it is not dependent on Camera, but on CPU, so, I get near 200 fps
Hi Nico, love your work! What do you end up using nose_3d_projection for? I tried using that projection coordinate for p2 but the face orthogonal line isn't as expected..
Also how would I project the other 2 orthogonal lines? Line that goes up (y) and side (x) of the nose, rather than outward (z)?
I have the same question, looking forward someone can help
Can this code be used using raspberry pi 4 with opencv and mediapipe install in it?
Yeah definitely!
Hi, Is there any tutorial to make this Head pose estimation without real-time, using a video as input?
Yeah u can just pass a video path to the video capture instead of a camera index
@@NicolaiAI Thank you so much
hello can you please help me with the input of video file
I want to co-relate this with the drowsiness detection based on head pose estimation . How to procced with that .
Me too
how to run mediapipe with gpu any ideas?
is this possible in javascript?
Hi Nicolai,
If you kindly tell us about the versions of the Libraries like opencv-python, mediapipe, numpy etc.
thanks, because I am facing cv2.error: opencv(4.5.5)
thanks, an earliest response will highly be appreciated, please.
I'm using opencv 4.5.2
@@NicolaiAI , Ok, thanks, I updated my Opencv-python to 4.2.52 but still have the following error:
Attribute Error: "NoneType' object has no attribute 'shape'
please, help me, I am in trouble. thanks
@@bolzanoitaly8360 it's because the image is not loaded in
@@NicolaiAI , Thanks, Of course, my Path was incorrect i.e. my Path is video/cabin2.avi, but i had given videos/cabin2.avi (extra s with video).
thanks for your guidance. appreciated highly.
best regards
Gul Rukh
Can anyone tell, how to make it work for multiple faces at the same time?
Awesome!
As i run the code only 1 face is detect. What are the changes if I want to detect more than 1 face?
Have you succeed to do that?
I have the same issue ... Please tell me if you have solved that
@@dimosojunior I have the same issue ... Please tell me if you have solved that
Hello, where can I access the codes you wrote?
Hi it’s all on my GitHub
@@NicolaiAI yeah I saw, btw the system works fine.
So can we also add eye tracking to this?
@@Samlorem007 i cannot find the code of this project from his GitHub.Under what name is the code file
Thank you for your lessons and examples. I very much appreciate your videos. Due to my disease I have a difficulty at using keyboard and mouse. I am using Ubuntu 20.04 Mate. I want to make myself a head mouse using your `headPoseEstimation.py` code. I can reach 22, -9 for x and -22, 10 for y axis by moving my head. I need to correlate those values with screen size (1920x1080) and move the pointer. I also need to fix jumpy x,y numbers for smooth pointer move.
I need your ideas.
@@khaledjaafar113 thank you for caring. I still need help.
Hello Nicolei.
I'm high school teacher in Turkey and in our school we are making project.In the project we want to use this head pose estimation coding. We re gonna attend a contest and we will need your permission to use your coding. I found nothing about copyright and I couldnt find any contact info. So ı wrote down here. I hope you reply me soon. We dont have so much time to join the contest
I want to estimate the faces of 4 people at the same time
can you share the codes?
Everything is on my GitHub in the description
Where are the source code
Can i get your source code :))??
💣💣💣💣Hello Nicolei.
I'm high school teacher in Turkey and in our school we are making project.In the project we want to use this head pose estimation coding. We re gonna attend a contest and we will need your permission to use your coding. I found nothing about copyright and I couldnt find any contact info. So ı wrote down here. I hope you reply me soon. We dont have so much time to join the contest
All of it is open source and u can just use the code
video : drive.google.com/file/d/1W0lNEoYS-Ka2Hj7h6AL3o7BqzXa_MvYe/view?usp=sharing
code : drive.google.com/file/d/1cUUzL18c1fxVeClYUeoGISwdU3DPolwD/view
trying to apply same concept one hand palm but getting vector comes out from both sides as in link
any suggestion to make it only comes only from the internal side of hand(palm) ?