Real-Time Head Pose Estimation: A Python Tutorial with MediaPipe and OpenCV

Nicolai Nielsen

Просмотров 39 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 26 авг 2024
Inside my school and program, I teach you my system to become an AI engineer or freelancer. Life-time access, personal help by me and I will show you exactly how I went from below average student to making $250/hr. Join the High Earner AI Career Program here 👉 www.nicolai-ni... (PRICES WILL INCREASE SOON)
You will also get access to all the technical courses inside the program, also the ones I plan to make in the future! Check out the technical courses below 👇
_____________________________________________________________
In this video 📝 we are going to do Head Pose Estimation with MediaPipe and OpenCV in Python. We are going to do face mesh detection to get the points used for pose estimation with OpenCV. MediaPipe has a lot of built-in customizable Machine Learning Solutions that we are going to take a look at in the upcoming videos. MediaPipe is the newest and fastest within machine learning solutions and can be run on common hardware which we are going to see throughout this tutorial.
If you enjoyed this video, be sure to press the 👍 button so that I know what content you guys like to see.
_____________________________________________________________
🛠️ Freelance Work: www.nicolai-ni...
_____________________________________________________________
💻💰🛠️ High Earner AI Career Program: www.nicolai-ni...
⚙️ Real-world AI Technical Courses: (www.nicos-scho...)
📗 OpenCV GPU in Python: www.nicos-scho...
📕 YOLOv7 Object Detection: www.nicos-scho...
📒 Transformer & Segmentation: www.nicos-scho...
📙 YOLOv8 Object Tracking: www.nicos-scho...
📘 Research Paper Implementation: www.nicos-scho...
📔 CustomGPT: www.nicos-scho...
_____________________________________________________________
📞 Connect with Me:
🌳 linktr.ee/nico...
🌍 My Website: www.nicolai-ni...
🤖 GitHub: github.com/nic...
👉 LinkedIn: / nicolaiai
🐦 X/Twitter: / nielsencv_ai
🌆 Instagram: / nicolaihoeirup
_____________________________________________________________
🎮 My Gear (Affiliate links):
💻 Laptop: amzn.to/49LJkTW
🖥️ Desktop PC:
NVIDIA RTX 4090 24GB: amzn.to/3Uc7yAM
Intel I9-14900K: amzn.to/3W4Z5Cb
Motherboard: amzn.to/4aR6wBC
32GB RAM: amzn.to/3Jt2XVR
🖥️ Monitor: amzn.to/4aLP8hh
🖱️ Mouse: amzn.to/3W501GH
⌨️ Keyboard: amzn.to/3xUGz5b
🎙️ Microphone: amzn.to/3w1F1WK
📷 Camera: amzn.to/4b4Ryr9
_____________________________________________________________
Tags:
#HeadPose #PoseEstimation #OpenCV #MediaPipe #ComputerVision #MachineLearning

Комментарии • 146

@NicolaiAI Год назад ⁺²
Join My AI Career Program
www.nicolai-nielsen.com/aicareer
Enroll in My School and Technical Courses
www.nicos-school.com
@Studio-gs7ye Год назад ⁺²
Where is the source code of this project / tutorial?
@nobodyelse-h6h 9 месяцев назад ⁺¹
@@Studio-gs7ye he never made it available...unless you pay
@dataisinfo8671 5 месяцев назад
Hi Nicolai, I am working in the same problem. But I have some question
1. What if a person adjust his seat. I think your solution will not work then.
2. What if a person adjust his laptop screen then also the code will not work as per requirement.
Can you please suggest me how to tackle these problem?
@spablaho 2 года назад ⁺¹⁸
Hello, i would like to add some details for anyone looking to implement this method. In the mediapipe doc it states that the z coordinate is in roughly the same scale as the x coordinate so you should also scale z with the image width. Next, the 3d points that you feed to solvePnP should be constant for every frame. That means you should keep the 3d points of a single frame and pass them with the 2d points of each new frame to the solvePnP function. Then the angles will actually be in degrees and will not need denormalization ( you should not multiply the angles with 360). You should however convert them to radians and use sin - cos functions to find the second (extended) point starting from the nose. The coordinates for this second point are (int(nose_landmark[0] - math.sin(y)*math.cos(x)*50, int(nose_landmark[1] + math.sin(x)*50) where the actual 3d length of this line is 50 and you can change it to make it larger or smaller. Dont hesitate to reply for more details!
@NicolaiAI 2 года назад
Thanks a lot, please provide all the details u have
@NicolaiAI 2 года назад
The results are very very bad tho when u scale the depth and have fixed 3d reference points. Can't be used for anything
@spablaho 2 года назад ⁺¹
@@NicolaiAI I get very satisfying results. I will update you when i upload my project on github!
@NicolaiAI 2 года назад
great, please let me know when u have uploaded it! Appreciate it :)
@wilbsmond 2 года назад
@@NicolaiAI I also tried using fixed 3d points. The results are indeed bad for x and y angles, but really good for z angles (which the same cannot be same when using changing 3d points). Would love to know a "universal" solution to get good x,y,z angles.
Also it turns out that solvepnp isn't very very stable / robust. Do you know other methods that can be more robust?
@martians1028 2 года назад ⁺⁵
I'm trying to use some of your tutorials to make a couple of addons for a 3d software called blender, I really appreciate how thorough you are in your explanation of your code!
@NicolaiAI 2 года назад
Thanks a lot!
@BLOitouP 2 года назад ⁺²
Great Video! You could use the blue line as a mouse on screen! a nose mouse! never have to take your fingers off the keyboard again! Input your monitor size and ratio, perform a little calibration and you might have something. I wonder what you could do to stabilize the blue line.. some filtering? That would make for a great video!
@NicolaiAI 2 года назад
Thanks a lot! Great idea actually and a cool useful project
@goksoy1305 Год назад ⁺³
ı didnt find this proje your github ?
@arsalansyed4709 2 года назад ⁺²
Hi Nico, thank you for the informative video. Just a note - the yaw readings are wrong. The max output of the head yaw (rotation around y axis) is 30 degrees, even when I turn my head 90 degrees relative to the camera.
@wilbsmond 2 года назад
Did you figure out what went wrong with the yaw readings? And what did you do to fix it, just multiplying it by 3 or something non-trivial?
@evinaslan164 Год назад
I also had the same observation. Do you have any updates on this? Many thanks!
@invisiablesk Год назад
The same problem here. Why he is not answering this question? Anyone could solve it?
@narasuruwu Год назад ⁺³
If anyone has trouble with FACE_CONNECTIONS it is now FACEMESH_CONTOURS as far as I know
@rommix0 11 месяцев назад
Thanks for this video. This helped me troubleshoot a project I'm working on.
I'm working on a dataset tool that processes images of faces and exports data of the landmarks and the rotation angles (the pitch, roll, and yaw of the head).
@btissamchaibi9015 Месяц назад
did you fid the source code ?
@jagadeeshnalliboina6447 2 года назад ⁺²
Hey Hi, This is simply awesome, but while I'm running code on my pc I'm getting a divided by zero exception at line 116( totalTime = end - start
) please give me clarity on this Nic. Thank you
@sergeantPepper_ 2 года назад ⁺¹
Hey. Keep up this good work. Your computer vision videos are really interesting.
@NicolaiAI 2 года назад ⁺¹
Hi thank you very much! Really appreciate it
@muhammadzeeshanyousaf6299 2 года назад ⁺³
can't find that code on your github can give link of that plz i really need that.....
@ahmedgaber8819 14 дней назад
hi sir if I want to enhance this code to calculate depth to detect antispoofing, how can I do this?
@felipepacheco2335 4 месяца назад ⁺¹
Excellent, very helpful. Thank you for the tutorial.
@NicolaiAI 4 месяца назад ⁺¹
Glad it was helpful!
@Eric-mo3eb 2 года назад ⁺¹
i have questions. At line 83, why we time 360? angle[0] in the video is sin or cos, isn't it?
@saharshjain3203 3 месяца назад ⁺¹
I did the same coding for my papplication but the line is reaally unstable!
@NicolaiAI 3 месяца назад
You can try with an average filter for some simple smoothing
@harshmirdhwal Год назад ⁺¹
bro can we train a model with python to get accurate results, what do u say?
@nobodyeverybody8437 2 года назад ⁺¹
Thanks for your tutorial. I guess it would be better if you would calculate the time of the computations via time.perf_counter() method, and also it may be more correct to calculate the end time right before displaying the results, it means right before the cv2.imshow() function and not before the mp_drawing, as a result, the (correct) FPS will be slightly lower.
@NicolaiAI 2 года назад ⁺¹
The FPS should be calculated without any drawings or visualizations since only the algorithm should be timed. I'm using the perf counter in recent videos
@mahdihabibi8425 Год назад
Dude this video was great. Is there any other tutorials on how to do this for makeup try on?
Or implementing 3d object on the user's face?
@montichandra3963 2 года назад ⁺¹
Thanks for your tutorial
@NicolaiAI 2 года назад
Thanks a lot for watching! Hope that u can use it
@user-eh2nx5mw3v 11 месяцев назад
The video really great, may I know where the center(origin) of face, I'm trying to inverse the rotation angle to make the face to the camera, thank you very much.
@wave1863 4 месяца назад
I think the roll (z) is not very well calculated because when i print it and test it by movin my head it just moves a little and the yaw (y) moves more than it
@wave1863 3 месяца назад
also, when you are facing the camera, why does the nose angle points by default a little bit to the right?
@NikhilGupta-nv4ri 2 года назад ⁺¹
Hey Nicolai, it's a very well-explained video for head pose estimation, I just want to ask a query if we don't face the camera then how to indicate that you are not looking at the camera.
@NicolaiAI 2 года назад ⁺²
Thanks a lot! If u are not facing the camera it won't be able to do detection. So I could just add another if statement with looking away
@iam.damian Год назад
@@NicolaiAI Thank you for pointing that out. Now I know I shouldn't try to implement this, since it is not enough to solve my project problem - detecting where people in front of a mobile robot are looking. Could you please recommend some stuff for that?
@arthurart2402 Год назад
@@iam.damian You could try to detect just a face. If don't find a face then no one is using the mobile, or you can try to detect the iris...
@hakankosebas2085 Год назад
I want to get value of roll movement of my head, which part should I change?
@hazar06 Год назад
Hi Nicolai, I watched your video. It's Very good. I have a few questions. I try to put the lines above the pupils and I want the program to tell where it is looking. how can I do it?
@hamza_celik Год назад
Hello, why don't the positions of the 'img_h/2' and 'img_w/2' variables in the cam_matrix camera matrix on line 66 of the code have to be exactly reversed? What does it achieve for 'img_w/2' to be below?
@zrlcproject 2 года назад ⁺¹
how can i draw the bounding box?
@antonlukanov3316 Год назад
Thank you for your video! But, module 'mediapipe' has no attribute 'solutions'. Where I can find it?
@bondevega4593 Год назад
good video mate! can i add a closed eye and yawning function in this media pipe? could you do a video about it?
@moayyadarz2965 2 года назад ⁺¹
Amazing Job . Thanks
@NicolaiAI 2 года назад
Thanks for watching!
@moayyadarz2965 2 года назад
@@NicolaiAI Thanks you too
@user-hm6ol7cc6l 2 года назад
I need to add a time for directions, like when I turn right for two seconds it will give me turn right.. how can I did that? thanks a lot
@manojprasad5155 Год назад
Hi,
Thanks for sharing this video.
I just want to know, How can I achieve same Pose Estimation by using javascript?
@shounakmehendale9764 Год назад
The code hangs when there are more than 1 faces in the camera frame(After adding the attribute max_num_faces=2). Please help. Is my CPU not strong enough or what.
@vaishalishiv5529 2 года назад
Hi Nicolai . I wanna do the head pose estimation for image . what are the changes should I made? can you suggest
@ronin9432 2 года назад
really well explained. I would like to see how to integrate this into a game engine like UE-5 =)
@iam.damian Год назад ⁺¹
Hello, What do you recommend to install cv2 and media pipe in? Do you recommend Docker?
@NicolaiAI Год назад
Just anaconda for development
@jamz1510 Год назад ⁺¹
Hello! Thank you so much for this tutorial. I am comparing the output of your code with datasets that include the euler angle ground truths but no matter what I do, the angles never match up with the ground truths and I am not sure why.
I have used the pointing 04 dataset and the aflw2000 datasets but am having no luck with matching the predictions from your code to the given ground truths.
I updated your code for passing an image only instead of a video. Any advice you could give would be truly appreciated.
@user-hu7pk6tx7p Год назад
Hey man, I am also struggling with similar problem although i used some other code. Now i am about to watch this video and implement it. Did you prepare any notebooks by any chance? that would be great help to me
@invisiablesk Год назад
The same problem here. These x y z angles are not real angles. Anyone has an idea to solve it?
@jardelvieira8742 2 года назад ⁺¹
Can I make a program combining head position estimation and iris detection with media pipe in a raspbarry?
@NicolaiAI 2 года назад ⁺¹
Yeah for sure. But I don't think u will get many fps
@goutamkelam6117 2 года назад ⁺¹
Thankyou for the great video, however I drew inspiration from it and wanted to use it to detect head nod and head shake using the head position. I got stuck as there was no fixed pattern to decide whether its a head shake or the person is genuinely looking left and right.Can anyone help me in how to detect head nod and head shake.
@NicolaiAI 2 года назад
Thanks for watching the video! U could try to look at the rate of change for the values and then based on that say if it's a shake or just looking in one direction
@goutamkelam6117 2 года назад
@@NicolaiAI Thankyou for the response, I will try to add the rate of change. Moreover, I also tried to pose the problem as an action detection, but no luck over there as well. Can you guide me if it could be done.
@wilbsmond 2 года назад
Do you end up not using nose_3d_projection? I see some pose estimation tutorials that use it but from my own use doesn't seem to be correct / robust
@C00LTYPER Год назад
is there a way to check what they are looking at?
@sabihatahsinsoha112 2 года назад ⁺¹
Really great video! Is there any solution for head pose estimation in android using mediapipe?
@NicolaiAI 2 года назад
Thanks a lot! Yeah u can get the face mesh detector for Android with mediapipe
@ravind_98 Год назад
can you develop a app for detect drivers head pose and if driver's head deviates from the safe range trigger an alert like system
@jardelvieira8742 Год назад
Hi Nicolai, Can you help me? I am using this program on a Jetson Nano, how can I reduce GPU consumption ?
@VandanaRD 11 месяцев назад ⁺¹
Hi, thanks for this tutorial, how to use mediapipe with gpu..
@NicolaiAI 11 месяцев назад
Hi thanks a lot for watching! Have seen gpu support before but will def check up on it. Been some time since I have used mediapipe
@spectralcodec Год назад ⁺¹
What kind of camera do you use for such a high frame rate? Thanks.
@NicolaiAI Год назад
It’s not the cameras frame rate but the model’s
@alrapie8 5 месяцев назад ⁺¹
thankyou sir
@NicolaiAI 5 месяцев назад
Thanks for watching!
@bolzanoitaly8360 2 года назад
Hi Nicolai Nielsen,
I have one error message as below:
INFO: Created TensorFow Lite XNNPACK delegate for CPU
and the code is not running,
can you help/ guide, what to do?
thanks
waiting for your response.
best regards
Gul Rukh
@SamHocking 2 года назад
Really interesting, thank you!
Could you tell me how how get visual code's intelisense working so I don't have to use the full path for mediapipe all the time? ie
'p_face_mesh = mp.solutions.mediapipe.python.solutions.face_mesh' should be 'mp.solutions.facemesh' but then I don't get any intelisense.
@theprincessfairy3737 4 месяца назад ⁺¹
Can I get the code please if possible?
@suyashchougule5358 2 месяца назад
did you get the code ?
@eloisacute111 7 месяцев назад
can you make it detect multiple faces?
@WIFAROBOTICS 5 месяцев назад
perfect
@user-no5gh5ne5y Год назад
Why do multiply the nose_3d right hand side by 3000?
@Nevozade Год назад
Thanks!
@NicolaiAI Год назад
Welcome!
@mgm1690 Год назад ⁺¹
hi, is there some reason that you set *3000 for the nose_3d lm.z?
@NicolaiAI Год назад ⁺¹
just to get better line visualisation for the direction
@mgm1690 Год назад
@@NicolaiAI Thanks a lot! When you append face_2D and face_3D, Z value is the only normalized value, where x and y are multiplied by image width and height. is it okay to do like that? I just ask you as i don't have background knowledge as mush as you.
@marciobarcellos9106 Год назад
Hi, where is the github link to the code for this project? I did not find.
@floboticsflobotics-robotic894 2 года назад ⁺¹
hi, cool stuff. What camera do you use to get above 100 fps (frames per second) and which resolution ? thx
@NicolaiAI 2 года назад
It's not the camera's fps but the algorithm's. It's 640x480 resolution on a CPU
@nobodyeverybody8437 2 года назад
it is not dependent on Camera, but on CPU, so, I get near 200 fps
@wilbsmond 2 года назад ⁺²
Hi Nico, love your work! What do you end up using nose_3d_projection for? I tried using that projection coordinate for p2 but the face orthogonal line isn't as expected..
Also how would I project the other 2 orthogonal lines? Line that goes up (y) and side (x) of the nose, rather than outward (z)?
@wadewang574 2 года назад ⁺¹
I have the same question, looking forward someone can help
@dinihanafi5097 2 года назад ⁺¹
Can this code be used using raspberry pi 4 with opencv and mediapipe install in it?
@NicolaiAI 2 года назад
Yeah definitely!
@agalyah9576 2 года назад ⁺¹
Hi, Is there any tutorial to make this Head pose estimation without real-time, using a video as input?
@NicolaiAI 2 года назад ⁺²
Yeah u can just pass a video path to the video capture instead of a camera index
@agalyah9576 2 года назад
@@NicolaiAI Thank you so much
@ganeshjada2697 Год назад
hello can you please help me with the input of video file
@hamadishaik8734 2 года назад
I want to co-relate this with the drowsiness detection based on head pose estimation . How to procced with that .
@dinihanafi5097 2 года назад
Me too
@user-no4mb9cz5q Год назад
how to run mediapipe with gpu any ideas?
@FCL00 Год назад
is this possible in javascript?
@bolzanoitaly8360 2 года назад ⁺¹
Hi Nicolai,
If you kindly tell us about the versions of the Libraries like opencv-python, mediapipe, numpy etc.
thanks, because I am facing cv2.error: opencv(4.5.5)
thanks, an earliest response will highly be appreciated, please.
@NicolaiAI 2 года назад ⁺¹
I'm using opencv 4.5.2
@bolzanoitaly8360 2 года назад ⁺¹
@@NicolaiAI , Ok, thanks, I updated my Opencv-python to 4.2.52 but still have the following error:
Attribute Error: "NoneType' object has no attribute 'shape'
please, help me, I am in trouble. thanks
@NicolaiAI 2 года назад
@@bolzanoitaly8360 it's because the image is not loaded in
@bolzanoitaly8360 2 года назад
@@NicolaiAI , Thanks, Of course, my Path was incorrect i.e. my Path is video/cabin2.avi, but i had given videos/cabin2.avi (extra s with video).
thanks for your guidance. appreciated highly.
best regards
Gul Rukh
@ishujain9343 Год назад
Can anyone tell, how to make it work for multiple faces at the same time?
@ShoaibAhmad-bb5kf Год назад
Awesome!
As i run the code only 1 face is detect. What are the changes if I want to detect more than 1 face?
@dimosojunior Год назад
Have you succeed to do that?
@shounakmehendale9764 Год назад
I have the same issue ... Please tell me if you have solved that
@shounakmehendale9764 Год назад
@@dimosojunior I have the same issue ... Please tell me if you have solved that
@Samlorem007 Год назад ⁺¹
Hello, where can I access the codes you wrote?
@NicolaiAI Год назад
Hi it’s all on my GitHub
@Samlorem007 Год назад
@@NicolaiAI yeah I saw, btw the system works fine.
So can we also add eye tracking to this?
@darkminion3593 Год назад
@@Samlorem007 i cannot find the code of this project from his GitHub.Under what name is the code file
@haydebre Год назад
Thank you for your lessons and examples. I very much appreciate your videos. Due to my disease I have a difficulty at using keyboard and mouse. I am using Ubuntu 20.04 Mate. I want to make myself a head mouse using your `headPoseEstimation.py` code. I can reach 22, -9 for x and -22, 10 for y axis by moving my head. I need to correlate those values with screen size (1920x1080) and move the pointer. I also need to fix jumpy x,y numbers for smooth pointer move.
I need your ideas.
@haydebre 9 месяцев назад
@@khaledjaafar113 thank you for caring. I still need help.
@mesutgokdeniz8459 2 года назад
Hello Nicolei.
I'm high school teacher in Turkey and in our school we are making project.In the project we want to use this head pose estimation coding. We re gonna attend a contest and we will need your permission to use your coding. I found nothing about copyright and I couldnt find any contact info. So ı wrote down here. I hope you reply me soon. We dont have so much time to join the contest
@trinhbinhkhang8826 Год назад
I want to estimate the faces of 4 people at the same time
@mohanpannirselvam4102 2 года назад ⁺¹
can you share the codes?
@NicolaiAI 2 года назад
Everything is on my GitHub in the description
@AmeyBhobaskar Год назад
Where are the source code
@benniiiz 5 месяцев назад
Can i get your source code :))??
@minegokkaya8105 2 года назад ⁺¹
💣💣💣💣Hello Nicolei.
I'm high school teacher in Turkey and in our school we are making project.In the project we want to use this head pose estimation coding. We re gonna attend a contest and we will need your permission to use your coding. I found nothing about copyright and I couldnt find any contact info. So ı wrote down here. I hope you reply me soon. We dont have so much time to join the contest
@NicolaiAI 2 года назад
All of it is open source and u can just use the code
@moayyadarz2965 2 года назад
video : drive.google.com/file/d/1W0lNEoYS-Ka2Hj7h6AL3o7BqzXa_MvYe/view?usp=sharing
code : drive.google.com/file/d/1cUUzL18c1fxVeClYUeoGISwdU3DPolwD/view
trying to apply same concept one hand palm but getting vector comes out from both sides as in link
any suggestion to make it only comes only from the internal side of hand(palm) ?

Следующие

Автовоспроизведение

Learn Camera Calibration in OpenCV with Python Script