AI Face Body and Hand Pose Detection with Python and Mediapipe
HTML-код
- Опубликовано: 6 июл 2024
- Want to start building body pose based apps?
Maybe want to control your screen using nothing but gestures!
Well, Mediapipe and Python are the answer! In fact, in this video you'll learn the basics for getting started with Body pose detection, facial landmark estimation and hand pose detection using a single Mediapipe library and your webcam.
Behind the scenes, Mediapipe Holistic, the model shown in this video, uses deep learning models to be able to accurately detect keypoints. Using this model you can begin to prototype a whole bunch of different use cases like touchless gesture control, human sentiment analysis and could even build your own exercise counter!
In this video, you'll learn how to:
1. Install Mediapipe and setup Mediapipe Holistic for Python
2. Access a real time video feed from your webcam using OpenCV
3. Detect and visualise facial landmarks, body poses and hand poses
Get the Code: github.com/nicknochnack/Full-...
Chapters:
0:00 - Start
2:49 - Installing Mediapipe and OpenCV
8:23 - Real Time Video Feed from Your Webcam using OpenCV
13:05 - Detecting Landmarks for Face, Body and Hand Poses
30:48 - Changing Landmark Colors and Styling
Oh, and don't forget to connect with me!
LinkedIn: / nicholasr...
Facebook: /
GitHub: github.com/nicknochnack
Patreon: / nicholasrenotte
Join the Discussion on Discord: / discord
Happy coding!
Nick
P.s. Let me know how you go and drop a comment if you need a hand! Наука
This video keeps delivering even up to this day(2-3 years later)! Great!
Fantastic tutorial. Minor update: the mp_holistic model has changed the name for FACE_CONNECTIONS to FACEMESH_TESSELATION
YOU ARE A LIFE SAVER MY FRIEND!
I LOVE YOU
THXXXXX
Your life needs to be lived @@Tigas4ever
It's so early in the relationship but yes I love you too @@DaniELGKDG
The quality of this and your other videos is outstanding. Great job Nicholas.
Thanks so much @Hassanin!
OMG big thanks Nicholas! I have gone thru so many overwhelming docs still couldn't understand how Mediapipe works until I watched your video. Thanks for making it easy and interesting🥰
Best teaching ive seen on this subject so far in youtube (and Ive seen millions of them)...thank you for this...really...thank you!
Yay so keen you just released this! I was working through how to do body language recognition for the TV show Frasier.
CONG!!! Thanks so much, awesome, quick note, you might need to render the lines as bolder if your image res is larger!
Great video! A small optimization is to set "image.flags.writeable" to False before sending the image object to Holistic and flip the writeable flag back to True after the process() call. By doing this, you will 1) avoid copying the image data by passing it by reference and ii) reuse the same image object for rendering ;)
HOLDDDD UPPPP, did I just have THE @Jiuqiang Tang comment on this!? Thank you sooo much, love your work!
FANTASTIC! YES, would love to see a continuation of this series, possibly including live display and minmax detection of joint angles at selected joints, velocity of motion for selected landmarks.
Would also be great to see an auto-reframe/dynamic crop of a wide angle view to do landmark-detection-driven "auto tracking".
Can the hand and face landmarks from the body pose set be dropped/made invisible selectively?
The auto reframing would be awesome! I saw a colleague do it recently, will add it to the list. Did you mean the invidual landmarks or the whole series e.g. drop all of the face landmarks?
This is a godsend and deserves more views dude. Fantastic job explaining and breaking it down individual lines of codes for beginners like me as I'm working on a Final Year Project that detects pose landmarks and using it to synthesise movements from images.
Thanks so much @Eu Yang Chai, gl with your capstone project!
@@NicholasRenotte Hey Nick, I would like to request a tutorial video on motion transfer using pose estimation (i.e. detecting pose landmarks and using it to synthesise movement on images) using Python and if possible, using Mediapipe as well :) let me know if this interests you, thanks! P/S I've also left you a message on LinkedIn because I desperately need help on that..
@@euyangchai9982 definitely, working on it atm! Exploring how to do it with Tensorflow using the Barracuda framework!
@@NicholasRenotte This is awesome! Thanks man, really appreciate it.
Thanks a lot for listening to my request, Mate. Love Ya.
Anytime my man! Glad you're enjoying it @Sujay.
It's very cool! There are a lot of videos like that, but your videos are the most interesting and you explain it clearly! Thanks! All works correctly and without mistakes and bugs.
Explored your channel today! Amazing content ! Was waiting for such channel since way long back !❤️
Thanks soo much @Amisha! Glad you enjoyed it!
Nicholas this is exactly what I was hoping you would cover. Love the sound of some examples of rep counters (squats, press-ups, etc.) or body language detection which sounds really interesting. I know there is a lot of research on eye-tracking in psychology I'm sure people in that field would love to see some of that. I would really like to see some onscreen angular output or a way of outputting that data to a file while tracking. The media pipe documentation outlines a Z component for the model which would be very interesting to see in terms of what the model sees in that direction or the computation of angular change during a movement.
Keep up the great work man I'm really enjoying the level you are pitching this at - the fact that I can follow along is a testament to your approach. :-)
Thank you so much @Richard, it means the world! Will be building on top of this, I really like the idea of bringing in the z axis into play. I'll brush up on my trig as well and get some angle calculations in there as well!
@@NicholasRenotte I would really appreciate that thanks ;-)
@@richieithaca I'm on it!
This tuturial was really helpful, thank you for your work making it! I saw some other commenters mention it too, but using this for realtime 2D/3D avatar manipulation sounds like a dream come true. I use some of the programs vtubers use for training students on motion capture, so it'd be awesome if there was a way to use mediapipe to generate the vmc protocol for something like vSeeFace! Looking forward to seeing if it could be possible!
Ha, definitely going to dig into the linkage to vtubing software @Rachel, the ones that I've seen so far don't seem to have open APIs though. Just realised vseeface is in unity. Will dig into it some more!
Amazing work Nicholas! I'm waiting to see how you apply this- such as using gaze tracking to learn if someone is paying attention during zoom calls or learning about your morale by figuring your emotion through facial landmarks. Side note: I'm working on some of this stuff myself and would love to discuss it with you if you are interested
Awesome work @Tejas, got a bunch more stuff coming in this space particularly for body language detection.
Looking forward to the series on applications built using Mediapipe. Thank you!
Awesome! Anything in particular you'd like to see?
@@NicholasRenotte Yes! Exercise counters perhaps? Like jumping jack counter or something of that sort? Thanks!
@@hsiaohsu4209 definitely, that's going to be first off the list!
This project is really great. Thanks for it. Please make a full series for this project.
Big fan of your work.
Definitely, I'm going to get onto it next @Nilutpol. Thanks sooo much!
Great video Nicholas! Love the content!!!
Right back at ya @Data 360 YP!
Dude this is fire, its amazing the stuff you can do with python, you are fire bro!
IKR, it's my absolute fav language! Thanks soo much @Bernado, plenty more to come!
This is insane bro! Thank you for knowledge sharing
Your quality of videos is improving 😉
Awesome video
:)
😅 phew! Hahaha thanks so much @Abdullah!
Awesome explanation and demo.
Without stopping the code and rerunning the release lines I think as we mentioned in starting code pressing 'q'(0xFF = 'q') in the pop up helps to do that. By the way Thanks, Great job! :)
Nice, I think sometimes it's a little glitchy on my PC. Looks like it works well when combined with cap.isOpened() as well.
Fantastic! Congrats man! 👏
Thank you sooo much @Andre, glad you enjoyed it!
Can’t wait for the full project 🔥🔥🔥
Yessss, anything you'd like to see as part of it @Daniel?
@@NicholasRenotte yes, I was thinking if you could include an Iris detection using the Mediapipe package
@@danieladama8105 agreed! It's only available in C++ atm but I actually found a workaround.
@@NicholasRenotte Yayyyyyyy!
@@danieladama8105 ayyyyeeee!
Amazing content Nicholas. Thank you!
Thanks soo much @Thirasha!
Thanks a lot, bro, I remember I said to you that please make a video on this topic and finally I got this. {Thanks a lot}😄
while True:
print("Thanks a lot ")
youre_awesome=True
if youre_awesome==True:
print('Anytime my man, a promise it a promise ✌️')
What amazing dreams can come true with this tutorial 😍.
Thanks Nicholas
Thanks for such explaining.
Great master @Nicholas! I had already seen some of #mediapipe but with your video I realized that I had only seen the tip of the iceberg, thank a lot.
Heyyyy, thanks for checking it out Marcelo!
Your Videos are excellent!
Thanks a ton @01bit!
This is really cool. Also, Instead of copying the code, I wrote the code with the video, so that I can write my own comments.
Awesome work!
Man, thank you so much, you've done really great job! :))
Thanks sooo much @Dmitry! Glad you enjoyed it.
Thankyou very much. Keep up the good work really helped a lot of us ♥
thanks buddy , very helpful to understand
You, sir, have a really nice channel! I would love to see some tensorflow in react native, I really think that it has a lot of potential! Keep up the good work!
Thanks so much @kento, definitely still getting up to speed with React Native but it's definitely coming!
INTERESTING!!! YOU ARE A GEM 💎 NICHOLAS !!!!! Waiting for next video now :)
Thanks so much @Abhishek! Definitley, plenty more to come.
Heya @Abhishek, pt 2 is out! Body language decoding! ruclips.net/video/We1uB79Ci-w/видео.html
You are the best Nicholas keep it going 😍
Thanks soo much @DidU!
Bro your the man and especially a genius 👏great work keep it up will love to see the end result
Ayyye, thanks so much @Lorenzo! Definitely, got some sweet stuff planned with it!
Heya Lorenzo, the follow up is out! ruclips.net/video/We1uB79Ci-w/видео.html
It is very awesome and marvellous to watch you contribution in spreading the knowledge. Your current presentation (video) is an inspiring start for our own innovative projects (apps). Based on your contribution many different applications can be created (no limits). Big tomb up for you (unfortunately I can give only one) and Google team for the effort in creating and sharing state of the art work. Have a nice day.
Agreed, it's amazing what you can do with the models that are already out there! I've actually already built a pipeline on top of this code that allows you to apply a custom classifier from the keypoints, possibilities are endless!
bro i'm a broke guy who cant afford rokoko and iphone x and you are my solution. love you
Wow!! I love this tutorial. It is very simplified and easy to follow and understand. I must say you are doing a great Job. Thank you for this
Amazing content!
Thanks soo much @Jie!
Thanks for the content as always. Could you suggest some tutorials of docker and amazon aws? In general it would be interesting to see the whole picture as well (meaning not only model creation but its deploy/implementation to production)
Its amazing. Will always appreciate it.
Awesome stuff @Priyam!
This is very helpful tutorial thanx for sharing brother.
thanks for this tutorial
Good tutorial video!! thumbs up for you
Thanks for this incredible tutorial
I thought it can be something in healthcare as it is tracking the body movement (bones) but I am not sure yet.
Anyway it's great
This is really informative!! Really helpful so here's a thanks from I.
Thanks so much @Cool Rock!
@@NicholasRenotte No u XD. Hopefully we can make proper use of this for school purposes.
@@coolrock3733 definitely, got a use case in mind?
Thankyou So much SIrr💌💌
damn, thank you so much! that helped me out a LOT!!
and really nice step by step explanations as well. love it!
So glad you enjoyed it @R J
Thank you for shairing
Please make a video about jumping, i.e. counting when both feet leave the ground.
Bro, This is Awesome
How about extending this project to create a sign language detection component that detects from the continuous video feed for which we don't need to label specifically for every gesture (like you did in the last live stream) or from a complex ready-made dataset from the internet that includes facial, hand and poses for recognition like British Sign Language, etc.
It'll be lit.
Can you guide me on how to build that sign language thing on top of this?
Heya @Sai, check this out: www.tensorflow.org/hub/tutorials/action_recognition_with_tf_hub
On a jupyter notebook, you can create a "key" variable with a value of 0 then make your while with a condition of key!=27. During the loop, you set the value of key to cv2.waitkey(10). It will escape the cam capture window when you press the escape key.
Sweet suggestion @Amaury!
Fantastic. Can you make a video of hand gestures?
Great introduction to mediapipe! I suppose it can also detect multiple persons? How can u detect if a person is lying on the ground? More videos on mediapipe would be great.
Thanks Henk! I believe the Holistic model is a single person model but you can use the pose detection model for multiple people! For detecting those lying on the ground you could apply a secondary ML model to classify based on keypoints, I've got something in this space coming!
In anticipation of your coming video!
btw: the landmarks coordinates x,y,x are between [0,1]. Are this is in inches, cm or..? If not how can you map to cm for example?
Would be awesome to see action detection done on top of this! :D
Working on it as we speak!
Great tutorial. Is there an example or tutorial on picking of object (=boxes/packages/parcel ) by hand, tracking the movement of the object and identifying the bin/shelf number in which it was dropped?
Чувак, ты мега гений
Thanks
Hello Nicholas! Thank you very much for this helpful tutorial
Do you have a nice solution to get the landmarks of the mouth and lips specifically?
Thank you
Anytime @Kadir, glad you enjoyed it!
Thanks 😊
Anytime @Ameer, glad you liked it!
That's Amazing
Thanks so much @Ahmed!
i had an idea using this for liveness detection
Great!
@Gustavo!! Thanks soo much!
Hi Nicolas NIce work
Thanks a bunch man!
Wow, Great tutorial! wondering if this can be use for VTuber! they don't need to buy expensive tools for body & hand tracking
Great videos as always, I was just thinking how far away are we from using this in skill training? Could be something like record someone doing a task well, then compare a learner and calculate the distance from the 'ideal' as a metric of progress?
Enjoyed this video. I was able to keep up with the pace. However, I’m running this on a raspi pi and I have some latency issues. Will it be possible to target the GPU instead of the CPU?
Hey Nicholas, Great work!
As a recommendation for your next video using mediapipe (python version): what about a face filter using Python - Mediapipe and OpenGL (not OpenCV)
Where you can add a filter to a specific part of the face ;)
More challenging right? :D
Keep up the good work
Got it coming up! Was actually investigating it today.
Oooh, yes with OpenGL too! 3d perhaps?
@@NicholasRenotte yeah 3D may look interesting
@@badaouiahmed2538 agreed!
If you ever wanted to help, I actually was trying to get the angle of the wrist but it is confusing to me, because of the way it draws. All i need is the segment of the elbow connecting to the point in the wrist, and then from the wrist to the middle of the hand to get the angle I need, but it appears to me that there are more then one connecting point in the wrist...?
Hey thanks so much for the vid! How would you implement this on recorded video (say mp4 that’s extracted from youtube) as opposed to live video?
Hi Nicholas,
Thanks for doing this project. Extending this module to workout count would be really exciting. Looking forward to that video.
Also, I tried the discord link but it seems the link is broken hence I couldn't join.
Thanks soo much @Hitesh. Oh noooo, for real it's not working?
can you please do a video on Stroke detection using FAST (Facial drooping, Arm weakness, Speech difficulties and Time)
Hi great video ! I don't understand why there is no standalone app ready by now its amazing and crazy Ai !
IKR, there's so much possible with it!
Thank you for sharing this tutorial. I tried to run the code but I am facing difficulties to make it work. The window with the camera opens, however, the outline for face, hands and body does not show. I am using PyCharm.
Great video! A question: Can this method be used to convert head movement and facial expressions into keyboard input? If so, where can continue learning to develop this?
This video is great, but i have one question: is it possible to extract only one coordinates and print only that. For example, i want to print only the y value of my left hand. Thanks
Hi Nicholas
Do you think it is feasible to project over the forearm a semi-transparent image (or a grid)?
I am doing research for my theses in the field of therapeutic FES , and part of the problems I am trying to address is an effective routine for stimulating electrodes placement. In order to work, they have to be near where the nerves ''get into'' the muscles.
As of now electrodes are placed randomly and then the patient has to undergo a very lenghty process for mapping electrodes activations to hand/finger contraction/extension. So there is a lot room for improvement. The topology of those points is relatively invariant among individuals (both healthy and not) , so I think that being able to project a semi-transparent anatomic table over the forearm(even if it is not extremely precise), could significantly speed up the process for the subjects and enhance the efficiency of the stimulation, as electrodes are placed more easily near those ''motor points''
I believe you could. Presumably the grid would need to be in a 3 dimensional vector space to be able to place it somewhat accurately. I'm experimenting with overlays as we speak, I've had a bit of a backlog but have a lot more planned in this field!
Questions:
Can we access and modify the architecture ?
Can we retrain on new data to make it work on new edge cases ?
1. For MP Holistic, no, but you can build custom models on top of it: ruclips.net/video/We1uB79Ci-w/видео.html
2. Yep, see link above
Hi Nicholas, can we use it to detect the outline of our body?
Instead of web cam I wants to use live desktop screen to detect objects, is it possible? And show in desktop rectangle itself?
I would love to see an example of using this model to coach exercise form. For example you could show the model correct and incorrect form of a a push-up. Then given a video input from a user the model would let you know whether your form is closer to the correct one or the incorrect one.
YES! Like a virtual coach. Nice! I'll add it to the list, definitely possible!
@@NicholasRenotte Exactly, awesome thanks!!
@@CalebSchantzChristFollower anytime! Actually pretty pumped to do that video!
Can you give little idea or make a separate video on how to apply clothes etc on human body using pose detection?
Thanks for nice tutorial! I have a question about mediapipe that if its possible to train the model on custom dataset provided by python language or just pre-trained model is available
The keypoint model, no. But you can use the mediapipe model as an input into another custom model!
🔥🔥🔥
Can you show us how to animate 3D object according to the detected body movements or at least the approach you would use?
(Like when they make 3D object dance like you instantly)
Thank you in advance🙏🏼
Working on it @Thuraya, going to be a long video though by the looks of it 😅
Hi Nicholas, thanks for the video. It is a great tutorial. I am new in python coding and I have a project that I am dreaming of completing soon for some disabled students of mine. I used the codes in Pycharm to implement but I suppose I shouldnt have done thati right? Because after I download mediapipe and opencv and write the codes your provided on github , I run the codes but my camera led turns on for a sec and then off and ı receive this
"INFO: Created TensorFlow Lite XNNPACK delegate for CPU. process finished with exit code 0" My camera never turns on. Do you happen to know what is it about? I appreciate any help
Hey Nicholas, I’ve looking for voice classification tutorials but there is nothing!, it would be great if u teach us with your own database!!
You got it, I'll add it to the list @Bernardo. Also saw an awesome example of voice cloning the other day as well!
Hey bro, how can I transfer this mocap data to a 3d programme like blender, do you have any tutorial on that?
can u also teach me how to detect certain parts of the face by using points number. im having trouble to locate points of my lips... i think i can recolor my lips if i can detect the lipsPoint
You can extract the keypoints from the detected landmarks. Might do a more detailed example of this soon @Biohazard Killer
very good video, it helps a lot
by the way is it possible to make a hairstyle filter using mediapipe?
but wont looks like a wig?
Hey nicholas, Thanks for the Video ... is there anyway to modify the keypoint? I mean, during the video there are both pose and hand gesture for hands ( they are overlapped) ... the same for Face ( face landmarks and pose landmarks overlapped each other for keypoints in face ) ... is there any option to modify the points?
Thanks
Hey bro, i need exactly this, did you resolve the keypoints problem?
Hey could you help me out what all should I know to get an internship in computer vision. I know OpenCV and created some minor projects using haar cascade. What else should I know to get at least an internship in computer vision. Would really appreciate if you could help me out.
Heya @lucifer777, I would be taking a look at the requirements for the internship and try to meet those as a first step but also check this out, it's for DS/ML but applies to CV as well: ruclips.net/video/xSElsMUqFqI/видео.html
Thank you very much for this great tutorial.
Is there a software to make your own model (to use in game in unity)?
Thank you very much for your help!
i going to create a cinama 4d plugin to detect face mocap
help plz
YESS! Sounds awesome! Planning on going some stuff in that space when I get some bandwidth!
Nicholas, can you do a lesson about generative adversarial networks (GANs) using TensorFlow?
Yup! Definitely @Thirasha!
As per the previous comments. Is there the possibility to export usable data for blender (e.g. for shapekeys, point clouds or named empties)?
Possibly, I haven't dug into it enough yet to have a good understanding as to how to do it unfortunately Ian.
this looks better than the paid vtuber face tracking apps on the app store
Woah nice! Haven't checked them out but will take a look!