Hey there! Thanks for introducing me to the Marigold model. After testing it, I found it wasn't ideal for real-time applications, and even Midas fell short for my thesis project. Luckily, your videos led me to the perfect solution - "Depth Anything". I'm excited to fine-tune it for my project. Your content has been a huge inspiration for my work in Computer Vision. Keep up the great work! 👏👁🗨
Thanks a lot for the Awesome words mate! Yeah this depth anything can now run in real time and with great performance compared to Midas. Huge leap and hope you can use it for some cool stuff!
Everything im learning about my current project keeps bringing me to your videos. Thanks for posting these indepth videos. Has helped a lot during my learning process.
For those who might be curious. I'm attempting to combine yolo and depth estimation to identify products on a shelf and how many are required to fill vacant locations. Got annoyed having to manually recall, so the engineering brain kicked in :)
So, let's say I have a single normal camera and I take a photo of a certain object. Is it possible to get real-world coordinates of that object from the photo using this model?
If you want to measure the exact depth of an object, deepanything is not absolutely accurate. I read it on Deepanything's github issue and someone has done it. It has a fairly large error when the object is more than 5 meters away. I think this is mainly for 3D reconstruction rather than measuring actual depth.
Excuse me, streamer. I'd like to ask how I can convert the output of the Depth Anything model into an actual depth map to obtain the real-world 3D coordinates of a specific pixel in the image. This is crucial for determining whether the model can be applied in real-world engineering projects.
If you want to measure the exact depth of an object, deepanything is not absolutely accurate. Someone has done it on the github issue and it has a fairly large error when the object is more than 5 meters away. I think this is mainly for 3D reconstruction rather than measuring actual depth.
I wanted to ask, is there already a version for stereo vision? And how can one calculate distances with an AI detection overlay to obtain object distances? thanks for this super video
I have setup both midas and Depth-Anything with your instructions and i am using it to do inference on rtsp stream. But the Depth-Anything model is being much slower than Midas model in my setup. What could be the reason of this?
They are the fastest by far but will need to be optimized and exported to a format that’s supported by apps. But in that case it can run real time but not easy to do
Thanks for this video! I just need to ask on what hardware did you run this for real-time performance? And what FPS (or inference time in milliseconds) did you get for each model?
Thanks for watching! Just from raw model I get around 50 fps or so. Can definitely be bumped up with optimization and deleting the visualizations etc. This is only a 25mb model so very small vision model! Can run pretty fast. I’m running this on a 3070
@@NicolaiAI Alright, fair enough, and I've seen your other videos as well for Stereo Cam Calibration and now this Mono Camera Depth Estimation, it would be SOO helpful if you provide links to the targetted GitHub repos or links in the description or Comment section. We literally are the ones who watch your videos full from the beginning to the end. So please, take care of that. Thanks 😃
Siger ikke hele ideen til min bachelor lige røg i vasken, men jeg bruger stereo-vision med IR-projektør... Fedt man nu kan bare med et enkelt kamera... yay! 🤣🤣
depth relative to the camera/viewer i would assume. since depth can be captured in different ways i think the term relative in this case is just a designation for the camera it is being captured on mimmicking depth youde produce using both of your eyes naturally, thats why this depth map is also used in 2d to 3D video conversions cuz it helps mimmick stereo view like the way studios do it
Join My AI Career Program
👉 www.nicolai-nielsen.com/aicareer
Enroll in My School and Technical Courses
👉 www.nicos-school.com
Hey there! Thanks for introducing me to the Marigold model. After testing it, I found it wasn't ideal for real-time applications, and even Midas fell short for my thesis project. Luckily, your videos led me to the perfect solution - "Depth Anything". I'm excited to fine-tune it for my project. Your content has been a huge inspiration for my work in Computer Vision. Keep up the great work! 👏👁🗨
Thanks a lot for the Awesome words mate! Yeah this depth anything can now run in real time and with great performance compared to Midas. Huge leap and hope you can use it for some cool stuff!
Everything im learning about my current project keeps bringing me to your videos. Thanks for posting these indepth videos. Has helped a lot during my learning process.
Thanks a ton man! Happy u find them helpful
For those who might be curious. I'm attempting to combine yolo and depth estimation to identify products on a shelf and how many are required to fill vacant locations. Got annoyed having to manually recall, so the engineering brain kicked in :)
One of the best channels for ai, computer visiion and deep learning
Wow, thanks a lot for the nice words! Appreciate all of u
How can I get depth information from heat map created so that I know if particular object is how much far or near relative to camera ?
nice video, question, how do I got metric values for pixel from these models?
thank you sir for amazing video , so can we use this to identify spoofing , if yes how can dedicated value to assume this is spoofing
Awesome! How can i get the cuda memory address before to go the cpu? Is a waste of process to the image processing in cpu. Anyway awesome video!!
So, let's say I have a single normal camera and I take a photo of a certain object. Is it possible to get real-world coordinates of that object from the photo using this model?
How can we get scaler distance value for specific object from depth video if it's possible?
same question
@@ed6280
AI will get it with no problem.
You will need to extract the positions. Normally you would come it with a segmentation model to do that
So does it gives default depth in meters can we get that without training@@NicolaiAI
If you want to measure the exact depth of an object, deepanything is not absolutely accurate. I read it on Deepanything's github issue and someone has done it. It has a fairly large error when the object is more than 5 meters away. I think this is mainly for 3D reconstruction rather than measuring actual depth.
Excuse me, streamer. I'd like to ask how I can convert the output of the Depth Anything model into an actual depth map to obtain the real-world 3D coordinates of a specific pixel in the image. This is crucial for determining whether the model can be applied in real-world engineering projects.
If you want to measure the exact depth of an object, deepanything is not absolutely accurate. Someone has done it on the github issue and it has a fairly large error when the object is more than 5 meters away. I think this is mainly for 3D reconstruction rather than measuring actual depth.
I wanted to ask, is there already a version for stereo vision? And how can one calculate distances with an AI detection overlay to obtain object distances?
thanks for this super video
Thanks a lot, then you will have to use metric depth from the model, check out their GitHub repo! Might do more videos about that as well
@@NicolaiAI please do! I'd love to understand how metric depth works.
Hey i just wanna know what will be this models performance on Embedded Devices like Jetson or Raspberry Pi
Can prob run a few frames on a jetson
Thanks for the reply😊. Just one more question Can we get the depth data in like centimetres using this model
is there anyway you can create a code where you can import a video and export it out as a depth map, please let me know 🙏
I have setup both midas and Depth-Anything with your instructions and i am using it to do inference on rtsp stream. But the Depth-Anything model is being much slower than Midas model in my setup. What could be the reason of this?
any idea how i can use/implement this algorithm for certain case for my banchelor thesis ?
Can these be converted to actual distances or do we need to use the metric depth model for that?
can we print out the depth map as we did in the MiDaS model ?
Thank you! Are these depth estimators fast enough to run on edge devices like iphones?
They are the fastest by far but will need to be optimized and exported to a format that’s supported by apps. But in that case it can run real time but not easy to do
Ah I see thanks ! :)
Thanks for this video!
I just need to ask on what hardware did you run this for real-time performance? And what FPS (or inference time in milliseconds) did you get for each model?
Thanks for watching! Just from raw model I get around 50 fps or so. Can definitely be bumped up with optimization and deleting the visualizations etc. This is only a 25mb model so very small vision model! Can run pretty fast. I’m running this on a 3070
@@NicolaiAI wow that's actually impressive! thanks alots
@malek3764 yeah that’s their small model they also have way larger models which can’t run real time. But now we have amazing results in real time
@@NicolaiAI based on ur experiments with the models, how big is the gap in quality between the small one and the larger versions?
@@malek3764 not much! Definitely go with the smaller models unless you do some 2D to 3D stuff and those things where you want the highest detail
Hi Nicolai. Is there a model that can output the depth information of a point if we provide the pixel coordinates of that point?
I also want that using Single camera
The video is nice. Why don't you provide required links in the captions?
Thanks! Will upload to my GitHub today! Had some urgent stuff that came up right at release
@@NicolaiAI Alright, fair enough, and I've seen your other videos as well for Stereo Cam Calibration and now this Mono Camera Depth Estimation, it would be SOO helpful if you provide links to the targetted GitHub repos or links in the description or Comment section.
We literally are the ones who watch your videos full from the beginning to the end. So please, take care of that.
Thanks 😃
I appreciate all of you! I promise ill do my very best and do that going forward
@@keshav2136
github.com/niconielsen32/depth-anything
Siger ikke hele ideen til min bachelor lige røg i vasken, men jeg bruger stereo-vision med IR-projektør... Fedt man nu kan bare med et enkelt kamera... yay! 🤣🤣
Haha sorry 😂
@@NicolaiAI Tak for altid at poste den nyeste viden, det holder os på dupperne 💪
@@kirkeby7875 mange takker for at følge med!
Nicolai Nielsen our beloved
Thanks a ton mate!
Can this code run in Python?
Yup this is running in python
@@NicolaiAI Owh, because I saw the icon software doesn't look like Python... 😅😅😅
Not anymore!
What’s new?
What is "relative depth"?
depth relative to the camera/viewer i would assume. since depth can be captured in different ways i think the term relative in this case is just a designation for the camera it is being captured on mimmicking depth youde produce using both of your eyes naturally, thats why this depth map is also used in 2d to 3D video conversions cuz it helps mimmick stereo view like the way studios do it
can we print out the depth map as we did in the MiDaS model ?