Chat with an image | LangChain custom tools tutorial | Python Streamlit | Computer vision
HTML-код
- Опубликовано: 7 июл 2024
- Code: github.com/computervisioneng/...
0:00 Intro
0:54 Start
2:29 Project overview
8:13 Main process
12:38 Auxiliary functions
27:06 LangChain custom tools
35:58 Create agent
51:32 Demo
52:40 Let's have some fun!
54:22 Outro
#computervision #python #webapp #computervisiontutorial #langchain #streamlit
Did you enjoy this video? Try my premium courses! 😃🙌😊
● End-To-End Computer Vision: Build and Deploy a Video Summarization API bit.ly/3tyQX0M
● Hands-On Computer Vision in the Cloud: Building an AWS-based Real Time Number Plate Recognition System bit.ly/3RXrE1Y
● Machine Learning Entrepreneur: How to start your entrepreneurial journey as a freelancer and content creator bit.ly/4bFLeaC
All my premium courses are available to the Computer Vision Experts in my Patreon. 😉
www.patreon.com/ComputerVisionEngineer
Very cool project, video deserves many many views! Subscribed.👍
Thank you! Glad you enjoyed it! 😃💪
awesome, your hard work is much appreciated!
😃 Thank you! Glad you enjoyed the video! 🙌
Looks like a great project!! Gracias!!
Yeah it is a great project to get more familiar with LangChain! 😃🦾 De nada!
it's great,make the knowledge easier and more interesting,thank you very much.
You are welcome! Glad you enjoyed it. 🙌
Awesome work Sir 💯
Thank you! 😃
Waooo, this looks awesome💥
😃 It is a very cool project!! Glad you enjoyed it, Sreekar! 🙌
It would be awesome to add a functionality that allows users to edit the uploaded image via prompt. "Cut the [detected object]", "Change the lightning from day to night", etc
Yeah, it would be awesome to add additional image processing functionalities! I will continue improving this project in future tutorials. 💪💪
Great Project😃😃
Yeah I enjoyed it a lot! 😃💪
Hi, thank you for the awesome video. I do have a question. I understand that the concept of building an agent with Langchain. In your example, the function returns the caption of an image. The caption is a man on a horse with a dog. If I were to query the color of the dog, it would not be able to re-process the image and focus solely on the dog, right? If so, what's the purpose of using Langchain agent and LLM? Wouldn't it be better to image-to-caption and store the caption to normal database? Or am I missing something here?
Hey, this tutorial is an example regarding how to use LangChain in a computer vision project. I agree we could continue working on this project and add more features to it. 💪🙌
How to fine tune the model on custom data? As it is specific to the training dataset associated with Open AI
I have a question ?
What if we dont create object detection and image caption tool?? Will it still answer our query related to object detection and captioning ???
Nop, without those tools it won't answer any query related to object detection and captioning.
Bro please please create a video on how to deploy Yolov8 trained custom model on Nano Jetson.
I have submission of my Final Year Project within a week and I can't find help regarding that. Please do it.
include installation of transformers in your requirements from the main hugging repo
Oh, I missed it! Thank you for the heads up! I will update the requirements file shortly. 🙌
sir , you have put that lane crossing detection video in private can you please put it back sir , my work depends on it and that video has helped me a lot , thank you
I am preparing a more recent version of the lane crossing detector 😃. It will be available soon! 🙌💪
Can we do this by using yolov8 and using our custom dataset and asking about that product ?
Do you mean creating a custom tool to perform object detection with yolov8? Yes, it is possible. 🙌
Love your videos
Is there an alternative to chatgpt that you recommend from huggingface?
Take a look at HuggingChat. 😃🙌
when I am trying to publish the repo to github so that we can host it on streamlit cloud, its giving error due to use of OpenAI secret key directly in the code. how can we fix that?
okay i fixed that error by using environment variable and not directly exposing the openai secret key to github but now its giving error that I have reached the quota for openai api key, even though i have not used it at all.
Hello , i followed this tutorial and made the app. But for whatever question i ask , its response is caption ( except for object detection) . Like when I asked it to generate story , it returned the caption of the image . I have done everything as same as shown here
Thank you for your feedback, I will try to test it soon and I will update the code if needed.
I have faced the same issue@@ComputerVisionEngineer
Which model have you used?
I am using the model facebook/detr-resnet-50 for object detection and Salesforce/blip-image-captioning-large for image description. 💪
Can we do this with free hugging Face model instead of Open Ai? Open Ai requires subscription.
Yes, we could use a free alternative, instead of openai. 🙌
@@ComputerVisionEngineer will you plz implement something instead of openai? I really liked the project and tried my best but couldn't make the project run with a free model.
I'm getting this error. PermissionError: [Errno 13] Permission denied: 'D:\\Courses\\Computer_vision_engineer\\Ask_image_question\\tmp3chlmrdj'
I have granted read write permission to the folder. Need help
What is your OS?
@@ComputerVisionEngineer Windows
@@ComputerVisionEngineer I fixed it. it was due to NameTemporaryFile() issue. Replaced with
with open("temp.jpg", "w+b") as f:
cool, thanks for the update!
@@ComputerVisionEngineer no problem.
I really love the projects you're working on and sharing the knowledge