How to Use LLM Vision to Analyze Camera Images and Video in Home Assistant

Michael Leen

Просмотров 14 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 24 янв 2025

Комментарии •

@michaelsleen Месяц назад ⁺¹
👉 Let me know any creative ways you’re using LLMs or would like to use them in your home automations!
@brian7android985 Месяц назад
I am working towards the house finding a book for me. It should then flash the nearest LED.
[I have lots of books, but also lots of leds.]
@michaelsleen 22 дня назад
Sounds like an interesting project. Good luck!
@SBinVancouver 3 дня назад
Having trouble getting this set up with OpenAI - haven't found a working example on RUclips yet.
@dandixonus Месяц назад ⁺³
Identify birds at a feeder
I just used this to process images from a camera I pointed at my bird feeder to identify the common bird name (and the scientific name). When motion is detected, I capture an image from my Relink camera and send it to Google for processing. When the bird name comes back, I copy the file to a new file with the bird's name in it.
My prompt is this:
- If there's a single bird in this photo, write the bird type, scientific
name
- If more than 1 bird, for each type: write bird type, scientific name,
number of this type
- If there is no bird, write "No Bird Found"
- If you don't know the bird type, write "Bird Type Unknown"
This video was super helpful. Thank you!
@michaelsleen Месяц назад
That's awesome, and thanks!
@sagar93kamat 9 дней назад
Great video! I successfully set mine up by following your video. thank you so much for this!
@michaelrich4872 Месяц назад
Got it up and running thanks to your tutorial. Hopefully I can get it up and running for vehicles too, just waiting to trigger it. Next step is figuring out how to get it to ignore our vehicles, coming home is relatively easy, leaving not so much.
@michaelsleen Месяц назад
I am glad to hear the tutorial was helpful!
@AdiGraham29 10 дней назад
I'm also looking to do this, can you ping me if you make any progress. Cheers
@robinpillekers4871 Месяц назад ⁺¹
Hi, great vid you have here, do you know a way to retrieve the message variable input / meta data (from the normal HA Automation flow) in to a node red flow (payload)?
@michaelsleen Месяц назад ⁺¹
You know despite years of playing with HA, I’ve yet to dive into Node-RED.
@leightonevans1071 21 день назад ⁺¹
How do I know what file path to use for a snap shot image? Can someone give a step by step guide?
On a side note I find most HA videos assume learners know the steps in between!
@michaelsleen 21 день назад
It's really up to you to decide. As an example, you could use the path /media/images/doorbell_snapshot.jpg for the camera.snapshot action, and then the path /media/local/images/doorbell_snapshot.jpg for the action that sends the image in a push notification to your phone. You can use something other than 'images' or 'doorbell_snapshot.jpg' if you prefer another name or organization structure.
@oxxide 21 день назад
This is amazing, but my issue is it doesn't show the full resposne. I can see full responses for other notification with other apps, so i don't think its a phone issue. It cuts off, for example my last noti "The image shows a front porch with a brick wall, a door with a securit...." which is fixed by just making 2 notifications, for some reason though, the date and time isn't working for me, also have a g4 door bell and last_changed doesn't get a date on the automation but does when check it with templates.
@michaelsleen 21 день назад
What you’re describing may just be the expected behavior. You can try telling the LLM in your prompt to limit the description to a certain number of characters. But if it’s too long, long pressing on the notification on iOS at least should reveal the full message and show a larger image of the snapshot photo. To my knowledge, there is nothing on the HA side that would limit the amount of text shown - this is enforced by Apple. See here for an example: drive.google.com/drive/folders/1KwET0jCuqH9LbVqm8VUQTJprmTYaZtus
@hebsclips 11 дней назад
@@michaelsleen this happened to me as well on android. i ended up using this prompt to limit it to 250 characters, and the last line has it append the timestamp and is included in the 250 char limit.
Describe the image in a short and concise message, no more than 250 characters. If you see people, describe their appearance. Describe what they are doing. If they are holding something, describe what they are holding. If it is a vehicle, describe the vehicle color and make and model. Sound like Kanye West. Provide the date and time in dd/mm/yyyy hh:mm am/pm format, at the very beginning of the description.
@AlanJones-vc9yr 21 день назад
Hi. Great video and thank you. I am using a Ring doorbell camera. All works fine except the notifications via HA app on my iPhone are truncated, so only get the first 100 ish characters of the description. Help please TIA
@michaelsleen 21 день назад
What you’re describing may just be the expected behavior. You can try telling the LLM in your prompt to limit the description to a certain number of characters. But if it’s too long, long pressing on the notification on iOS at least should reveal the full message and show a larger image of the snapshot photo. To my knowledge, there is nothing on the HA side that would limit the amount of text shown - this is enforced by Apple. See here for an example: drive.google.com/drive/folders/1KwET0jCuqH9LbVqm8VUQTJprmTYaZtus
@AlanJones-vc9yr 20 дней назад
@@michaelsleen Hi Thanks for the reply. The message is actually truncated prior to receipt by the HA/iPhone. It stops mid mid word. I have tried requesting a shorter response which reduces the characters but still truncates the message ??? Flummuxed :-(
@Daveyboi1981 11 дней назад
Hi ya the response variable is coming up as not defined like its either not setting or not able to read it in the next action
@michaelsleen 11 дней назад
Is it come up that way when you click run action in the automation editor? If so, that makes sense.
@Daveyboi1981 11 дней назад
@michaelsleen even when I run the whole automation
@michaelsleen 11 дней назад
When you say "run the whole automation," do you mean clicking run actions in the editor, or do you mean when you walk in front of the camera triggering the motion even?
@Daveyboi1981 11 дней назад
@@michaelsleenboth
@michaelsleen 10 дней назад
Able to drop your automation YAML here using Pastebin?
@davidlucas1844 8 дней назад
Hi, great video. Got 95% of it working apart from sending the image. it must be the directory but I cant work it out. Also it doesnt show me the full notification as has ... at the end and when i tap the notification it just takes me to my homeassistant page. Can anyone ofer advice on the image and ... thankyou
@michaelsleen 8 дней назад
Are you using iOS or Android? I've seen some folks with Android struggle a bit more. Go to the folder in your config where you are trying to save the image snapshot, and see if the image is actually being saved there. If it is and you're still not seeing it in the notification sent to your phone, another option is to write image_file as /media/images/snapshot.jpg and filename as /media/images/snapshot.jpg and image as /media/local/images/snapshot.jpg.
@davidlucas1844 8 дней назад
@@michaelsleen Thank Michael, I'm using IOS. when i go into traces and the last step I see the following which doesnt look correct after .jpg, is that right? Im using the right directory : Executed: 17 January 2025 at 12:26:05
Result:
params:
domain: notify
service: mobile_app_daves_iphone
service_data:
message: >-
Several cars, including a silver sedan, a white SUV, and a maroon sedan,
are parked in front of modern, two-story homes with solar panels on their
roofs. 10:42 AM (01-17-25)
title: Front Door Motion
data:
image: /config/www/tmp/doorbell_snapshot.jpg?1737116765.744563
target: {}
running_script: false
@wscottfunk Месяц назад ⁺¹
Hey Michael, is the image retention automatically managed or is there a "purge" option to set a maximum number of snapshots to retain or to overwrite the oldest? I could see this taking up file storage if the images aren't deleted/ overwritten. Nice job with the tutorial. Much appreciated.
@michaelsleen Месяц назад ⁺¹
Each image overwrites the prior. Thanks!
@FawziBreidi Месяц назад
I was trying to make the voice assistant run vision llm of some sort to be able to pass the prompt to the vision llm after it captures the image but i was unable to make the assistant to trigger a script. for example I would like to ask my assistant if there is any car parked outside in my garage and be able to analyze. sorry for the long message but if this is possible, please let us know!
@michaelsleen Месяц назад
What is the trigger for your automation? Do you have a camera with vehicle detection? You can try giving LLM Vision a prompt like, "Tell me if there is a car in the image. If so, describe what it looks like in one sentence."
@FawziBreidi Месяц назад
@@michaelsleen i was researching it and looks like we need to create new intents to instruct it to run vision llm. might be a good idea to research and do.
@VWTesla Месяц назад
So, I'm intrigued by this video. My use case is to characterize my incoming USPS mail. I'm using the "Mail and Packages" HACS integration to generate an MP4 containing the USPS mail being delivered (USPS Informed Delivery). Now, I already see the MP4 on my Home Assistant dashboard upon hitting a button. What I want to do is to have AI read the images and let me know if they're addressed to "Resident" or "Home Owner" versus myself, my wife, or my kids. I'm currently using a home grown solution in Python (Pytesseract) but I believe there might be a better solution?
@michaelsleen Месяц назад
Interesting. If you feed the images to the AI model you can try giving it a prompt like that.
@88Snipi88 20 дней назад
is it possible to send only a massage when the postman arrived?
@michaelsleen 20 дней назад ⁺¹
In the LLM description you can tell it to only comment if the mail carrier is seen (it correctly identifies ours every time). Then you would need some kind of condition to only proceed with the rest of the automation if the response variable for the LLM contains a response. I haven’t tried this exactly.
@alexfroehlich711 Месяц назад
How do you generate a preview (snapshot) of the camera feed in the Home Assistant Notification? I use the exact same camera with Protect, and am on IOS. I can hold down long on the notification which opens up the Protect app, but no previews. I bought your code and followed line for line. Everything is working besides that. Thanks!
@alexfroehlich711 Месяц назад
Disregard. Realized I have to use my domain link like you did (through my Cloudflare tunnel) couldn't just use my internal IP. Thanks
@michaelsleen Месяц назад ⁺¹
Glad to hear it’s working!
@alexfroehlich711 Месяц назад
@@michaelsleen “A barefoot dude, looking like he just lost a bet, flips off the security camera with a surprised Pikachu face.” I love these responses 😂 this is quickly becoming my favorite automation already.
@Shaq2k Месяц назад
Nice. Can you train the language models? If you have 2 cats for example; can the language model be trained to know their name and see differentiate them?
@michaelsleen Месяц назад
There was a similar discussion about using this to train facial recognition in a post I made on Facebook. See here: facebook.com/share/p/12BUp2ESnMz/?mibextid=WC7FNe
@iMazTV Месяц назад
Let’s go!! 🔥
@dimitrisdimitriou9769 Месяц назад
Thank you, I make everything and it working very good, but the notification on adroid have a problem , when the notification have a image it is only two lines of message
@michaelsleen Месяц назад
I’ve heard others with Android also talk about this limitation. I’m on iOS and do not have this limitation. Otherwise I’m glad it’s working for you!
@alingabrielafloarei3499 Месяц назад
Great video. Is this not the same as generative ai ?
@michaelsleen Месяц назад
Generative AI is a broader term that includes LLMs. So, the use of LLMs as shown in this video falls within the scope of Gen AI.
@mutley247365 23 дня назад
hay, Ive followed this to the letter, first off Im not getting any image through to my phone (Samsung) but I am getting the AI generated response. is this possible for android devises?
@michaelsleen 23 дня назад
Yes, it's possible. The only Android nuance I'm aware of is the AI-generated response may be cut-off if it's too long. Check to see if the camera image is being saved to the folder path you put in your automation for generating the camera snapshot.
@samiam732 Месяц назад
Do you think HA will ever be made easier to use? I resist it and don't like it because it doesn't seem very user friendly. I like the way Homey seems but it's expensive.
@michaelsleen Месяц назад ⁺²
Yes, I expect such things to get easier to use over time. In my ~3 years using Home Assistant, so many things have gotten easier. It’s really come a long way.
@kevinallen500 25 дней назад
define easy. Even as they improve, it's not an off the shelf plug in product, You need to have enough tech experience to use it. The plug and play is likely years away. There are a ton of video's on how to set it up, but again, you need to have some tech experience.
@marcusagren2838 25 дней назад
On my Oneplus 9Pro Android, the LLM-text doesn't really fit in the notification if i also attach the snapshot. I only see "Here is a description of the image in one sentence..." No way to expand the notification so i can see the ful description.
@michaelsleen 25 дней назад
I’ve seen others with Android say something similar. This is not an issue on iOS, and I don’t have an Android phone.
@marcusagren2838 25 дней назад
@@michaelsleen Yeah, maybe a bug in the Android companion or os. For now, I split up the notification in two. One for the snapshot and one for the LLM.
@PhilBlancett 25 дней назад
are you deleting the image after so many images?
@michaelsleen 25 дней назад
I believe only the most recent image is saved
@PhilBlancett 25 дней назад
@@michaelsleen you should check that, because I believe you need to add another script to make sure or your drive is going to fill up (eventually)
@michaelsleen 24 дня назад ⁺¹
I double-checked. Each image overwrites the prior.
@mazi2be Месяц назад
is Gemini API free? or is there limited number of free prompts? how does it work?
@michaelsleen Месяц назад ⁺¹
I am using the Google Gemini API Free Tier, so it doesn’t cost me anything. There are rate limits, but I’ve yet to hit them.
@Shunopoli Месяц назад
I bought the yml and not matter what I do I get Message malformed: template value is None for dictionary value @ data['actions'][3]['data']
@michaelsleen Месяц назад
Reach out on the Contact page, and I'll get you sorted out: shop.michaelsleen.com/pages/contact
@fightingmajor Месяц назад
Getting this error from using your code. Error rendering data template: UndefinedError: 'response' is undefined
@michaelsleen Месяц назад
Did you try naturally triggering the automation? For example, if your automation is set to trigger based upon motion or a person detected at the camera, try re-creating that by walking in front of the camera and see if it works. If I just click “run” to test out the notification, I also get that error because no response variable exists yet from LLM Vision. But the automation works perfectly for me every time it is naturally triggered by a person being detected at my front video doorbell. And I know several others are using my code successfully. Let me know so I can get you sorted out!
@RakshitPithadia Месяц назад
@@michaelsleen Was facing this same error and realized the actual automation works :)
Thanks for making this detailed video!
@eierund Месяц назад
@@michaelsleen I'm confused. Why would the response variable not exist if the automation is triggered manually? It would still run the LLM integration first and therefore, create the response variable, no?
@michaelsleen Месяц назад
@@eierund You can run the entire automation, or you can run specific actions within the automation. If you run the entire automation, it should work. But if you run just the action where it sends a notification to your phone, that will not work, and instead present the error message: Error rendering data template: UndefinedError: 'response' is undefined. Regardless, the automation itself still works when triggered.
@h3ld3rk1d 25 дней назад
Hello, Im trying send to telegram but no luck, can you send the response to telegram? How? Thnkx
@michaelsleen 25 дней назад ⁺¹
I do not use Telegram so cannot comment on it.
@h3ld3rk1d 24 дня назад
@@michaelsleen thnkx my friend.. i wil try find solution
@Tripitakabc 10 дней назад
Perhaps change the title of the video to remove 'and Video'. I just watched to find out how to use stream analyser, only to find that you only cover snapshots.
@michaelsleen 10 дней назад
Thanks for the feedback. The video covers various use cases, but the hands-on example focuses on snapshots as you mention.
@TheRealDanielsan 22 дня назад
I can't beleive you pay walled the yaml...
@michaelsleen 22 дня назад
I share everything you need to know both in my video and in a written article on my website, all for free. It is not necessary to pay a small fee for the code, but for those who want it to be as quick and easy as possible, I make that option available, and I'm not the only one to do so. Thanks for watching.
@aijii 18 дней назад
I can't believe you're complaining about a few bucks
@clsferguson Месяц назад ⁺¹⁰
Selling the automation? Really?
@michaelsleen Месяц назад ⁺¹²
Producing quality reviews and tutorials requires a large investment of time. My videos are free and show everything you need. To make it even easier for others, I invest additional time in creating and sharing the Blueprints, code, etc., and you can access these for a small fee.
@clsferguson Месяц назад ⁺⁹
Better to invest in yourself. Quality generates subscribers/views. Rely on the potential ad revenue/sponors.
More people will watch/share/continue to watch if you don't put the yaml behind a pay wall.
This is a hot topic right now, and someone else will outrun you with views because of it.
@EricHernandez91 Месяц назад ⁺⁶
@@clsferguson What's wrong with him selling a shortcut to people who don't want to sit through an entire video that shows you exactly how to do it for free?
@Delyn Месяц назад ⁺¹
@@clsfergusontell me you’re not a creator, without telling me you’re not a creator.
@clsferguson Месяц назад
@@Delyn hmm.. curious, do you charge for any home assistant automations you have written?
@beecee7359 5 дней назад
Move the mic away from your throat
@michaelsleen 5 дней назад
Noted

Следующие

Автовоспроизведение

AI in Home Assistant - A Complete Guide!