Raspberry Pi AI Kit - Custom YOLOV8 Object Detection

Offline Speech-to-Text on Raspberry Pi 5 with Python3 and OpenAI Whisper ASR

Best FREE Speech to Text AI - Whisper AI

The WORST dog matting I have ever seen in my 13 years as a pet groomer | EXTREME transformation

Hey.. long time no see

Noob To Pro With DRAGON REWORK in Blox Fruits

I tested audio transcription from OpenAI Whisper on Raspberry PI. The results were astonishing!

Eugene Tkachenko

Просмотров 7 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 18 янв 2025

Комментарии • 16

@Ul_Nika 5 месяцев назад
wow! great experiment! At the end, I also wondered about the accuracy, so if it's an interesting topic for you I will be grateful for your sharing.
@itkacher 5 месяцев назад
Thank you!
You can see the accuracy on 10:57 ruclips.net/video/pH07mng2jBU/видео.htmlsi=pGX2A9TTy_gcHFqc&t=657 .
The most common differences is punctuation, lowercase/upper case.
However, I didn't test the real-live scenario.
The youtube video has a professional sound with a speach from a professional actor.
I don't know the quality of the transcription if it happens in the noised spaces 😂
I'll let you know if I try it :)
@snaggle202 3 часа назад
Is there any way to cut the chunking time of 10 seconds down?
@itkacher 3 часа назад
OpenAI whisper doesn’t support streams.this feature. There are some third-party libraries that added this feature, but I don’t know about their performance.
You can find a link on it in one of comments here.
@3DForge-i7i 4 месяца назад
Would it be easy to pass the transcription to lama to summarize it, create task list, etc… ?
@itkacher 4 месяца назад
It shouldn’t be a problem.
There are a lot of technical issues with the transcription as the Whisper tries to transcribe sounds that aren’t voices.
But I haven’t tried this.
@3DForge-i7i 4 месяца назад
@@itkacheryeah but I meant on the RPI itself with a lama instance which may run on the Hailo ?
@itkacher 4 месяца назад
I haven’t try llama. I saw that people run it on CPU. It was very slow.
Sorry, I have no idea if it supports Hailo.
@cedricrueckert2399 5 месяцев назад
nice work!! so if you would put the text to translate this and give that as sound out... you would have the first life translation. If you do such project im highly interested so see the results :)
@itkacher 5 месяцев назад
Thank you! To be honest, there are plenty of such solutions on the market.
Just Google "ai live translation". However, it's not so simple, and the devil is in the details.
The transcription worked perfectly fine on a speech from Netflix. In real life, sounds and noises will add some false words.
Additionally, a narrator's quality does matters.
Then, the translation works great, but it also produces a lot of false-translation.
So it will work, but the quality wouldn't be so good.
And the process require something more powerful, like Nvidia Jetson, Xavier, etc.
@SamiP111 5 месяцев назад
how can I reach you ? have some questions
@itkacher 5 месяцев назад
I haven’t received any requests on LinkedIn so I assume you’ve figured out all the questions:)
@MadHolms 8 дней назад
chunking in 10 seconds block is no good, you can cut it in the middle of the word, and the LLM will use the context. better to use a wrapper project whisper_streaming which works much more correctly with the streaming audio
@itkacher 8 дней назад
You are right, the 10 seconds approach could cut in the middle of the word.
However, I didn't find native solution from OpenAI
(docs: platform.openai.com/docs/guides/speech-to-text#improving-reliability )
The purpose of the video was to test performance and I am sure that a wrapper doesn't improve it.
I guess they "feed" content a few times to cover the "cut" case, but it's an additional operation that will consume even more CPU.
So if someone is looking for a reliable implementation - yes, they should think about it.
Thank you for noticing it :)
@AB-cd5gd Месяц назад
I gave up whisper and faster whisper for my voice assistant rpi, its slow, inaccurate and has some hallucinations, for some reason google speech recognition is much faster and accurate lol
@itkacher Месяц назад ⁺¹
Hm… I haven’t tried it. Will do
Thanks pointing it out!

Следующие

Автовоспроизведение

Raspberry Pi AI Kit - Custom YOLOV8 Object Detection

Raspberry Pi AI Kit - Custom YOLOV8 Object Detection

Offline Speech-to-Text on Raspberry Pi 5 with Python3 and OpenAI Whisper ASR

Offline Speech-to-Text on Raspberry Pi 5 with Python3 and OpenAI Whisper ASR

Best FREE Speech to Text AI - Whisper AI

Best FREE Speech to Text AI - Whisper AI

The WORST dog matting I have ever seen in my 13 years as a pet groomer | EXTREME transformation

The WORST dog matting I have ever seen in my 13 years as a pet groomer | EXTREME transformation

Hey.. long time no see

Hey.. long time no see

Noob To Pro With DRAGON REWORK in Blox Fruits

Noob To Pro With DRAGON REWORK in Blox Fruits

Marvel Rivals | Winter Celebration, Joyful Jubilation

Marvel Rivals | Winter Celebration, Joyful Jubilation

OpenAI Whisper C++ on Raspberry Pi 5 - Real-time translation and transcription

OpenAI Whisper C++ on Raspberry Pi 5 - Real-time translation and transcription

AI Copyright Claimed My Last Video

AI Copyright Claimed My Last Video

Transcribe and Translate Audio with Whisper Using Ubuntu

Transcribe and Translate Audio with Whisper Using Ubuntu

Can Whisper be used for real-time streaming ASR?

Can Whisper be used for real-time streaming ASR?

Using Ollama to Run Local LLMs on the Raspberry Pi 5

Using Ollama to Run Local LLMs on the Raspberry Pi 5

World’s Fastest Talking AI: Deepgram + Groq

World’s Fastest Talking AI: Deepgram + Groq

Multi Speaker Transcription with Speaker IDs with Local Whisper

Multi Speaker Transcription with Speaker IDs with Local Whisper

It Happened! Elon Musk Reveals Incredible Features Of Tesla Bot Gen 3 2025! Destroy ALL Rivals!

It Happened! Elon Musk Reveals Incredible Features Of Tesla Bot Gen 3 2025! Destroy ALL Rivals!

OpenAI Whisper: Speech To Text With Microphone Demo(Update In Description)

OpenAI Whisper: Speech To Text With Microphone Demo(Update In Description)

🇷🇺🇬🇪 MERAB DVALISHVILI VS UMAR NURMAGOMEDOV FINAL FACE OFF (UFC 311)

🇷🇺🇬🇪 MERAB DVALISHVILI VS UMAR NURMAGOMEDOV FINAL FACE OFF (UFC 311)

Антикражные полки

Антикражные полки

ЦЫГАНКА ПРЕДСКАЗАЛА БУДУЩЕЕ #иванабрамов #юмор #цыганка #предсказание #концовка #shorts

ЦЫГАНКА ПРЕДСКАЗАЛА БУДУЩЕЕ #иванабрамов #юмор #цыганка #предсказание #концовка #shorts

Кит Страдал от Тысяч Паразитов на Своём Теле. То, Что Сделал Дайвер, Шокировало Всех!

Кит Страдал от Тысяч Паразитов на Своём Теле. То, Что Сделал Дайвер, Шокировало Всех!

Roomba Balloon Roulette 😱

Roomba Balloon Roulette 😱

МНОГОЖЕНЕЦ ВСТРЕТИЛСЯ С ТЕЩЕЙ НА ПУСТЬ ГОВОРЯТ...

МНОГОЖЕНЕЦ ВСТРЕТИЛСЯ С ТЕЩЕЙ НА ПУСТЬ ГОВОРЯТ...

Brave Coco, facing danger with a smile #clown #angel

Brave Coco, facing danger with a smile #clown #angel

UFC 311: Битвы взглядов и финальные слова

UFC 311: Битвы взглядов и финальные слова