How to Install & Use Whisper AI Voice to Text
HTML-код
- Опубликовано: 19 июн 2024
- In this step-by-step tutorial, learn how to transcribe speech into text using OpenAI's Whisper AI. Whisper AI is an AI speech recognition system that can transcribe and translate audio files in approximately 100 different languages.
📚 RESOURCES
- Install Python: www.python.org/
- Install PyTorch: pytorch.org/get-started/locally/
- Install Chocolatey: chocolatey.org/
⌚ TIMESTAMPS
00:00 Introduction
00:40 Install overview
01:00 Install Python
02:31 Install PyTorch
03:55 Install Chocolatey package manager
04:53 Install ffmpeg
05:28 Install Whisper AI
05:59 Transcribe one file
07:18 Output files
07:58 Transcribe multiple files
08:39 Available models
09:51 Transcribe in other languages
10:31 Translate to English
11:06 Help
11:40 Quality
12:04 Uninstall
12:14 Wrap up
📺 RELATED VIDEOS
- Run Whisper AI in the cloud for free using Google Colab: • Best FREE Speech to Te...
😢 Uninstall instructions:
- Uninstall Whisper AI
In command prompt, enter:
pip uninstall openai-whisper
- Uninstall ffmpeg
In command prompt, enter:
choco uninstall ffmpeg
- Uninstall Chocolatey
In File Explorer, delete the folder:
"C:\ProgramData\chocolatey"
- Uninstall PyTorch
In Command Prompt, enter:
Pip3 uninstall torch torchvision torchaudio
- Uninstall Python
Go to Installed Apps in Windows Settings, search for Python and Python Launcher, click the three dots, and then uninstall.
📩 NEWSLETTER
- Get the latest high-quality tutorial and tips and tricks videos emailed to your inbox each week: kevinstratvert.com/newsletter/
🔽 CONNECT WITH ME
- Official web site: www.kevinstratvert.com
- LinkedIn: / kevinstratvert
- Discord: bit.ly/KevinStratvertDiscord
- Twitter: / kevstrat
- Facebook: / kevin-stratvert-101912...
- TikTok: / kevinstratvert
- Instagram: / kevinstratvert
🎒 MY COURSES
- Go from Excel novice to data analysis ninja in just 2 hours: kevinstratvert.thinkific.com/
🙏 REQUEST VIDEOS
forms.gle/BDrTNUoxheEoMLGt5
🔔 SUBSCRIBE ON RUclips
ruclips.net/user/kevlers?...
🙌 SUPPORT THE CHANNEL
- Hit the THANKS button in any video!
- Amazon affiliate link: amzn.to/3kCP2yz (Purchasing through this link gives me a small commission to support videos on this channel -- the price to you is the same)
#stratvert #whisperai #openai - Наука
Run Whisper AI in the cloud using Google Colab (requires no install and is also free): ruclips.net/video/8SQV-B83tPU/видео.html
Didn't work for me. I just get error reports
Works great for me using Co-Lab. Or on my hard drive. Both work great.
But here's something:
I have multiple gmail accounts. And I have a number of tools, add-ons, extensions to Google Drive/Docs/Sheets, including Co-Lab, Apps Scripts, etc.
And I initially set them all up on one google account. But when I go to set up those same tools in my other google drive accounts, I get an error message, and can't do it.
It seems that I can't have stuff in Co-Lab, for example, in more than one google account.
there is a way with the installation on windows to use whisper OFFLINE?
@@francescooliva5951 once you install, you can use offline.
@@KevinStratvert so the only time i go online Is to download for the First Time the pre-trained model?(tiny/medium/large according to my choice)? I have a AMD Radeon 530 GPU... But whisper seems to not read It. In fact i use 99% of my CPU in task manager. What Is the medium time to transcribe a medium kind of file?
Gosh, Kevin, this is the first video I've seen of your and I am mightily impressed! I've been in IT for over 30 years and I can tell you that your presentation is one of the leanest and meanest I've ever seen. What a great contribution this is to the community. Thank you very much!
for the ones having issues with "file doesn't exist" you have to make sure that you add the file type at the end even if its not named that. For example if you file is named "file" and its an mp3 then you must type in whisper file.mp3. Hope this helps because this was not specified
I need help FP16 is not supported on CPU; using FP32 instead.. what does this mean?
@@lauram14 nothing, just more ram used and low speed
thank you i was stuck for two week now its work
still facing the issue for m4a file... is it possible we need to give only certain file types
wait why are you here?
Thank you for doing a complete walkthrough, unlike so many other RUclipsrs who act like they're being thorough but later find out they're skipping small but essential steps as if we already know!
This is probably my favorite video on RUclips ever. It is amazing. It takes a process that I found complicated and turns it into easy to follow steps. It actually takes what could be stress inducing and makes it relaxing with some unintentional ASMR presentation. Very well done.
Amazing walkthrough. Thank you. You've made something that would have been overwhelming for me and taken me hours (if I could do it at all) seem so easy and I was done in under half an hour!!
Thank you , Kevin, for sharing your knowledge and teaching skills in this and your other RUclips contributions. I followed this RUclips video to the letter and was able, with only a few hiccups (of my making), to transcribe very important audio files my wife recorded on her iPad. My Win 10 PC did the job flawlessly to my wife's stringent specifications. Happy wife, happy home. I first tried your excellent video, "Audio to Text" which was satisfactory for very small audio files due to the limited capacity available through Microsoft. The AI system worked very well on a 6 mb audio file (four pages of text in a MS Word file). I haven't yet tried a larger file size but believe it would work fine for larger files. Again thank you for all you do, for sharing your selfless talents and wonderful passion for what you present.
Hi Kevin. Been watching you for awhile and just want to say thanks for all the explanations. Concise and interesting. You've helped me a lot and, again,
I thank you. Keep it rolling!
I had previous success with your Stable Diffusion video for a local install. It was the only one I found that was clear and perfectly detailed! This video also was excellent, I just followed your step by step instructions and everything is working great!
Amazing job Kevin. My first attempt at installing Whisper was bad, but your video had me running in no time.
You know, this is one of those videos that you wish you could like 100 times. Much appreciated, man. Amazing video. Thank you so much. Subscribed
Wow, many thanks Kevin. I had my own videos that I was planning to do Voiceover and found it very difficult to listen to and translate the video, this way I was able to generate Arabic text and it is pretty good and even the translate feature to English is excellent. This video solved a lot for me, and I have tested it, and very promising. Many thanks again.
Dude, I can't tell you how many times you've saved me in my IT job. I'm an AV specialist and have never used python, but with your videos I've been able to use Whisper on google collab for short pieces and with this, a 40 minute piece, with no struggle. Your written and video guides are pretty incredible.
YOUR VIDEO IS AMAZING!!! It helped me so much with learning languages, I used this whisper program, converting speech to text, and then I used chat GPT as a super translator, IT IS ABSOLUTELLY AMAZING. Thanks to this video I did in 1 day the amount of work for 4 days. The quality of Whisper is absolutelly amazing. Kevin Stratvert is the BEST, Thank you
Another incredibly useful video and so very easy to follow as well! It works perfectly for my large assembly recordings. Thanks so much Kevin. You're such a great teacher, I just love your stuff!
It worked after some serious debugging but couldn't have done it without this video. Thank you a ton!!
Thanks again, Kevin, for a very useful video. Nice to see Python at work. It reminds me of old-time programming - at least a little. I am 71 and wrote my first program using punch cards... :)
I dropped my whole stack of punch cards once :)
I was just telling my buddy about that. I think AI is going to be as big a jump as punch cards/numbered lines to named variables was
@@noreenstxs9605 I did that back around 1970!
I'm 72... Cards.. IBM 1130 Fortran Apple 2 Pascal 😅
@@hubertmallard7254 Yeah, same here. Programmed in Fortran, Cobol, Pascal, etc. What about the TRS-80? Remember those?
This was indeed a helpful video, even if I wish you skipped package managers for ffmpeg installation. I got Whisper installed and working, testing transcription on a recording of a 70 minute meeting. With a fairly muscular PC, I tried with small, medium and large models. Surprisingly I got more accurate results with small, in addition to quicker results. Great tool, wonderful intro.
I was recently thinking how great it would be to have Whisper local, instead of online only. And, voila, here's Kevin! Readin' minds, and don't even know it; well, you do now. Thanks!
Many thanks for another excellent video. Some of the versions from this video have been updated but I was able to find the ones you mentioned so everything is working as expected.
I teach English (and digital literacy) and sometimes wanted a transcript for an interesting podcast. This is great as it is free, has no time limit and offers other languages which I am keen to test soon. I also like your video which shows how to use the online version of Word.
Btw, I use some of your tutorials in the classroom for my MS app classes and the students love your videos too. The only adjustment I do is slow down the playback as it is sometimes a little too fast for my learners :).
Many thanks again and please keep up the good work!
It worked on Python 3.11.4 and the latest PyTorch! I used a CPU and a 1 minute speech took 4 minutes to be transcribed using the small model, 10 minutes using the medium. The installation in the cloud (00:35) is much faster, with the result in under 1 minute. The medium model can recognize technical words. Thanks for showing this tool.
I use this to watch movies in other languages and this has boosted my language skills more than anything. I feel like language learners are never thought of because they make up such a small percentage of any user base and are quite silent. Thanks Kevin.
Thank you very much Kevin. Your channel helps even laymen like myself appear like tech nerds when I share these solutions with friends. And I always recommend your channel to them.
This is the first time I got a video playing straight after its release!
As an educator, I really like you style of explaining. Tnx
UPDATE:
This is truly the holy grail.
For technical writers, journalists, people who do tons of interviews that need accurate transcription. For paralegals.
This is a game-changer.
I had used the one via Co-Lab before, per Kevin's instructions.
But you are limited to three transcripts a day or something.
With this on my HARD DRIVE, I can translate multiple files.
I assume there's no cap, no limit.
Getting the transcript in all those multiple formats Kevin shows in the video? Almost too good to believe.
I don't have a dedicated graphics card, so I chose "CPU." (Hence the slowness, I reckon.)
I DO have an i7 processor.
But it's a laptop with only 8GB of RAM, and no ability to add more.
I want a desktop so that I can upgrade RAM, get a dedicated graphics card, upgrade processors, etc.
For more of this kind of thing. Automation. Some heavy lifting.
------------------
Okay. Seems to be running. Slowly, but running.
I now have at least two different versions of Python installed on my PC. Installed 3-10-10 just for whisper. Already had 3-11 to run globally.
I always make choosing an installation location more complicated than it has to be.
But I don't want to run into compatibility problems with the various versions of Python -- plus, I don't know what the implications are as far as Environment Variables, and the fact that the various versions all have to call ffmpeg, chocolatey, or selenium, or whatever.
I installed 3.11 in the default location for Program Files.
In installed 3-10-10 in a folder directly on C drive that I created for it, called python-3-10-10.
I think that part of the key to success here is following kevin's protocol of going to the folder where the audio files are at, and typing CMD directly into the address bar FROM THERE. (I've seen one or two other vids about python. No one mentioned this good tip.)
Anyway, with my limited knowledge, I think it's like this:
I've installed the following globally:
chocolately
ffmpeg
python 3-11
pytorch.
Then, I've installed 3-10 locally, in a folder on the c drive.
I bring my audio files into that 3-10 folder, enter CMD into the address bar there, and all is well.
I'm running 3-10, and still, I guess, accessing all those global resources that I need to.
I have a similar hardware setup (no CUDA, only CPU), and been wondering how long does it take to transcribe a 1-hour long video file using the --large model. What's been your experience?
@@antipupsz2411 Yup, I believe it was faster using Co-Lab. The advantage of using it on your hard drive, though, is transcribing multiple files. Set and forget it, go outside.
@@antipupsz2411 hey how do you achive to transcribe 1 hour. I tried 1 hour .mkv file but everytime it only transcribe 1 minute :(
@@etnisu You have to wait a lot for it to keep transcribing
@@antipupsz2411 Hey I'm transcribing one hour as well and it's been like 4 hours and only now it's halfway. I'm using medium model with my 6GB GPU and this is very slow. How long did it take for you?
This is why I love internet! To execute a neural network you just have to follow simple guidelines!
There are issues and stuff to figure out yourself, but this is such a great jumpstart!
This is one of the best step-by-step instructions I've ever seen. Thank you!
Thank you so much. Great instructions with exactly the right level of detail. Got whisper running on first try.
It is really amazing how good it is at transcribing songs! Using that for my home build arranger/karaoke keyboard :)
Thank you! This is exactly what we needed to transcribe our tiny DnD podcast!
It's a great help to sort and summarize important info from a vidz 😊. Thank you mr. Kevs!
Thank You! I was able to transcribe my mp3 file. Excellent technology for next week's online course.
Incredibly helpful. Thank you.
Whenever I want to use some (free and very useful) open-source tool I'm always baffled how difficult unintuitive it is to get it running by yourself
So useful and clearly presented, never stop making videos
This exercise gave me some solid experience troubleshooting errors. I had to pull teeth to get Homebrew (using a Mac) to install properly, and then had an SSL certificate error, but Google & Stack Overflow came to the rescue, and Whisper is working like a charm. Thanks for the great video!
By the way, if anyone gets an SSL certificate error using Python3 (which apparently is common), just enter the following in terminal, exactly as written (but check your version*):
/Applications/Python\ 3.11/Install\ Certificates.command
* Just adjust the version number to match your release, in the example above, I updated it to 3.11
people like you further motivate me to share my knowledge with the internet. Thank you so much! you have saved me a ton of time.
I have this problem but when I try your solution the terminal says: "no such file or directory: /Applications/Python" Do you know how to fix that?
Thank you Kevin for sharing your walkthrough, been looking at paid at platform for transcription. So easy when you know how
bless your soul, my assignment would've never been submitted on time if it weren't for this video 🙏
Running flawlessly for me. What a fantastic guide. I had to download latest version of pip to work but no hitches installing anything for me.
Wow! Really impressed how quick and easy this was. Would love a follow up video on how to incorporate something like pyannote to this so that we can also have speaker diarization!
Excellent how-to, easy to follow and descriptive. Thanks!
Thanks so so much this great programme .Right now l am running
an English school . During this Covid 19 it is really hit my business so bad . I will share this useful app to help up my students . Again thanks so much .
Thank you Kevin for what you do. I followed the instructions. I added the following in case some newbies wanted this.
I installed Python version 3.11.5 in Windows 11 and it works fine. In Windows Explorer, I created a folder under the C: Drive called Whisper. I then copied my mp3 audio file (from data drive) to C:\Whisper, typed in cmd in the address field to bring up the Command Prompt, and then typed
whisper filename.mp3 --model medium [and then Enter].
A 36-minute conversation (50mb) took a little over 39 minutes to run. I then cut all the files from C:\Whisper and pasted them into a folder on my data drive. Then I copied the text version into a version of Word that I don’t pay a monthly fee for and saved it. 😊
Hope this helps someone.
I tried Python 3.11.5 too, but every time i go in my C:\Whisper folder and type in CMD where I type in Whisper test.wav it says:
FileNotFoundError: [WinError 2] The system cannot find the specified file
Do you know a solution?
My brother you have saved me literally over a thousand hours of work. This made a life-changing improvement on my productivity
Outstanding tutorial as always Kevin. Thank you. I used this to transcribe my recording of a 45-minute webinar so I could read along and highlight as I listened to the replay. It took just 11 minutes on my high-end gaming computer with a Geforce RTX-3060 Ti graphics card. Very useful tool!. ‼
SOunds great, which model did you use? the default small model or a higher one?
@@generalgeert I used whisper -model medium
Which CPU did you have for that transcribe? Thank you
I have an RTX 7800xt, but when transcribing it is the CPU that does the work... how do I use the GPU?
dude thank you this actually worked compared to other tutorials!
amazing tutorial. Thank you for this super high quality well thought out tutorial. went super smooth.
Great instructional video. Clear and informative. Thank you.
Awesome tutorial. Thanks Kevin. Whisper AI is an amazing tool.
Love these videos, Kevin Keep them coming man!
Awesome! Thank you so much. You helped me actually get this to work (after watching several other videos!).
Crystal clear tutorial. Worked the first time trying. Thanx buddy! 😁
Great to hear!
Very useful thanks. As always very clear succinct videos
Kevin, you are my tech genius! That came in the right time.
Thanks heaps for your amazing video:)
You are a legend. What an amazingly helpful and easy to listen to tutorial on this.
Very cool my dude, thank you for helping with this. I would have never gotten this on my own
Brooo! The CMD trick is so good!
Appreciate your teaching Kevin, love and respect from Singapore :)
哇真的好用,讲的很细致!!在中国永远找不到如此细致的教程
This is great, but I wish there was a way to output in a Word document and segment by speaker - more a comment on whisper functionality than the video. Great work!
Why you dont use GUI for Whisper? No one have a time to play with console... regards! :)
Bro I dunno what to say but this is the thing that I have been looking for. Thank you a lot.
Well, I got it to work so I'm good. Your instructions are excellent!
THANK YOU - this tutorial is fantastic.
Thank you for all the time you put into making this step-by-step guide, all worked, yay! It did, however, take over 2,5h to transcribe a 40 min interview in .wav. Is that how it's supposed to be? Anyone else noticing similar sluggishness? 🤔
incroyable!! merci beaucoup j'ai tout compris c'était méga clair. bravo continue comme ça.
Thank you for the details. I like your tutorial being logical and explaining things from the base. I am curious about the text being split into each clip. what those clips were split based on? if the audio is 2-person conversation, will each clip be based on person. I am stuck on person identification using whisper
Awesome work Kevin. Subscribed
Amazing! Thanks fpr such a helpful video, dude!
Thanks for the clear instructions to use the tool. It works on python 3.11.1 also albeit with a few errors that can be ignored
I'm running it on python 3.11.5 error free.
Well, it is a wonderful video and useful too, but it's taking longer time to load the transcript. Thanks to you Kevin!!
Thank you for great video, Kevin!
Holy... I've been following you for quite some time now and I have to say, you lost me on this one. I'm sure there's another way I can accomplish this, not to say you are wrong, or giving bad advice or whatever, in fact just the opposite, you explained it perfectly and of course I have no doubt it's doable. In fact I'm writing this to get you more comments on the video. Great job, it's just one I'll pass on.
God bless you! Thank you for explaining the process in a simple and easy to follow way.
Excellent tutorial. Great job. Thank you
Thanks, Kevin. Super helpful!
This was very helpful. Thanks a lot!
It's working !!! Thank you for help ))
I was able to transcribe and translate audio with Whisper!! Thank you so much!!
>M
Amazing video dude, thanks!
Fantastic video. 💯 I was wondering if there's a way to have the transcribe italicize or bold words based on the audio. I know - asking for a bit much. But it doesn't hurt! I'm a voiceover artist and sometimes books I read have emphasis, questions etc. Please let me know and again, thank you for all the digestible videos you upload. 👍
extremely good tutorial! Thank you!
Worked for me. Thanks... good content.
Works perfect! Thanks!
Fantastic video. I'm going to grab the transcript and start installing on another i7 laptop and see what happens. Thank you sir!
Your content is a jewel, ty!
great explanation and all straightforward
just 2 words for you.. you are incredibly awesome.
Kelvin you such a sweet heart... just when I needed a transcribing software...
vaahlaaaaah!!! Here you are with the solution..
Kelvin are you reading my mind?Answer me
Nice one bruh... you make everything seems easy.. And working SMART Muah!!! Kelvin Kelvin!!!!! Thank you ❤
Hi Kevin, thank you for the training video.
Good tutorial! Easy to follow
Thanks! Worked perfectly ;)
Thank you so much for this video! It was extremely helpful! I have a quick question though. Is there a way for Whisper to cluster the different voices in the audio and identify different speakers?
Just awesome!!! Thanks a TON :buddy )
Super helpful, thank you so much
Thanks kevin excellent video, and excellent way to explain.
I would like know if there's the possibility to generate just 1 file per audio and add to it a little bit of format like "/n" between the lines, or make the file store it self at the end of the process with an specific name ?
Thanks again
thank you for that guide, simple and to the point, but full of info, like.
and yes, ive install and use whisper, it works, somewhere lose correct endings of words or choose wrong letter, but it have insanely quality of transcribation even for Russian lang on normal base.
Absolutely amazing, but I have one question. What is the most stable version of Python to run since 3.10.10 is the older version and there are updates such 3.12.x? Do you recommend to try the latest or which one is the most stable and the newest? Thanks again for this which saves me much in my doctoral research projects.
Worked for me, thank you.
Thanks. Your videos are easy to understand as they are explained step-by-step . Is it possible to make a video on Whisper JAX? Thanks once again