How to Convert Speech to Text for FREE Using Whisper AI & Google Colab (Step-by-Step Tutorial)

Поделиться
HTML-код
  • Опубликовано: 25 янв 2025

Комментарии • 39

  • @ElleWang
    @ElleWang  3 месяца назад +2

    00:00 Intro: Turning audio into text for free
    00:09 No Downloads: No local installation needed
    00:17 Whisper AI & Colab: Using Whisper AI with Google Colab
    00:50 Google Colab Setup: How to use Google Colab
    01:58 Runtime Options: Choosing CPU vs GPU
    03:37 Install Packages: Setting up Whisper and FFmpeg
    04:25 Upload Files: Adding audio/video files to Colab
    05:25 Choose Model: Picking the right Whisper model
    06:16 Run Transcription: Executing transcription
    07:07 File Outputs: Different file types explained
    07:44 Avoid File Loss: Save files before Colab resets
    Thank you for watching! Let me know if you have any questions down in the comment section! 😀

  • @piqueselio
    @piqueselio 19 дней назад

    Interesting video, well done and explained. Thanks, it was helpful to me.

    • @ElleWang
      @ElleWang  18 дней назад

      So glad to know it was helpful to you! Feel free to share any questions you have. 😊

  • @Tom77889
    @Tom77889 5 дней назад

    Can you also make a Google Collab on free talking avatar ? Thank you.

  • @waichow5887
    @waichow5887 12 дней назад

    Hi, is there a way to get transcript from a youtube video that didn't have transcription in built. For example, they didn't activate CC caption. Thanks

    • @ElleWang
      @ElleWang  12 дней назад +1

      Hi there, you are able to record the audio in your local computer, than you can use the method in the video to get the transcription.

  • @RisottosWife
    @RisottosWife 3 месяца назад

    Thank you so much! I really appreciate it!

    • @ElleWang
      @ElleWang  3 месяца назад +1

      Thank you for your kind comment! So glad it was helpful to you!

  • @ilham-z5j5m
    @ilham-z5j5m 6 дней назад

    idk what is wrong with mine i cant do it i got 500 audios to transcript

    • @ElleWang
      @ElleWang  6 дней назад

      Did you get an error message? Sometimes uploading your audio files in the folder takes a while. You will need to run the script until after your audio uploading completed.

  • @steffenb.4322
    @steffenb.4322 Месяц назад

    Hi ElleWang, I have installed now Colab, but I do not find it afterwards in the selection. Where is the problem?

    • @ElleWang
      @ElleWang  Месяц назад

      That sounds rare. Make sure you log in to the same gmail account and try refresh? (It doesn't work when you use your tablet or phones. So try use the browser on a PC/Mac.) Good luck!

  • @yoyoschmo1
    @yoyoschmo1 24 дня назад

    What about RUclips? I have the link or the video downloaded onto RUclips but can’t get video to my computer

    • @ElleWang
      @ElleWang  24 дня назад

      Great question! Whisper AI can't transcribe based on an RUclips link directly. I'd first check if RUclips's own transcription is satisfiable. You can click on the "description" of any youtube video, and scroll all the way down and click on "Show Transcript". Then you will find the transcription on the right-hand side. Hope this helps!

  • @Tom77889
    @Tom77889 5 дней назад

    Hi! Can you make a Google Collab on TTS for free ala eleven labs

  • @Abhijit-12361
    @Abhijit-12361 22 дня назад

    What about large model?

    • @ElleWang
      @ElleWang  22 дня назад

      The large model takes longer time to process. I’d recommend testing the large model on audios with heavy accents or strong noise. Or in any case, you are not satisfied with the result with the medium model, then switch up to large. Hope this helps!

    • @Abhijit-12361
      @Abhijit-12361 21 день назад

      @@ElleWang There's other project called seamless M4T which has speech to speech translation but its not installing on colab so do you have any idea about it?

    • @ElleWang
      @ElleWang  21 день назад

      Thank you for asking about Seamless M4T! Yes, it's a powerful tool, but I should mention that speech-to-speech task require significantly more GPU resources than speech-to-text. That's why it might struggle on Colab. I'd recommend running it locally on your own GPU if possible.

  • @c4uk1
    @c4uk1 3 месяца назад +1

    Hey, thank you so much for this video, So heplful
    Just wondering
    Do you know a method to convert foreign language speech video to english text please?

    • @ElleWang
      @ElleWang  3 месяца назад +1

      Thank you for your comment! Yes, you can use Whisper AI to translate other languages into English as well. It depends on what source language you are using. You can check the URL in the description to find the command for it. I may also plan a tutorial video on translation with whisper. :-)

    • @c4uk1
      @c4uk1 3 месяца назад

      @@ElleWang Thank you for your response. Found the command in the URL, Will give it try & see how it does. Thank you :)

  • @formationWPfacile
    @formationWPfacile Месяц назад

    Thank you Elle. I found a Chinese drama on RUclips (mandarine speaking) and I would like to get the Chinese text to convert it in Pinyin to learn Chinese mandarin.
    Do you think it's possible with this tool? The RUclips video has English captions and Chinese characters (transcription is only in English) but I need PinYin to learn how to prounounce each words. If I get the text of the video in Chinese, I know how to convert it into Pinyin with free websites. Watching dramas is a great way to study a language.

    • @ElleWang
      @ElleWang  Месяц назад

      Yes, you can use Whisper AI to transcribe videos in Chinese and then translate those into English also use Whisper (you can check out another video on my channel that focuses on the "translation" function of Whisper AI. :-)

    • @formationWPfacile
      @formationWPfacile Месяц назад

      @@ElleWang Thank you very much, I did it and it worked fine! Very impressive tool. I'm also building a language app, so this kind of AI tool can help me a lot.

    • @formationWPfacile
      @formationWPfacile Месяц назад

      @@ElleWang it's cool to make captions for videos in different languages. Subtitles took a longtime to make when I was doing that on my Commodore Amiga in the 80's ^^

    • @ElleWang
      @ElleWang  24 дня назад

      So glad you find the video helpful! Yes, indeed, it used to take forever to manual transcribe and translate. Your language app idea sounds fascinating! Good luck with everything!

    • @formationWPfacile
      @formationWPfacile 24 дня назад

      @@ElleWang Thank you. My app will be free, no ads. I hope to give free 300 lessons with the help of AI. I try to make a conversational live cartoon, for example, in a restaurant, you speak and answer to the waitress/ter then depends of your answers or questions, the next dialogue is different. I try to reproduce a real situation. If you know AI to do this (create animated cartoon and voice recognition, text to speech), pls let me know. :)

  • @FundacionUPIPADE
    @FundacionUPIPADE 3 месяца назад +1

    Thank you so much for the tool! However I uploaded a large wav file -one hour lecture- and the text was incomplete, do you know what could have happened?

    • @ElleWang
      @ElleWang  3 месяца назад

      Thank you for your comment! Re: your question - It might be a Google Colab session timeout. Sometimes the wifi condition can influence it. I've processed a 2-hour wav file successfully in the past. Good luck!

  • @tarekolya9457
    @tarekolya9457 2 месяца назад

    I can’t find in google drive in my i pad

    • @ElleWang
      @ElleWang  2 месяца назад +1

      Hi there, you will need to use a desktop/laptop to use Google Colab. :-)

    • @tarekolya9457
      @tarekolya9457 2 месяца назад +2

      @ElleWang Thanks for your quick response 👍

  • @MohammedKhan-qs5nr
    @MohammedKhan-qs5nr Месяц назад

    it took me 4 mins to transcribe a 30 sec audio wav file, is that expected? also does this work with aac file, also thank you so much for sharing this

    • @MohammedKhan-qs5nr
      @MohammedKhan-qs5nr Месяц назад

      Sorry I was using regular CPU for this, when i took T4 GPU it took me 47 secs, but do I have to install whisper everytime i have to transcribe a file? it showed me an error /bin/bash: line 1: whisper: command not found and only when i rerun the install command did it go away

    • @ElleWang
      @ElleWang  Месяц назад

      Yes, you need to run the same script including the installing lines every time using Google Colab. And yes using “T4 GPU” whenever you can! :-)

    • @MohammedKhan-qs5nr
      @MohammedKhan-qs5nr Месяц назад

      @@ElleWang Thank you

  • @existentialbaby
    @existentialbaby Месяц назад

    i am running faster-whisper on my 2016 entry level potato laptop