Live Speech to Text with Watson Speech to Text and Python | FREE Speech to Text API

Поделиться
HTML-код
  • Опубликовано: 19 окт 2024

Комментарии • 337

  • @StevePGLy
    @StevePGLy 3 года назад +14

    Hey Nicholas. I am Steve who is on a learning journey to become data scientist. I would like to say that I really appreciate what you are sharing! I have been studying pretty much by watching your great videos :D

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +2

      Thanks so much Steve! So glad you're enjoying the videos!!

  • @TimDownsAnimation
    @TimDownsAnimation 3 года назад +8

    I'm an amateur at this stuff, but I'm trying to design a proof-of-concept that proposes a hybrid of something like this with lip-reading AI to produce real-time subtitles for the deaf and hard-of-hearing. In my research, I came across your channel and I love it! You're great at explaining things and it's easy to follow along. Instant sub!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Ayyyy, welcome to the channel. Also, that sounds like a sick MVP!

    • @KANEDAX1987
      @KANEDAX1987 3 года назад +1

      i am actually trying to do something similar. I am trying to create something that recognices lip reading to audio instead of subdtitles for the mutes. pretty similar but different at the same time.
      nice. also an amateur

    • @byiringirooscar321
      @byiringirooscar321 2 года назад

      HEY DEAR i AM ABOUT TO IMPLEMENT THAT KIND OF PROJECT I need your advise and methodology

    • @TimDownsAnimation
      @TimDownsAnimation 2 года назад

      @@byiringirooscar321 oh I have no idea. By “proof of concept” I meant I was making a short film for a school project in animation and VFX lol. Sorry

    • @aoeu256
      @aoeu256 7 месяцев назад

      Use AI image recognition? Use someone elses project? Also you can use a one handed keyboard disambiguate or something for example LIP sounds (BPMFV), mouth roof(TDNSZ), velar(KG NG X), H, vowels will be mouth size like A is big, U is rounded lip. Detect nasal air flow to detect M N NG, u can detect eye movement for B vs P, S vs Z, T vs D, etc... Also would be nice to have a full face anime girl mask and girl voice... @@byiringirooscar321

  • @guyincognito1985
    @guyincognito1985 3 года назад +3

    Thanks for zooming in to make the text larger, and for using a larger font in VS Code. Also thanks for alerting me to pipwin!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Anytime @Guy Incognito, I actually picked it up while installing it, wanted to call it out to ensure it all worked!

  • @haisomeone2218
    @haisomeone2218 2 года назад

    You don't know or somone didn't tell you that you are the best.

  • @andresg297
    @andresg297 3 года назад +1

    This channel is so underrated

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      It does an innate split @Mert, the default is set to 70%. More detail here: pycaret.org/train-test-split/

  • @thinamG
    @thinamG 3 года назад +2

    This looks awesome, Nicholas. Thanks for sharing!

  • @nitinpatel35
    @nitinpatel35 3 года назад +1

    Those who are getting the following error while running the code - 'Handshake status 403 Forbidden' and have selected 'eu-gb' as the region, please change the region and it should work for you. I have encountered a similar problem. After setting up my service at the 'us-east', it worked for me.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      This is awesome! Thanks so much for sharing @Nitin!

  • @OliverHiggins
    @OliverHiggins 3 года назад +3

    Great video. I got the 403 issues to (Based in Sydney) the only location that would work was us-south

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Thanks for the share, yeah looks like it's a common issue across the regions but us-south is up.

    • @OliverHiggins
      @OliverHiggins 3 года назад

      @@NicholasRenotte if it’s set to record and it doesn’t hear anything does it process nothing or does it send something ie if I did a loop every few second to record for a 2-3 seconds would it eat through the allocated time from IBM?

    • @OliverHiggins
      @OliverHiggins 3 года назад

      @@NicholasRenotte almost need a software vox or something 🤔

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@OliverHiggins let me get back to you on that, will reach out to the product teams!

  • @viniciuslongo4622
    @viniciuslongo4622 3 года назад +1

    Hands down the best ML youtube channel

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Ahhhh shucks @Vinícius, you're too kind! Thanks sooo much!

  • @ravindunawanjana7050
    @ravindunawanjana7050 3 года назад

    I already started to followed your videos and this is great video nick cheers

  • @avnishat24
    @avnishat24 2 года назад +1

    @Nicholas You specified this can be used for live transcript from a meeting but when I tried this transcribes only the voice from my microphone and does not transcribe others voice from the meeting so how it can be used for transcribing a meeting. I would love to use this code to transcribe the entire meeting session

  • @sayelfujael6378
    @sayelfujael6378 3 года назад +2

    I'll check it! Great work man.

  •  Месяц назад

    Hi Nicholas. First of all, I'd like to appreciate all the efforts you put into your videos. They have helped me in various scenarios. I would like to raise a small request. Would you do a video on Meta's SeamlessM4T model and finetuing of it if possible? Thanks in advance

  • @WisKy64VT
    @WisKy64VT Год назад

    Nice! what if you wanted to have it automatically trigger when someone starts talking, and stops after like 2 seconds of silence?

  • @fjizq
    @fjizq 2 года назад +4

    Good evening. Thanks for the video and our explanations! I've got this error: [Errno 11001] getaddrinfo failed. Any clue? Thanks a lot!

    • @9000richi
      @9000richi 2 года назад

      I'm stuck with the same problem, have you found a solution?

    • @fjizq
      @fjizq 2 года назад

      @@9000richi Not really. I was trying to make it work in different computers, but I usually received the 403 error. I think all of them have to do with the region you base your stt service in, but I was not able to solve it

    • @9000richi
      @9000richi 2 года назад

      @@fjizq I tried with 2 regions and non of them work, so tbh I don't know if that's the problem, but thanks for answering.
      I'm going to try another method to use the API no python, I'll let you know if I find one.

    • @andreabussolan2832
      @andreabussolan2832 2 года назад

      @@9000richi I'm also stuck with that error and I tried different regions. Did you find a solution?

    • @9000richi
      @9000richi 2 года назад

      @@andreabussolan2832 Not really, haven't looke into it a lot, I've been focused on other parts of my code leaving this to last. Sorry.

  • @pauljones5476
    @pauljones5476 2 года назад

    Hi Nicholas,
    I've watched a few videos on how to use speech to text, with the majority being with google. I must say your explantion is more detailed, and the audio quality is very good (crisp and clear). It seems like watson is easier to install than googles version. I'd like to install watson on my macbook. Is there anything that I need to do differently, or can I follow the exact steps you used to get it installed on my mac?
    Will watson detect speech if I have a video playing on my mac and then transcribe that audio to text?
    Kind Regards
    Paul

  • @aldorojas1918
    @aldorojas1918 3 года назад +1

    I've got one question, I use TextNow to call people from my computer to make some interviews, but I would like to transcript not my voice (trough the microphone), but what people I'm calling is saying, that's what I want to transcript, what people say over this program (TextNow), is that possible? and how can I do that? Thanks

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Would suggest recording rhe video from textNow then running it through a transcription tool like Watson STT. Got a vid on the channel about how to do it from video!

  • @ComputerScienceSimplified
    @ComputerScienceSimplified 3 года назад +1

    Amazing video, keep up the incredible work! :)

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Thanks so much @Computer Science Simplified!

    • @rachidaboussaid501
      @rachidaboussaid501 2 года назад

      @@NicholasRenotte
      Hey dear friend Nicholas, we appreciate your work too,thanks you for that .
      I want to ask you,how can I get the realtime speech to text result in browser, rather then terminal,to avoid copy paste or goings comings between cmd and browser, to use for exemple extension of Google translate in chrome to translate a part of text(speech) I want, quickly with the feature of click_translate ,can I use Jupyter notebook for that rather then vscode ?
      In other words, can I run that command "python transcribe.py -t 20" for exemple in jupyter notebook, and show result in browser rather then terminal (cmd)?

  • @ajkadhim6058
    @ajkadhim6058 3 года назад +1

    thank you so much for an excellent video. When it is transcribing live, it outputs several lines as you are speaking. However, after "done recording", you see the complete output of the text. When I am running the command, I do not have that finalized output at the end. Is there a way to get that? thank you again in advance.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Heya AJ, as in you don't see it or need the output saved somewhere?

    • @Louisljz
      @Louisljz 3 года назад +2

      @@NicholasRenotte same problem here too..I don't see the final text come out in the end, after the text "done recording"..it says on_close() takes 1 positional argument but 3 were given..thanks in advance

  • @jianbintang554
    @jianbintang554 8 месяцев назад

    Nick, need an updated version on this :) I tried, the existing seems not working any more.

  • @adibokay
    @adibokay 3 года назад +3

    This is so helpful, thank you Nick! I am trying to build a software that can convert ASL to text on the screen that can help ease communication on platforms such as Zoom and Microsoft Teams. Would be great if you could show us how to build it ^^

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Got something coming soon on advanced conversion!

  • @arif5615
    @arif5615 3 года назад +1

    hey nicholas, great job. One thing, May i know, are there any videos of you doing live speech to text translation with Python?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Heya @Arif, this one?

    • @arif5615
      @arif5615 3 года назад +1

      @@NicholasRenotte yeah this is live STT, what I mean was live speech translation. You speak English and then, directly translate to other languages in text. 😊

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@arif5615 oh, long day at work, wasn't paying attention. Ah, nope, no vid on it yet!

  • @TheTomdeaf
    @TheTomdeaf 3 года назад +2

    I am deaf, I want live subtitles (not transcript) in a small window, which is always on to the top in Z-order. Can you or so someone here show or give tipps how do adapt it?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @The Tomdeaf, you could take this code and apply it into something like a GUI. Tkinter perhaps?

    • @TheTomdeaf
      @TheTomdeaf 3 года назад

      I don't know anything about python, Only C++ and WinAPI. Can you give me some details in this direction

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@TheTomdeaf oh got it, could probably build a GUI with C++ as well. There's a fully documented API here: cloud.ibm.com/apidocs/speech-to-text

  • @shoaqa16
    @shoaqa16 3 года назад +1

    amazing video, thank you
    i'm having a problem with the last text is not being printed and idk why
    i added a print statement to on_close to see if it reaches it but it didn't. even though it was working fine before then it stopped so idk what to do :(

  • @gustavoluz8983
    @gustavoluz8983 3 года назад +1

    Great job Nicholas! what about consuming RUclips live stream as the audio input? any thoughts on that?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Ooooh, a live stream feed. I think you might be able to do it using PyAudio. Need to dig into it a little more though.

    • @gustavoluz8983
      @gustavoluz8983 3 года назад +1

      @@NicholasRenotte Nice suggestion, thanks for the answer!
      Found some reddit discussion but it seems kind of messy . The best way would be to integrate with the youtube api but guess the delay would be very high and the integration is not very clear. My idea would be to subtitle and translate at real time a youtube live stream (maybe should try with twitch or others)

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@gustavoluz8983 ya, I've added it to my video list. I think it'd be sick to do a video on it!

    • @gustavoluz8983
      @gustavoluz8983 3 года назад +1

      @@NicholasRenotte i will be the first to watch it! will try to develop some things on my free time and if i succeed i put it at my forked repo and let you know, thanks

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@gustavoluz8983 yess, thanks so much excited to hear about it!

  • @manikagarwal5415
    @manikagarwal5415 3 года назад +1

    list index out of range
    on_close() takes 1 positional argument but 3 were given
    I got this error please help me out as well as not recording my audio.
    Previously I got an error with pyaudio while importing but it has been resolved by declaring a variable to it.

    • @darkmasterbatista2815
      @darkmasterbatista2815 7 месяцев назад +1

      [Errno 11001] getaddrinfo failed
      on_close() takes 1 positional argument but 3 were given. I got this one aswell, did you solve it ?

  • @ranjansutradhar1046
    @ranjansutradhar1046 3 года назад +2

    Thank you , but I want to save the speech output as text file , how to do that ,could you please answer

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Another subscriber shared the code. Add this in transcribe.py in line126
      with open('output.txt', 'w') as out:
      out.writelines(data['results'][0]['alternatives'][0]['transcript'])

    • @ranjansutradhar1046
      @ranjansutradhar1046 3 года назад

      @@NicholasRenotte bro i have tried out, what you have mentioned in the above comment, but getting this error - out.writelines(data['results'][0]['alternatives'][0]['transcript'])
      NameError: name 'data' is not defined, data variable is not recognised, i have tried out with as many configurations that i could ,but still getting this error , would you be able to rectify this , as im doing a project on speech to text transcription and then summarization of the transcription.

    • @ranjansutradhar1046
      @ranjansutradhar1046 3 года назад +2

      It's done , thank you buddy

    • @esmahanaldoseri8151
      @esmahanaldoseri8151 3 года назад

      @@ranjansutradhar1046 Hi , can you share the way you solved the error as I got the same too

    • @ranjansutradhar1046
      @ranjansutradhar1046 3 года назад +1

      @@esmahanaldoseri8151 paste the same code snippet given by @Nicholas Renotte in the above reply in transcribe.py's line no 98 after the print satement and accourding to the structure, pls if you run successful pls notify me here okay, thanks

  • @amitjena1556
    @amitjena1556 8 месяцев назад

    Hey Nicolas, How do it work on my teams meeting or RUclips ??

  • @valeriofaraone388
    @valeriofaraone388 3 года назад +1

    hey nicholas thank you very much, you are great.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      You're most welcome @Valerio, glad you enjoyed it!

  • @潘凡雯SHAIKRESHMAPARVEENQ36
    @潘凡雯SHAIKRESHMAPARVEENQ36 2 года назад +1

    Hi, I am getting this error ->"[Errno 11001] getaddrinfo failed ", when running transcribe.py file in the last ,please suggest the needful
    Thank you

    • @seymuromarov9287
      @seymuromarov9287 7 месяцев назад +1

      Hi, I also have such problem, did you solve it?

  • @NeoAAnderson
    @NeoAAnderson 5 месяцев назад

    How do I do this in django? This is perfect, I need to include it in my final paper

  • @nadaessam4603
    @nadaessam4603 3 года назад +1

    Thanks for sharing this video, I have a question when I try to install pyaudio a I got the error 'pipwin is not recognized as an internal or external command' so what should I do, thanks in advance

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Try installing pipwin @Nada, check this out: pypi.org/project/pipwin/

    • @tkipkemboi
      @tkipkemboi 3 года назад

      pip install pipwin

  • @anishaudayakumar1778
    @anishaudayakumar1778 3 года назад +2

    Amazing Tutorial !!! I'm stuck with Pyaudio installation in my windows :( And when I tried with colab my final step throws "OSError: No Default Input Device Available"... Any leads to help?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Heya @Anisha, this won't work in Colab, you'll need access to the microphone from your local machine!

  • @thibautbouexiere1881
    @thibautbouexiere1881 3 года назад +2

    Hey Nicholas, thanks for your video.
    I'm a code beginner. How can I get that transcription written in another text editor like Google Doc?
    I'm trying to find a solution to subtitle a live conference.
    Thank you so much.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Thibaut, do you need to subtitle live or just in post?

    • @thibautbouexiere1881
      @thibautbouexiere1881 3 года назад

      Hey @@NicholasRenotte
      I need to subtitle live, as it would work, you know, on a RUclips video being subtitled in live.
      How can I add my subtitles to the live video?
      Thank you!!

  • @houralghasham
    @houralghasham 3 года назад +1

    Thank you for your hard work. I’m wondering if there is a way to merge it with watson assistant chatbot so it will give me a response?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Sure can, there's actually a voice agent integration available for WA that's specifically designed for it.

  • @ibrahimisrafilov1248
    @ibrahimisrafilov1248 3 года назад +1

    Nic, i'm trying to get as input wave the device output (Speakers) however I get low quality transcription I think it's becuase of the quality of waves since from speaker it doesn't register as good as from mic. What do you suggest me to do?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Heya @Ibrahim, does it need to come through the speakers or could you use a recording perhaps?

    • @ibrahimisrafilov1248
      @ibrahimisrafilov1248 3 года назад +1

      @@NicholasRenotte From speakers :)

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      @@ibrahimisrafilov1248 hmmm, that's a tough one, I wonder if you could tap into the output signal and use that instead?

    • @ibrahimisrafilov1248
      @ibrahimisrafilov1248 3 года назад +1

      @@NicholasRenotte Exactly I did as you say. I have used sounddevice lib. and installed the output device as an input so basically speakers. However, the quality is not good as it were in MP4 so transcribtion is not good enough. I was thinking it could be due to the Hgz,

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@ibrahimisrafilov1248 hmmm, yeah that's a tough one! I think the audio signal would impact the result significantly.

  • @zeroranger
    @zeroranger 3 года назад +2

    I couldn't run it because it said that there was no SSL available :(
    Please help!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @JumpNShootMan, was there a broader error you can share?

  • @razmandhamarasheed4325
    @razmandhamarasheed4325 3 года назад +1

    Hey man, great work! I love your explanation. Want to get into data science and learning it from you makes it easy. I have a request though, are you able/willing to look more into process time series data where you can make continuous predictions based on historical data?
    Anyways, keep up the great work!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Definitely! Just checking, have you seen this: ruclips.net/video/KvLG1uTC-KU/видео.html

    • @razmandhamarasheed4325
      @razmandhamarasheed4325 3 года назад +1

      Hi @@NicholasRenotte, thank you for your reaction, yes I think that was the first one I saw from your work. I kinda get the batch like prediction, where you have a csv file or something else to predict in batches. But is it possible to get a continuous prediction that kinda adapts to new situations. For example: predicting process behavior like the effect of heat and fluid flow on pressure in a continuous fashion. If you are interested I can give a more detailed description of my question.
      Anyways, thanks for your reaction as a follower I appreciate your work. Keep it up.
      Razmand

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      @@razmandhamarasheed4325 so predicting in real time? as in new data comes in a you forecast the next couple of time steps?

    • @razmandhamarasheed4325
      @razmandhamarasheed4325 3 года назад +1

      @@NicholasRenotte, yes exactly. I see that the problem is a bit more in the corner of programming, but I have until now not seen anyone on RUclips try to get the predictions automatic. In my case I would like to use these predictions for a continuous stream of data coming from an SQL server (Pi OSISOFT). I would appreciate it if you could explain how this would be setup like all your other explanations.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@razmandhamarasheed4325 hmmm, to be honest if I was just running on a SQL server table I would have a stoc proc setup to reforecast each time new data was added to the table. That would probably be the easiest way rather than having to work with a stream.

  • @enigmaticpuzzle9654
    @enigmaticpuzzle9654 3 года назад +1

    also, I need assistance with how to set up the command prompt and vs visual code. To make it clear I am stuck from 7:41 to forward. I need help with that. Thank you

  • @rangadiyyala7546
    @rangadiyyala7546 3 года назад +1

    This looks really cool, Nicholas. How can we make speaker audio as input, here we are using mic as input...

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +2

      Would need to do some digging into this? Want me to make a vid on it?

    • @rangadiyyala7546
      @rangadiyyala7546 3 года назад

      Yeah try to make a video on that it would be supercool

    • @funkedelic_bob
      @funkedelic_bob 3 года назад

      ​@@NicholasRenotte Did you ever manage to put together a video or dig into this? I'm also looking for it to transcribe whatever the system audio is playing. Thanks!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      @@funkedelic_bob never got around to it Justin, might bump it up on the list, keen to get back into some of the Watson stack.

  • @oluwatimilehinfolarin5758
    @oluwatimilehinfolarin5758 2 года назад +2

    Thank you Nicholas for the wonderful video.
    But I got an error - [Errno 11001] getaddrinfo failed.
    How can I solve this problem? It's urgent. Thank you.

    • @praveenlokku
      @praveenlokku Месяц назад

      im also getting the

    • @praveenlokku
      @praveenlokku Месяц назад

      if you resolved this help me out!!!!!11111

  • @pragatisharma3703
    @pragatisharma3703 3 года назад +1

    Hey Nicholas. I am stuck with "Handshake status 401 Unauthorized" can you plz help me out.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Pragati, just double checking you've updated the API key? If so, try creating an instance in a different region.

  • @srivatsavm3892
    @srivatsavm3892 2 года назад

    Can I get a code where I get only one line where the words are added as I speak(instead of the whole line printing again and again)?

  • @عمرالقرني-ه6ي
    @عمرالقرني-ه6ي 3 года назад +1

    Hey nicholas thank you for this video
    I Have a Problem with this code when I run it it give me HandShake 401 unauthorized Problem
    Can you tell me How can I fix this problem
    I am sure of my apikey and region

  • @SaumyaSharma007
    @SaumyaSharma007 2 года назад +1

    Thanks Nicholas Sir for this awesome video 😀
    Plzzzz can u help me in this,
    For example if a village person is not able to speak Hindi language then is it possible to design a solution that converts local ethnic language into Hindi or English text format and that too in live conversation... Plzzzz Sir reply.....
    U have already made a video on where u have converted eng audio of "Hello World" into Hindi text.....But that was not live....
    Plzz I just want your views about this.... Will be waiting for your reply 🥺

  • @Cdawgw
    @Cdawgw 3 года назад +1

    Hey Nicholas, great tutorial, I am working on a project that can classify information based on a key pressed and then stores it. Imagine pressing 'B' key and the streamed text will flow into text file 1, but when pressing the 'N' key on my keyboard it would stream to text file 2. Could you show how the text can actually be retrieved? Thank you so much you saved my project!!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Hmmmm, I'm not too sure I get the project. So it would allow you to break up the speech into multiple files?

    • @Cdawgw
      @Cdawgw 3 года назад

      Yes so imagine you to take notes but only from a specific part of a sentence (part of a string), that is one of the key things I am missing with speech transcription. I could do learning and repetition using books, read out important parts and highlight parts of that sentence using for example a key press on my computer. Hope that makes sense! Thanks for the input!🚀🚀

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@Cdawgw ah, got it! Could add some logic into the code to capture your keypress and change the routing of the output in response.

  • @oswaldmboussa3798
    @oswaldmboussa3798 3 года назад +2

    Thank you sir for this great video! Can it work with streamlit?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      I believe so, will add it to the list of upcoming vids @Oswald!

  • @ameerazam3269
    @ameerazam3269 3 года назад +1

    Already i cover this but credit goes to you sir

    • @ameerazam3269
      @ameerazam3269 3 года назад +1

      because of you that work

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Awesome work @Ameer, so you're using?!

    • @ameerazam3269
      @ameerazam3269 3 года назад

      @@NicholasRenotte yes already I did by watching previous video and deploy on herokuapp ..already send you my work linkedin

  • @yashuandchikusfunworld3208
    @yashuandchikusfunworld3208 3 года назад +1

    Thanks for the nice explanation. When I ran last statement to run the program, getting error as "Handshake status 401 Unauthorized
    on_close() takes 1 positional argument but 3 were given". Can somebody help me on this.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Try using the us-south region!

    • @yashuandchikusfunworld3208
      @yashuandchikusfunworld3208 3 года назад

      Thanks… I just realized that my stt service was not launching properly.. so i restarted that and its working now..

  • @mailtoraj76
    @mailtoraj76 2 года назад

    Great work!!, but can I get the final text somewhere into the file preferably in JSON? I need to pass this to my app. Advise pls.

  • @mazaharhulhaque4482
    @mazaharhulhaque4482 3 года назад +1

    I want to transcribe names I speak in microphone. How can I give some meta data (kind of hint) to improve the transcription

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Can fine tune the model using Watson STT! medium.com/ibm-data-ai/tune-by-example-how-to-tune-watson-text-to-speech-for-better-intonations-bcee8404d927

  • @lenover12
    @lenover12 3 года назад

    this looks really awesome, I was wondering if it was possible to output the phonemes instead of text. that would be something I could really use in a project!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Ooooh, I don't think that's possible unfortunately @lenover12.

  • @el3412
    @el3412 3 года назад +2

    great video thank you !
    How can I print text in .text file ?
    I use out.writelines( ) ,but there’s error.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Might be easier to record the live audio then output: ruclips.net/video/A9_0OgW1LZU/видео.html

    • @samarqasem9558
      @samarqasem9558 3 года назад

      Did you figure it out? I'm going through the same thing :(

    • @el3412
      @el3412 3 года назад +1

      @@samarqasem9558 yes , add this in transcribe.py in line126
      " with open('output.txt', 'w') as out:
      out.writelines(data['results'][0]['alternatives'][0]['transcript']) “

  • @gokulkaruna1
    @gokulkaruna1 3 года назад +5

    "Handshake status 403 Forbidden" ERROR while running the transcribe.py file.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Heya @Gokul, can you double check your URL and API key. Normally this error is due to a slight typo in either.

    • @pidpyd7759
      @pidpyd7759 3 года назад

      @@NicholasRenotte Still Not Working :/

    • @frenchcoder-developpementw2429
      @frenchcoder-developpementw2429 3 года назад

      @@pidpyd7759 Problem solved. I changed the region

    • @pidpyd7759
      @pidpyd7759 3 года назад

      @@frenchcoder-developpementw2429 to what u did

    • @frenchcoder-developpementw2429
      @frenchcoder-developpementw2429 3 года назад

      @@pidpyd7759 I changed the region to us-south

  • @shikhajoshi8961
    @shikhajoshi8961 5 месяцев назад

    Can i get a little help.. while installing pyaudio it is showing error while building wheels

  • @avnishat24
    @avnishat24 3 года назад +1

    This is very useful. Thanks a lot. I am facing "Handshake status 403 Forbidden
    on_close() takes 1 positional argument but 3 were given" error and based on below comments I did try changing region to "us-south" but still I get the same error. Tried multiple times creating new service in IBM cloud in us-south. As region / api-key does not seem to be an issue here I would like to know if I must change the url in the code as I saw few comments related to changing url. If yes which file has the url to be changed? When i ran a sample test with curl command using an input audio file (example shown in ibm cloud page) I do get a response of the transcription test.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Try using the us-south region instead!

    • @avnishat24
      @avnishat24 3 года назад +1

      ​@@NicholasRenotte Thanks. region did not help but i changed url in transcribe.py file and it fixed 403 error. I still get "on_close() takes 1 positional argument but 3 were given" but atleast it works. This one transcribes only microphone voice. Is there any way to transcribe all voices in the meeting? If there are 2 or more ppl in the meeting I would love to transcribe all the voices.

    • @sarindrathereserandriambel417
      @sarindrathereserandriambel417 2 года назад +1

      @@avnishat24 hey can you help me to fix this too, I did not manage to solve this problem

    • @rafaelprudencioleite7291
      @rafaelprudencioleite7291 2 года назад +1

      @@avnishat24 How do u solve it?

    • @michelemetta23
      @michelemetta23 2 года назад

      How do you solve it? I don't know..

  • @souparnaroy5283
    @souparnaroy5283 3 года назад +1

    Hi Nicholas, I followed along with your video but when I try to run live transcription in the end it shows:
    Handshake status 403 Forbidden
    on_close() takes 1 positional argument but 3 were given
    Any idea how I can go about solving this?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Can you try using the us-south region?

    • @souparnaroy5283
      @souparnaroy5283 3 года назад

      @@NicholasRenotte Thanks mate. Changed that and it's working now.

    • @brown_canadian
      @brown_canadian Год назад

      Hey, I am getting the error, but I already had the us-south. Any fix?

  • @meg33333
    @meg33333 2 года назад

    Which algorithm is used in this speech to text?

  • @Prateikx
    @Prateikx Год назад

    How to transcribe from the stream of the audio in real time whisper AI model?

  • @cajwan
    @cajwan 3 года назад +1

    Hi! thanks for the video.
    is it possible to convert the recorded speech into a text file?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Yup, another subscriber shared the code. Add this in transcribe.py in line126
      with open('output.txt', 'w') as out:
      out.writelines(data['results'][0]['alternatives'][0]['transcript'])

  • @kiss-bws
    @kiss-bws 3 года назад +2

    Bro nice video but the thing is I want to train my voice assistant for accuracy and I also want to it predict please make another tutorial please if you are going to make then reply and also you got another subscriber🙂

  • @esmahanaldoseri8151
    @esmahanaldoseri8151 3 года назад +1

    Hi , I got this error .. Can you please help ?
    Error:
    A connection attempt failed because the connected party did not properly responded .........

    • @esmahanaldoseri8151
      @esmahanaldoseri8151 3 года назад +1

      Finally fixed this problem.. It worked ell but not concatenating the whole text together
      Error:
      on_close() takes 1 positional argument but 3 were given

  • @katherinezhang7194
    @katherinezhang7194 3 года назад +1

    Thanks for sharing this ! Question : when I run pip install -r requirements.txt I get ERROR: Command errored out with exit status 1: - do you now how to fix this? (I'm on a Mac) Thanks !!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Heya @Katherine, is there a larger error? If not, can you try individually installing the packages from the requirements.txt file, ideally one should error out, then we can work through it!

    • @katherinezhang7194
      @katherinezhang7194 3 года назад +1

      @@NicholasRenotte Thanks ! (Was missing Homebrew Portaudio - it now works finally !) When I run python transcribe.py -t 20 it's showing ImportError: No module named configparser - any ideas ?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@katherinezhang7194 try running pip install configparser

    • @keatlck
      @keatlck 3 года назад

      @@katherinezhang7194 python3 transcribe.py -t 20

  • @Kishi1969
    @Kishi1969 3 года назад +1

    Wao you are Amazing but try to explain that our minutes in IBM WATSON can finished and what else can we do?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      You can upgrade the plan or you might need to delete and create a new free tier but keep in mind that the API key will change!

    • @Kishi1969
      @Kishi1969 3 года назад

      @@NicholasRenotte Thanks for your response, Please im having problem with FFmpeg you did the other time i sent you text but not response, if you want me to re-send sir

  • @aprosflumine9074
    @aprosflumine9074 3 года назад +1

    Is there any way to make computer read this text after you recorded it.

    • @Van088
      @Van088 3 года назад

      Same question pls

  • @Snakebite0
    @Snakebite0 8 месяцев назад

    Sir I'm trying fine tune mozila deepspeech with my custom data in colab
    It's not working
    I tried it in many ways but not working
    Can you can give me any idea ? Or can you do a video on it 🥺

  • @manishsharma2211
    @manishsharma2211 3 года назад +1

    Handshake status 403 Forbidden Error while running the transcribe.py
    I have double checked the API and Region
    Any help Nic ?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Manish, can you try using a different region when you setup your service? Try us-east.

  • @-alfeim2919
    @-alfeim2919 3 года назад +1

    when I wrote
    python transcribe.py -t 20
    i got 7 errors on the
    configparser.py
    file, and all of them are:
    Refactor this function to reduce its Cognitive Complexity from 20 to the 15 allowed
    any help

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Hmmm, haven't encountered that one before, can you share the full error? Any changes made to the baseline code?

    • @-alfeim2919
      @-alfeim2919 3 года назад +1

      @@NicholasRenotte it turns out that I've forgot about adding the apikey, and other errors as well, but hopefully enough after week of trying, I figured it out and it worked!! Many thanks for your concern, and your amazing job!!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      @@-alfeim2919 stoked you got it working! Nice!!

  • @AmandeepKaurDhillon
    @AmandeepKaurDhillon 3 года назад +1

    Hii sir really I like your coding style, but this code returning me following error "Handshake status 403 Forbidden
    " any suggestions?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Heya, try spinning up an STT service in another region. This error sometimes pops up.

    • @AmandeepKaurDhillon
      @AmandeepKaurDhillon 3 года назад

      @@NicholasRenotte thanks, Yes, solved with changing the location

  • @vmars316
    @vmars316 2 года назад

    Can this live speech be used for youtube videos ?

  • @usus8420
    @usus8420 6 месяцев назад

    hi great works but what about smartphones?

  • @__Hrishi__
    @__Hrishi__ 3 года назад +1

    (Handshake status 403 Forbidden
    on_close() takes 1 positional argument but 3 were given)
    How to resolve this error??

  • @KuzieJames
    @KuzieJames 3 года назад

    i get a "certificate verify failed" error when I try this. i'm very familiar with using the ibm python sdk's so i'm wondering if this authentication method is not correct because they ask you to use their authenticator nowadays. any idea why my authentication might be failing? i've double and triple checked my url and api key everything should be good. thanks in advance!

  • @nitinpatel35
    @nitinpatel35 3 года назад +1

    I am trying to pickle my final text output to process it using the NLP service. However, I am not able to. Could anyone guide how can I pickle the final text output, please?

    • @nitinpatel35
      @nitinpatel35 3 года назад +2

      I found it. Just in case anyone is looking for here are the details.
      Replace the on_close function with the below syntax. This will generate transcribe.pkl file in the same folder which you can use for further analytics.
      def on_close(ws):
      """Upon close, print the complete and final transcript."""
      global LAST
      if LAST:
      FINALS.append(LAST)
      transcript = "".join([x['results'][0]['alternatives'][0]['transcript']
      for x in FINALS])
      print(transcript)
      with open("transcribe.pkl", "wb") as file:
      pickle.dump(transcript, file)

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Thanks for sharing @Nitin

    • @esmahanaldoseri8151
      @esmahanaldoseri8151 3 года назад

      @@nitinpatel35 Hi , I added the code you've mentioned above but it said ( pickle is not defined)

  • @vamsivuyyuru671
    @vamsivuyyuru671 3 года назад

    Hi Nicholas,
    Thank you for the video, as always it is crystal clear and short. I have tried to implement and facing an issue, Could you please guide me?
    Error:
    [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1123)
    on_close() takes 1 positional argument but 3 were given
    I have tried the solution from the comments such as to change the regions, but none of them worked. Please help.
    Thank you in advance.

  • @stephanetollec9776
    @stephanetollec9776 3 года назад

    Hi Nicholas, thanks for sharing this content.
    I successfully installed the requested SW pieces but when running the python -t transcribe.py command,
    I get an error message "Handshake status 403 Forbidden". Looking closer, it seems URLs are deprecated. Is there a new available transcribe.py with new URLs?
    Thanks for you help.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Stéphane can you try setting up the API in an alternate region?

  • @ziyadcodes
    @ziyadcodes 3 года назад +1

    NICHOLAAAS help 😥, so I was getting the 403 error so I changed it to London which is closest to where I live then I tried again and now I'm getting error 503 service unavailable please anyone help

    • @ziyadcodes
      @ziyadcodes 3 года назад

      so I somehow got it to work😅🥳, all I did was that I wrote py -m pip install -r requirements.txt and it told me that requirement already satisfied then I wrote python transcribe.py -t 10 and it gave me error 403 instead of error 503 which meant that I was using the wrong region when I was 100% sure that I was using the right one, so I decided to try us-south because I saw a comment saying that it worked for him, so I used it and then I typed py -m pip install -r requirements.txt again but then after doing that I wrote python transcribe.py -t 10 and It worked, somehow😂
      CONCLUTION:
      use us-south no matter what, and then reinstall the requirements then type python transcribe.py -t 10 and pray that it works ( :

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      I love the journey here 😂, thanks for sharing. US-South seems to be the way to go!

  • @sameermishra3598
    @sameermishra3598 3 года назад +2

    Sir, I'm getting Handshake error 403 forbidden

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Sameer, can you try creating a new instance in a different region?

  • @BtechF15
    @BtechF15 6 месяцев назад

    reallly really love you bro

  • @samuelsilva4665
    @samuelsilva4665 3 года назад

    Hey. I'm getting this "Handshake status 403 Forbidden" when I try to run the command. I've already tried to set the same model as my region but I still getting the same problem

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Samuel, can you try the us-south region. Looks like some of the others were dropping out.

    • @samuelsilva4665
      @samuelsilva4665 3 года назад

      @@NicholasRenotte If I'm not wrong I think that I did that when I was trying to use this way to create the voice assistant and it didn't worked (I also believe that I actually tried all the regions) but I'll try again. Thanks for helping

    • @samuelsilva4665
      @samuelsilva4665 3 года назад

      @Zain Lokhandwalla Yeah, that's what happened to me too. I've tried lots of different regions and didn't worked

  • @mauriceneveling6472
    @mauriceneveling6472 Год назад

    Can i Connect this to a Phone call?

  • @raha5985
    @raha5985 3 года назад +1

    i followed your code and every thing work but for no reason the output did not show when i done recording

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Got any errors?

    • @raha5985
      @raha5985 3 года назад

      @@NicholasRenotte no

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@raha5985 is your mic connected? Might be muted.

    • @raha5985
      @raha5985 3 года назад

      @@NicholasRenotte no it not mute ,every things i say it show but when the recording is done the final result don’t show

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@raha5985 hmmm, I'm not too sure unfortunately. If the API keys and the URLs are set and you're not getting errors, kinda hard to debug.

  • @canislupus2661
    @canislupus2661 2 года назад

    I get this error. I would highly appreciate if you could help. Double checked everything, so I followed correctly, also tried different region
    [Errno -2] Name or service not known
    on close() takes 1 positional argument but 3 were given

    • @kevwesophia
      @kevwesophia Год назад

      Hello please, did you figure out the problem and solution cause i am also getting the same issue

    • @canislupus2661
      @canislupus2661 Год назад

      @@kevwesophia no sorry. i assume there are far more effective options nowadays already though

  • @nighthawk6414
    @nighthawk6414 2 года назад +1

    if you're on a mac and have issues installing py audio run these commands brew update
    brew install portaudio
    brew link --overwrite portaudio
    $ pip install pyaudio ;)

  • @YuvarajSR-m3l
    @YuvarajSR-m3l Год назад

    Hey Nicholas, while creating account in IBM cloud, it shows error. How can we solve this

    • @hadjerBrioua
      @hadjerBrioua 11 месяцев назад

      did you solve it?

    • @YuvarajSR-m3l
      @YuvarajSR-m3l 11 месяцев назад

      @@hadjerBrioua not solved, I done with another platform

    • @hadjerBrioua
      @hadjerBrioua 11 месяцев назад

      which platform did you use? and did it give the same result and efficiency?@@YuvarajSR-m3l

  • @NitheshS-gm6cf
    @NitheshS-gm6cf 6 месяцев назад

    I cannot create my ibm account they are asking for credit card

  • @سناءدهلوي
    @سناءدهلوي 3 года назад +1

    how could i save the output here as a text ?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      You can tweak the underlying code to output the results as a text file once the connection closes.

  • @abdullahhashmi5423
    @abdullahhashmi5423 2 года назад

    [Errno 11001] getaddrinfo failed
    on_close() takes 1 positional argument but 3 were given
    this error coming can u help i changes three regions too as you said down but didnt work

    • @seymuromarov9287
      @seymuromarov9287 7 месяцев назад

      Hi, I have the same problem, did you solve it?

  • @sindugokulapati9834
    @sindugokulapati9834 3 года назад +1

    hey i got the following error
    Handshake status 403 Forbidden
    could anyone plx help me out

  • @fatimahjabr1269
    @fatimahjabr1269 3 года назад

    great video , thanx :)
    i'm having trouble with the last step (Running Live Speech to Text)
    it first had a problem with importing configparser but i googled the solution and solve it.
    now it says:

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Got the rest of the error for me?

    • @fatimahjabr1269
      @fatimahjabr1269 3 года назад

      @@NicholasRenotte I have no idea what happened to the rest of the comment 😂!
      anyway, the error I'm having now, is:
      Traceback (most recent call last):
      File "transcribe.py", line 27 in
      import pyaudio
      ImportError: No module name payaudio
      event though I actually downloaded it successfully, I also tried to download it again just to be sure and the requirement was satisfied.I don't know what to do!
      I'm using mac so the pipwin is not working

    • @fatimahjabr1269
      @fatimahjabr1269 3 года назад +1

      ti anyone having the same problem, I FIGURE IT OUT !
      all what you have to do is to add number 3, so it will be:
      python3 transcribe.py -t 10
      an off course you can change the seconds number

    • @jordondraggon1459
      @jordondraggon1459 2 года назад

      @@fatimahjabr1269 Bit late but how did you solve the configparse if you remember, i have an error importing it with no module found despite the fact i've successfully downloaded it:
      using python 2.7.15
      File "transcribe.py", line 19, in
      import configparser
      ImportError: No module named configparser

  • @amritkumar8876
    @amritkumar8876 3 года назад

    I tried to run transcribe python file but it showed error ( handshake status 403 forbidden) . Please help 🙏

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Can you try using the us-south region for me @Amrit?

  • @mohamedhasib5037
    @mohamedhasib5037 3 года назад +2

    i have this error and i can't solve it
    Handshake status 403 Forbidden

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Mohamed, can you double check your APIKEY and Region are correct?

    • @mohamedhasib5037
      @mohamedhasib5037 3 года назад +1

      @@NicholasRenotte
      i checked them many times but i don't know where is the problem

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@mohamedhasib5037 can you show me the full output?

    • @andreasweilinghoff9075
      @andreasweilinghoff9075 3 года назад

      @@NicholasRenotte I've got exactly the same problem using Windows 10 with Python 3.9.1.. I also checked the apikey and the region as specified in my cloud profile

    • @allyg1383
      @allyg1383 3 года назад

      @@NicholasRenotte
      Probably a problem in the region? I chose en-GB, otherwise I come from Slovenia.

  • @21day5
    @21day5 2 года назад

    im getting Handshake status 403 Forbidden this error

  • @fuzzyreplex2033
    @fuzzyreplex2033 2 года назад

    '[Error 11001] getaddrinfo failed on_close() takes 1 positional rgument but 3 were given' Error. Any information from anyone would be very appreciated

    • @fuzzyreplex2033
      @fuzzyreplex2033 2 года назад

      I've tried changing regons and I couldn't get it to function. My main desire is to get a solid base for speech to text and then have the command prompt return to me individual words that I've said. I want to use this data to code some trigger words which will activate commands.

  • @nourahsaad7573
    @nourahsaad7573 3 года назад

    hi , i did every steps but its not working and i dont know where is the problem .. could u help me pls :(

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Got any errors for me?

    • @nourahsaad7573
      @nourahsaad7573 3 года назад

      @@NicholasRenotte no such file or directory

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@nourahsaad7573 you're running the script from the right directory?

  • @zainmzameer
    @zainmzameer 2 года назад

    Great Video, you might want to mention that the speech recognition models are now updated and the one shown on the video is no longer working.

  • @pasindumanodya5517
    @pasindumanodya5517 Год назад

    Is it possible to create a IBM account free now ??

  • @sadiasultana667
    @sadiasultana667 3 года назад

    hi, hope you are well. This is such a wonderful video. Today I saw this video and I tried but when I tried to run transcribe.py it gave me the error
    Handshake status 401 Unauthorized
    on_close() takes 1 positional argument but 3 were given
    can you help me?

    • @sarindrathereserandriambel417
      @sarindrathereserandriambel417 2 года назад

      hey, have you find the solution to fix this?

    • @sadiasultana9775
      @sadiasultana9775 2 года назад

      @@sarindrathereserandriambel417 not yet

    • @sarindrathereserandriambel417
      @sarindrathereserandriambel417 2 года назад

      @@sadiasultana9775 and what have you done to do live stream speech recognition? have you seen another way besides that???

    • @sadiasultana667
      @sadiasultana667 2 года назад

      @@sarindrathereserandriambel417 I did everything according to the video instructions. But I did not tried it another way. And one more thing, when I used on_close() with less arguments then it gave another type of error

  • @fariaamir1139
    @fariaamir1139 3 года назад +1

    Great video and easy instructions. Can I integrate Text to Speech in this program?

  • @aiyazm7278
    @aiyazm7278 3 года назад

    I had got a error Handshake status 403 Forbidden