Great video - but can you answer something? If you chunk and transcribe and your goal of this whole thing is to make subtitle files - how do you handle the timestamps for each transcription? Usually, if the video is 3 minutes long and you don't chunk it, you can get subtitles from WEBVTT easily. But if you make 3 chunks, each transcription starts from 0 to 1 minute. It can be rather confusing to merge the WEBVTTs and then make a unified SRT file. Would love a tutorial on this.
As far as I am aware, only English is supported. One thing you can do is to translate the output to other language using either Google Cloud or Azure translation API.
Video seems great, but for us newbies, you might have to explain certain things... Like what is the blue screen that you typed in the api key and model id...
Great video! Thanks for informing people! Do you know, if you wanted to identify specific time for specific dialogues, how you could do that? If I wanted to get the start/end time stamps of a specific sentence, would that be possible?
I have some larger mp3 files that I have extracted from RUclips videos. I am using your chunking method in order to meet the file size requirements for the whisper app but I'm not sure how to create the vtt files on the individual chunks in a way that will allow me to reassemble all of the separate vtt files into a single file where the timing matches the original mp3 file before chunking. Do you have any code for this? Thank you so much for this video. It's great!
Thank you for this video! Unfortunately i'm getting an error with the 'Audio.transcribe' method: "AttributeError: type object 'Audio' has no attribute 'transcribe'" Does anyone happen to know what the problem is? /edit: the problem was my outdated openai version. I was on 0.26, but you need 0.27.
Great video - but can you answer something? If you chunk and transcribe and your goal of this whole thing is to make subtitle files - how do you handle the timestamps for each transcription? Usually, if the video is 3 minutes long and you don't chunk it, you can get subtitles from WEBVTT easily. But if you make 3 chunks, each transcription starts from 0 to 1 minute. It can be rather confusing to merge the WEBVTTs and then make a unified SRT file. Would love a tutorial on this.
interesting! thanks for the explanation. i wish you explained how it could be done for real time transcribe by using a Mic instead of audio file
I will look into it.
great tutorial! question: does it only translates to english? what would i have to change in the code to translate to other languages? thanks!!
As far as I am aware, only English is supported. One thing you can do is to translate the output to other language using either Google Cloud or Azure translation API.
Video seems great, but for us newbies, you might have to explain certain things... Like what is the blue screen that you typed in the api key and model id...
Great video, thank you for this! And thank you for including closed captions!
Thanks for your comment!
Great video! Thanks for informing people!
Do you know, if you wanted to identify specific time for specific dialogues, how you could do that? If I wanted to get the start/end time stamps of a specific sentence, would that be possible?
I'm still trying to figure those things out. From my understanding, OpenAI will extend the functionalities in future updates.
Great video! But I have a question. How do you run python code blocks in VS Code terminal? Is it some kind of extension?
No extension need! This video might help ruclips.net/video/VXJChVF28jw/видео.html
@@jiejenn Thanks a lot!
Amazing video man thank you
Jie, Awesome tutorial. Thanks
👍
Thanks, Is whisper api free ?
I have some larger mp3 files that I have extracted from RUclips videos. I am using your chunking method in order to meet the file size requirements for the whisper app but I'm not sure how to create the vtt files on the individual chunks in a way that will allow me to reassemble all of the separate vtt files into a single file where the timing matches the original mp3 file before chunking. Do you have any code for this?
Thank you so much for this video. It's great!
I will have to look into it.
Why we can't use translate and transcribe in one python file
Thank you Jie Jenn :)
👍
Thank you for this video! Unfortunately i'm getting an error with the 'Audio.transcribe' method:
"AttributeError: type object 'Audio' has no attribute 'transcribe'" Does anyone happen to know what the problem is?
/edit: the problem was my outdated openai version. I was on 0.26, but you need 0.27.
Hard to tell without looking at your script. You print(dir(open.Audio)), what members does the output show?
@@jiejenn Thank you for your reply! I've already solved it by now :) my openai version was outdated (0.26 instead of the required 0.27)
@@WolverineAndSloth Cool. Glad you solved your problem. This is something I don't think I would be able to figure it out.
Hi all, i just followed the tutorial and it run an error out: exceeded your current quota..... But i only gave it a 10 sec. video. How come?
Did you have any luck resolving this? Currently having the same issue.
how to rename "assistant:" to the name i want eg Alexa: ?
I don't think that feature is supported currently.
Hide you key.. You probably realized that already tho
Thanks for the reminder. Yeah, key is deleted right after the tutorial.
@@jiejenn I panicked :) Saw the key before I finished the vid :) Glad your are safe!!!
I prefer Amazon Polly
I will have to check it out.