Видео 33
Просмотров 16 730

High Quality Voice Cloning TTS Model - Fish Speech by Fish Audio

22:21

Setting up a Realistic Text-to-speech; Bark (by Suno AI) locally

12:17

Simple AI Agent/Chatbot | MegaMind | I/O with Whisper.cpp & Piper TTS

36:59

Speech to text with Whisper CPP in a Python Project (with CoreML/Apple Silicon Support)

11:30

Gemini 1.5 Pro (latest) with Langchain's ChatVertexAI Package

8:13

Vision Tool & Screenshot Tool for Langchain Structured Chat Agent (Powered by Gemini 1.5 Pro)

13:27

Fish Speech v1.4 by @FishAudio High Quality Voice Cloning TTS Model

NOTE: This is just an update to my previous video on setting up Fish Speech v1.2 on Macbook M1 Pro
I'll be setting up Fish Speech version 1.4 by @FishAudio This is a great TTS model trained on 700k hours of audio data in multiple languages (English, Japanese, German, French, Spanish, Korean, Arabic, and Chinese audio data), it also performs wonderfully at voice cloning and TTS generation. The only downside is: it is CPU intensive and I recommend you get a beefy GPU if you intend to use it as part of your AI system's TTS engine. In this video I set it up on my MacBook M1 Pro and inference took a while, which I had to speed up so as to not waste your time.
🔗 LINKS
Code Repo: github.com/brainia...

Видео

High Quality Voice Cloning TTS Model - Fish Speech by Fish Audio

22:21

High Quality Voice Cloning TTS Model - Fish Speech by Fish Audio

Просмотров 740Месяц назад

NOTE: This video is part of the Text-to-speech Comparison Series I'll be setting up Fish Speech by Fish Audio. This is a great TTS model trained on 300k hours of English, Japanese and Chinees audio data, it also performs wonderfully at voice cloning and TTS generation. The only downside is; it is CPU intensive and I recommend you get a beefy GPU if you intend to use it as part of your AI system...

Setting up a Realistic Text-to-speech; Bark (by Suno AI) locally

12:17

Setting up a Realistic Text-to-speech; Bark (by Suno AI) locally

Просмотров 1,1 тыс.2 месяца назад

NOTE: This video is part of the Text-to-speech Comparison Series I'll be setting up Bark a transformer-based text-to-audio model created by Suno AI. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also generate nonverbal cues, such as laughter, sighing and crying. 🔗 LINKS Code Repo: github...

Simple AI Agent/Chatbot | MegaMind | I/O with Whisper.cpp & Piper TTS

36:59

Simple AI Agent/Chatbot | MegaMind | I/O with Whisper.cpp & Piper TTS

Просмотров 3333 месяца назад

MegaMind AI is a barebones AI Assistant that's split in the three sections: STT | LLM/MLLM | TTS and was built as a structured chat agent, with the Langchain framework, together with a few python libraries whisper.cpp, to enable fast transcription of user's speech depending on user's PC hardware. 🔗 LINKS Project's Github: github.com/brainiakk/megamind.ai Whisper CPP (CoreML Support) Example Git...

Speech to text with Whisper CPP in a Python Project (with CoreML/Apple Silicon Support)

11:30

Speech to text with Whisper CPP in a Python Project (with CoreML/Apple Silicon Support)

Просмотров 7573 месяца назад

Let's setup Whisper.cpp locally, in a python project and run the audio transcription or convert our speech to text, with the subprocess python package. 🔗 LINKS Github Repo: github.com/brainiakk/Whisper-CPP-CoreML-Example Whisper CPP Github: github.com/ggerganov/whisper.cpp Main Video: ruclips.net/video/PDz_9wChvcs/видео.html 🔗 MY LINKS Twitter: x.com/alhajibrain Instagram: techgia...

Gemini 1.5 Pro (latest) with Langchain's ChatVertexAI Package

8:13

Gemini 1.5 Pro (latest) with Langchain's ChatVertexAI Package

Просмотров 2654 месяца назад

In this video, we'll set up the Langchain Google Vertex AI (Python) Package, and we'll also go through the steps to create a Google cloud console, service account and enable the Vertex AI API. Also we'll create a pretty simple chatbot while trying out the different versions of the @Google 's Gemini AI (Gemini 1.5 Pro 001, Gemini 1.5 Flash 001, Gemini Pro) 🔗 LINKS Langchain Structured Chat Agent...

Vision Tool & Screenshot Tool for Langchain Structured Chat Agent (Powered by Gemini 1.5 Pro)

13:27

Vision Tool & Screenshot Tool for Langchain Structured Chat Agent (Powered by Gemini 1.5 Pro)

Просмотров 3204 месяца назад

NOTE: This is a basic overview of a structured chat agent using langchain framework, the Multimodal LLM powering the agent and tools is @google 's Gemini 1.5 pro preview. You can play around with the code after cloning the GitHub repository below. 🔗 LINKS Setting up Text To Speech (Piper TTS): ruclips.net/video/hQe861JElXc/видео.html Github Repo: github.com/brainiakk/langchain-structured-chat-a...

Installing Piper Text To Speech Engine (on a Macbook w/ Apple Silicon)

20:39

Installing Piper Text To Speech Engine (on a Macbook w/ Apple Silicon)

Просмотров 8254 месяца назад

NOTE: This video is part of the Text-to-speech Comparison Series, but it also addresses the issue of installing Piper TTS (which requires Piper phonemize) on a MacBook arm64 🔗 LINKS Github Repo: github.com/brainiakk/youtube/tree/main/tts-comparison Original Piper Phonemize issue link: github.com/rhasspy/piper-phonemize/issues/14 🔗 MY LINKS Twitter: x.com/alhajibrain Instagram: _al...

Setting up Openvoice version 2 and MeloTTS for AI voice cloning

36:55

Setting up Openvoice version 2 and MeloTTS for AI voice cloning

Просмотров 6 тыс.5 месяцев назад

NOTE: This video is part of the Text-to-speech Comparison Series I'll be setting up Openvoice version 2 and MeloTTS by MyShell AI. We'll follow their documentation closely and setup MeloTTS to work independently as a text-to-speech engine and as the Base speaker TTS for Openvoice voice cloning engine, so you can easily integrate it into your AI application. 🔗 LINKS Code Repo: github.com/brainia...

WizardLM2 function call - Using Llama 3 Tokenizer & Langchain's Pydantic OpenAI function converter

3:26

WizardLM2 function call - Using Llama 3 Tokenizer & Langchain's Pydantic OpenAI function converter

Просмотров 1135 месяцев назад

WizardLM2 function call - Using Llama 3 Tokenizer & Langchain's Pydantic OpenAI function converter

LLAMA 3: function calling review using llama index framework and Ollama locally.

5:34

LLAMA 3: function calling review using llama index framework and Ollama locally.

Просмотров 5555 месяцев назад

LLAMA 3: function calling review using llama index framework and Ollama locally.

Port Harcourt City (Nigeria) | A Brief Cinematic OVER-view 🤣

0:49

Port Harcourt City (Nigeria) | A Brief Cinematic OVER-view 🤣

Просмотров 976 месяцев назад

Port Harcourt City (Nigeria) | A Brief Cinematic OVER-view 🤣

CFly Faith 2 Pro Drone Review: Advanced Features, 540º Obstacle Avoidance & Impressive Performance!

11:14

CFly Faith 2 Pro Drone Review: Advanced Features, 540º Obstacle Avoidance & Impressive Performance!

Просмотров 5248 месяцев назад

CFly Faith 2 Pro Drone Review: Advanced Features, 540º Obstacle Avoidance & Impressive Performance!

Speedybee Bee35 Pro FPV Frame Review: Durable, Protective, and Feature-Packed!

11:44

Speedybee Bee35 Pro FPV Frame Review: Durable, Protective, and Feature-Packed!

Просмотров 1 тыс.9 месяцев назад

Speedybee Bee35 Pro FPV Frame Review: Durable, Protective, and Feature-Packed!

9:30

Honest Walksnail Avatar HD Pro Kit Review: Disappointing Range, and a Watery Demise 💔

Просмотров 11010 месяцев назад

Honest Walksnail Avatar HD Pro Kit Review: Disappointing Range, and a Watery Demise 💔

1:06

TECH GIANT Youtube Channel Trailer

Просмотров 8610 месяцев назад

TECH GIANT RUclips Channel Trailer

Freestyle using the @flyfishrc VoladoR II VD5 and video shot with the @CADDXFPV Walnut

0:40

Freestyle using the @flyfishrc VoladoR II VD5 and video shot with the @CADDXFPV Walnut

Просмотров 1610 месяцев назад

Freestyle using the @flyfishrc VoladoR II VD5 and video shot with the @CADDXFPV Walnut

Exploring the Landscapes of Uniport with the CFLY Faith 2 Pro Drone

3:07

Exploring the Landscapes of Uniport with the CFLY Faith 2 Pro Drone

Просмотров 29610 месяцев назад

Exploring the Landscapes of Uniport with the CFLY Faith 2 Pro Drone

0:46

@flyfishrc Volador II VD5 Maiden flight

Просмотров 1510 месяцев назад

@flyfishrc Volador II VD5 Maiden flight

0:54

Cinematic Drone Footage

Просмотров 1910 месяцев назад

Cinematic Drone Footage

@aotrakstar 11 часов назад
bro, is you Nigerian?
@jun6lee 2 дня назад
Curious, what is this explorer window that shows the content being populated as you run the commands?
@dhananjaytalati4529 3 дня назад
What are the required machine configs for this? I'm running out of memory on T4 Tesla 16 gigs on cuda and my ram 28 gigs on cpu
@techgiantt 3 дня назад
I ran it on my Macbook M1 pro but it also supports cuda if your gpu supports that, in the video I switched to cpu to run it on my mac
@JagoKritik 4 дня назад
Is that support Bahasa Indonesia?
@ojikutu 4 дня назад
Does it have a docker setup?
@techgiantt 3 дня назад
not sure
@mahaltech 4 дня назад
hello pro big big fan pro finally its run perfectly with me 😍 thank you for this Awesome tutorial just one more question i have GPU 4090 on my laptop what i need to run this on GPU i try to change this code device="cpu" to device="cuda" i have 2GPU's intel Build in and Nvedia GFORCE RTX 4090 thank you in advance pro 😀
@techgiantt 4 дня назад
I use a macbook that's why I switched from cuda to cpu. You can reach out to Nvidia support if you're having trouble with their hardware, but make sure you've installed the necessary drivers for that graphics card so you can use cuda, before reaching out to them.
@mahaltech 3 дня назад
@@techgiantt i have Cuda setup and it work fine with many other projects but what change that need to do ro run in GPU
@mahaltech 5 дней назад
Please pro make full clear tutorial
@mahaltech 5 дней назад
Hello pro i fellow your tutorial on fish speech Please make full clear tutorial about This model 1.4
@mahaltech 6 дней назад
we still wait for clear tutorial about this model please pro
@mahmoudlaal8601 9 дней назад
Hello Sir thank you so much for this tutorial. I'm wondering if you can do the cloning but using Bark (the one you used in last video ) so the reading also sounds more human ?
@mahaltech 11 дней назад
hello i try all solution that you give me still same errors please i need your help best regards
@mahaltech 14 дней назад
hello pro good tutorial but i have mistake can you review with me what is problem did almost every thing like what you did its still show me No module named 'fish_speech.conversation' i check the path and every thing is good
@techgiantt 14 дней назад
Did you use version 1.2 (that's what I used)? but there is a new version (v1.4).
@mahaltech 14 дней назад
@@techgiantt yes i use 1.4 still same problem Can i have your contact No i need your help pro
@mahaltech 14 дней назад
@@techgiantt i need your contact No I need your help
@techgiantt 14 дней назад
@@mahaltech Just get version 1.2 from their github releases page, and follow the other steps I showed in the video.
@mahaltech 14 дней назад
@@techgiantt there is some defrance
@T.ONE-2-EntreOlivosEstudio Месяц назад
The project can also read PDF documents and I am willing to connect it all into a super all in one project wit UI this months.
@T.ONE-2-EntreOlivosEstudio Месяц назад
I would also Love an UI to select languajes between your outputs and select characters between your characters .TXT prompts. And for last a selector for your vaults .TXT to tell the LLM what do you expect the AI to know about your needs.
@fulldivemedia Месяц назад
Thanks, I think you need a better 🎤 mic, your voice is so low, probably will put people to sleep, voice quality is more important than video, good luck
@techgiantt Месяц назад
Sorry about the inconvenience, had some problems with the mic IO. Would switch it out in the next video.
@baoxinping3081 Месяц назад
great job！by the way, did you try vllm to load mistral llm? vllm can start an OpenAI API-compatible server with: python -m vllm.entrypoints.openai.api_server --model mistralai/Mistral-7B-Instruct-v0.3 from langchain.chat_models.base import BaseChatModel creat a custom class CustomVLLM：class CustomVLLM(BaseChatModel)， then llm = CustomVLLM(base_url="localhost:8080/v1"
@techgiantt Месяц назад
@@baoxinping3081 I’ll check it out, I’ve been postponing checking it out for a while. I’m also working on updating the project, maybe I’ll cram it into one video.
@baoxinping3081 Месяц назад
awesome
@OriahVinree Месяц назад
Having issues with piper, not able to find the module regardless of my attempts.
@techgiantt Месяц назад
I didn't set up piper in this video
@OriahVinree Месяц назад
@@techgiantt no worries, was having some bizarre dependency issues with my venv, did a clean pip of the requirements and all ran smoothly besides some ALSA issues but cleaned it all up. Thanks.
@jadenscali4585 Месяц назад
guys make sure u r using python 3.10 or it gets weird. But also this was an amazing tutorial! thanks!!
@RansbyJohan Месяц назад
is it utilizing the GPU of the apple silicon?
@martinjoshy9102 Месяц назад
♥
@HellFable 2 месяца назад
As far as I understand, this project is for Linux and MacOS only. Since coremltools only works for them. But the video is still good.
@wilcurran3377 2 месяца назад
For those who don't want to mess around with a complex installation, use Pinokio
@digitalhour 2 месяца назад
Nice work man
@techgiantt 2 месяца назад
Thanks man
@SurrenderToAction 2 месяца назад
While executing python -m main, I'm getting the following error: cannot import name 'TTS' from 'melo'
@techgiantt 2 месяца назад
@@SurrenderToAction did you install Melo TTS ? Because it’s in the modules directory and you need to install it in that virtual environment you setup
@SurrenderToAction 2 месяца назад
@@techgiantt As you do on the 3:48 of the video, right? Yes, I did that, but I had a bunch of issues. Maybe because I was using last version of python, and went to 3.9 later..
@techgiantt 2 месяца назад
@@SurrenderToAction Check ruclips.net/video/UsuuSgnOJxg/видео.html
@SurrenderToAction 2 месяца назад
@@techgiantt Yeah, I did rename it to "melo". Maybe better do everything from scratch. One more day..
@SurrenderToAction 2 месяца назад
@@techgiantt I'm getting this error when installing melo: ERROR: Failed building wheel for tokenizers
@hteinferno 2 месяца назад
got this error when genereting the audio file ../lib/libespeak-ng.dylib' (no such file)
@kapildeshpande4314 2 месяца назад
why you mounted the gps like that......... its too bad the frame comes with the mount broo
@techgiantt 2 месяца назад
The GPS mount is too small for the standard size of GPS
@ChibuezeAnthony 2 месяца назад
Wow, This is amazing 😮😮
@rvick3914 2 месяца назад
Thanks, Can i run this on CPU because i don' t have GPU
@techgiantt 2 месяца назад
I think by default it runs on CPU, just ignore the CoreML support bit.
@everybodyguitar5271 3 месяца назад
A bit confusing. What's the relation between MeloTTS and OpenVoice V2?
@techgiantt 3 месяца назад
Melo tts can act as a stand-alone Text to speech engine or as the Base speaker for Openvoice v2. Openvoice is both a tts and a voice cloning engine. The Openvoice v1 can do without Melo tts as the base speaker
@everybodyguitar5271 3 месяца назад
@@techgiantt Thanks for your reply. I'm able to play English voice without any issue. But when I play Chinese, I got the following error message: RuntimeError: Placeholder storage has not been allocated on MPS device! Any suggestion? Thanks.
@vivekgangurde9685 3 месяца назад
Does it work good on other languages audio because i have tried on bark and tacotron 2 but did not get good results for hindi language, thanks for video keep giving good content 😊
@techgiantt 3 месяца назад
I think it’s mostly English, Japanese, Chinese, French, Spanish, Korean language that’s supported, but they it also has Indian accent
@ProjeAdam. 3 месяца назад
Hello brother, can u share codes ?
@techgiantt 3 месяца назад
For the text to speech or ai agent using tools, because I already have a video on the TTS and a GitHub repo in the description of that video
@komakaze1 3 месяца назад
Is there a way to make these TTS more expressive.
@techgiantt 3 месяца назад
Yes, but you need a beefy gpu to use it with an ai model since you won’t want extra latency, but I’ll create a video for that.
@komakaze1 3 месяца назад
@@techgiantt I think it would be amazing if they could act, expressing emotions anger, sadness, sorrow, compassion, confidence, hesitation, shyness, embarrassment, bravado, whisper, fear, shout, laugh, etc. moods and personality expressed via voice.
@SurrenderToAction 2 месяца назад
@@komakaze1 exactly my thought. Is there any other alternative out there, one that doesn't cost 20 bucks per month?
@gnosisdg8497 3 месяца назад
try making it run with ollama phi3 and llava for vision of phi3 vision please if possible also where can we get the code i have also set of tools that can be used in ollama and phi3 hit me to talk about it if you want !!!
@Henriqueoi 3 месяца назад
usa o Wizard, ele é melhor como assistente, estou procurando projeto para compartilhar a tela do computador, e ele descreve e ajudar com codigos.
@xspydazx 3 месяца назад
good bro!
@techgiantt 3 месяца назад
Thanks 🔥
@deffed2286 4 месяца назад
Cool campus
@zelx7567 4 месяца назад
Id like to see more how to achieve this. each agent is a different llm instance? Thank you!
@timendres2362 4 месяца назад
I can complete the build and install. All good. But when I run piper to gen speech, I cannot get past this: Error processing file '/usr/share/espeak-ng-data/phontab': No such file or directory. And MacOS will not allow installing anything into /usr/share (yes, I know of the work around for that, but not willing to do it).
@techgiantt 4 месяца назад
I mentioned a different way to install piper locally if you don't want to use Virtualenv, I think it's in the readme.md.
@timendres2362 4 месяца назад
@@techgiantt THANK YOU! I missed that part.
@charlietech892 4 месяца назад
I get an error when I create make cmake -Bbuild -DCMAKE_INSTALL_PREFIX=install make: cmake: No such file or directory make: *** [all] Error 1
@techgiantt 4 месяца назад
When installing Piper phonemize?
@techgiantt 4 месяца назад
When installing Piper phonemize?
@charlietech892 4 месяца назад
No that all seems to work. It is when building by using "make."
@techgiantt 4 месяца назад
There's no dot after the "make" command. It's piper phonemize I used the command to setup, after checking out to a new branch from a commit id
@techgiantt 4 месяца назад
You can remove the build directory and also install directory in the pp folder, if you have issues with cmake
@xspydazx 4 месяца назад
bro you need to create your model .... first choose your model type (me i use mistral) .... and stay with it !!! <<< merge your way up and train for all your use cases !! hence becoing a powerfull model ! <<<< its a personal AI in the end ::: you can do all your tutorials with your model!! << do not spend money on outside models :: I made the swahilibots to begin ... but they also need training . i used my super model as based so its just giving the model enough samples in swhaili for the brain to convert itself !!! <<< we cannot start from scratch (wasting money too bro).... Good work on the tutorial !! << these are just components to add onto your model (until you train it to perform these task internally).... So if you create examples of functions and function calls and completions in the prompts the funcitons you created inside the prompt in python the model will use later in its other methods! so essentially we can train the model to perform a single function that we create in python as long as we give it many examples of its execution ie inouts and outputs as well as requests for data and the function be recalled in the brain and used to produce the answer!! <<< like an internal app!! (so we can secretly train these functions in) we can use these funciton calls with its verbose outputs to create examples of its use and use this verbose as data to fine tune th emodel on task!! <<<<<huggingface.co/collections/LeroyDyer/swahilibots <<<<<
@ersaaatmeh9273 4 месяца назад
Could you share the code please
@techgiantt 4 месяца назад
Definitely, currently creating a video breaking it down, because there are different ways to achieve this. Once I’m done I’ll share the link to the repo here also
@ProjeAdam. 3 месяца назад
@@techgianttthank u brother, write me if u have time and get money ))
@ersaaatmeh9273 4 месяца назад
Could you share the code
@techgiantt 3 месяца назад
github.com/brainiakk/langchain-structured-chat-agent
@hardware_house 4 месяца назад
Is there a download link for the zip file?
@techgiantt 4 месяца назад
Yup, the link to the GitHub repo is in the description
@socialexperiment8267 4 месяца назад
great bro!
@charlietech892 4 месяца назад
Great job. Can you show how a PDF or TXT file can be uploaded and used instead of cutting and pasting or typing text? Most videos show short phrases but if you want a paper or a document made text to speech, how would you go about doing this? Thanks again.
@techgiantt 4 месяца назад
Interesting 🤔 might do a video on it, but it’s as simple as adding a function that parses the text from the pdf or txt and unto the text to speech function. It can be passed in chunks, maybe paragraph by paragraph. You could also accept an input of the document file path once you run the python script or make it more lively by using the tts function to say “please provide the file you want me to read” and a file dialog opens your file explorer. It all depends on what you want but it’s possible
@justindaniels5923 4 месяца назад
What operating system are you running? Will this work for Windows?
@techgiantt 4 месяца назад
I'm using OSX (Apple Macbook), I think it should work fine on windows
@justindaniels5923 4 месяца назад
@@techgiantt Appreciate the response! I'm going to give it a shot. Subbed!
@techgiantt 4 месяца назад
@@justindaniels5923 Thanks for the sub
@revathivamshi7621 5 месяцев назад
Thank for library that made possible for us to do illama3❤
@eucharisticadoration 5 месяцев назад
Hello, first thank you for the tutorial. Currently there are not many out there 🙂 But when running, I'll get this error: Loaded checkpoint 'modules/openvoice/checkpoints_v2/converter/checkpoint.pth' missing/unexpected keys: [] [] Any idea what might be wrong? Thank you!
@techgiantt 5 месяцев назад
Did you set the directories up exactly as I did? Also, make sure you copied the downloaded checkpoints_v2 folder to the openvoice directory properly and if you did all that, you could go to their huggingface page and redownload the Openvoice V2 converter/checkpoint.pth to replace the old one. I'll add their huggingface link in the description.
@techgiantt 5 месяцев назад
Hold on, do you mean: [ ] ? Because what I'm seeing in your comment is this: [] []
@eucharisticadoration 5 месяцев назад
@@techgiantt Yes, exactly! Maybe a copy and paste issue.
@techgiantt 5 месяцев назад
@@eucharisticadoration Then, that's not an issue. I don't know why they didn't hide that output on this version. Just let it run, if there was a missing key it would be written the square brackets like a list, but since its empty that means everything is okay.
@eucharisticadoration 5 месяцев назад
@@techgiantt Ok, thank you very much! Finally I've realized that I had to change some more things (paths) in voice.py and now it is running 🙂
@eddysaoudi253 5 месяцев назад
Great. Thank you.
@techgiantt 5 месяцев назад
You are welcome
@morena_sun 5 месяцев назад
video is really dim, unviewable
@techgiantt 5 месяцев назад
My apologies; I think something went wrong in post production. But I'll keep that in mind when uploading the next video, because I'll be updating the codebase. Thanks for letting me know.
@sokoo2 5 месяцев назад
hey, cool video! do you have source code published somewhere?
@techgiantt 5 месяцев назад
github.com/brainiakk/youtube/tree/main/llama3-function-calling
@sokoo2 5 месяцев назад
@@techgiantt Thanks, much appreciated :)

Tech Giant

Видео

Комментарии