I wish every AI Engineer could watch this.

*Don't* miss these LLMs Concepts!!

From a Github PR to LlamaIndex Engineer & AI Researcher!!!

Best Descendants Ranked: Ultimate Character Tier List (The First Descendant)

Quavo, Lana Del Rey - Tough (Official Video)

EVERYTHING You Need to Know about College Football 25 DYNASTY

Learn How he reproduced Karpathy's GPT-2 for Audio!!!

1littlecoder

Просмотров 4,5 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 14 июн 2024
🔗 Links 🔗
Building GPT2o - Part 1 : Audio
/ building-gpt2o-part-1-...
GPT-2 for Audio - github.com/nivibilla/build-na...
Srinivas Billa Twitter
x.com/sbeastwindy
Srinivas Billa Linkedin
/ srinivasbilla
Andrej Karpathy's GPT-2 Video - • Let's reproduce GPT-2 ...
❤️ If you want to support the channel ❤️
Support here:
Patreon - / 1littlecoder
Ko-Fi - ko-fi.com/1littlecoder
🧭 Follow me on 🧭
Twitter - / 1littlecoder
Linkedin - / amrrs
Наука

Комментарии • 33

@Srinivas_Billa 20 дней назад ⁺¹⁹
Thanks for having me!
@thecandidcrood 20 дней назад ⁺¹
Crazy work bro
@spencerfunk6697 19 дней назад
Do u have a discord
@petersobolewski1354 19 дней назад
Thanks for sharing this with us!
@maulikmadhavi 16 дней назад
thanks for sharing your code and explanation!
@maulikmadhavi 16 дней назад
wonder if any augmentation can help to overcome overfitting issue
@CubicPostcode 19 дней назад ⁺²
I was thinking about how OpenAI could come up with nice voices without being prone to be sued legally. I came up with this idea. It would generate voices randomly and provably so, it would be possible to prove the voices where generated randomly and then people could upvote or downvote voices, so that the most popular ones according the crowdsourced polling would be the ones featured in the app. Since the voices where randomly generated no one could say they where an imitation of someone. Also it would be no fault of OpenAI that people preferred some of them. Also, this seems a better approach than just allowing users to upload a sample of their desired voices since with this approach you can avoid misuses to do with deepfaking.
@mshonle 20 дней назад
Very cool! I wonder if you could leverage available text models to do something like the model mashups or Franken-merges? For example if you do a LoRA-like fine-tuning, but focused on all layers with addition layers added to both ends (to translate from the audio encodings to the pretrained model’s hidden embeddings and then back to decodable audio again).
@christaylor-gz6mi 18 дней назад ⁺¹
Thank you! This is fantastic content!
@1littlecoder 18 дней назад
Glad you enjoyed it!
@chickenp7038 20 дней назад
awesome video.
@maulikmadhavi 16 дней назад ⁺¹
thanks for sharing 😊
@1littlecoder 16 дней назад
Thanks for watching!
@user-en4ek6xt6w 20 дней назад ⁺¹
Mind blowing
@1littlecoder 20 дней назад
🚀
@petersobolewski1354 19 дней назад
How about traiming it on animal sounds? Will it learn to speak with them?
@1msirius 20 дней назад
cool bro!
@1littlecoder 20 дней назад
✌
@satyamtiwari3839 20 дней назад
It is inspiring
@haileycollet4147 16 дней назад
HF datasets/jhu-clsp/seamless-align-expressive
The English half of this is ~3500 hours I think.
@WebWizard977 20 дней назад ⁺¹
Its feel like tts which convert audio to text then send it to gpt server🤔
@1littlecoder 20 дней назад
This is one single native audio model
@WebWizard977 20 дней назад
@@1littlecoder wow that's amazing
@efexzium 20 дней назад
Its a Large Multy Modal
Model
?
@WebWizard977 20 дней назад ⁺¹
@@efexzium ya sure brother I have my own llm also but thanks for theory update
@efexzium 20 дней назад
@@WebWizard977 me 2 but I just cherry pick from the internet the best model.
@xXWillyxWonkaXx 16 дней назад
Forgive me in saying this, but why is this "a great project"? i failed to understand? So he did an audio in-audio out model similar to GPT-4o audio feature? is that it? We're getting an open sourced model.
@TheRealUsername 20 дней назад
GPT-4o ?
@1littlecoder 20 дней назад
He mentions that's his motivation
@TheRealUsername 20 дней назад
@@1littlecoder I mean, it could be the same tech behind GPT-4o
@Srinivas_Billa 20 дней назад ⁺¹
This is a very naive way to do it but yeah. There probably ar eothe r optimisation to make but I'd rather have something to play with than not I guess?
@TheRealUsername 20 дней назад
@@Srinivas_Billa so that does mean the open-source community is capable of building something similar to GPT-4o
@Srinivas_Billa 20 дней назад ⁺²
@@TheRealUsername of course! I plan to try video next and then ultimately combine them together. The tools are there to do it. compute is the issue.

Следующие

Автовоспроизведение

I wish every AI Engineer could watch this.

I wish every AI Engineer could watch this.

*Don't* miss these LLMs Concepts!!

*Don't* miss these LLMs Concepts!!

From a Github PR to LlamaIndex Engineer & AI Researcher!!!

From a Github PR to LlamaIndex Engineer & AI Researcher!!!

Best Descendants Ranked: Ultimate Character Tier List (The First Descendant)

Best Descendants Ranked: Ultimate Character Tier List (The First Descendant)

Quavo, Lana Del Rey - Tough (Official Video)

Quavo, Lana Del Rey - Tough (Official Video)

EVERYTHING You Need to Know about College Football 25 DYNASTY

EVERYTHING You Need to Know about College Football 25 DYNASTY

I Finally Got a Tesla Cybertruck and It Scares the Crap Out of Me

I Finally Got a Tesla Cybertruck and It Scares the Crap Out of Me

26 Incredible Use Cases for the New GPT-4o

26 Incredible Use Cases for the New GPT-4o

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai

How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai

Two GPT-4os interacting and singing

Two GPT-4os interacting and singing

Let's build GPT: from scratch, in code, spelled out.

Let's build GPT: from scratch, in code, spelled out.

Mixture of Agents (MoA) BEATS GPT4o With Open-Source (Fully Tested)

Mixture of Agents (MoA) BEATS GPT4o With Open-Source (Fully Tested)

AI is ruining the Internet 😭

AI is ruining the Internet 😭

GPT-4o is WAY More Powerful than Open AI is Telling us...

GPT-4o is WAY More Powerful than Open AI is Telling us...

How I’d learn ML in 2024 (if I could start over)

How I’d learn ML in 2024 (if I could start over)

Собери ПК и Получи 10,000₽

Собери ПК и Получи 10,000₽

ASMR | РАСПАКОВКА IPAD M4 😱 | КАК ДУМАЕТЕ КАКОЙ АЙПАД ЛУЧШЕ ? #gamestation #gamingislife #shorts

ASMR | РАСПАКОВКА IPAD M4 😱 | КАК ДУМАЕТЕ КАКОЙ АЙПАД ЛУЧШЕ ? #gamestation #gamingislife #shorts

Мои любимые Android смартфоны

Мои любимые Android смартфоны

Fake iPhone 15 Pro Max LUX (2024) за 15.000 рублей c Wildberries

Fake iPhone 15 Pro Max LUX (2024) за 15.000 рублей c Wildberries

Как улучшить качество видеозвонка на iPhone?

Как улучшить качество видеозвонка на iPhone?

АСМР Компьютерный Магазин: Какую Клавиатуру Выберешь? (Royal Kludge RK N80, RK H81)

АСМР Компьютерный Магазин: Какую Клавиатуру Выберешь? (Royal Kludge RK N80, RK H81)

Как улучшить качество видеозвонка на iPhone?

Как улучшить качество видеозвонка на iPhone?

81000 руб. за ремонт воздуха в игровом ноутбуке Legion7 и как испарить деньги испарительной камерой

81000 руб. за ремонт воздуха в игровом ноутбуке Legion7 и как испарить деньги испарительной камерой