Это видео недоступно.

Сожалеем об этом.

Deduct OpenAI GPT-4o's Neural Network Architecture

Olewave

Просмотров 1 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 16 авг 2024
GPT-4o, ... new flagship model that can reason across audio, vision, and text in real time.”
GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction-it accepts as input any combination of text, audio, image, and video and generates any combination of text, audio, and image outputs. It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time(opens in a new window) in a conversation. It matches GPT-4 Turbo performance on text in English and code, with significant improvement on text in non-English languages, while also being much faster and 50% cheaper in the API. GPT-4o is especially better at vision and audio understanding compared to existing models.
#google #gpt4o #gpt-4o #openai #whisper #multimodal #audio #voicemode #voiceengine #voice #speech #projectastra #astra
==
Olewave offers avant-garde bespoke solutions for proprietary data labeling, normalization, and transformation.
Tired of inaccurate transcriptions and frustrating APIs? Olewave offers a superior solution with:
• AI-powered Accuracy: Transcribe any audio, regardless of language, dialect, accent, or topic, with exceptional accuracy. We surpass the competition in understanding even the most challenging recordings.
• Detailed Insights: Gain valuable insights with word/character-level confidence scores, precise timestamps, and advanced speech analytics.
• Privacy Guaranteed: Keep your data secure. Integrate our powerful data labeling tool directly into your platform, eliminating risks associated with external APIs.
• Competitive Pricing: Enjoy high-quality service at accessible prices, outperforming both tech giants and human-intensive transcription solutions.
Ready to experience the difference? Don't settle for mediocrity. Contact info@olewave.com and give us a try!
Customized Large-Scale Datasets
Olewave delivers customized, labeled, and validated large-scale real-world NLP/CV/speech/multimodal datasets of various scenarios such as dictation and conversation in multi accents/dialects/languages, and of diverse topics such as education, finance, legal, entertainment, healthcare, retail, and customer service.
Our datasets include:
• topic-specific text datasets for training your own LLM/ChatGPT/LLaMA model.
• visual/video/image datasets with tags/prompts for training your own CV/SAM model;
• speech/audio datasets of different languages and dialects for training your own ASR/Whisper/SeamlessM4T/TTS model.
• and multimodal datasets.
We constantly collect timely data from languages including Brazilian Portuguese, Latin America Spanish, Arabic, Southeast Asian, Chinese, Japanese, Korean.
Faster and affordable in data delivery than traditional data vendors;
More effective and efficient than traditional data vendors.

Комментарии • 5

@reynoldsVincent 2 месяца назад
Working my way through this, it looks like you did a lot of insightful work, Sub'd--thanks!
@haibinwu1568 2 месяца назад ⁺¹
What do you think is the most likely tokenizer for audio in GPT-4O? What do you think are the possible solutions? Could it be BASE TTS style tokens or Codec-style tokens?
@olewave 2 месяца назад ⁺¹
What do you think is the most likely tokenizer for audio in GPT-4O? What do you think are the possible solutions?
> I think you meant 'encoder' other than 'tokenizer'. I mentioned that in the video, please watch the video.
Could it be BASE TTS style tokens or Codec-style tokens?
> This is very specific, I am not OpenAI employees. I only infer their architecture, not their implementation.
@augmentos 2 месяца назад
Great video ser thanks subd
@augmentos 2 месяца назад
Would love you to expand on how they ‘optimize heavily’ to run very fast

Следующие

Автовоспроизведение

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

AI, Machine Learning, Deep Learning and Generative AI Explained

AI, Machine Learning, Deep Learning and Generative AI Explained

[Detailed Paper Reading] Zipformer: A faster and better encoder for automatic speech recognition

[Detailed Paper Reading] Zipformer: A faster and better encoder for automatic speech recognition

HIGHLIGHTS | AUSTRALIA v SOUTH AFRICA | The Rugby Championship 2024

HIGHLIGHTS | AUSTRALIA v SOUTH AFRICA | The Rugby Championship 2024

UFC 305: Pre-Fight Press Conference

UFC 305: Pre-Fight Press Conference

Feh Channel (Choose Your Legends 2024 Edition) - Fire Emblem Heroes

Feh Channel (Choose Your Legends 2024 Edition) - Fire Emblem Heroes

Survive 100 Days In Nuclear Bunker, Win $500,000

Survive 100 Days In Nuclear Bunker, Win $500,000

Google's Universal Speech Model for 100+ languages beats OpenAI's Whisper Model

Google's Universal Speech Model for 100+ languages beats OpenAI's Whisper Model

Automated Prompt Engineering with DSPy + DSPy Visualization

Automated Prompt Engineering with DSPy + DSPy Visualization

The Potential for AI in Science and Mathematics - Terence Tao

The Potential for AI in Science and Mathematics - Terence Tao

How BYD, Nio And Other Chinese EVs Compare To Tesla

How BYD, Nio And Other Chinese EVs Compare To Tesla

A Survey of Techniques for Maximizing LLM Performance

A Survey of Techniques for Maximizing LLM Performance

Claude 3.5 Deep Dive: This new AI destroys GPT

Claude 3.5 Deep Dive: This new AI destroys GPT

Fine-tune Multi-modal LLaVA Vision and Language Models

Fine-tune Multi-modal LLaVA Vision and Language Models

What Is an AI Anyway? | Mustafa Suleyman | TED

What Is an AI Anyway? | Mustafa Suleyman | TED

AI business ideas funded by YCombinator

AI business ideas funded by YCombinator

would you eat this? #shorts

would you eat this? #shorts

Ахбори Тоҷикистон ва ҷаҳон (16.08.2024) اخبار تاجیکستان

Ахбори Тоҷикистон ва ҷаҳон (16.08.2024) اخبار تاجیکستان

Sevinch Ismoilova - Xayollarim 18-Avgust 19:00 Premera

Sevinch Ismoilova - Xayollarim 18-Avgust 19:00 Premera

КУДА ДАЛЬШЕ ДВИНУТСЯ ВСУ? БЕСЕДА С ЮРИЙ ФЕДОРОВ

КУДА ДАЛЬШЕ ДВИНУТСЯ ВСУ? БЕСЕДА С ЮРИЙ ФЕДОРОВ

СОБАЧЬЯ ПОДСТАВА ► SCHOOLBOY RUNAWAY #4

СОБАЧЬЯ ПОДСТАВА ► SCHOOLBOY RUNAWAY #4

Get 10 Mega Boxes OR 60 Starr Drops!!

Get 10 Mega Boxes OR 60 Starr Drops!!

YouTube и Точка. Жуткий баг ютуба выдающий странные ролики

YouTube и Точка. Жуткий баг ютуба выдающий странные ролики

MELLSTROY - первое интервью: как живет самый обсуждаемый стример года

MELLSTROY — первое интервью: как живет самый обсуждаемый стример года