Deep Dive: Optimizing LLM inference

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Has Generative AI Already Peaked? - Computerphile

Tijuana 4-2 Chivas, los Xolos de Osorio muerden al Rebaño

XO (Only If You Say Yes) - ENHYPEN エンハイプン 엔하이픈 [Music Bank] | KBS WORLD TV 240712

My Puzzle Robot is 200x Faster Than a Human

MatMul Free Language Modeling: New Ways of LLM Training & Inference

AI Anytime

Просмотров 1,2 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 15 июл 2024
In this tutorial, I dive deep into the world of scalable MatMul-free language modeling. You'll learn about the basics of matrix multiplication (MatMul), its role in neural networks and large language models, and the challenges it presents. Discover how MatMul-free language models operate, leveraging BitLinear layers with ternary weights to achieve impressive efficiency and performance.
I'll also explore the GPU-efficient implementation that reduces memory usage by up to 61% during training and significantly improves inference speed, as well as the custom FPGA hardware solution designed for brain-like efficiency.
If you find this video helpful, please like, comment, and subscribe to my channel for more tutorials!
JOIN THE DISCORD: / discord
Join this channel to get access to perks:
/ @aianytime
To further support the channel, you can contribute via the following methods:
Bitcoin Address: 32zhmo5T9jvu8gJDGW3LTuKBM1KPMHoCsW
UPI: sonu1000raw@ybl
GitHub: github.com/AIAnytime/MatMul-F...
#ai #llm #aiagents
Наука

Комментарии • 3

@ozne_2358 16 дней назад
I was hoping for a more in depth description of the architecture. For example, I looked at the paper and I understand the equations on pg. 6 and 7. However, I do not understand how they connected to each other : they even use the same symbol gt as....an output in both cases.
@khaledbouzaiene3959 14 дней назад
the link on description isn’t working

Следующие

Автовоспроизведение

Deep Dive: Optimizing LLM inference

Deep Dive: Optimizing LLM inference

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

Tijuana 4-2 Chivas, los Xolos de Osorio muerden al Rebaño

Tijuana 4-2 Chivas, los Xolos de Osorio muerden al Rebaño

XO (Only If You Say Yes) - ENHYPEN エンハイプン 엔하이픈 [Music Bank] | KBS WORLD TV 240712

XO (Only If You Say Yes) - ENHYPEN エンハイプン 엔하이픈 [Music Bank] | KBS WORLD TV 240712

My Puzzle Robot is 200x Faster Than a Human

My Puzzle Robot is 200x Faster Than a Human

Video captures shooting at Trump rally

Video captures shooting at Trump rally

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

Finetune Your LLM on Custom Datasets with Unsloth and Open WebUI Front End!

Finetune Your LLM on Custom Datasets with Unsloth and Open WebUI Front End!

Matrix Multiplication is AI - What 1.58b LLMs Mean for NVIDIA

Matrix Multiplication is AI - What 1.58b LLMs Mean for NVIDIA

HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

HuggingFace Fundamentals with LLM's such as TInyLlama and Mistral 7B

Computer Scientist Explains Machine Learning in 5 Levels of Difficulty | WIRED

Computer Scientist Explains Machine Learning in 5 Levels of Difficulty | WIRED

Attention is all you need (Transformer) - Model explanation (including math), Inference and Training

Attention is all you need (Transformer) - Model explanation (including math), Inference and Training

How I’d learn ML in 2024 (if I could start over)

How I’d learn ML in 2024 (if I could start over)

"okay, but I want Llama 3 for my specific use case" - Here's how

"okay, but I want Llama 3 for my specific use case" - Here's how

Should You Use Open Source Large Language Models?

Should You Use Open Source Large Language Models?

Не держит нагрузку / Батарея Makita BL1016 | Перепаковка аккумуляторов

Не держит нагрузку / Батарея Makita BL1016 | Перепаковка аккумуляторов

iPhone 16 - СТОИТ ПРОПУСТИТЬ • Apple ПРОГНУЛИ • iPhone 17 Slim УДИВЛЯЕТ

iPhone 16 – СТОИТ ПРОПУСТИТЬ • Apple ПРОГНУЛИ • iPhone 17 Slim УДИВЛЯЕТ

S24 Ultra and IPhone 14 Pro Max telephoto shooting comparison #shorts

S24 Ultra and IPhone 14 Pro Max telephoto shooting comparison #shorts

Colorful Vulcan w rtx 4070ti Super

Colorful Vulcan w rtx 4070ti Super

Я купил три последних айфона!

Я купил три последних айфона!

Cheapest gaming phone? 🤭 #miniphone #smartphone #iphone #fy

Cheapest gaming phone? 🤭 #miniphone #smartphone #iphone #fy

Не Бери INFINIX NOTE 40 Pro + 5g, Не Посмотрев Это Видео!

Не Бери INFINIX NOTE 40 Pro + 5g, Не Посмотрев Это Видео!

ИГРОВАЯ СБОРКА ПК ЗА 30К ОТ А ДО Я

ИГРОВАЯ СБОРКА ПК ЗА 30К ОТ А ДО Я