Keep HPC Running - an SRE's Guide to Supporting GPUs on Kubernetes - Christopher Dutra, JP Morgan

Why Kubernetes Is Inappropriate for Platforms, and How to Make It Better

Unleashing the Power of AI in Kubernetes through K8sGPT | Alex Jones

Zoro Faces the Return of the Dice-Dice Fruit | One Piece

England Top Group C, But Still Plenty Of Work To Do | EURO 2024

Red Velvet 레드벨벳 'Cosmic' MV

Training Large Language Models on Kubernetes - Ronen Dar, Run:ai

CNCF [Cloud Native Computing Foundation]

Просмотров 1 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 12 ноя 2023
Training Large Language Models on Kubernetes - Ronen Dar, Run:ai
Large Language Models (LLMs) are emerging as the biggest technology breakthrough since the iPhone launched. LLMs are huge in size and their training requires massive amounts of data and compute power. Often LLM training is being carried out on bare metal servers with workload schedulers from the high-performance computing world, like Slurm. In this talk, we present the challenges involved in pre-training LLMs in general and in specific on Kubernetes. We discuss best practices in terms of networking optimization, distributed resource management, scheduling, and code manipulation. We provide scripts based on NVIDIA’s Megatron Transformer framework with pre-made configurations, data pre-processing workflows, and training setup to make it easy for users to quickly start LLM training on K8s. We further provide benchmarks results comparing training throughput between bare metal environments and K8s-based environments with models like GPT, T5 and BERT, across a varying number of GPU nodes.
Наука

Комментарии •

Следующие

Автовоспроизведение

Keep HPC Running - an SRE's Guide to Supporting GPUs on Kubernetes - Christopher Dutra, JP Morgan

Keep HPC Running - an SRE's Guide to Supporting GPUs on Kubernetes - Christopher Dutra, JP Morgan

Why Kubernetes Is Inappropriate for Platforms, and How to Make It Better

Why Kubernetes Is Inappropriate for Platforms, and How to Make It Better

Unleashing the Power of AI in Kubernetes through K8sGPT | Alex Jones

Unleashing the Power of AI in Kubernetes through K8sGPT | Alex Jones

Zoro Faces the Return of the Dice-Dice Fruit | One Piece

Zoro Faces the Return of the Dice-Dice Fruit | One Piece

England Top Group C, But Still Plenty Of Work To Do | EURO 2024

England Top Group C, But Still Plenty Of Work To Do | EURO 2024

Red Velvet 레드벨벳 'Cosmic' MV

Red Velvet 레드벨벳 'Cosmic' MV

Brent Street Receives The GOLDEN BUZZER From Howie Mandel | Auditions | AGT 2024

Brent Street Receives The GOLDEN BUZZER From Howie Mandel | Auditions | AGT 2024

“What's wrong with LLMs and what we should be building instead” - Tom Dietterich - #VSCF2023

“What's wrong with LLMs and what we should be building instead” - Tom Dietterich - #VSCF2023

Demystifying Kubernetes Platforms with Backstage (DockerCon 2023)

Demystifying Kubernetes Platforms with Backstage (DockerCon 2023)

Kubernetes & Cloud Native Berlin Meetup - Celebrating 10 years of Kubernetes!

Kubernetes & Cloud Native Berlin Meetup - Celebrating 10 years of Kubernetes!

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Large Language Models and The End of Programming - CS50 Tech Talk with Dr. Matt Welsh

Large Language Models and The End of Programming - CS50 Tech Talk with Dr. Matt Welsh

Efficiently Scaling and Deploying LLMs // Hanlin Tang // LLM's in Production Conference

Efficiently Scaling and Deploying LLMs // Hanlin Tang // LLM's in Production Conference

Machine Learning on Kubernetes | Salman Iqbal

Machine Learning on Kubernetes | Salman Iqbal

[1hr Talk] Intro to Large Language Models

[1hr Talk] Intro to Large Language Models

HPC and AI - Two Sides of the Same Coin

HPC and AI - Two Sides of the Same Coin

РАБОТА НАД СИСТЕМОЙ!! Apple выпустила iOS 18 Beta 2 на Айфон! Что нового? Можно ли ставить?!

РАБОТА НАД СИСТЕМОЙ!! Apple выпустила iOS 18 Beta 2 на Айфон! Что нового? Можно ли ставить?!

Собрал ПК, продал на Авито! Сколько заработал перекуп компьютеров?

Собрал ПК, продал на Авито! Сколько заработал перекуп компьютеров?

iOS 18 Beta 2 - КРУПНЕЙШЕЕ ОБНОВЛЕНИЕ со времен iOS 17

iOS 18 Beta 2 — КРУПНЕЙШЕЕ ОБНОВЛЕНИЕ со времен iOS 17

REALME GT 6T - СУБФЛАГМАН С МОЩНЫМ ПРОЦЕССОРОМ И ЯРКИМ ЭКРАНОМ!

REALME GT 6T - СУБФЛАГМАН С МОЩНЫМ ПРОЦЕССОРОМ И ЯРКИМ ЭКРАНОМ!

Лицемерная экология Apple. Деньги важнее планеты?

Лицемерная экология Apple. Деньги важнее планеты?

Комп работает как часы#юмор #коментарі

Комп работает как часы#юмор #коментарі

Intel - это ФИАСКО! Новые CPU обжигают руки! 🔥🤬

Intel - это ФИАСКО! Новые CPU обжигают руки! 🔥🤬