Understanding the LLM Inference Workload - Mark Moyou, NVIDIA

The Evolution of Multi-GPU Inference in vLLM | Ray Summit 2024

Moondream: how does a tiny vision model slap so hard? - Vikhyat Korrapati

This Month Was Tough on Us..

Warfare | Official Trailer HD | A24

Manchester City v. Manchester United | PREMIER LEAGUE HIGHLIGHTS | 12/15/2024 | NBC Sports

From model weights to API endpoint with TensorRT LLM: Philip Kiely and Pankaj Gupta

AI Engineer

Просмотров 2,1 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 4 фев 2025

Комментарии • 1

@supritblackhawk 4 месяца назад ⁺¹
I'm not sure I got the question at 24:03 but it looked like the comparison was between vLLM and TRT-LLM. A few months ago, when we tried to work with Llama 2 70B model for RAG based systems, we noticed that the KV caching mechanism seemed to be better with vLLMs when memory usage was concerned. As our inference server now supports vLLM, we generally tend to default to vLLM engines. A survey (months old, so might be outdated) by Ray also highlighted that vLLM performed better than TRT-LLM across quite a few scenarios.
This is a great video and the presenters look like they know what they are talking about. Can you expand on what people generally don't tune on top of default options when trying to optimise LLMs using TRT?

Следующие

Автовоспроизведение

Understanding the LLM Inference Workload - Mark Moyou, NVIDIA

Understanding the LLM Inference Workload - Mark Moyou, NVIDIA

The Evolution of Multi-GPU Inference in vLLM | Ray Summit 2024

The Evolution of Multi-GPU Inference in vLLM | Ray Summit 2024

Moondream: how does a tiny vision model slap so hard? - Vikhyat Korrapati

Moondream: how does a tiny vision model slap so hard? — Vikhyat Korrapati

This Month Was Tough on Us..

This Month Was Tough on Us..

Warfare | Official Trailer HD | A24

Warfare | Official Trailer HD | A24

Manchester City v. Manchester United | PREMIER LEAGUE HIGHLIGHTS | 12/15/2024 | NBC Sports

Manchester City v. Manchester United | PREMIER LEAGUE HIGHLIGHTS | 12/15/2024 | NBC Sports

How Employees Are Coffee Badging To Avoid Full Days At The Office

How Employees Are Coffee Badging To Avoid Full Days At The Office

AI Is Making You An Illiterate Programmer

AI Is Making You An Illiterate Programmer

Langfuse Town Hall: 2025 Roadmap & V3 (January 2025)

Langfuse Town Hall: 2025 Roadmap & V3 (January 2025)

AI Platform Engineering: Patrick Debois

AI Platform Engineering: Patrick Debois

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

[1hr Talk] Intro to Large Language Models

[1hr Talk] Intro to Large Language Models

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Mastering LLM Inference Optimization From Theory to Cost Effective Deployment: Mark Moyou

Mastering LLM Inference Optimization From Theory to Cost Effective Deployment: Mark Moyou

Datorios @ Flink Forward - Visually Diagnosing Operator State Problems

Datorios @ Flink Forward - Visually Diagnosing Operator State Problems

What's Behind This Football Player's Jaw-Dropping Skills?

What's Behind This Football Player's Jaw-Dropping Skills?

Он живёт внутри часов #амитеш #амстердам

Он живёт внутри часов #амитеш #амстердам

Отдельный вид испытания в Египте - ТОРГОВЦЫ

Отдельный вид испытания в Египте — ТОРГОВЦЫ

Cool Items!🥰 New Gadgets, Smart Appliances, ✨Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, ✨Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

У Украины «нет шансов». ВСУ в Курской области. Могут ли США переломить ход конфликта? / Ширяев

У Украины «нет шансов». ВСУ в Курской области. Могут ли США переломить ход конфликта? / Ширяев

Что это если не жиза ? 🥲 #юмор #отношения

Что это если не жиза ? 🥲 #юмор #отношения

КОММУНАЛЬНАЯ СЛУЖБА (смешное видео, приколы, юмор, поржать)

КОММУНАЛЬНАЯ СЛУЖБА (смешное видео, приколы, юмор, поржать)

【徐々にヲタ芸になっていく】全部やっちゃう人 #slave #wotagei #zerouchirestart #ヲタ芸

【徐々にヲタ芸になっていく】全部やっちゃう人 #slave #wotagei #zerouchirestart #ヲタ芸