Microservices are Technical Debt

Linus Torvalds: Speaks on Hype and the Future of AI

Can ChatGPT o1-preview Solve PhD-level Physics Textbook Problems?

Minecraft but I get CAPTURED in PVP CIVILIZATION

TVA: Nolichucky Dam failure is imminent, could cause life-threatening flooding

Rich Homie Quan - Song Cry (Official Video)

Generative AI Inference Powered by NVIDIA NIM: Performance and TCO Advantage

NVIDIA Developer

Просмотров 472

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 30 сен 2024
Visualize the impact of high-performance generative AI inferencing with NVIDIA NIM microservices. This video showcases how NIM prebuilt, optimized microservices outperform popular alternative inferencing engines, delivering up to 3x more tokens per second throughput when running on the same NVIDIA accelerated infrastructure.
The video demonstrates the benefits of NIM microservices through a crossword puzzle-solving application powered by LLMs, scaling concurrent LLM requests from 50 to 200. Watch the throughput advantage grow as the inferencing workload increases, and more tokens are processed per second on the same infrastructure to power more generative AI applications with lower overall TCO.
0:15 - Value of optimizing generative AI inference for maximum performance
0:28 - Overview of NIM microservices (nvda.ws/4bZLY9E)
0:44 - Demo of a crossword puzzle-solving application deployed with NIM and popular alternative inferencing software
1:33 - 2.4x more tokens per second when solving nearly 50 crosswords
1:41 - 3x more tokens per second when solving 225 crosswords
2:03 - Impact on business productivity
Get started today at ai.nvidia.com: nvda.ws/3Y2Po7U
Developer resources:
▫️ Learn more about NIM: nvda.ws/3yqsuNw
▫️ Join the NVIDIA Developer Program: nvda.ws/3OhiXfl
▫️ Access downloadable NIM microservices on the API catalog: nvda.ws/4bZLY9E
▫️ Read the Mastering LLM Techniques series to learn about inference optimization, LLM training, and more: resources.nvid...
#inferencemicroservices #inferenceoptimization #api #selfhosting #modeldeployment #aimodel #LLM #generativeai #aimicroservices #nvidianim #generativeaideployment #aiinference #productiongenai #enterprisegenerativeai #acceleratedinference #nvidiaai #apicatalog

Комментарии • 2

@anubisai 14 часов назад
Mmmmmmhh I dunno....
@mustafaisildak 18 часов назад

Следующие

Автовоспроизведение

Microservices are Technical Debt

Microservices are Technical Debt

Linus Torvalds: Speaks on Hype and the Future of AI

Linus Torvalds: Speaks on Hype and the Future of AI

Can ChatGPT o1-preview Solve PhD-level Physics Textbook Problems?

Can ChatGPT o1-preview Solve PhD-level Physics Textbook Problems?

Minecraft but I get CAPTURED in PVP CIVILIZATION

Minecraft but I get CAPTURED in PVP CIVILIZATION

TVA: Nolichucky Dam failure is imminent, could cause life-threatening flooding

TVA: Nolichucky Dam failure is imminent, could cause life-threatening flooding

Rich Homie Quan - Song Cry (Official Video)

Rich Homie Quan - Song Cry (Official Video)

Pointless Ai Products

Pointless Ai Products

llm.c's Origin and the Future of LLM Compilers - Andrej Karpathy at CUDA MODE

llm.c's Origin and the Future of LLM Compilers - Andrej Karpathy at CUDA MODE

Understanding LLM Inference | NVIDIA Experts Deconstruct How AI Works

Understanding LLM Inference | NVIDIA Experts Deconstruct How AI Works

Full Stack Developers will take over. This is why.

Full Stack Developers will take over. This is why.

The Best RAG Technique Yet? Anthropic’s Contextual Retrieval Explained!

The Best RAG Technique Yet? Anthropic’s Contextual Retrieval Explained!

Reliable, fully local RAG agents with LLaMA3.2-3b

Reliable, fully local RAG agents with LLaMA3.2-3b

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

Is AI Replacing Software Engineering?

Is AI Replacing Software Engineering?

I can't believe we coded an app with AI in 67 mins (V0, Cursor AI, Replit, Claude AI)

I can't believe we coded an app with AI in 67 mins (V0, Cursor AI, Replit, Claude AI)

Web Without Walls - Dan Abramov | React Universe Conf 2024

Web Without Walls — Dan Abramov | React Universe Conf 2024

DAXSHAT!!! Avaz Oxun sahnada yeg'lab yubordi

DAXSHAT!!! Avaz Oxun sahnada yeg'lab yubordi

# Rural Funny Life Wang Ge

# Rural Funny Life Wang Ge

Как открыть багажник?

Как открыть багажник?

Выполни Испытания, Чтобы ВЫЖИТЬ В Лесу! (Хазяева, Сквозьбаб, Дерзко, Коффи, Кокошка, Дилблин и др.)

Выполни Испытания, Чтобы ВЫЖИТЬ В Лесу! (Хазяева, Сквозьбаб, Дерзко, Коффи, Кокошка, Дилблин и др.)

ПРО ПЕРВЫЕ ГОДЫ "ИМПРОВИЗАЦИИ" #мнесмешно #шастун #импровизация #воронин #бабьяк #mediumquality

ПРО ПЕРВЫЕ ГОДЫ "ИМПРОВИЗАЦИИ" #мнесмешно #шастун #импровизация #воронин #бабьяк #mediumquality

ОЛЬГА БУЗОВА vs АЛЕНА АПИНА | БИТВА ПОКОЛЕНИЙ | 3 СЕЗОН | 3 ВЫПУСК

ОЛЬГА БУЗОВА vs АЛЕНА АПИНА | БИТВА ПОКОЛЕНИЙ | 3 СЕЗОН | 3 ВЫПУСК

Лучше одной, чем с такими

Лучше одной, чем с такими

Простые куриные голени в кефире

Простые куриные голени в кефире