LangChain, agents and artificial intelligence

How to Boost AI with Real and Accurate Data #RAG

Beyond the Hype: A Realistic Look at Large Language Models • Jodie Burchell • GOTO 2024

Ice Spice, Travis Scott - Oh Shhh...

FIRST LOOK: 2025 Corvette ZR1 - 1064hp, Turbos & 215mph!

WE MADE ANOTHER HOLE IN ONE

How to evaluate Artificial Intelligence?

Ricardo Santos

Просмотров 354

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 25 июл 2024
The state of the art of language models shows that AI still has a long way to go. Researchers are designing new evaluation methods to quantify the performance of Large Language Models (LLM) and identify the limitations and strengths of AI models.
In this video we explore the new LLM evaluation methods based on the paper "A Survey on Evaluation of Large Language Models" and answer the question of why you should not trust AI.
Video title: How to evaluate AI?
Watch my latest video: The Great Leap! From Developer to AI Engineer - • ¡El Gran Salto! De Des...
824 Views - Feb 26, 2024
Help me reach my subscriber goal!: ||||||...... 17% ............... 17.4K/100K
-------------------------------------------------- -----------------------------------
Resources
- A Survey on Evaluation of Large Language Models: arxiv.org/abs/2307.03109
-------------------------------------------------- -----------------------------------
Sections:
0:00 Introduction
0:52 Evaluation of AI models
1:34 What are the tasks that LLMs perform?
2:06 Performance in NLP tasks
2:49 Performance in ethics and bias
3:24 Performance in social sciences
4:01 Performance in natural sciences and engineering
4:29 Performance in medicine
4:48 Performance in agent tasks
5:23 Performance in other tasks
6:07 Where to evaluate LLMs?
7:17 How to evaluate LLMs?
8:36 Summary of findings in the evaluation of LLMs
9:58 Conclusions
-------------------------------------------------- -----------------------------------
Music:
Legend Has It - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/3UN60C...
Lucky Stars - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/70f90U...
Stop The Clock - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/2fainn...
No Introduction - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/4SMBTz...
Rise Up - Harris Heller
Provided by Streambeats
Listen: open.spotify.com/track/4DqeLS...
-------------------------------------------------- -----------------------------------
Networks:
GitHub: github.com/Tibiritabara
LinkedIn: / ricardosantosdiaz
Instagram: / tibiritabara90
-------------------------------------------------- -----------------------------------
Thanks for watching the video!
#ai #llm #software
Наука

Комментарии • 6

@RicardoSantosDiaz 11 месяцев назад ⁺¹
Los LLMs, Large Language Models, o Grandes Modelos de Lenguaje, llegaron para quedarse, pero es necesario antes de su adopción en masa identificar sus graves fallas y riesgos en la sociedad, y dedicar un gran esfuerzo en su mejora y evaluación continua, asegurando su impacto positivo. Debemos aceptar que aún estamos demasiado lejos de ello.
@S4z4kku 11 месяцев назад ⁺¹
Muy buena información, se ha glorificado todo lo asociado a IA que no se habla de esos detalles técnicos e importantes que aún no se han cubierto
@RicardoSantosDiaz 11 месяцев назад
Ciertamente, hay que mantener una perspectiva objetiva, pero el ruido sensacionalista de la prensa muchas veces es más fuerte
@adriipinto 11 месяцев назад
🙌🏽🙌🏽🙌🏽
@angelicasantos568 11 месяцев назад
wooow
@OTTOALACCION28 11 месяцев назад
Excelente la inteligencia artificial no puede ser mas inteligente que nosotros los seres humanos

Следующие

Автовоспроизведение

LangChain, agents and artificial intelligence

LangChain, agents and artificial intelligence

How to Boost AI with Real and Accurate Data #RAG

How to Boost AI with Real and Accurate Data #RAG

Beyond the Hype: A Realistic Look at Large Language Models • Jodie Burchell • GOTO 2024

Beyond the Hype: A Realistic Look at Large Language Models • Jodie Burchell • GOTO 2024

Ice Spice, Travis Scott - Oh Shhh...

Ice Spice, Travis Scott - Oh Shhh...

FIRST LOOK: 2025 Corvette ZR1 - 1064hp, Turbos & 215mph!

FIRST LOOK: 2025 Corvette ZR1 – 1064hp, Turbos & 215mph!

WE MADE ANOTHER HOLE IN ONE

WE MADE ANOTHER HOLE IN ONE

I Built Bikini Bottom in a Fish Tank!

I Built Bikini Bottom in a Fish Tank!

All the science behind image synthesis

All the science behind image synthesis

AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"

AI Pioneer Shows The Power of AI AGENTS - "The Future Is Agentic"

How Far is Too Far? | The Age of A.I.

How Far is Too Far? | The Age of A.I.

Qué detiene a las empresas de aprovechar la inteligencia artificial

Qué detiene a las empresas de aprovechar la inteligencia artificial

The nearest most massive black hole found! AND it’s in the mass gap | Night Sky News July 2024

The nearest most massive black hole found! AND it’s in the mass gap | Night Sky News July 2024

What is a REST API and how does it work?

What is a REST API and how does it work?

La Siguiente Gran Revolución: NLP (Procesamiento del Lenguaje Natural)

La Siguiente Gran Revolución: NLP (Procesamiento del Lenguaje Natural)

The machines took over my channel

The machines took over my channel

ChatGPT: 30 Year History | How AI Learned to Talk

ChatGPT: 30 Year History | How AI Learned to Talk

Собрал ПК на ОЗОН, чтобы продать на АВИТО дороже! Сколько заработал на перепродаже компьютеров?

Собрал ПК на ОЗОН, чтобы продать на АВИТО дороже! Сколько заработал на перепродаже компьютеров?

Just Connect Your TV and Watch All the World's Channels in Full HD Format

Just Connect Your TV and Watch All the World's Channels in Full HD Format

КУПИЛ САМЫЙ ПОПУЛЯРНЫЙ ПК ARDOR GAMING в DNS для CS2

КУПИЛ САМЫЙ ПОПУЛЯРНЫЙ ПК ARDOR GAMING в DNS для CS2

Полная версия на @brother-live Запустил серверный комп который нашёл на радиоэлектронной свалке))

Полная версия на @brother-live Запустил серверный комп который нашёл на радиоэлектронной свалке))

How to Soldering wire in Factory ?

How to Soldering wire in Factory ?

AMD RX 7600 тест в играх и сравнение pci express 4.0 vs 3.0

AMD RX 7600 тест в играх и сравнение pci express 4.0 vs 3.0

Я купил первый в своей жизни VR! 🤯

Я купил первый в своей жизни VR! 🤯

iPhone 15 Pro в реальной жизни

iPhone 15 Pro в реальной жизни