Do we need Attention? A Mamba Primer

Acing the MBA Interview: A Workshop

What is a vector database? Why are they critical infrastructure for #ai #applications?

WE BOUGHT A NEW HOUSE!!!

Tyson Fury vs Oleksandr Usyk • Full Post Fight Press Conference Video • Fury vs Usyk

Koe Wetzel - Sweet Dreams (Official Visualizer)

Hanna Hajishirzi (AI2) - OLMo: Findings of Training an Open LM

Sasha Rush 🤗

Просмотров 967

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 28 апр 2024
Talk from the Open-Source Generative AI Workshop at Cornell Tech.
Speaker: homes.cs.washington.edu/~hann...
Slides - drive.google.com/file/d/1BlHJ...
Наука

Комментарии • 2

@AM-yk5yd 20 дней назад
Not fan until they'll bring something new to architecture. Compare it to striped hyena/mamba/rwkv. Whatever kneading they do with training datasets, in the end its worse than Apache-licensed mistral.
There are hundreds of new papers and thousands of oldish papers that were not implemented into big textgen models. Yet we see yet another decoder-only-transformer model.
@srush_nlp 20 дней назад ⁺¹
While I'm interested in different architectures, my guess is that they will end up performing similar to decoder-only Transformers at the end of the day. Changes in data and amount of training seem to have a larger impact on the actual performance of the model. While Mistral / Llama 3 are extremely good, we do not really know why. Presumably it is do the data ingestion processes.

Следующие

Автовоспроизведение

Do we need Attention? A Mamba Primer

Do we need Attention? A Mamba Primer

Acing the MBA Interview: A Workshop

Acing the MBA Interview: A Workshop

What is a vector database? Why are they critical infrastructure for #ai #applications?

What is a vector database? Why are they critical infrastructure for #ai #applications?

WE BOUGHT A NEW HOUSE!!!

WE BOUGHT A NEW HOUSE!!!

Tyson Fury vs Oleksandr Usyk • Full Post Fight Press Conference Video • Fury vs Usyk

Tyson Fury vs Oleksandr Usyk • Full Post Fight Press Conference Video • Fury vs Usyk

Koe Wetzel - Sweet Dreams (Official Visualizer)

Koe Wetzel - Sweet Dreams (Official Visualizer)

OFFLINETV RACES SOAPBOX CARS

OFFLINETV RACES SOAPBOX CARS

ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained)

ORPO: Monolithic Preference Optimization without Reference Model (Paper Explained)

Hao Zhang - Chatbot Arena (UCSD / LMSys)

Hao Zhang - Chatbot Arena (UCSD / LMSys)

This is What Limits Current LLMs

This is What Limits Current LLMs

Gail Weiss: Thinking Like Transformers

Gail Weiss: Thinking Like Transformers

[1hr Talk] Intro to Large Language Models

[1hr Talk] Intro to Large Language Models

Zihao Zhang (Oxford-Man Institute) - "Deep Learning for Market by Order Data"

Zihao Zhang (Oxford-Man Institute) - "Deep Learning for Market by Order Data"

A little guide to building Large Language Models in 2024

A little guide to building Large Language Models in 2024

Tatsu Hashimoto - Lessons from the Alpaca Project (Stanford)

Tatsu Hashimoto - Lessons from the Alpaca Project (Stanford)

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - 681

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple - 681

iPhone СНИМАЕТ ХУЖЕ! Почему?! #iphone

iPhone СНИМАЕТ ХУЖЕ! Почему?! #iphone

iPhone СНИМАЕТ ХУЖЕ! Почему?! #iphone

iPhone СНИМАЕТ ХУЖЕ! Почему?! #iphone

Apple, как вас уделал Тюменский бренд CaseGuru? Конец удивил #caseguru #кейсгуру #наушники

Apple, как вас уделал Тюменский бренд CaseGuru? Конец удивил #caseguru #кейсгуру #наушники

Best Gun Stock for VR gaming. #vr #vrgaming #glistco

Best Gun Stock for VR gaming. #vr #vrgaming #glistco

ЭТО САМЫЙ ДОРОГОЙ IPAD!

ЭТО САМЫЙ ДОРОГОЙ IPAD!

🤖Вернулись в ПРОШЛОЕ🤪

🤖Вернулись в ПРОШЛОЕ🤪

Я бы сделал дешевле - Samsung Flip 4

Я бы сделал дешевле - Samsung Flip 4

Новосибирск, не покупайте эти ПК! (Авито) #топ3хреновыхпк

Новосибирск, не покупайте эти ПК! (Авито) #топ3хреновыхпк