Understand DSPy: Programming AI Pipelines

New Discovery: LLMs have a Performance Phase

Why Does Diffusion Work Better than Auto-Regression?

A Shocking Knicks/Wolves Trade Could Make or Break Karl-Anthony Towns | The Bill Simmons Podcast

I beat Kaizo Mario Odyssey. It was Painful.

Joker 2 Out Of Theater Review

New xLSTM explained: Better than Transformer LLMs?

Discover AI

Просмотров 6 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 30 сен 2024
JUST days ago a new alternative to transformer LLMs was published: xLSTM, in particular mLSTM. The Matrix Long Short-Term Memory (mLSTM) network is an advanced variation of the traditional Long Short-Term Memory (LSTM) model. The core idea of mLSTM is based on "accumulated covariance" with exponential gating functions. I explain it in detail in this video and compare it to the classical attention mechanism.
The actual performance can't be independently evaluated at the moment, since the research paper was just published. I will keep you informed.
mLSTM differentiates itself by employing a matrix-based approach to its architecture, where both the input and recurrent weights along with the gates (input, forget, and output gates) are represented as matrices rather than the standard vectors. This configuration allows the mLSTM to process inputs and maintain internal states using matrix operations, facilitating a more intricate interaction between inputs and the recurrent network's hidden states.
One of the most significant innovations of mLSTM is its ability to capture and represent more complex relationships and dependencies within the data. By utilizing matrices to represent its states and operations, mLSTM can encapsulate relationships across multiple dimensions of the input data simultaneously, increasing the network's representational power and computational efficiency, especially for tasks involving high-dimensional data sets such as natural language processing and time series analysis involving multiple variables. This matrix approach not only enhances the depth of data interaction within each cell of the network but also allows the network to model interactions across different features within the data
All rights w/ authors:
xLSTM: Extended Long Short-Term Memory
arxiv.org/pdf/...
#airesearch
#ai
#newtechnology
Наука

Комментарии • 12

@first-thoughtgiver-of-will2456 3 месяца назад ⁺¹
this just makes me want to innovate off mamba
@propeacemindfortress 4 месяца назад ⁺¹
nice, my favorite timeseries staple get's an upgrade 😄
awesome find, and big big thanks for sharing
@wiktorm9858 4 месяца назад ⁺¹
Is there a ready-made pytorch implementation of this?
@denishclarke4470 3 месяца назад
Hey, please provide the slides
@davidhauser7537 4 месяца назад
very cool
@timothywcrane 4 месяца назад
I hope this resets the audio industry as well. LSTM are great for melody prediction etc... I wonder how this new modeling will be applicable and expandable in scope.
@Dom-zy1qy 4 месяца назад
I haven't had much luck creating a good model to predict melodies. Any resources you recommend?
@timothywcrane 4 месяца назад
@@Dom-zy1qy check out @ValerioVelardoTheSoundofAI
@thedoctor5478 4 месяца назад
woh woh. did you forgot to say a little something at beginning of video?
@thomasmitchell2514 4 месяца назад ⁺¹
Hahaha my wife rolls her eyes when I say it along with him after gleefully clicking on a new upload 😅
Also I can’t help echoing “beautiful” out loud even with headphones on 😂
@JonathanYankovich 4 месяца назад
He said it :)
@和平和平-c4i 4 месяца назад
@thomasmitchell2514
What are you all talking about ? What is the funny part ? all I see is machine learning stuff ...

Следующие

Автовоспроизведение

Understand DSPy: Programming AI Pipelines

Understand DSPy: Programming AI Pipelines

New Discovery: LLMs have a Performance Phase

New Discovery: LLMs have a Performance Phase

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

A Shocking Knicks/Wolves Trade Could Make or Break Karl-Anthony Towns | The Bill Simmons Podcast

A Shocking Knicks/Wolves Trade Could Make or Break Karl-Anthony Towns | The Bill Simmons Podcast

I beat Kaizo Mario Odyssey. It was Painful.

I beat Kaizo Mario Odyssey. It was Painful.

Joker 2 Out Of Theater Review

Joker 2 Out Of Theater Review

Tim Walz and His Rescue Dog Scout | WeWalkDogs

Tim Walz and His Rescue Dog Scout | WeWalkDogs

In-Context Learning: EXTREME vs Fine-Tuning, RAG

In-Context Learning: EXTREME vs Fine-Tuning, RAG

RAG explained step-by-step up to GROKKED RAG sys

RAG explained step-by-step up to GROKKED RAG sys

GROKKED LLM beats RAG Reasoning (Part 3)

GROKKED LLM beats RAG Reasoning (Part 3)

XLSTM - Extended LSTMs with sLSTM and mLSTM (paper explained)

XLSTM - Extended LSTMs with sLSTM and mLSTM (paper explained)

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

LLM - Reasoning SOLVED (new research)

LLM - Reasoning SOLVED (new research)

Anthropic's new improved RAG: Explained (for all LLM)

Anthropic's new improved RAG: Explained (for all LLM)

How Did Open Source Catch Up To OpenAI? [Mixtral-8x7B]

How Did Open Source Catch Up To OpenAI? [Mixtral-8x7B]

2025 AI Revolution: Multi-Agent Systems, Apple's Next Move, Unlocking Profit Strategies

2025 AI Revolution: Multi-Agent Systems, Apple's Next Move, Unlocking Profit Strategies

🤯 20x Zoom on Samsung S24 Ultra 🔥 🤫 🙆‍♂️ #s24ultra #Samsung #shortfeed #shorts #shortsvideos #trend

🤯 20x Zoom on Samsung S24 Ultra 🔥 🤫 🙆‍♂️ #s24ultra #Samsung #shortfeed #shorts #shortsvideos #trend

Подробный обзор AirPods 4 - КАК они это сделали?!

Подробный обзор AirPods 4 — КАК они это сделали?!

Московский сервис веников не вяжет. Игровой ноутбук Intel® NUC KC57 и что ждать от Китайских...

Московский сервис веников не вяжет. Игровой ноутбук Intel® NUC KC57 и что ждать от Китайских...

Самый дорогой iPHONE 16 PRO MAX #shorts

Самый дорогой iPHONE 16 PRO MAX #shorts

САМЫЙ ДОРОГОЙ Набор Геймера RAZER с DNS | Клавиатура, мышь, наушники, микрофон,стеклопад, колонки !

САМЫЙ ДОРОГОЙ Набор Геймера RAZER с DNS | Клавиатура, мышь, наушники, микрофон,стеклопад, колонки !

Российские LTE | Станки из РФ для транзисторов | Спецсбор за иностранное ПО

Российские LTE | Станки из РФ для транзисторов | Спецсбор за иностранное ПО

Российские LTE | Станки из РФ для транзисторов | Спецсбор за иностранное ПО

Российские LTE | Станки из РФ для транзисторов | Спецсбор за иностранное ПО

Скучнее iPhone еще не было!

Скучнее iPhone еще не было!