LongRoPE & Theta Scaling to 1 Mio Token (2/2)

One Thought on the Future of AI Agents World Model

LLM - Reasoning SOLVED (new research)

KISS OF LIFE (키스오브라이프) 'Sticky' Official Music Video

Teyana Taylor, Victoria Monét, Coco Jones, Chlöe, Keke Palmer & More Honor Usher! | BET Awards '24

England vs. Slovakia Highlights | UEFA Euro 2024 | Round of 16

Many-Shot VISUAL ICL is amazing! (Stanford)

code_your_own_AI

Просмотров 2 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 20 май 2024
Many-shot visual in-context learning (ICL) is amazing! Especially when working with ICL+ (1 mio token context length) like Gemini 1.5 Pro, also tested already for GPT-4o.
An amazing alternative to fine-tuning VLM and LLMs.
New study by Stanford Univ shows the potential of new long context VLMs, also in regard to visual information (images). Tests include up to 1000 images in a prompt, with batched queries and the models perform!
Multimodal Many-shot in-context learning for extreme context lengths (1 mio token and more) tested for complete length of prompt.
This study by Stanford Univ establishes that multimodal foundation models can effectively leverage many-shot ICL, showing substantial performance gains and efficiency improvements. This paves the way for enhanced adaptability and accessibility of large multimodal foundation models in practical applications.
All rights w/ authors:
Many-Shot In-Context Learning
in Multimodal Foundation Models
arxiv.org/pdf/2405.09798
#airesearch
#ai
#visual
Наука

Комментарии • 11

@propeacemindfortress Месяц назад ⁺¹
ohhh I can imagine a lot 😂
great presentation, looking forward to the next series
@fabriciot4166 Месяц назад ⁺¹
Excellent channel!: Totally agree with your last observation. It seems that we have lost the fact that you can take the most "intelligent" or expert person in the world in a certain science/discipline and they will be far from knowing everything about everything. I think that once an LLM learns enough about the language, its relationships and something else, it doesn't seem very natural to continue "putting" training data into it about all the information that is or is circulating out there. If you think of a "simple" use case (and despite the power of these models today, it is still not possible to have "enough" confidence) such as a customer service assistant, he or she must clearly know how to direct himself correctly (or the personality more appropriate to the use case) to the client, must manage a simple but stable dialogue (without hallucinations or oddities), and what remains, "the fine task" of the assistant will be something very specific.. And going to a more philosophical facet if If you want, you can't have everything, or you have something quite good in general but not so good in specific issues, or you have something very good in a specific issue (expert) and not as good, average as most of us, in the general knowledge. Excellent videos, I get a little lost in some of them, most of them are understood and enjoyed a lot. Thank you, a big hug
@DewEfresh Месяц назад ⁺¹
I'm interested in pre-training as well. What technique would you use? If one of the goals is to be as efficent as possilbe, you could possible use ReLora, GaLore, FSDP QDoRA. If fine tuning is just an extension of pre-training(with slight differences) these could all be options. You could also throw 1.58bit llm's into the mix which could be trained at fp8.
@norman9174 Месяц назад
good video
@kenchang3456 Месяц назад ⁺⁴
Interesting but, and I'm no expert, for ICL, wouldn't you have to reevaluate the context every LLM session? Whereas with fine-tuning that context is essentially baked into the fine-tuned model. Although I think, ICL would be more flexible in accommodating changes to the context but I wonder what the token cost would be when you use a ICL requiring a very large number of tokens and reevaluating that context every LLM session?
@marvinkunz843 Месяц назад ⁺³
It is true that ICL is increasing the number of tokens required.
But at the same time, it can allow for dynamic selection of examples and can be handled much more flexible than fine-tuning. Medprompt had a very interesting technique for prompt selection that added a lot to it, in my opinion.
@TheReferrer72 Месяц назад ⁺¹
@@marvinkunz843 Yep, I think that the key to these findings that its a much quicker and flexible way of fine tuning the model.
You can see by the graphs displayed they had lots of fun varying the batch sizes.
This is an amazing video.
@propeacemindfortress Месяц назад ⁺¹
for your last question... sending data to SF is a bad idea...
@pensiveintrovert4318 Месяц назад
Who is really doing the work? You with your many examples or the LLM? You might as well just give it the answer.
@code4AI Месяц назад
You just discovered the phenomenon of over fitting, when the LLM learns the answers and not the solution (path) itself. That is the exact reason why we have to be really careful with fine-tuning or ICL+, so we do not enter over fitting, which is a standard topic for the last 2 years. Thanks for your comment.
@pensiveintrovert4318 Месяц назад
@@code4AI I wasn't really making the point about overfitting. When one writes papers, one has unlimited time to play around with creating examples for the ICL. If I have to generate examples for any random query, then where is the time saving? It is no longer a general solution that an LLM is supposed to be.

Следующие

Автовоспроизведение

LongRoPE & Theta Scaling to 1 Mio Token (2/2)

LongRoPE & Theta Scaling to 1 Mio Token (2/2)

One Thought on the Future of AI Agents World Model

One Thought on the Future of AI Agents World Model

LLM - Reasoning SOLVED (new research)

LLM - Reasoning SOLVED (new research)

KISS OF LIFE (키스오브라이프) 'Sticky' Official Music Video

KISS OF LIFE (키스오브라이프) 'Sticky' Official Music Video

Teyana Taylor, Victoria Monét, Coco Jones, Chlöe, Keke Palmer & More Honor Usher! | BET Awards '24

Teyana Taylor, Victoria Monét, Coco Jones, Chlöe, Keke Palmer & More Honor Usher! | BET Awards '24

England vs. Slovakia Highlights | UEFA Euro 2024 | Round of 16

England vs. Slovakia Highlights | UEFA Euro 2024 | Round of 16

Nardwuar vs. Lucki

Nardwuar vs. Lucki

Has Generative AI Already Peaked? - Computerphile

Has Generative AI Already Peaked? - Computerphile

NEW TextGrad by Stanford: Better than DSPy

NEW TextGrad by Stanford: Better than DSPy

Better Searches With Local AI

Better Searches With Local AI

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)

Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Paper Explained)

What is RAG? (Retrieval Augmented Generation)

What is RAG? (Retrieval Augmented Generation)

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

LoRA & QLoRA Fine-tuning Explained In-Depth

LoRA & QLoRA Fine-tuning Explained In-Depth

5 Design Patterns That Are ACTUALLY Used By Developers

5 Design Patterns That Are ACTUALLY Used By Developers

SEGA multi-mega/ SEGA CDX/ SEGA mega CD , ремонт?

SEGA multi-mega/ SEGA CDX/ SEGA mega CD , ремонт?

СОБРАЛ ЛУЧШИЙ БОМЖ ПК ЗА 8000 рублей в 2024 ГОДУ!🔥 WB+AVITO | Сборка ПК за 8К 🔥

СОБРАЛ ЛУЧШИЙ БОМЖ ПК ЗА 8000 рублей в 2024 ГОДУ!🔥 WB+AVITO | Сборка ПК за 8К 🔥

ВЕЛИКАЯ ЭВОЛЮЦИЯ ЗВУКА: от 8-bit до Hi-Res | РАЗБОР

ВЕЛИКАЯ ЭВОЛЮЦИЯ ЗВУКА: от 8-bit до Hi-Res | РАЗБОР

Вся правда про Surface Laptop на ARM. Snapdragon X Elite - конкурент Apple?

Вся правда про Surface Laptop на ARM. Snapdragon X Elite — конкурент Apple?

Кто производит iPhone?

Кто производит iPhone?

Необычные наушники • 212443387 Делюсь обзорами в профиле @lykofandrei

Необычные наушники • 212443387 Делюсь обзорами в профиле @lykofandrei

Полный аналог YouTube “создан” в РФ, РКН запрещает обход блокировок, Прощай ICQ

Полный аналог YouTube “создан” в РФ, РКН запрещает обход блокировок, Прощай ICQ

Неловкая ситуация… У нас можно приобрести iPhone любого цвета без сомнений Тг канал :GADGET_PR0

Неловкая ситуация… У нас можно приобрести iPhone любого цвета без сомнений Тг канал :GADGET_PR0