Two Towers vs Siamese Networks vs Triplet Loss - Compute Comparable Embeddings

Transformers (how LLMs work) explained visually | DL5

Speculative Decoding: When Two LLMs are Faster than One

I Filled my ENTIRE House with Snow *don’t try this*

The Most Illegal Baseball Bat Ever Created

Yelling at my GF in front of FaZe Rug and Brawadis..

LLM Prompt Engineering with Random Sampling: Temperature, Top-k, Top-p

DataMListic

Просмотров 8 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 9 фев 2025

Комментарии • 23

@datamlistic Год назад
Wondering how you can fine-tune LLMs? Take a look here to see how this is done with LoRa, a popular fine-tuning mechanism: ruclips.net/video/CNmsM6JGJz0/видео.html
VIdeo mistakes:
- At 2:30 the sum should be for j, not for i. Thanks @mriz for noticing this!
- The probability distribution after selecting top-3 words at 4:10 is not accurate, and they should be sunny - 0.46, rainy - 0.38, the - 0.15. Thanks @koiRitwikHai for noticing this!
@stev__8881 10 месяцев назад ⁺¹
Great introduction with a clear an simple explanation/ illustration. Thanks!
@datamlistic 9 месяцев назад
Thanks! Glad you found it helpful! :)
@waiitwhaat 9 месяцев назад
This is a really clear explanation in this concept. Loved it. Thanks!
@datamlistic 9 месяцев назад ⁺¹
Thanks! Happy to hear that you liked the explanation! :)
@이수연-p1f9n 8 месяцев назад ⁺¹
Thanks! Top p and Top k were easy to understand.
@datamlistic 8 месяцев назад
You're welcome! I'm glad to hear that those concepts were clear and easy to understand. If you have any more questions or need further clarification on this topic, feel free to ask! :)
@starsmaker9964 7 месяцев назад
video helped me a lot! thanks
@datamlistic 7 месяцев назад
Glad it helped! :)
@igordias8728 11 месяцев назад ⁺¹
Hello, in TOP-P, witch of the 4 words will be chosen? It's randomly between "sunny", "rainy", "the" and "good"?
@datamlistic 11 месяцев назад ⁺¹
Yes, it's random according to their distribution.
@Annaonawave 10 месяцев назад ⁺¹
@@datamlistic so they are randomly selected, but higher probable values have higher chance of being selected?
@datamlistic 10 месяцев назад ⁺¹
@@Annaonawave exactly :)
@nizhsound Год назад
Thank you for the video and explanation between the three types of sampling for LLMs. When sampling between Temperature, Top-K and Top-P, are you using or enabling all three sampling methods at the same time?
For example if I chose to do Top-K sampling for controlled diversity and reduced nonsense, does that mean that I will choose a low temperature as well?
@datamlistic Год назад
Glad it was helpful! Yes, you can combine multiple sampling methods at the same time. :)
@varadarajraghavendrasrujan3210 8 месяцев назад
Let's say I use top_k=4, does the model sample 1 word out of the 4 most probable words randomly? If not, what happens?
@datamlistic 8 месяцев назад
That's exactly what happens! The model samples 1 word out of the most probable 4, according to their distribution. (i.e. the higher the probabaility of a word, the more likely it is to sample it).
@matthakimi3132 7 месяцев назад
Hi there, this was a great introduction. I am working on a recommendation query using Gemini; would you be able to help me fine-tune for the optimal topK and topP? I am looking for an expert in this to be an advisor to my team.
@datamlistic 7 месяцев назад
Unfortunately my time is very tight right now since I am working full time as well, so I can't commit to anything extra. I could however help you with some advice if you can provide more info.
@koiRitwikHai Год назад
The probability distribution you get after selecting top-3 words at 4:10 is not accurate. The probabilities, after normalizing the 3-word-window, should be sunny-0.46, rainy-0.38, and the-0.15.
@datamlistic Год назад ⁺¹
Yep, that's correct. Thanks for the feedback! I created/recorder the video over a longer period of time and it seems that I used two version of numbers in doing that (forgot to make any updates). I'm sorry if this has caused any confusion. I will add some corrections about this issue in the description/pinned comment.
p.s. Maybe it would be a good idea to take the ceil of one of the probabilities you enumerated, so they sum up to 1.
@mriz 11 месяцев назад
2:3
bro you wrong the sums is not for input i , but for j
@datamlistic 11 месяцев назад ⁺¹
Yep, that's correct. Thanks for the feedback and sorry if this confused you! I will add a note about this mistake in the pinned comment. :)

Следующие

Автовоспроизведение

Two Towers vs Siamese Networks vs Triplet Loss - Compute Comparable Embeddings

Two Towers vs Siamese Networks vs Triplet Loss - Compute Comparable Embeddings

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

Speculative Decoding: When Two LLMs are Faster than One

Speculative Decoding: When Two LLMs are Faster than One

I Filled my ENTIRE House with Snow *don’t try this*

I Filled my ENTIRE House with Snow *don’t try this*

The Most Illegal Baseball Bat Ever Created

The Most Illegal Baseball Bat Ever Created

Yelling at my GF in front of FaZe Rug and Brawadis..

Yelling at my GF in front of FaZe Rug and Brawadis..

Noob To Pro With DRAGON REWORK in Blox Fruits

Noob To Pro With DRAGON REWORK in Blox Fruits

AI and Automation in Fashion Opportunities and Challenges

AI and Automation in Fashion Opportunities and Challenges

AI Text Generation Clearly Explained!

AI Text Generation Clearly Explained!

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

Fine-tuning, RAG, Llama, prompt-engineering, LLM-арены | Что происходит в LLM

Fine-tuning, RAG, Llama, prompt-engineering, LLM-арены | Что происходит в LLM

What if all the world's biggest problems have the same solution?

What if all the world's biggest problems have the same solution?

Softmax - What is the Temperature of an AI??

Softmax - What is the Temperature of an AI??

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Large Language Models explained briefly

Large Language Models explained briefly

T-Test Explained

T-Test Explained

100km/h Through The Legs 😱🚀

100km/h Through The Legs 😱🚀

Улетели ДЕЛАТЬ детей с Настей Туман НЕСТАНДАРТНЫМ способом.

Улетели ДЕЛАТЬ детей с Настей Туман НЕСТАНДАРТНЫМ способом.

HOW EMPIRES DISAPPEARED 2 ⚔️ #countryhumans

HOW EMPIRES DISAPPEARED 2 ⚔️ #countryhumans

ЭКСТРЕМАЛЬНЫЕ ЗАДАНИЯ от ДРУЗЕЙ ! **мы больше не общаемся**

ЭКСТРЕМАЛЬНЫЕ ЗАДАНИЯ от ДРУЗЕЙ ! **мы больше не общаемся**

дети раньше vs дети сейчас (телефон)

дети раньше vs дети сейчас (телефон)

У МЕНЯ ЕСТЬ ДОЧЬ? СНОВА СТАЛА МАМОЙ!

У МЕНЯ ЕСТЬ ДОЧЬ? СНОВА СТАЛА МАМОЙ!

Power of Makeup (Poppy Playtime)

Power of Makeup (Poppy Playtime)

ПОЛУФИНАЛ! NaVi vs Spirit - IEM Katowice 2025 - ЛУЧШИЕ МОМЕНТЫ CS2 | КРИВОЙ ЭФИР

ПОЛУФИНАЛ! NaVi vs Spirit - IEM Katowice 2025 - ЛУЧШИЕ МОМЕНТЫ CS2 | КРИВОЙ ЭФИР