[research paper review] GPT-3 : Language Models are Few-Shot Learners

GPT - Explained!

GPT2 Explained!

Trying EVERY Fast Food Holiday Item!

Barstool Pizza Review - Del Rossi's (Philadelphia, PA) Bonus Cheesesteak Presented by Tommy John

THE AMAZING DIGITAL CIRCUS - Ep 4: Fast Food Masquerade

GPT-2 (basic for understanding for GPT-3)

Minsuk Heo 허민석

Просмотров 3,6 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 6 фев 2025
GPT-3 is super intelligent NLP deep learning model. In order to understand GPT-3 or later version, we should understand fundamental basic of it which is the GPT-2. I covered the how the GPT-2 could achieved zero shot learning and replaced high score on multiple NLP bench marks, and gave examples how GPT-2 as one model could achieve multiple NLP tasks without fine tuning.

Комментарии • 6

@lifegeargg 3 года назад ⁺¹
3:49 위 GPT-2 는 GPT-1 인 거죠..? 쉬운 설명 감사합니다.
@hinnawe 4 года назад ⁺¹
Hello Sir,
if we take a queation similarity task, the input in Bert is:
CLS token + Question one + SEP + Question 2 + SEP
I read that yhe input in GPT-2 is:
Question one + Question 2 + CLS token.
Is this correct?
If yes,
Should we use the CLS token to represent the input for classification as we do in Bert?
@TheEasyoung 4 года назад ⁺¹
Thanks for a good question. Bert and gpt are different, and cls token only exist in bert. You can use the last token from gpt for a classification but the result may worse than bert. Gpt-2 research paper has not covered question similarity task, so there is no official ways for your question, but I honestly don’t understand what would be expected out from gpt for the input q1 + q2 + special token. I think possible solution for question similarity is to use good sentence embedding and checking similarity or using Siamese network to see if the pair is similar. Hope this answer at least have some direction for your question.
@hinnawe 4 года назад
@@TheEasyoung Thank you for your prompt reply.
I am not sure about the last token used by gpt-2 for classification. Do you mean the eos token or, should we append the CLS token to the end of the input after adding it to the vocabulary, then use the representation of this CLS token for classification as done in BERT
?
Will the output in gpt-2 be output[0][:,-1], the output embedding for the last token ? In Bert, the pooled output is output[0][:,1], which is the embeddings for CLS token.
With regard to padding:
In Bert, pad token is appended to the end of the input ( to the right), whereas in GPT-2 a pad token is placed at the beginning!
Thanks in advanced.
@TheEasyoung 4 года назад ⁺¹
While bert has cls token for classification, gpt-2 doesn’t have unless you trained with cls token at the end. The more like gpt-2 way is like below,
train data1: how are you, s1, how are you doing, s2, true
Train data2: i am a boy, s1, thanks, s2, false
You will make sure you have enough data for generative training for doing it in gpt-2 way.
BERT has pretrained for classification with cls token so bert should be easy for you to fine tune and use while you can’t find gpt-2 for your usecase.
I hope this answers your question.
@hinnawe 4 года назад ⁺¹
@@TheEasyoung many thanks for your clarification.

Следующие

Автовоспроизведение

[research paper review] GPT-3 : Language Models are Few-Shot Learners

[research paper review] GPT-3 : Language Models are Few-Shot Learners

GPT - Explained!

GPT - Explained!

GPT2 Explained!

GPT2 Explained!

Trying EVERY Fast Food Holiday Item!

Trying EVERY Fast Food Holiday Item!

Barstool Pizza Review - Del Rossi's (Philadelphia, PA) Bonus Cheesesteak Presented by Tommy John

Barstool Pizza Review - Del Rossi's (Philadelphia, PA) Bonus Cheesesteak Presented by Tommy John

THE AMAZING DIGITAL CIRCUS - Ep 4: Fast Food Masquerade

THE AMAZING DIGITAL CIRCUS - Ep 4: Fast Food Masquerade

Seungmin "그렇게, 천천히, 우리(As we are)" | [Stray Kids : SKZ-PLAYER]

Seungmin "그렇게, 천천히, 우리(As we are)" | [Stray Kids : SKZ-PLAYER]

GPT-1 (basic for understanding GPT-2 and GPT-3)

GPT-1 (basic for understanding GPT-2 and GPT-3)

Google’s AI Course for Beginners (in 10 minutes)!

Google’s AI Course for Beginners (in 10 minutes)!

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

8 НЕОЧЕВИДНЫХ МИНУСОВ ЖИЗНИ В ЯПОНИИ, о которых редко говорят

8 НЕОЧЕВИДНЫХ МИНУСОВ ЖИЗНИ В ЯПОНИИ, о которых редко говорят

How I'd Learn AI in 2025 (if I could start over)

How I'd Learn AI in 2025 (if I could start over)

I asked an AI for video ideas, and they were actually good

I asked an AI for video ideas, and they were actually good

Large Language Models explained briefly

Large Language Models explained briefly

Attention in transformers, step-by-step | DL6

Attention in transformers, step-by-step | DL6

WE TRIED TO DO IT IN DOUBLE SPEED! 🤣 #shorts

WE TRIED TO DO IT IN DOUBLE SPEED! 🤣 #shorts

притворился дедом и проверил шаурмечные на человечность ч10

притворился дедом и проверил шаурмечные на человечность ч10

Инфоцыгане: от марафона желаний до марафона кредитов | Маркарян, Блиновская, разоблачение

Инфоцыгане: от марафона желаний до марафона кредитов | Маркарян, Блиновская, разоблачение

КОГДА БАТЯ ПОЛУЧИЛ ТРАВМУ НА РАБОТЕ😂#shorts

КОГДА БАТЯ ПОЛУЧИЛ ТРАВМУ НА РАБОТЕ😂#shorts

The Fantastic Four: First Steps | Official Teaser | Only in Theaters July 25

The Fantastic Four: First Steps | Official Teaser | Only in Theaters July 25

спидран по ютуб шортс 107 | Амёба поедающая мозг

спидран по ютуб шортс 107 | Амёба поедающая мозг

На ТАКОЙ ПОСТУПОК способен только человек с по-настоящему ДОБРЫМ СЕРДЦЕМ #shorts

На ТАКОЙ ПОСТУПОК способен только человек с по-настоящему ДОБРЫМ СЕРДЦЕМ #shorts

Мем про дорожку

Мем про дорожку