How to set up RAG - Retrieval Augmented Generation (demo)

Mastering OpenAI CLIP Model & Practical Usage Tutorial

Unlocking innovative generative AI use cases with text and multimodal embeddings

Marlon Wayans' Hollywood Stories & Mo'Nique Saying The Wayans Bros Stole Her Joke In White Chicks

20 YouTubers vs World's Deadliest Escape Room

minecraft movie trailer… if it was good

OpenAI CLIP Embeddings: Walkthrough + Insights

John Tan Chong Min

Просмотров 684

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 8 сен 2024

Комментарии • 6

@johntanchongmin 5 месяцев назад
At 58:22, the weights W_i and W_t are the projections of the embedding space form the image model output and text model output respectively (allows for change in embedding dimension). This allows for more generic text and image models with different output dimensions, and they can all map to the same embedding dimension.
@johntanchongmin 5 месяцев назад
For the loss function at 1:00:15, they use Cross Entropy Loss with the input as the unnormalised logits (multiply by exponent term with temperature t). That is why there is a need to multiply the resultant cosine similarity matrix with the logits. In the Cross Entropy Loss function, this will be divided further by the summation of all other input terms multiplied by the exponent term (otherwise known as normalised). See pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html for details.
@johntanchongmin 5 месяцев назад
CLIP's loss function has also been described as InfoNCE loss, a common loss term for contrastive learning.
See builtin.com/machine-learning/contrastive-learning for details.
It is essentially Cross Entropy over cosine similarity terms, which is what is done in CLIP.
@johntanchongmin 5 месяцев назад
1:07:31 This is a mistake on my end - this is not the ImageNet Supervised Learning model. Li. et. al. is actually the Visual N-gram model where they predict n-grams (n words) for each picture. arxiv.org/pdf/1612.09161.pdf
Here, I believe they did not even implement out their model (it is quite low performance of 10+% accuracy on ImageNet), but rather, just use the method of how they use the class name text directly. They applied this on CLIP.
Basically, the paper was misleading - they did not even need to refer to Li. et. al. for that chart as the methodology is totally different. It is just CLIP with ImageNet class names without any added prompt engineering.
@Qzariuss 5 месяцев назад ⁺¹
going to try this tomorrow
@johntanchongmin 5 месяцев назад
Jupyter Notebook Code can be found here if you want to do your own experiments too:
github.com/tanchongmin/TensorFlow-Implementations/tree/main/Paper_Reviews/CLIP/CLIP%20Code

Следующие

Автовоспроизведение

How to set up RAG - Retrieval Augmented Generation (demo)

How to set up RAG - Retrieval Augmented Generation (demo)

Mastering OpenAI CLIP Model & Practical Usage Tutorial

Mastering OpenAI CLIP Model & Practical Usage Tutorial

Unlocking innovative generative AI use cases with text and multimodal embeddings

Unlocking innovative generative AI use cases with text and multimodal embeddings

Marlon Wayans' Hollywood Stories & Mo'Nique Saying The Wayans Bros Stole Her Joke In White Chicks

Marlon Wayans' Hollywood Stories & Mo'Nique Saying The Wayans Bros Stole Her Joke In White Chicks

20 YouTubers vs World's Deadliest Escape Room

20 YouTubers vs World's Deadliest Escape Room

minecraft movie trailer… if it was good

minecraft movie trailer… if it was good

Guess The Secret NFL Player ft. Jalen Ramsey

Guess The Secret NFL Player ft. Jalen Ramsey

Multi-Tenant: Database Per Tenant or Shared?

Multi-Tenant: Database Per Tenant or Shared?

RAG with knowledge graph | Advance RAG| Graph RAG with Neo4J | Streamlit RAG

RAG with knowledge graph | Advance RAG| Graph RAG with Neo4J | Streamlit RAG

Contrastive Language-Image Pre-training (CLIP)

Contrastive Language-Image Pre-training (CLIP)

How AI 'Understands' Images (CLIP) - Computerphile

How AI 'Understands' Images (CLIP) - Computerphile

Fast intro to multi-modal ML with OpenAI's CLIP

Fast intro to multi-modal ML with OpenAI's CLIP

The Evolution of AI: Traditional AI vs. Generative AI

The Evolution of AI: Traditional AI vs. Generative AI

Beyond the Hype: A Realistic Look at Large Language Models • Jodie Burchell • GOTO 2024

Beyond the Hype: A Realistic Look at Large Language Models • Jodie Burchell • GOTO 2024

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Generative AI in a Nutshell - how to survive and thrive in the age of AI

OpenAI’s CLIP explained! | Examples, links to code and pretrained model

OpenAI’s CLIP explained! | Examples, links to code and pretrained model

В конце дочь Мии Бойки? 😱👩🏻‍🎤 #виола #шортс

В конце дочь Мии Бойки? 😱👩🏻‍🎤 #виола #шортс

чем закончился прикол, смотри в тг «хей! это марьяна!» @Dasha_Da_

чем закончился прикол, смотри в тг «хей! это марьяна!» @Dasha_Da_

Chuck Be Like : Not My Problem😂 | #brawlstars #shorts

Chuck Be Like : Not My Problem😂 | #brawlstars #shorts

ЭТОТ МОДИФИЦИРОВАННЫЙ СКОРОСТРЕЛ ПРОСТО ИМБА! / PVZ

ЭТОТ МОДИФИЦИРОВАННЫЙ СКОРОСТРЕЛ ПРОСТО ИМБА! / PVZ

The Most Elite Chefs Ever!

The Most Elite Chefs Ever!

Каха домашние продукты #непосредственнокаха

Каха домашние продукты #непосредственнокаха

ЭТО САМЫЙ УДОБНЫЙ МОД НА PLANTS VS ZOMBIES!

ЭТО САМЫЙ УДОБНЫЙ МОД НА PLANTS VS ZOMBIES!

Джарахов про ситуацию с Кукояками #джарахов #кукояки #instasamka #инстасамка

Джарахов про ситуацию с Кукояками #джарахов #кукояки #instasamka #инстасамка