Bag of Words - Feature Extraction in Natural Language Processing (BoW in NLP)

The moment we stopped understanding AI [AlexNet]

The Best Learning Tool in History - 400 years ahead of its time!

Shakira - Soltera (Audio)

Minecraft but I get CAPTURED in PVP CIVILIZATION

North Carolina flooding, deaths confirmed: Helene aftermath & recovery in Asheville

Cosine Similarity ← Natural Language Processing ← Socratica

Socratica

Просмотров 4,5 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 29 сен 2024

Комментарии • 15

@Socratica 7 месяцев назад ⁺³
𝙄𝙣𝙩𝙧𝙤𝙙𝙪𝙘𝙞𝙣𝙜 𝙎𝙤𝙘𝙧𝙖𝙩𝙞𝙘𝙖 𝘾𝙊𝙐𝙍𝙎𝙀𝙎
www.socratica.com/collections
@MakeDataUseful 7 месяцев назад ⁺⁵
Fantastic video, great to see another Socratica video in my feed
@jagadishgospat2548 7 месяцев назад ⁺³
Keep em coming, the courses are looking good too.
@Insightfill 7 месяцев назад ⁺²
This is phenomenal! Here I was, thinking we were just going to talk about the cos a = a approximation in trig. Bonus!
@Socratica 7 месяцев назад ⁺³
It was a fun surprise to learn about this technique 💜🦉
@Insightfill 7 месяцев назад ⁺²
@@Socratica It's fun when you hear of similar analysis done to uncover ghost writers or shared authorship. Shakespeare, Rowling, and The Federalist Papers all come to mind.
@juanmacias5922 7 месяцев назад ⁺²
So cool, being able to find similarities in books from neighboring time periods was fascinating.
@Socratica 7 месяцев назад ⁺³
It really makes us curious about a lot of the more recent writers-can you use this to find out which older writers influenced them!
@AndrewMilesMurphy 5 месяцев назад
That's a very intuitive and helpful explanation, thank you. But pray tell, prithee even, is not some relationship between words in individual sentences what we would prefer (smaller angels)? It seems odd to me that when creating embeddings we're focused on these huge arcs rather than the smaller arcs that build understanding on a more basic level. The thresh-hold for AI in GPT 3 seems to have been on a huge amount of text, but isn't there some way to make that smaller? For most of us, that's the only way we can even contribute, as we just don't have the computer-hardware.
@ahmedouerfelli4709 6 месяцев назад
I don't like removing "stop words" from the statistics, because their frequency is still meaningful. Even though everybody uses the word "the" frequently, some use it much more than others; and that is some characteristic that should not be ignored.
So rather, I would suggest performing some kind of "normalization"; like dividing each word count by the average occurrence rate of that particular word in natural language.
Instead of just word counts, the vector coordinates will consist of relative use rate of the particular word in the book compared the average use rate in general language.
That would make a much more precise comparison. Because not just "stop words" are very common, some words are inherently much more common than others.
Although I did not make the experiment, I suspect that in this way, everything will have a much lower cosine similarity.
@OPlutarch 6 месяцев назад
Very useful info, and the approach was excellent, very fun too
@danielschmider5069 7 месяцев назад
pretty good, but the visualization of the results could have been made in something other than a table. That way, you wouldn't have to explain why the diagonal is 1, and that every number appears twice (mirrored along the diagonal). You'd end up with just 45 rather than 100 datapoints, and then compare the "top 10" across the different measurements. This would be much easier to follow.
@Socratica 7 месяцев назад ⁺³
Interesting!! We'd love to see a sketch of what you have in mind!
@cryptodashboard1173 2 месяца назад
@@Socraticapls upload more videos on AI and machine learning
@orangeinfotainment620 6 месяцев назад
Thank you

Следующие

Автовоспроизведение

Bag of Words - Feature Extraction in Natural Language Processing (BoW in NLP)

Bag of Words - Feature Extraction in Natural Language Processing (BoW in NLP)

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

The Best Learning Tool in History - 400 years ahead of its time!

The Best Learning Tool in History - 400 years ahead of its time!

Shakira - Soltera (Audio)

Shakira - Soltera (Audio)

Minecraft but I get CAPTURED in PVP CIVILIZATION

Minecraft but I get CAPTURED in PVP CIVILIZATION

North Carolina flooding, deaths confirmed: Helene aftermath & recovery in Asheville

North Carolina flooding, deaths confirmed: Helene aftermath & recovery in Asheville

We Made It This Far

We Made It This Far

Cosine Similarity, Clearly Explained!!!

Cosine Similarity, Clearly Explained!!!

A math GENIUS taught me how to LEARN ANYTHING in 3 months (it's easy)

A math GENIUS taught me how to LEARN ANYTHING in 3 months (it's easy)

What P vs NP is actually about

What P vs NP is actually about

Text Preprocessing « NLP « Machine Learning - Mathematica Essentials

Text Preprocessing « NLP « Machine Learning – Mathematica Essentials

Coding Was HARD Until I Learned These 5 Things...

Coding Was HARD Until I Learned These 5 Things...

You're not stupid: A Science based System to Learn ANYTHING quickly

You're not stupid: A Science based System to Learn ANYTHING quickly

Accuracy vs PRECISION 🎯 College Math & Science

Accuracy vs PRECISION 🎯 College Math & Science

A Helping Hand for LLMs (Retrieval Augmented Generation) - Computerphile

A Helping Hand for LLMs (Retrieval Augmented Generation) - Computerphile

Einstein's grades 👀

Einstein's grades 👀

РАДИОУПРАВЛЯЕМАЯ vs НАСТОЯЩАЯ МАШИНА ЗА 500$ !)

РАДИОУПРАВЛЯЕМАЯ vs НАСТОЯЩАЯ МАШИНА ЗА 500$ !)

Каре - это не из-за расставания, а по любви❤️

Каре – это не из-за расставания, а по любви❤️

Сюрприз, который вы не можете себе представить | длинная версия #shorts

Сюрприз, который вы не можете себе представить | длинная версия #shorts

Девушка, будьте осторожнее #юмор #пранк #прикол #топ

Девушка, будьте осторожнее #юмор #пранк #прикол #топ

I CAN’T BELIEVE! The smallest Phone😱😍 #tiktok #elsarca

I CAN’T BELIEVE! The smallest Phone😱😍 #tiktok #elsarca

РОДИТЕЛИ НА ШКОЛЬНОМ ПРАЗДНИКЕ

РОДИТЕЛИ НА ШКОЛЬНОМ ПРАЗДНИКЕ

50m Small Bike vs Car FastChallenge

50m Small Bike vs Car FastChallenge