Sarcasm is Very Easy to Detect! GloVe + LSTM

K-means Clustering in Python

Introduction to NLP | GloVe Model Explained

How I Robbed The 2024 Steam Summer Sale - Steam Is Perfectly Balanced With NO EXPLOITS...

MARK CAVENDISH MAKES HISTORY! 🐐 | Tour de France Stage 5 Final Kilometres | Eurosport Cycling

I simulated a LEGO WAR...

Introduction to NLP | How to Train Custom Word Vectors

Normalized Nerd

Просмотров 4,2 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 5 апр 2020
#nlp #word2vec #python
In the last video, we have learned how to use pre-trained word vectors. Here, I've shown how to train your own word vectors using gensim library.
For more videos please subscribe -
bit.ly/normalizedNERD
Support me if you can ❤️
www.paypal.com/paypalme2/suji04
www.buymeacoffee.com/normaliz...
NLP playlist -
• Introduction to NLP
Source code -
github.com/Suji04/NormalizedN...
Data source -
megagonlabs.github.io/HappyDB/
Facebook page -
/ nerdywits

Комментарии • 21

@shrutiiyyer2783 2 года назад
More such videos please, this is much better than the Udemy courses, even the paid ones!
@ARSHABBIR100 4 года назад
Excellent. Thanks for uploading. Kindly make more videos to build a chatbot .
@NormalizedNerd 4 года назад
It is in my wish-list too! keep supporting
@vishnuprabhaviswanathan546 2 года назад
Can u show how to calculate similarity of 2 words using custom trained word2vec
@Lotof_Mazey Год назад ⁺¹
Sir Kindly guide - How can I use Pre Trained word embedding models for local languages (or languages written in Roman format) that are not available/trained in the pretrained model. Do I have to use an embedding layer(not pre trained) for creating embedding matrices for any local language? How can I get benefit from pretrained models for local language?
@NormalizedNerd Год назад ⁺¹
Hi, unfortunately there aren't a lot of pre-trained word embeddings of romanized non-english languages. You can search and if you find something then you can fine tune it on your data. But I don't think there's an easy way to use English models on romanized non-english languages.
@MrStudent1978 4 года назад ⁺¹
Very nice explanation! I have a question....in shell no 50..what is sense behind "trainable = false" ? The video is about training custom word2vec...then why false?
@NormalizedNerd 4 года назад ⁺¹
@Gurpreet Singh
I understand your confusion.
We are actually training our word vectors in shell 46. In shell 50, we are making our embedding layer that will be placed just before the LSTM units. Remember that embedding layer is nothing but the learned word vectors (in matrix form)!
So if we make trainable = True at the embedding layer then keras will train the embedding layer (i.e. the word vectors) again while performing the back prop on LSTM. We don't want that.
I hope now it's clear to you.
@MrStudent1978 4 года назад
@@NormalizedNerd thanks for your response! I got it now....
@vishnuprabhaviswanathan546 2 года назад
Pls show how to custom train Bert embedding
@rushikeshkulkarni7758 Год назад
why didn't we use sklearn train_test_split?
@hanjes4793 3 года назад ⁺¹
Hello...i got a question. In train test split cell. Where is 'word_index' from??? Thx
@NormalizedNerd 3 года назад
It's the Keras Tokenizer that is giving us the 'word_index'
@WhatsAI 4 года назад
Great video m yfriend!
@NormalizedNerd 4 года назад
Thank you pal :)
@s.m.saifulislambadhon2654 4 года назад ⁺¹
bro, in 44 no shell what is the purpose of tokenizer when we already tokenize the sentences into words in preprocessing part
@s.m.saifulislambadhon2654 4 года назад
would you please explain 44 no shell little bit more briefly? I think this is the most important part which I missing....
@NormalizedNerd 4 года назад ⁺¹
Great point! In NLP preprocessing, tokenization makes it easier to clean the text. Here I generally use nltk library.
In block 44, I did the tokenization with keras Tokenizer which allows us to use two nice functions: word_index & texts_to_sequences. These help us to create the tensors easily. So yes, tokenization is redundant here but I did it anyway to make our life easier :D
@s.m.saifulislambadhon2654 4 года назад ⁺¹
Thanks for the explanation
@coxixx 4 года назад ⁺¹
would you learn how to train our custom word vectors with Glove using python?
@NormalizedNerd 4 года назад
That's actually very easy. Just make your corpus (.txt file). Then use the official repo to train glove model on your corpus. github.com/stanfordnlp/GloVe

Следующие

Автовоспроизведение

Sarcasm is Very Easy to Detect! GloVe + LSTM

Sarcasm is Very Easy to Detect! GloVe + LSTM

K-means Clustering in Python

K-means Clustering in Python

Introduction to NLP | GloVe Model Explained

Introduction to NLP | GloVe Model Explained

How I Robbed The 2024 Steam Summer Sale - Steam Is Perfectly Balanced With NO EXPLOITS...

How I Robbed The 2024 Steam Summer Sale - Steam Is Perfectly Balanced With NO EXPLOITS...

MARK CAVENDISH MAKES HISTORY! 🐐 | Tour de France Stage 5 Final Kilometres | Eurosport Cycling

MARK CAVENDISH MAKES HISTORY! 🐐 | Tour de France Stage 5 Final Kilometres | Eurosport Cycling

I simulated a LEGO WAR...

I simulated a LEGO WAR...

2024 GMC Hummer EV SUV Review: A $110,000 Beast that Nobody Will Buy

2024 GMC Hummer EV SUV Review: A $110,000 Beast that Nobody Will Buy

Multiprocessing in Python | Basics to Advanced | Tutorial - 1

Multiprocessing in Python | Basics to Advanced | Tutorial - 1

Word2Vec - Skipgram and CBOW

Word2Vec - Skipgram and CBOW

Text Summarization & Keyword Extraction | Introduction to NLP

Text Summarization & Keyword Extraction | Introduction to NLP

Introduction to NLP | GloVe & Word2Vec Transfer Learning

Introduction to NLP | GloVe & Word2Vec Transfer Learning

Training Word Vectors with Facebook's fastText

Training Word Vectors with Facebook's fastText

Watch this to learn Machine Learning in 2021!

Watch this to learn Machine Learning in 2021!

Pytorch Transformers from Scratch (Attention is all you need)

Pytorch Transformers from Scratch (Attention is all you need)

DBSCAN Algorithm | Machine Learning with Scikit-Learn Python

DBSCAN Algorithm | Machine Learning with Scikit-Learn Python

КАКОЙ У ТЕБЯ ЛЮБИМЫЙ МАРМЕЛАД?

КАКОЙ У ТЕБЯ ЛЮБИМЫЙ МАРМЕЛАД?

ТЯНЕМ ОРХИДЕЮ К СОЛНЦУ #тренинг #мудрость #эмираты #девочки #юмор #спб #топ

ТЯНЕМ ОРХИДЕЮ К СОЛНЦУ #тренинг #мудрость #эмираты #девочки #юмор #спб #топ

Добейте нам 2 миллиона с братом 🥹 #шортс #мем #бабич #shorts #trending #ytshorts

Добейте нам 2 миллиона с братом 🥹 #шортс #мем #бабич #shorts #trending #ytshorts

Сколько стоит жить в США #сша #штаты #цены

Сколько стоит жить в США #сша #штаты #цены

If You Like My Outfit Then You Are Subscribed 👀

If You Like My Outfit Then You Are Subscribed 👀

ДЕВУШКА НЕ МОЖЕТ ПРИПАРКОВАТЬСЯ В ГТА 5 (GTA 5 RMRP / Криминальная Москва)

ДЕВУШКА НЕ МОЖЕТ ПРИПАРКОВАТЬСЯ В ГТА 5 (GTA 5 RMRP / Криминальная Москва)

Громкие заявления Лукашенко на саммите ШОС! | О чём Президент говорил с Генсеком ООН?

Громкие заявления Лукашенко на саммите ШОС! | О чём Президент говорил с Генсеком ООН?