Thank you for watching this video! This was a part of my preparation for AWS Machine Learning Specialty exam. If you liked this video, check one more related here: - NLP with Tensorflow and Keras. Tokenizer, Sequences and Padding (ruclips.net/video/qw7rkwsk0oc/видео.html)
your idf was wrong, if idf = number of docs containing term/total number of docs, result will return the value less than or equal to 0, IDF must be equal to "total number of docs/number of docs containing term"
There's an error at 4:29 when you describe IDF calculation. The numerator is the 'total number of documents in the corpus', not the denominator. I guess picking up an example where word frequency and number of documents are not the same number , here 2, would have helped. Thanks!
People are saying IDF calculation was wrong? If IDF = N / {d element of D: t element of d}, so N documents divided by the amount of documents which does contain the term, then this will obviously give us 2/2. What is wrong here? Some people propose 2/5, but then, why 5? The term "fox" appears 5 times across all documents that is true, but the total number of documents which contain the term "fox" is still 2.
In this example, the TF-IDF score doesn't reflect that the word "fox" appears more times in d2. And therefore it loses that information that could help to distinguish d1 and d2
I think there is an error when you calculate the IDF in the logarithm part , we do have total no of "5" terms of "fox" in the corpus I think it should be log(5/2).
You forgot to remove stop words and perform lemmatization and stemming before calculating the term frequency so invariably the entire problem becomes wrong
Thank you for watching this video! This was a part of my preparation for AWS Machine Learning Specialty exam.
If you liked this video, check one more related here:
- NLP with Tensorflow and Keras. Tokenizer, Sequences and Padding (ruclips.net/video/qw7rkwsk0oc/видео.html)
your idf was wrong, if idf = number of docs containing term/total number of docs, result will return the value less than or equal to 0, IDF must be equal to "total number of docs/number of docs containing term"
He probably forgot the inverse part.
short, precise,and easy to understand Tutorial Thanks!
idf=total number of docs/number of docs containing term
Great video! there's an error tho. IDF=total number of docs/number of docs containing term
Great video! Thank you man for effecient expression. I'm from Turkiye. I like your videos.
Thanks for watching! Appreciate your feedback! :)
quem veio pelo Guruja? Vamos vencer, aqui SEFAZ, aqui se passa! Pra cima !
Amém!
I think you got the IDF part wrong, the denominator and nominator should be the other way around
There's an error at 4:29 when you describe IDF calculation. The numerator is the 'total number of documents in the corpus', not the denominator. I guess picking up an example where word frequency and number of documents are not the same number , here 2, would have helped. Thanks!
People are saying IDF calculation was wrong? If IDF = N / {d element of D: t element of d}, so N documents divided by the amount of documents which does contain the term, then this will obviously give us 2/2. What is wrong here? Some people propose 2/5, but then, why 5? The term "fox" appears 5 times across all documents that is true, but the total number of documents which contain the term "fox" is still 2.
it is wrong
wow, clearly the best explanation
Thanks a lot! :)
10q
Thank you for your effort for this content!
In this example, the TF-IDF score doesn't reflect that the word "fox" appears more times in d2.
And therefore it loses that information that could help to distinguish d1 and d2
term frequency does that
is still tf-idf work to optimize content for beter ranking ?
I think there is an error when you calculate the IDF in the logarithm part , we do have total no of "5" terms of "fox" in the corpus I think it should be log(5/2).
I think it should be log(2/5)
No
"The big D"
Fantastic Explanation !!!
Thank you for feedback! :)
which software are you using for explaing?
For this tutorial: simple PowerPoint and Camtasia
Great video! can you share the your slides if its possible?
Sadly I dont't have slides of that, just this video... :/
Pause the video, take a screenshot. Paste in the Powerpoint. Voila!
sarunas pao religion
great content! thank u!
You forgot to remove stop words and perform lemmatization and stemming before calculating the term frequency so invariably the entire problem becomes wrong
Extremely good explained!
Really appreciate your feedback, thank you for watching! :)
@@DataScienceGarage clear explanation but its wrong dude
Great video thanks!
Thanks for watching! Hoping it was useful. :)
nice! easy explanation :)
Thanks for watching! :)
Very Helpful thanks
I think, IDF calculation is wrongly explained. It's just opposite of what he said for denominator and numerator.
Thank you
Thanks for watching this! :)
Fix your video. in IDF calculations you swapped the numerator and denumerator.
Excellent
Thanks for watching!
The big D
Love from india
Thanks for watching this!
Just be aware that 2 / 2 = 1 ! Not 0 like you hear in the video.
Hi! I have no idea where you saw 2/2=0 in this video... There was log(2/2)=0, which is true.
@@DataScienceGarage Check 4:54
@@YouPI227...but while I said "two divided by two equal to zero" I pointed to log(2/2)=0. Log(1)=0.
great
your IDF calculation is wrong