when it comes to translation, there can be >1 correct answers. BLEU (bilingual evaluation) score measures how correct a translation is by comparing it with the translation provided by actual people. modified precision is used to calculate BLEU score modified precision (word by word) = max number of times the word is supposed to appear / number of times the word is present in the translation. modified precision on bigrams is where you take two consecutive words at a time (like a slider) and then calculate using the same formula (but for a two word phrase this time) same goes for n-grams if the output is exactly equal to one of the references, all modified precision values (for 1,2,....n-grams) = 1.0 combined BLEU score = BP*exp(sum of k modified precisions / n) where k goes from 1 to n and BP=brevity penalty (it penalizes translations that are too short because short translations have a higher chance of having higher modified precision scores) BP = 1 if output (machine translation) length>reference (human translation) length BP = exp(1- (machine translation length/human translation length)) otherwise
Great video, but there's an error I've seen in every resource I've looked at. Had to find out from reading the original paper. Cumulative Bleu score = BP × exp( 1/n x sum(log(Pn))).. the log is an important difference. Video was great tho! I've seen like 5 resources that seem to have left the log out
There's an awful lot of 5-grams! Papineni et al (2002) states "... as can be seen in Figure 2, the modified n-gram precision decays roughly exponentially with n...", so I expect that 5-grams are a pain to calculate, and they don't add much precision to the score.
thank for upload this series ... plz add course about natural processing language that pro andrew mention is last past of full series about deeplearn in coursea :)) thk
I read other people's tutorials regarding this topic and by far this is the best and easiest tutorial on bleu score. Thanks a lot.
when it comes to translation, there can be >1 correct answers. BLEU (bilingual evaluation) score measures how correct a translation is by comparing it with the translation provided by actual people. modified precision is used to calculate BLEU score
modified precision (word by word) = max number of times the word is supposed to appear / number of times the word is present in the translation.
modified precision on bigrams is where you take two consecutive words at a time (like a slider) and then calculate using the same formula (but for a two word phrase this time)
same goes for n-grams
if the output is exactly equal to one of the references, all modified precision values (for 1,2,....n-grams) = 1.0
combined BLEU score = BP*exp(sum of k modified precisions / n) where k goes from 1 to n and BP=brevity penalty (it penalizes translations that are too short because short translations have a higher chance of having higher modified precision scores)
BP = 1 if output (machine translation) length>reference (human translation) length
BP = exp(1- (machine translation length/human translation length)) otherwise
Thank you and also your voice is so calming
There is an error that BP:=exp(1-ref_output/MT_output)
Please upload the full series. Eagerly waiting.
Looking forward to your upload of full series of sequence models~
I think that brevity penalty factor has to be
exp(1-reference_output_length/MT_output_length) if MT_output_length
Yes this is a typo. The original paper also sum over log(p) to scale Bleu between 0 and 1
according to the original paper, does the BP under otherwise condition should be exp(1-reference_outut_length/MT_output_length)?
yes. it was corrected later in the coursera course.
Please upload the full series !!!
Thank you.
Great video, but there's an error I've seen in every resource I've looked at. Had to find out from reading the original paper. Cumulative Bleu score = BP × exp( 1/n x sum(log(Pn))).. the log is an important difference. Video was great tho! I've seen like 5 resources that seem to have left the log out
Lecture L03 and L04 are missing from the playlist of the week
Will the full series of Sequence Models be uploaded soon?
did you find it ?
Whole course in python or octave
thank you andrew.
at time 8:14 there is a mistake, the count clip for "the mat" should be 2, isn't it.?
I'm afraid not. It might be max appearence in one sentence.
Brevity Penality Factor:
IF len(MT_output) == len(ref_output)
then also exp(1-m/r) equals to 1? Right?
Yes. But the equation has a typo. You should swap numerator and denominator.
Why only up to p_4 n-gram, if there are 6 words in reference #1? Up to p_5 is better, no?
There's an awful lot of 5-grams! Papineni et al (2002) states "... as can be seen in Figure 2, the modified n-gram precision decays roughly exponentially with n...", so I expect that 5-grams are a pain to calculate, and they don't add much precision to the score.
Kindly upload full series
please upload full series of sequence models. waiting for it.
ruclips.net/p/PLBAGcD3siRDittPwQDGIIAWkjz-RucAc7
Hey thanks for the reply
You can do the whole course with the Jupyter Notebook projects and quizzes and get a certification on Coursera.
@@adamishay808 The playlist does not exist.
thank you . we need more example
thank for upload this series ... plz add course about natural processing language that pro andrew mention is last past of full series about deeplearn in coursea :)) thk
wtf is wrong with your english??
@@Ahmed-fj5jq no it's not... I recommend you to buy it. It worth every penny you spend.
"on the" appears twice?
sentence 1 = 1 "on the", sentence 2 = 1 "on the".
in unigram he mentioned that we take the maximum one.
Thus, "on the" = 1
@@jackyangara1 "On the" is bigram right?
@@BalaguruGupta yes
For count clip, we take the max. occurence of the n-gram in the referenes. Not the count of the n-gram from each reference.
There is a high pitch sound. It is so annoying.
0:45 Reference 2 is not perfectly fine.
1:59 didn't know that he uses Slack
I thought my Slack give me notification
2:40
有老吴的 中文视频吗 这个真的听不懂
我觉得不错啊!讲的非常好
One of the most boring lecturer I’ve ever seen! He’s great, though.
nice one, you made me laugh idiot !
poorly explained. and the formula is wrong. Andrew is overrated
thank you so much. or else my calculation would have been wrong