BERT Research - Ep. 1 - Key Concepts & Sources

Поделиться
HTML-код
  • Опубликовано: 22 дек 2024

Комментарии •

  • @zhou7yuan
    @zhou7yuan 4 года назад +11

    Significance [0:08]
    Research Posts [1:27]
    BERT Mountain [3:02]
    can we skip over LSTMs [5:14]
    BERT Paper [5:50]
    BERT Repo [7:15]
    BERT Announcement Post [7:40]
    Attention is All You Need (Transformer) [8:11]
    The Annotated Transformer [8:42]
    Jay Alammar's Posts [10:28]
    Sequence Models on Coursera [11:23]
    Next Up [13:19]

  • @davidz6828
    @davidz6828 5 лет назад +6

    Hi Chris, I read your articles on BERT before and have learned a ton from them. Can't believe you have videos as well. Thanks for sharing the knowledge!

  • @thalanayarmuthukumar5472
    @thalanayarmuthukumar5472 4 года назад +6

    A very no nonsense way of representing the work you are doing. It felt like I was with you and studying with you. Thanks. I am planning to go through the rest of your videos, in my journey to learn BERT

    • @ChrisMcCormickAI
      @ChrisMcCormickAI  4 года назад +2

      Thanks so much Thalanayar! I'm so glad the videos are helping you on your BERT journey! :D

  • @nana-xf7dx
    @nana-xf7dx 2 года назад +1

    Your explanation is super clear and I like the Bert mountain which shows what I need to understand first .

  • @jingyingwang767
    @jingyingwang767 4 года назад +1

    OMG, that BERT Mountain picture at the beginning is exactly what I've been conceptualizing!! I love this series of videos! Thanks a lot!

  • @AbdelhakMahmoudi
    @AbdelhakMahmoudi 4 года назад +11

    Hi Chris, I like the way you explain things. I like visual explanations and the BERT's mountain was "all I need" :D !, thanks a lot.

  • @prakashkafle454
    @prakashkafle454 3 года назад

    Token indices sequence length is longer than the specified maximum sequence length for this model (1312 > 512). Running this sequence through the model will result in indexing errors . Get this error message while doing news classification .

  • @kooshatahmasebipour690
    @kooshatahmasebipour690 4 года назад

    Feeling so lucky to find your website, resources, and channel. Thanks a lot!

  • @leliaglass1568
    @leliaglass1568 5 лет назад +8

    thanks for making this video, I am enjoying the series. I would especially like to see hands-on demos as Jupyter notebooks! :)

  • @vinayreddy8683
    @vinayreddy8683 4 года назад

    Like the way you teach. Not many people are teaching nlp, so it's good to have a person like you.
    Btw, 1000th subscriber.

  • @rabirajbanerjee3872
    @rabirajbanerjee3872 3 года назад

    Awesome series, I have a basic idea about how Attention Mechanism work but this builds on the concepts

  • @tobiasgiesemann2180
    @tobiasgiesemann2180 4 года назад

    Hi Chris, thanks so much for the video.
    I actually got stuck on the same line in the BERT paper, where it says "we will omit an exhaustive explanation". From there I went down the BERT mountain and finally got to your video, so thanks a lot for picking me up on the way.
    Looking forward to the rest of the series!

  • @viiids
    @viiids 4 года назад

    I understand RNNs, LSTM, Bidirectional LSTMs and Attention I still found the BERT paper hard to read and had the exact same feeling of the mountain you drew. This video and the subsequent one is getting me much more confident about BERT, hoping to watch the 3rd video in the morning. Thanks for this contribution, your explanation is very concise.

    • @ChrisMcCormickAI
      @ChrisMcCormickAI  4 года назад

      Glad I'm not the only one! Thanks for your comment :)

  • @8g8819
    @8g8819 5 лет назад

    Hi,
    Please keep going on this (hands on)series, I'm pretty sure you will help lots of people out there!!

    • @ChrisMcCormickAI
      @ChrisMcCormickAI  5 лет назад

      Thanks giavo, I'll keep them coming!
      Anything in particular that you'd like to see explained?
      Thanks!

    • @8g8819
      @8g8819 5 лет назад +1

      @@ChrisMcCormickAI I think BERT research is perfect for now. Soon there gonna be a wide application of BERT in NLP area and a research like this one is perfect for who wants to understand all of it's aspects ( + general aspects as Word Embeddings, what exactly is Attention mechanism...). It would be great to talk about how can we adopt BERT to a certain domain with domain specific terms...
      Also, personally I would like to understand how to use BERT in order to compute similarity between 2 documents(tried already Cosine Similarity measure based on TF-IDF, Chi square, Keygraphs based keywords but still not happy with results)
      Thanks again!

  • @JJ_eats_wings
    @JJ_eats_wings 4 года назад +1

    Hahaha bursted into laughter at 3:47. Chris you are exactly right - I started researching BERT and then just keeps bouncing from topic to topic. (as a beginner to deep NNs)

  • @learn2know79
    @learn2know79 2 года назад

    Excellent work... it's very informative, especially the prerequisite domain knowledge area. Waiting to see more from you

  • @AbdennacerAyeb
    @AbdennacerAyeb 4 года назад

    Thank you a lot.
    You are making it easier for us to understand hard topics..

  • @riasingh2558
    @riasingh2558 5 лет назад +1

    Hi Chris,
    Firstly, thanks a lot for writing the most comprehensive blog post, extremely helpful. I have been following it to understand BERT more closely.
    Secondly, besides creating word and sentence vectors by using different pooling strategies and layers, could you please extend the blog post by showing how to compute the word attentions and their respective positions?
    Thanks!

  • @chronicfantastic
    @chronicfantastic 5 лет назад +4

    Great video, thanks for this looking forward to the rest of the series!

  • @keshavramaswamy6217
    @keshavramaswamy6217 4 года назад +7

    You sir, are a legend in your own right! Keep up all this work you are doing! At some point it will be helpful if you can put a guide to effective science writing like you do! :)

  • @akbarghurbal
    @akbarghurbal 4 года назад

    Thanks a lot for your videos. It's almost the end of 2020 and still there are no books on Amazon about BERT!

  • @praveenchalampalem4038
    @praveenchalampalem4038 4 года назад

    Wonderful Explanation Chris!!!!

  • @chuanjiang6931
    @chuanjiang6931 2 года назад

    What is the difference between attention and self-attention?

  • @syedhamza3314
    @syedhamza3314 Год назад

    Hi Chris, absolutely amazing series on Transformers. I have a question regarding how transformers handle the variable length inputs. Suppose I set the max_lenght for my sequences to be 32 and feed the input_id and attnetion_mask only for 32 tokens during training, given some tokens can be padded tokens since each sequence won't be exactly of 32 lengths. Now if we talk about bert the default max_lenght is 512 tokens so my question is does the transformer implicitly add 512-32 padded tokens to calculate MHA on 512 tokens as it will not attend to the tokens with the padded token ID? If that's the case then are we not updating the parameters directly attach to the remaining 512-32 positional vectors?

  • @binwangcu
    @binwangcu 4 года назад +1

    10:16 "all these sound very discouraging" - says Chris :)

  • @adityasoni121
    @adityasoni121 4 года назад +1

    Cool Video Chris!

  • @Ramm165
    @Ramm165 4 года назад

    Hi Chris thanks for the wonderful video. I would like to know if the topics covered in the ebook are different from videos or not .Thank you

  • @akhilsebastian3804
    @akhilsebastian3804 3 года назад

    Hi Chris, I am back here in your first video again after an year. I guess this time I ll be able to follow you better.

  • @mahdiamrollahi8456
    @mahdiamrollahi8456 3 года назад

    So, Bert is a model different from other language models like word2vec or Glove, right?

  • @flamingflamingo4021
    @flamingflamingo4021 4 года назад

    Do you have a playlist for all the episodes regarding BERT? It'd be really organized and helpful.

  • @CristianTraina
    @CristianTraina 3 года назад

    Really great content! Does anyone know how can I contact Chris? I need to ask permittion to use and quote some of his work

  • @geo2073
    @geo2073 4 года назад +1

    great content Chris!

  • @akshayklr057
    @akshayklr057 4 года назад

    I would appreciate if you could cover other models as well, these tutorials are good for a noob to start with.

  • @bitbyte8177
    @bitbyte8177 4 года назад

    What a great video! You earned a new subscriber.

  • @aytuncun6910
    @aytuncun6910 4 года назад

    Hi Chris, thanks for the post. Feeling lucky I've found your videos. Currently, I'm going through what you've been through basically. Can't wait to watch the whole series. Have you tried Google's Natural Questions challenge yet? Thanks again.

  • @hanman5195
    @hanman5195 4 года назад

    Hi Chris, This is really amazing explanation.
    Can you please help me how to use this Bert model with lime to explain model ?

  • @aanchalagarwal6886
    @aanchalagarwal6886 4 года назад

    Hey, the link to your blog page is throwing a 404: Page Not Found error. Could you please help me with the problem

  • @azizbenothman5374
    @azizbenothman5374 4 года назад

    I gave you the 800th like, good work

  • @kingeng2718
    @kingeng2718 4 года назад +1

    Nice Job, Thanks a lot for sharing

  • @swapnil9047
    @swapnil9047 4 года назад

    HI Chris,
    Great video!
    Do you have a medium /twitter channel to follow your latest works in Data Science?

  • @mahadevanpadmanabhan9314
    @mahadevanpadmanabhan9314 4 года назад

    What an amazing effort.Super.

  • @abeersalam1623
    @abeersalam1623 10 месяцев назад

    Sir, I'm new to this field, my research topic is about automatically evaluating essay answers using Bert what should I learn in advance so
    that I pick up only the main points related only to my research in order
    not to be distracted by too much information and could please give me your email I want to consult you
    And thank you

  • @mourady.650
    @mourady.650 4 года назад

    Hello Chris, thanks for this beautiful series. You described the training tasks as fake/bogus tasks. I prefer to name them proxy tasks - as in proxy war, but for good purposes. :)
    What do you think?

  • @kraken1350
    @kraken1350 4 года назад

    could I use BERT model on language like Arabic?

  • @田英俊
    @田英俊 4 года назад

    Thank you!

  • @felipeacunagonzalez4844
    @felipeacunagonzalez4844 4 года назад

    Thank you sir!