Transformers for beginners | What are they and how do they work

Поделиться
HTML-код
  • Опубликовано: 19 янв 2025

Комментарии • 152

  • @lyeln
    @lyeln 11 месяцев назад +22

    This is the only video around that REALLY EXPLAINS the transformer! I immensely appreciate your step by step approach and the use of the example. Thank you so much 🙏🙏🙏

    • @CodeWithAarohi
      @CodeWithAarohi  11 месяцев назад +3

      Glad it was helpful!

    • @Reem.alhamimi
      @Reem.alhamimi 8 месяцев назад

      exactly

    • @napoleanbonaparte9225
      @napoleanbonaparte9225 5 месяцев назад

      truly, I went through several medium blogs & also video but this lecture gave me immense calrity on each step of Transformer, thank u

  • @MrPioneer7
    @MrPioneer7 8 месяцев назад +3

    I had watched 3 or 4 videos about transformers before this tutorial. Finally, this tutorial made me understand the concept of transformers. Thanks for your complete and clear explanations and your illustrative example. Specially, your description about query, key and value was really helpful.

  • @americanfinancial7511
    @americanfinancial7511 5 месяцев назад +1

    Very well explained. Most of the people did not explained transformer as you did. You made it easy for new student to learn. Thanks

  • @mdfarhadhussain
    @mdfarhadhussain Год назад +4

    Very nice high level description of Transformer

  • @Zohranishrat
    @Zohranishrat Месяц назад

    You're a life saver. Thank you sooo much. I've tried GPT, tried different articles, but it's only now that I'm getting the whole concept

  • @VishalSingh-wt9yj
    @VishalSingh-wt9yj Год назад +1

    Well explained. before watching this video i was very confused in understanding how transformers works but your video helped me alot

  • @AI_Adhyayana
    @AI_Adhyayana 9 месяцев назад +1

    Accidentally I came across this video, very well explained. You are doing an excellent job .

  • @AbdulHaseeb091
    @AbdulHaseeb091 10 месяцев назад

    Ma'am, we are eagerly hoping for a comprehensive Machine Learning and Computer Vision playlist. Your teaching style is unmatched, and I truly wish your channel reaches 100 million subscribers! 🌟

    • @CodeWithAarohi
      @CodeWithAarohi  9 месяцев назад +1

      Thank you so much for your incredibly kind words and support!🙂 Creating a comprehensive Machine Learning and Computer Vision playlist is an excellent idea, and I'll definitely consider it for future content.

  • @bidishamukherjee3051
    @bidishamukherjee3051 2 месяца назад

    Great explanation Aarohi. Thank you.

  • @harshilldaggupati
    @harshilldaggupati Год назад +1

    Very well explained, even with such niche viewer base, keep making more of these please

  • @chandankumar-j3h5d
    @chandankumar-j3h5d Месяц назад

    So nicely explained. Thank u so much

  • @UjjwalSolanki-j2b
    @UjjwalSolanki-j2b 8 месяцев назад +2

    Can you please let us know I/p for mask multi head attention. You just said decoder. Can you please explain. Thanks

  • @satishbabu5510
    @satishbabu5510 8 месяцев назад

    thank you very much for explaining and breaking it down 😀 comparatively so far, your explanation is easy to understand compared to other channels thank you very much for making this video and sharing to everyone❤

  • @SureshNair-i2q
    @SureshNair-i2q Месяц назад

    Thank you for explaining so well.

  • @servatechtips
    @servatechtips Год назад

    This is a fantastic, Very Good explanation.
    Thank you so much for good explanation

  • @sukritgarg3175
    @sukritgarg3175 11 месяцев назад +1

    Great Video ma'am could you please clarify what you said at 22:20 once again... I think there was a bit confusion there.

  • @shaminMohammed-s9s
    @shaminMohammed-s9s 8 месяцев назад

    Wow.. you are amazing. Thank you for the clear explanation

  • @sahaj2805
    @sahaj2805 10 месяцев назад

    The best explanation of transformer that I have got on the internet , can you please make a detailed long video on transformers with theory , mathematics and more examples. I am not clear about linear and softmax layer and what is done after that , how training happens and how transformers work on the test data , can you please make a detailed video on this?

    • @CodeWithAarohi
      @CodeWithAarohi  10 месяцев назад +1

      I will try to make it after finishing the pipelined work.

    • @sahaj2805
      @sahaj2805 10 месяцев назад

      @@CodeWithAarohi Thanks will wait for the detailed transformer video :)

  • @imranzahoor387
    @imranzahoor387 11 месяцев назад

    best explanation i saw multiple video but this provide the clear concept keep it up

  • @vasoyarutvik2897
    @vasoyarutvik2897 Год назад

    Very Good Video Ma'am, Love from Gujarat, Keep it up

  • @exoticcoder5365
    @exoticcoder5365 Год назад

    Very well explained ! I can instantly grab the concept ! Thank you Miss !

  • @pandusivaprasad4277
    @pandusivaprasad4277 Год назад

    excellent explanation madam... thank you so much

  • @ykakde
    @ykakde 6 месяцев назад

    Nice explanation to such complex topic

  • @jaideepraulji1395
    @jaideepraulji1395 5 месяцев назад +1

    Well Explained

  • @user-dl4jq2yn1c
    @user-dl4jq2yn1c 8 месяцев назад

    Best video ever explaining the concepts in really lucid way maam,thanks a lot,pls keep posting,i subscribed 😊🎉

  • @aditichawla3253
    @aditichawla3253 Год назад

    Great explanation! Keep uploading such nice informative content.

  • @akera2775
    @akera2775 4 месяца назад

    lovely and deep explanation provided

  • @soravsingla6574
    @soravsingla6574 Год назад +1

    Hello Ma’am
    Your AI and Data Science content is consistently impressive! Thanks for making complex concepts so accessible. Keep up the great work! 🚀 #ArtificialIntelligence #DataScience #ImpressiveContent 👏👍

  • @sumankumari-gl3ze
    @sumankumari-gl3ze 5 месяцев назад

    you explained very nicely

  • @debarpitosinha1162
    @debarpitosinha1162 10 месяцев назад

    Great Explanation mam

  • @MinalMahala
    @MinalMahala 9 месяцев назад

    Really very nice explanation ma'am!

  • @BharatK-mm2uy
    @BharatK-mm2uy 10 месяцев назад

    Great Explanation, Thanks

  • @princekhunt1
    @princekhunt1 Месяц назад

    Nice tutorial

  • @afn8370
    @afn8370 8 месяцев назад

    your video is good, explanation is excellent , only negative I felt was the bg noise. pls use a better mic with noise cancellation. thankyou once again for this video

    • @CodeWithAarohi
      @CodeWithAarohi  8 месяцев назад

      Noted! I will take care of the noise :)

  • @sukumarane2302
    @sukumarane2302 3 месяца назад

    Well explained . Thank you 🙏

  • @sairampenjarla
    @sairampenjarla 7 месяцев назад +2

    hi, Good explanation but at the end, when you explained what would be the input to the decoder's masked multi-head attention, you fumbled and didn't explain clearly. But the rest of the video was very good.

    • @CodeWithAarohi
      @CodeWithAarohi  7 месяцев назад

      Thank you for the feedback!

    • @anandkumar-lq3dt
      @anandkumar-lq3dt 3 месяца назад

      Initial input to the decoder will be from the encoder output and after that decoder will consume the input from the previously generated decoder output. At a time decoder generate one word.

  • @sahaj2805
    @sahaj2805 10 месяцев назад

    Can you please make a detailed video explaining the Attention is all you need research paper line by line, thanks in advance :)

  • @Sam-yy4tw
    @Sam-yy4tw 6 месяцев назад

    Great work mam

  • @Reem.alhamimi
    @Reem.alhamimi Месяц назад

    The best

  • @MAHI-kj5tg
    @MAHI-kj5tg Год назад

    Just amazing explanation 👌

  • @sanjiwaneeayurvedic3199
    @sanjiwaneeayurvedic3199 2 месяца назад +1

    not clear about the input of mask attention layer

  • @akramsyed3628
    @akramsyed3628 Год назад +2

    can you please explain 22:07 onward

    • @UnchartedExperience
      @UnchartedExperience 5 месяцев назад

      she is not gonna reply she only replies to happy praise comments and ignores questions lol.....i know she messed up in the end she didnt know what to say but overall it was nice attempt .....additionally she entirely skipped CROSS ATTENTION and just mumbled around the concept without introducing the terminology

  • @parrotsafari6329
    @parrotsafari6329 2 месяца назад

    Great

  • @vimalshrivastava6586
    @vimalshrivastava6586 Год назад

    Thanks for making such an informative video. Please could you make a video on the transformer for image classification or image segmentation applications.

  • @praveenchandra-i8f
    @praveenchandra-i8f 5 месяцев назад

    this explanation is nice can u do pratical how to implement this transformer model using sentiment analysis in python platform

  • @_seeker423
    @_seeker423 11 месяцев назад

    Question about query, key, value dimensionality
    Given that
    query is a word that is looking for other words to pay attention to
    key is a word that is being looked at by other words
    shouldn't query and word be a vector of size the same as number of input tokens? so that when there is a dot product between query and key the word that is querying can be correctly (positionally) dot product'd with key and get the self attention value for the word?

    • @CodeWithAarohi
      @CodeWithAarohi  11 месяцев назад +1

      The dimensionality of query, key, and value vectors in transformers is a hyperparameter, not directly tied to the number of input tokens. The dot product operation between query and key vectors allows the model to capture relationships and dependencies between tokens, while positional information is often handled separately through positional embeddings.

  • @bijayalaxmikar6982
    @bijayalaxmikar6982 Год назад

    excellent explanation

  • @manishnayak9759
    @manishnayak9759 Год назад

    Thanks Aaroh i😇

  • @blindprogrammer
    @blindprogrammer 7 месяцев назад

    very high level but perfect!

  • @soravsingla6574
    @soravsingla6574 Год назад

    Very well explained

  • @TheMayankDixit
    @TheMayankDixit Год назад

    Nice explanation Ma'am.

  • @farzanehpishnamaz
    @farzanehpishnamaz Год назад

    Hello and Thank you so much. 1 question: I don't realize where the numbers in word embedding and positional encoding come from?

  • @MuqadasGull-x3z
    @MuqadasGull-x3z 8 месяцев назад

    Its great. I have only one query as whats the input of the masked multi-head attention as its not clear to me kindly guide me about it?

  • @animexworld6614
    @animexworld6614 7 месяцев назад

    Great Content

  • @anandtewari8014
    @anandtewari8014 6 месяцев назад

    I think may be input to the masked multi head attention is not told correct.

    • @CodeWithAarohi
      @CodeWithAarohi  6 месяцев назад

      Thank you for your message. Please share in detail.

  • @burerabiya7866
    @burerabiya7866 11 месяцев назад

    can you please upload the presentation

  • @akshayanair6074
    @akshayanair6074 Год назад

    Thank you. The concept has been explained very well. Could you please also explain how these query, key and value vectors are calculated?

  • @thangarajerode7971
    @thangarajerode7971 Год назад

    Thanks. Concept explained very well. Could you please add one custom example (e.g finding similarity questions)using Transformers?

  • @mahmudulhassan6857
    @mahmudulhassan6857 Год назад

    maam can you please make one video of classification using multi-head attention with custom dataset

  • @palurikrishnaveni8344
    @palurikrishnaveni8344 Год назад

    Could you make a video on image classification for vision transformer, madam ?

  • @_Who_u_are
    @_Who_u_are 8 месяцев назад

    Thank you so much

  • @nikhilrao20
    @nikhilrao20 Год назад

    Didn't understand what is the input to the masked multi head self attention layer in the decoder, Can you please explain me?

    • @CodeWithAarohi
      @CodeWithAarohi  Год назад +1

      In the Transformer decoder, the masked multi-head self-attention layer takes three inputs: Queries(Q), Keys(K) and Values(V)
      Queries (Q): These are vectors representing the current positions in the sequence. They are used to determine how much attention each position should give to other positions.
      Keys (K): These are vectors representing all positions in the sequence. They are used to calculate the attention scores between the current position (represented by the query) and all other positions.
      Values (V): These are vectors containing information from all positions in the sequence. The values are combined based on the attention scores to produce the output for the current position.
      The masking in the self-attention mechanism ensures that during training, a position cannot attend to future positions, preventing information leakage from the future.
      In short, the masked multi-head self-attention layer helps the decoder focus on relevant parts of the input sequence while generating the output sequence, and the masking ensures it doesn't cheat by looking at future information during training.

  • @KavyaDabuli-ei1dr
    @KavyaDabuli-ei1dr 11 месяцев назад

    Can you please make a video on bert?

  • @_seeker423
    @_seeker423 11 месяцев назад

    Can you also talk about the purpose of the 'feed forward' layer. looks like its only there to add non-linearity. is that right?

    • @abirahmedsohan3554
      @abirahmedsohan3554 10 месяцев назад

      Yes you can say that..but mayb also for make key, quarry and value trainable

  • @kadapallanithin
    @kadapallanithin Год назад

    Could you explain with python code which would be more practical. Thanks for sharing your knowledge

  • @niluthonte45
    @niluthonte45 Год назад

    thank you mam

  • @EmpoweringHub24
    @EmpoweringHub24 Год назад

    hello maa is this transform concept same for transformers in NLP?

    • @CodeWithAarohi
      @CodeWithAarohi  Год назад

      The concept of "transform" in computer vision and "transformers" in natural language processing (NLP) are related but not quite the same.

  • @minhaledits301
    @minhaledits301 5 дней назад

    hmri ma'am ny b apsy hi phara ho ga lakin unki to class main samj hi nhi ai thi

    • @CodeWithAarohi
      @CodeWithAarohi  5 дней назад

      ohh... Video se samajh aaya apko?

    • @minhaledits301
      @minhaledits301 5 дней назад

      @CodeWithAarohi hnji kl paper hy and thank you
      Allah Khush rakhy apko ♥️

    • @CodeWithAarohi
      @CodeWithAarohi  5 дней назад

      Good luck for your exam 😊

  • @rj00502
    @rj00502 4 месяца назад

    doing phenominal work .

  • @saeed577
    @saeed577 11 месяцев назад

    I thought it's transformers in CV. all explanations were in NLP

    • @CodeWithAarohi
      @CodeWithAarohi  11 месяцев назад

      I recommend you to understand this video first and then check this video: ruclips.net/video/tkZMj1VKD9s/видео.html After watching these 2 videos, you will understand properly the concept of transformers used in computer vision. Transformers in CV are based on the idea of transformers in NLP. SO its better for understanding if you learn the way I told you.

  • @mohdUbaidWani
    @mohdUbaidWani Год назад

    how to get pdfs mam

  • @Red_Black_splay
    @Red_Black_splay 9 месяцев назад

    Gonna tell my kids this was optimus prime.

    • @CodeWithAarohi
      @CodeWithAarohi  9 месяцев назад

      Haha, I love it! Optimus Prime has some serious competition now :)

  • @skyeyes4757
    @skyeyes4757 Месяц назад

    why don't you try to explain in hindi we can understand english but lack when it come to english to imganitation for new topic

    • @CodeWithAarohi
      @CodeWithAarohi  Месяц назад +1

      Hindi tutorial: ruclips.net/video/uJhVLjZfmo8/видео.html

  • @jagatdada2.021
    @jagatdada2.021 Год назад

    Use mic, background noise irritate

  • @_Who_u_are
    @_Who_u_are 8 месяцев назад

    Speaking in Hindi would be more better

  • @digambar6191
    @digambar6191 6 месяцев назад

    Thank you mam