Mamba Language Model Simplified In JUST 5 MINUTES!

Поделиться
HTML-код
  • Опубликовано: 13 дек 2024

Комментарии • 12

  • @zagoguic
    @zagoguic 10 месяцев назад +3

    Great video! Keep making them!

  • @doublesami
    @doublesami 7 месяцев назад +1

    Very informative looking forward for the in depth video on vision mamba or vmamba

    • @analyticsCamp
      @analyticsCamp  7 месяцев назад

      Thanks for watching and for your suggestion. Stay tuned :)

  • @optiondrone5468
    @optiondrone5468 11 месяцев назад +1

    Thanks for this video, keep up the good work.

  • @kvlnnguyieb9522
    @kvlnnguyieb9522 9 месяцев назад

    a great video. next video, may be you can explain the details about selective mechanisms in code

    • @analyticsCamp
      @analyticsCamp  9 месяцев назад

      Great suggestion! Thanks for watching :)

  • @nidalidais9999
    @nidalidais9999 10 месяцев назад +1

    I liked your style and your funny personality

    • @analyticsCamp
      @analyticsCamp  10 месяцев назад

      Thanks for watching, I love your comment too :)

  • @ln2deep
    @ln2deep 11 месяцев назад +1

    It's a bit unclear to me how the Mamba architecture works recurrently when looking at the architecture in 5.30. What is the input here? the whole sequence or individual tokens? Surely it'd have to be the whole sequence for Mamba to build a representation recurrently. But then it seems strange to have a skip connection on the whole sequence. I think I've missed something.

    • @analyticsCamp
      @analyticsCamp  11 месяцев назад +1

      Hi, thanks for your comment. I mentioned that delta discretizes the input as the word sequence into tokens, ..., and the fact that, at every step of the hidden state update, it takes into account the previous hidden state and the 'current input word'. I try to make an update on this, maybe reviewing the entire article if I can. Please do let me know if you are interested in any particular topic for a video.