The Next 100x - Gavin Uberti | Stanford MLSys #92

Поделиться
HTML-код
  • Опубликовано: 29 дек 2024

Комментарии • 10

  • @sucim
    @sucim 10 месяцев назад +4

    Very interesting and well presented!

  • @jaytau
    @jaytau 10 месяцев назад +9

    Would it be possible to use an external mic for the speaker and the person who asks the question?
    Its quite challenging to hear

  • @vicaya
    @vicaya 10 месяцев назад +3

    37:40, as you already realized that LLM (and transformer architecture in general) is memory constrained, the extra FLOPS are wasted until TSMC productize SOT-MRAM. groq with SRAM is a more realistic short term approach for small models.

  • @ZevUhuru
    @ZevUhuru 6 месяцев назад +6

    Great talk, but why didn't anyone ask questions around competition. What is to prevent Nvidia, AMD, or Intel from producing niche chips like this? With their R&D teams, Quality Assurance systems, Warranties, and supply chains, they likely thought of this and if not should be able to deploy a more competitive and reliable solution fast.
    That being said I really appreciate Gavin breaking down the history here I learn a lot of new things.

    • @manonamission2000
      @manonamission2000 5 месяцев назад +3

      Corporations tend to move slowly... it is less expensive (relatively, $ and time) for a nimble co to attempt to innovate like this... also, the gamble is the Sohu platform becomes so appetizing that it ends up as an acquisition target... again, both are simply bets... not without risk

    • @ZevUhuru
      @ZevUhuru 5 месяцев назад

      @@manonamission2000 Sure, you could argue some leaders like Blockbuster moved slow when the rising leader Netflix transitioned to online and on-demand content.
      However, unlike on-demand streaming services, Gen Ai is the most revolutionary technology of our time and if this direction was so promising and yet as simple as creating a niche chip focused solely on transformers then you'd think Intel and AMD with it's massive R&D teams would already be doing it to get an edge on Nvidia.
      These serious business questions should have been asked, I'll do more research but hard to take any of this seriously if such as basic question could not have been asked/answered.

    • @TheAIEpiphany
      @TheAIEpiphany 4 месяца назад +3

      legacy, they can't break their CUDA sw stack, so it'd be a big investment to build everything from scratch basically

  • @georgehart5182
    @georgehart5182 10 месяцев назад +8

    it's cool, but this is going to be a long road. The main problem is software at the IR (e.g. CUDA), not necessarily hardware. There are many companies that can make interesting transistor permutations that have been doing it for a long time and they are not magically "accelerating superintelligence". This is a software ecosystem problem more than anything else. good luck.

  • @nauy
    @nauy 7 месяцев назад +3

    Nice history lesson. Nothing about the ‘next 100x’ promised in the title.

  • @briancase6180
    @briancase6180 6 месяцев назад

    Dude, you're at Stanford; I think students know what an inverter does. This was an ML seminar talk? How? And, how did this have anything to do with the topics explicitly raised in the Abstract? Just asking.... And, BTW, HBM isn't the only type of memory that's relevant especially for inference, which is, BTW, the focus of his company.