Cerebras @ Hot Chips 34 - Sean Lie's talk, "Cerebras Architecture Deep Dive"

Поделиться
HTML-код
  • Опубликовано: 21 ноя 2024

Комментарии • 22

  • @centuriomacro9787
    @centuriomacro9787 2 года назад +5

    Very interesting presentation, thx

  • @piscocuk2011
    @piscocuk2011 8 месяцев назад +1

    00:04 Cerebras aims to revolutionize AI compute with a co-designed architecture
    02:06 Architecture focused on neural networks
    06:25 Memory bandwidth enables full performance in neural network computation.
    08:36 Cerebras core hardware architecture flexibility
    13:08 Cerebras chip has 84 die with 850,000 cores on a single 300mm wafer.
    15:27 Homogeneous array of cores across the wafer for unprecedented fabric performance
    19:21 Cerebras architecture utilizes dataflow mechanisms for weight computations
    21:12 Single chip enables high-performance neural networks
    25:02 Scalable clustering and wafer-scale chips enable large model access to everyone
    Crafted by Merlin AI.

  • @whyjay9959
    @whyjay9959 Год назад +4

    Hi. There's something that a few people were wondering about: Why is the Wafer-Scale Engine square? Since it looks like there's room for ~28 more complete, attached tiles.

    • @CerebrasSystems
      @CerebrasSystems  Год назад +7

      It's a good question! The answer is rather prosaic, we're afraid. If the WSE weren't rectangular, the complexity of power delivery, I/O, mechanical integrity and cooling become much more difficult, to the point of impracticality.
      Take a look at the virtual teardown on our website and you may get a feel for some of these challenges: www.cerebras.net/cs2virtualtour
      The upshot is that a mere 850,00 cores will just have to suffice. ;)

    • @whyjay9959
      @whyjay9959 Год назад +2

      @@CerebrasSystems I think I get the idea, thanks.

    • @AbeDillon
      @AbeDillon 4 месяца назад +1

      ​@@CerebrasSystems Would it be possible to lop off some of those edge tiles to make mini engines?

  • @xeusai
    @xeusai 7 месяцев назад

    I didn't catch that much from the routing protocol, and how actually die to communicate on wse2 , yiu guys have alot if things , congratulations 🎊 😊

  • @christopherkeates4147
    @christopherkeates4147 2 месяца назад

    Incredible work. How do you scale a trained model down so that you can put it in something smaller and run inference real-time for control of a system?

  • @RalphDratman
    @RalphDratman Год назад +2

    Is the CS-2 used only for training?
    Will a time come when, for massively concurrent inference, this architecture will be applicable?

    • @CerebrasSystems
      @CerebrasSystems  Год назад +2

      Hi Ralph, good question. The vast bulk of our customers have used our systems for training LLMs or for HPC applications.
      We have had a couple of projects using it for inference, like one with Lawrence Livermore National Laboratory where they offloaded an unwieldy inference step from many nodes of their Lassen supercomputer to one of our systems. You can read the case study here: www.cerebras.net/cerebras-customer-spotlight-overview/spotlight-lawrence-livermore-national-laboratory/
      But in principle, our architecture should make at terrific concurrent inference platform because we can run many (hundreds or even thousands depending on the model) in parallel across our massive array of cores.

  • @JoeLion55
    @JoeLion55 7 месяцев назад

    Re: the The die-to-die interface at about 15:15.
    You mentioned you an upper metal layer to cross the scribe lines between the dies. What does the reticle look like for this. Is this a regular mask, but the alignment for the mask is just offset so it straddles the scribe lines for the rest of the wafer? Is this something TSMC does regularly for other products? Or is this a new process to have reticles on the same wafer that don’t align on top of each other?

  • @808bigisland
    @808bigisland 2 года назад +2

    Aloha and thanks! Way to go! Just imagined what you will be doing in ten years from now! Do you have a public roadmap?

    • @CerebrasSystems
      @CerebrasSystems  2 года назад

      Thanks, 808 Big Island! Sadly, no public roadmap. You'll just have to keep watching!

  • @xeusai
    @xeusai 7 месяцев назад

    Was wondering if memory x is actually an independent device outside of wse-2 , wafer ,? the fact it has better spars performance in hardware level , is very interesting?

  • @CaseyKoh
    @CaseyKoh 2 месяца назад

    What is the yield of that wafer sir ? thank you

  • @WoodyDataAI
    @WoodyDataAI 2 месяца назад

    Super fast, lighting speed AI system. Great!

  • @billykotsos4642
    @billykotsos4642 2 года назад +3

    👀👀👀👀👀👀

  • @hg6996
    @hg6996 Месяц назад +1

    If this wse is really that good why is still nobody talking about Cerebras AI while Nvidia is still printing money?

    • @jhockey11liu91
      @jhockey11liu91 Месяц назад

      Because they are f-u-c-k up

    • @Marqui17
      @Marqui17 Месяц назад

      Because todays biggest models dont fit on one Cerebras chip

    • @hg6996
      @hg6996 Месяц назад

      @@Marqui17 Hmm. So it's not possible to put together more of them in order to make the models fit on such a system?

    • @Marqui17
      @Marqui17 Месяц назад

      @@hg6996 I guess you should be able to interconnect them and split the model on them but then you are introducing the same complexities Nvidia has, taking away Cerebras' main advantage