Fundamentals of GPU Architecture: Introduction

Поделиться
HTML-код
  • Опубликовано: 16 ноя 2024

Комментарии • 52

  • @sushobhanapatra4690
    @sushobhanapatra4690 3 года назад +17

    Enjoying this at 3am on a Saturday. Keep up😁!

    • @quintonalberto6073
      @quintonalberto6073 3 года назад

      You prolly dont care but does anyone know of a trick to log back into an Instagram account??
      I stupidly lost my account password. I would love any help you can offer me.

    • @carldamien1889
      @carldamien1889 3 года назад

      @Quinton Alberto instablaster :)

    • @quintonalberto6073
      @quintonalberto6073 3 года назад

      @Carl Damien i really appreciate your reply. I found the site through google and I'm trying it out atm.
      Looks like it's gonna take a while so I will reply here later when my account password hopefully is recovered.

    • @quintonalberto6073
      @quintonalberto6073 3 года назад

      @Carl Damien It worked and I actually got access to my account again. I'm so happy!
      Thank you so much you saved my account !

    • @carldamien1889
      @carldamien1889 3 года назад

      @Quinton Alberto you are welcome :)

  • @ericminnerath4892
    @ericminnerath4892 5 лет назад +22

    Very interesting! Im an undergrad comp csi major here at UMN, saw your reddit post, and love this content. I've always been a PC nerd so actually learning about this stuff is so cool. Makes me more interested in it as I progress through college. Keep up the good work!

    • @CoffeeBeforeArch
      @CoffeeBeforeArch  5 лет назад +2

      Thanks, fella! Glad you are enjoying it!

    • @kunalkokate
      @kunalkokate Месяц назад

      Another ECE grad major here from UMN. Loving your content! @CoffeeBeforeArch

  • @biocode5441
    @biocode5441 3 года назад +10

    Although not as popular, and probably not worth that much views-wise, for the few people that watch this, it's quite helpful to have a person with experience comment material in a book, in a "read and comment" fashion; it's as if the book is a person that doesn't just speak fixed, typeset words :))

    • @u263a3
      @u263a3 3 года назад +2

      This is like fine wine for those who really care about the subject

  • @ribethings
    @ribethings Год назад +3

    This is a gold mine. Thank you!

  • @areklachowicz
    @areklachowicz 4 года назад +4

    Amazing, I'd love to watch your series

  • @ebramful
    @ebramful 5 лет назад +8

    Great video and content. Thank you.

  • @justcurious1940
    @justcurious1940 2 месяца назад

    This is exactly what I was looking for, Thank u for sharing.

  • @delphicdescant
    @delphicdescant Год назад +5

    Now, in mid-2023, have there been any substantial changes in the industry that might require changes to any information in this video series?
    Or is everything presented in this series still just as up-to-date as it was 4 years ago?

    • @ChimiChanga1337
      @ChimiChanga1337 11 месяцев назад

      Hardware architecture doesn't change as rapidly as software.

  • @saeed6971
    @saeed6971 4 года назад +2

    Thank you for explaining these.

  • @GagandeepSingh-me4qt
    @GagandeepSingh-me4qt 9 месяцев назад

    Thank you, very informative

  • @raghul1208
    @raghul1208 2 года назад

    awesome playlist

  • @LinBond
    @LinBond 4 года назад +1

    nice clip , thanks nick

  • @podilasahithi7
    @podilasahithi7 4 года назад +5

    This is the best online resource I got for learning GPU architecture and related stuff. One quick question why can't CPU have many cores as GPU. I mean whats the necessity of GPU apart from the specific usage that each is designed to use for???

    • @CoffeeBeforeArch
      @CoffeeBeforeArch  4 года назад +8

      CPU cores are designed for a (mostly) different purpose than GPU "cores" (they're difficult to compare directly). A large fraction of the silicon in CPU cores is for mining ILP, while GPU "cores" are optimized for DLP. For example, GPUs have neither branch prediction nor OoO execution. They also have some significant differences in cache coherence (GPUs often have non-coherent L1 caches).
      Ultimately, GPUs are mostly designed for massively parallel applications, while CPUs are more general purpose, and have many more optimizations for single-threaded performance (not everything benefits from parallelism).

    • @podilasahithi7
      @podilasahithi7 4 года назад +2

      @@CoffeeBeforeArch thank you . That was very quick response 👏

    • @CoffeeBeforeArch
      @CoffeeBeforeArch  4 года назад +1

      @@podilasahithi7 Happy to help!

  • @vexury
    @vexury 5 месяцев назад +1

    Is the document seen in the video accessible?

  • @akintoyeopeyemi2587
    @akintoyeopeyemi2587 3 года назад

    If you look at the microphone for too long, you begin to see an alien with eyes and mouth

  • @antoinedevldn
    @antoinedevldn 3 года назад

    Very awesome series! Thanks

  • @gareththomas3234
    @gareththomas3234 7 месяцев назад

    It seems this is the only dedicated book on GPU architecture - there is a chapter in another book from 2022

  • @MLSCLUB-t2y
    @MLSCLUB-t2y 2 месяца назад

    many thanks

  • @eaemmm333
    @eaemmm333 3 года назад

    thank you for such a great videos

  • @abhishektyagi7513
    @abhishektyagi7513 4 года назад +1

    Why is SM called a core? As far as I know, SMs have several cores inside them. Are all these cores single threaded? I mean does a core work on only one thread?
    Btw, great content! I really appreciate the time you have put in to explain everything in detail.

    • @CoffeeBeforeArch
      @CoffeeBeforeArch  4 года назад +5

      The definition of "core" ends up being fairly arbitrary (the same can be said for threads which are incredibly different between CPU and GPUs). There are a few logical things that make an SM like a core. It has a private L1 cache, it's own private register file, compute resources, and has work (thread blocks) scheduled to it.
      CUDA cores are perhaps better referred to as execution units. They're where instruction get mapped during execution.

    • @abhishektyagi7513
      @abhishektyagi7513 4 года назад

      @@CoffeeBeforeArch Are the execution units single threaded? I am imagining the latency thing like this - when a thread has to get data, it is stalled however other threads are working simultaneously so the overall throughput is better since many operation are finishing even if one thread is stalled. Is this correct?

    • @CoffeeBeforeArch
      @CoffeeBeforeArch  4 года назад +1

      Abhishek Tyagi GPUs fetch warp instructions. If that instruction is a load that had to go to main memory, there is a warp scheduler that will try and fetch instructions from different warps (if available) that can make forward progress. If an instruction has already made it to an execution unit, it has the data it needs already (with the exception of loads/stores in the load store queue)

    • @abhishektyagi7513
      @abhishektyagi7513 4 года назад

      @@CoffeeBeforeArch Thanks!

  • @mytech6779
    @mytech6779 11 месяцев назад +2

    This video is the missing link needed by programmers that want to expand beyond traditional CPU applications.

  • @koysdo
    @koysdo 2 года назад

    what program are you using to read and annotate the book? many thanks for the video as well.

    • @Benderhino
      @Benderhino 3 месяца назад

      Have you found out yet? I want to use this too

  • @georgebenson3826
    @georgebenson3826 2 года назад +1

    👍

  • @hanwang5940
    @hanwang5940 5 лет назад +2

    How does multithreading hide off-chip latency?
    EDIT: I tried looking online for resources to read but couldnt find any that explains this :(

    • @CoffeeBeforeArch
      @CoffeeBeforeArch  5 лет назад +6

      If some threads miss in the caches and are waiting on a response from DRAM, you can swap in other threads that have useful work to do instead of just stalling. One of the nice things about GPUs is that the context switching of threads is a cheap operation. This is because all the threads already have their own private sets of register in the massive register files. So when a long latency memory access occurs for one warp, a new one can immediately be swapped in to make forward progress.

  • @anonviewerciv
    @anonviewerciv 4 года назад

    Getting started with bitcoin mining.

  • @JP-re3bc
    @JP-re3bc 6 месяцев назад +2

    Too much names and hand-waving. Could go directly into concepts and examples instead.

  • @adrienloridan
    @adrienloridan Год назад

    30secs ads every 5min ! Wtf

    • @bosnian1
      @bosnian1 2 месяца назад

      saw 1 ad in total

  • @carrapaz3645
    @carrapaz3645 4 года назад

    awesome video! I just find slightly annoying how often you say "you know" but maybe it's just me ;-P
    hyped to continue the series

  • @mantan_rtw
    @mantan_rtw 4 месяца назад

    It is a complete waste of time to talk about GPU without referring to the type of workload they run. The start of the video says will not talk about graphics, ok bizzare, because that is the most important workload for a GPU. Maybe you don't know the GFX rendering pipeline. So compute, start with compute shaders and what they do then go to how a GPU executes a CS. Talking about bits and pieces of the H/W units that form a GPU in isolation is utter nonsense.

    • @CoffeeBeforeArch
      @CoffeeBeforeArch  4 месяца назад +2

      So this series follows the book "General-purpose Graphics Processor Architectures", and these kinds of GPUs do not contain any GFX rendering pipelines (e.g., parts like the H100 are still called GPUs but are built solely for HPC and ML workloads, without any support for graphics)