AVX512 Properly Explained! - Performance and Syntax Analysis

Поделиться
HTML-код
  • Опубликовано: 15 июл 2024
  • // Join the Community Discord! ► / discord
    The Advanced Vector Extension, A.K.A. AVX, is an extension to the x86 instruction set architecture, designed to make SIMD possible within the CPU core itself!
    Building a Budget PC can be tough. Not only are GPUs and CPUs so incredibly expensive, but they can be hard to find on a budget... But, there are tips and tricks to finding you your dream Budget GPU, and pairing it with a CPU that will give you the performance you want!
    Also, if you're reading this far - I've got an RTX 2060 review coming!
    Have a Great Day!
    - Proceu
    Timestamps:
    0:00 Intro
    0:46 Preface
    1:22 What is AVX?
    3:05 Parallel vs AVX code paths
    3:33 Why SIMD ISAs?
    4:07 Intel MMX
    4:26 MMX vs SSE & AVX
    5:17 Uses for Double-Precision
    5:34 Why use the CPU?
    6:31 immintrin.h
    6:50 The Code
    11:36 The Benchmarks
    14:22 Conclusions
    15:49 Guinea Pig Cam
    #AVX #AVX2 #AVX512
  • НаукаНаука

Комментарии • 31

  • @rn3srk1
    @rn3srk1 9 месяцев назад +19

    Criminally underrated techtuber

  • @MarekKnapek
    @MarekKnapek 8 месяцев назад +5

    Recently, I implemented the Serpent symmetric block cipher (AES candidate) in portable C using 32bit unsigned integers. Then ported it to 128bit SSE2 for 4× the performance, and later to 256bit AVX2 for additional 2× speedup on top of SSE2. That thing scales like magic. I don't have AVX-512 integer capable computer. Reading Intel's intrinsics docs isn't really that difficult like I was afraid of initially.

  • @problematic3255
    @problematic3255 3 месяца назад +5

    I... wh... okay THIS is how my friends feel when I start talking about various cpu's gpu's and other hardware... this whole video felt like I was listening to an entirely different language. am I just stupid or something for not being able to pick up on context clues?

    • @ProceuTech
      @ProceuTech  3 месяца назад +5

      You’re not stupid- this isn’t common sense information, and a lot of it builds on other information that without having knowledge of it, it creates holes in understanding. Kind of my fault for not explaining things well enough

    • @problematic3255
      @problematic3255 3 месяца назад +2

      @@ProceuTech I mean I know 512 starts with 11th gen and then disappears and is replaced and/or updated to a new thing on 12th gen when looking at instruction sets, but that’s as far as my knowledge goes lol I should’ve looked up instruction set basics videos before looking up the new new stuff lol

    • @ProceuTech
      @ProceuTech  3 месяца назад +3

      AVX-512 is a weird one because you’re right. Intel didn’t support it officially on 12th Gen (but it was still active in hardware in early revisions available to the public), but it’s been entirely fused off with 13th and 14th Gen. the next gen of AVX, called AVX10, aims to fix this by allowing programmers to still utilize AVX512 code, but it will be able to “double pump” an AVX2 hardware data path similar to what AMD does with Zen 4. Weird stuff but you’re not stupid for not understanding!

  • @KvapuJanjalia
    @KvapuJanjalia 2 дня назад +2

    The fact that the current high-end gaming Intel CPU (14900K) does not support AVX512 is insane.

  • @nate6908
    @nate6908 7 месяцев назад +6

    how is avx512 used for Inference?
    from what i understood in this video avx512 enables you to execute a Multiply (or Accumulate) instruction for eight double precision floats (8*64=512, thus the name)
    so could quantized models to int8 then execute 64 int8 with one single instruction instead of decoding the same instruction 64 times?
    the company neuralmagic even goes the route of saying cpu inferencing is the way forward
    bbut even with "64simd", GPUs are still much more parallel i thought

  • @ChrisM541
    @ChrisM541 8 месяцев назад +3

    You have to understand that, in it's most basic form, shifting from an 8 to a 16bit CPU carries an automatic 'SIMD' upgrade to all increased registers, for rather obvious reasons. With today's 64bit CPU's, adding separate large registers and applicable opcodes (opcodes which have become more complex/powerful) can - and does - have the effect of stalling a general move to greater than 64bit CPU's.
    Today, were merely extending hybrid architectures, and today's 'large register' extensions are our means to do that.

  • @dennysgrimaldi9623
    @dennysgrimaldi9623 8 месяцев назад +4

    nice video, deserve more views

  • @yumenokoyume
    @yumenokoyume 4 месяца назад +3

    I'm no programmer but, I wonder what happens if you run a AVX2 program on a processor that doesn't supports it. Like an Intel i5 3470.

    • @ProceuTech
      @ProceuTech  4 месяца назад +3

      It will throw a seg fault and crash :(

    • @yumenokoyume
      @yumenokoyume 4 месяца назад +3

      @@ProceuTech Thanks for the reply. I'm using a 3rd Gen i5 for my video editing and VFX. But Adobe 2024 installers won't allow me to further install because AVX2 is not supported. I'm just kinda curious what happens if I ran the program. 🤣🤣

    • @ferna2294
      @ferna2294 2 месяца назад +3

      @@yumenokoyume Usually they program their apps in a way that they have some fallback ability when we talk about the LATEST tech, so someone who has a couple gen older hardware can also use their app. However, since it´s been more than 10 years of the standarization of AVX2, Adobe probably doesn´t care anymore about backwards compatibility.

  • @cdriper
    @cdriper 8 месяцев назад +3

    vector at in high performance loop? )

    • @ProceuTech
      @ProceuTech  8 месяцев назад +3

      You could also theoretically do a vector.data()+sizeof(int32_t)*i;

    • @cdriper
      @cdriper 8 месяцев назад +3

      ​@@ProceuTech vector::at validates index on each access, vector::operator[] doesn't (pass vector by reference to simplify access to the operator[], moreover prefer to use passing by reference if null invariant is not expected)
      but yeah, more important point here is that w/o good optimization each indexed access to an array means "offset + index*sizeof(element)"
      also it's not a good idea to put a condition inside a loop because in that case a performance depends on other optimization -- a branch prediction inside CPU

  • @youtubeshadowbannedmylasta2629
    @youtubeshadowbannedmylasta2629 8 месяцев назад +3

    and it just makes performance worse.

    • @nidalspam509
      @nidalspam509 6 месяцев назад +5

      Not in the case of zen 4 from amd.

    • @MrKatoriz
      @MrKatoriz Месяц назад +3

      Intel's garbage nodes that can't just not melt upon seeing an AVX512 instruction are the reason (entire CPU downclocks for signifacant amount of time as soon as AVX512 instruction is executed).

    • @Antagon666
      @Antagon666 День назад

      @@nidalspam509 It makes the performance same as with avx2. Because of functional units only being 256bit splitting the workload on 2x 256 bit operations.
      It can be argued that having full 512b FUs and running at 70% clockspeed is still better than 256b FUs and 100% clockspeed.

  • @MZRFaith
    @MZRFaith 6 месяцев назад +5

    Most emulation requires avx-512 to run stable, the intel cpus to me have been trash in performance, the 5600x I have is way better.

    • @Adamchevy
      @Adamchevy 5 месяцев назад +5

      This is the sole reason I havent upgraded from my 11900k. Why intel went backwards on this I will never understand. I guess emulation isn't something they care about.

    • @CaptainScorpio24
      @CaptainScorpio24 4 месяца назад +4

      ​@@Adamchevy my i7 12700 non k has avx 512😊

    • @Adamchevy
      @Adamchevy 4 месяца назад +2

      @@CaptainScorpio24 the early ones do, but the later ones do not. And it isn’t in the 13th or 14th gen. Ofcourse with emulation under attack it might not be that important a year from now. But when you use RCPS3 it makes a huge difference.

    • @stevensv4864
      @stevensv4864 3 месяца назад +2

      Bro 5600x IS trash compare to 12 13 and 14 gen intel, even without avx512😂

    • @stevensv4864
      @stevensv4864 3 месяца назад +1

      ​@@CaptainScorpio24can you test god of war 3 with the same settings as my videos