Crazy! AMD's Milan-X Delivers 1.5GB of L3 Cache to EPYC Servers

Поделиться
HTML-код
  • Опубликовано: 21 окт 2024

Комментарии • 203

  • @JeffGeerling
    @JeffGeerling 2 года назад +139

    When a CPU looks large in comparison to Patrick's presence... you know it's serious (and crazy expensive) business.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +22

      Ha! Thanks Jeff.

    • @bitterrotten
      @bitterrotten 2 года назад +7

      I'm looking forward to you seeing Wendell in one of your videos.

    • @OK_ACME
      @OK_ACME 2 года назад +4

      Jeff, is that called a balanced load? Physical size versus presence... :D

    • @mzamroni
      @mzamroni 2 года назад

      Server cpu indeed gives most profit per transistor than desktop cpu and more over desktop gpu.
      No wonder that rdna2 availability is very scarce

    • @shephusted2714
      @shephusted2714 2 года назад

      i blame these developments on fat old white men - they will be refurbs in 4-5 yrs - it is going to take 5-10 yrs for bigger more significant breakthroughs but they will happen and make this current paradigm look petty - more than likely #tco #roi #cots #refurb long tail #release eng

  • @AntsAasma
    @AntsAasma 2 года назад +35

    These CPUs have as much cores as Opterons had bits and as much cache as Opterons had RAM. How time flies.

    • @RobBCactive
      @RobBCactive 2 года назад +2

      Hell, Zen3 micro-op cache is larger than the RAM on my first micro 😆😆😁

  • @kwinzman
    @kwinzman 2 года назад +54

    Oh my god he is always so hyped making those videos. Love the energy!

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +20

      How can you not be with crazy CPUs like this?!?!

    • @kayakMike1000
      @kayakMike1000 2 года назад +2

      @@ServeTheHomeVideo they are VERY omg these days ....

  • @MarkUseBlender00
    @MarkUseBlender00 2 года назад +21

    the old days: people use ten floppies to store their OS
    early 2000: people run their os on hard drives
    21 centry: people put their os on RAM
    the year of 2023: people run their os on L3 cache

    • @Алексей-й3ю5ч
      @Алексей-й3ю5ч Год назад +1

      А что дальше? Half life alyx в кеше процессора мобильного шлема?))
      Думаю авто перевод справиться.

    • @Алексей-й3ю5ч
      @Алексей-й3ю5ч Год назад +1

      Может какие нибудь програмисты старой школы сделают оптимизации чего то безумного на asm под эту мощь.
      Вроде тех энтузиастов, кто писал собственные OS на asm. Не помню название OS

  • @jksoftware1
    @jksoftware1 2 года назад +5

    NICE!!!! I watch @Level1Techs Wendel as well and been watching him since he was part of another channel... Nice to see you have him on as a guest.

    • @tanmaypanadi1414
      @tanmaypanadi1414 2 года назад +1

      we don't talk about the other channel 🤫

  • @Jsteeeez
    @Jsteeeez 2 года назад +42

    Wendell is making the rounds today!

    • @d00dEEE
      @d00dEEE 2 года назад +11

      First he's a Potato, now he got Served.

    • @JeffGeerling
      @JeffGeerling 2 года назад +6

      @@d00dEEE Level1Home

    • @MarkRose1337
      @MarkRose1337 2 года назад +1

      @@JeffGeerling Potato1Home

    • @samlebon9884
      @samlebon9884 2 года назад +3

      He probably got a lot of cache.

  • @johndoh5182
    @johndoh5182 2 года назад +9

    This is the market where I knew 3D cache was going to be beneficial, more so than a PC, but as is shown it depends on use case, which is why stacked cache isn't going to be something that ends up on every IC AMD makes as people I've debated with suggested.
    The problem of this tech is no just the stacking technology, it's also creating a flat surface on the top of the entire package, which means having to add material over the core die, and I assume that material is designed to conduct heat as quickly as possible since heat now has to travel further before a heat exchanger (the cooler) can wick the heat away.
    Exciting times though because stacking technology along with interconnect technology adds a lot of tools chipmakers can use, and I expect that the next 5 years is going to see that performance curve for compute power take another good jump.
    THANKS AMD! Intel wasn't getting us anywhere quickly. AMD has pushed the world of compute about 15 years, or when Zen 4 and Zen 5 come out, over 6 years, starting with Zen 2, and if you want to add "the core wars", where Intel couldn't ignore high core count computing anymore, then going back to the beginning of Zen.
    I would say Intel brought big-little to X86-64, but to me that STILL seems like a response to power consumption more than anything else. I KNOW that for server it's going to become more important, but for desktop, not so much. And the reason it will be more important for server is very high core count computing. With stacked cache it opens the door even more for CPUs pushing a couple hundred cores. I think the bigger thing Intel has brought in is accelerators. How this plays out is pretty obvious. It will be interesting to see how AMD responds to this with their recent purchase of Xilinx. Memory bandwidth is still one more issue for high core count server CPUs, but that's going to get there soon.

  • @gmscott9319
    @gmscott9319 2 года назад +7

    1:40 Normally, I get suspicious when I hear something like, "we finished dinner, went back to his hotel room, and filmed a quickie." ;)
    Awesome video! Crazy mico-tech is becoming more common these days, I can't wait to see what our future holds!

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +6

      Imagine having to tell people you make videos that have demographics of like 97%+ adult males in places where people do not understand what servers and switches are.

  • @ewenchan1239
    @ewenchan1239 2 года назад +5

    If you want to see something that's relatively crazy from the HPC world/space -- there are actually applications that use a hybrid MPI/OpenMP software architecture where going in between nodes, it uses MPI, but within a node, it uses OpenMP for the parallelisation.
    The purpose of this is to reduce the MPI overhead, which really starts to kick in when you have 2^7 (128 cores) and above because of the number of pair-wise connections you have to make with said MPI.
    So imagine that you are running a vehicle crash simulation or a computational fluid dynamics simulation where it thinks that it has 1.5 GB of "RAM", when really, that's L3 cache; which would be just crazy.
    And then, to be able to cache that much data - that's just insanity!!!

  • @benjamintrathen6119
    @benjamintrathen6119 2 года назад +30

    Patrick and wendell are like two peas in a nerdalicious pod. You are both are great.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +9

      It was fun just hanging out at AMD last week with him. The last time we had a meal together was in Taipei in like 2019.

  • @massfrommars
    @massfrommars 2 года назад +1

    Patrick gets more and more hyped about self-introduction with each video. I love that start. It's like one of those catchy tunes that gets and stays stuck in one's head forever.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +1

      I am one of the few people that gets genuinely excited about servers

  • @GMTX-qj8or
    @GMTX-qj8or 2 года назад +1

    Legend, big hug from Australia, great work :)

  • @shinokami007
    @shinokami007 2 года назад +2

    Nerds! love to see you guys having fun, thanks, stay safe and strong ;)

  • @slithery9291
    @slithery9291 2 года назад +21

    Are they the fire, water, earth and wind stones from '5th Element' ?
    Leeloo being the 5th?
    Multipass....
    Do I win the 3080?

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +10

      You got it!

    • @d00dEEE
      @d00dEEE 2 года назад +6

      @@ServeTheHomeVideo Got what? The 3080 or the Multipass?

    • @linusidle
      @linusidle 2 года назад +2

      @@d00dEEE yes

  • @billymania11
    @billymania11 2 года назад +2

    A good video. I'll never use these CPU's but it's nice to see the details. These are crazy powerful.

  • @setharnold9764
    @setharnold9764 2 года назад +5

    Hey Patrick and crew, fun video, thanks.
    The kernel mitigations for the various microarchitectural flaws are pretty expensive; I think some of the fixes dump processor caches on syscall entry and exit. I wonder if turning off the mitigations would show more benefit to this crazy cache. Since RUclips eats many of my comments with links I'll reply with a link and hope it works.

    • @sfalpha
      @sfalpha 2 года назад

      IF workloads not doing that much context switch, it won't affect the performance that much. Most of mitigations do cache line flush only when context switch happens.

    • @setharnold9764
      @setharnold9764 2 года назад

      @@sfalpha yeah, perhaps the math-heavy benchmarks won't show much difference. But some of those benchmarks are very syscall heavy. Check vmstat 1 output when running your favorite benchmarks to see just how many context switches there are.

    • @setharnold9764
      @setharnold9764 2 года назад +1

      Ha, yeah, it looks like RUclips ate my comment with the new Linux kernel command line. Search for "make Linux fast again" and add the usual commercial tld.

  • @ByrnesPCGarage
    @ByrnesPCGarage 2 года назад +5

    One of my favorite graphics cards of all time the 8800gtx had 768 MB of vram lol.

  • @behold_band
    @behold_band 2 года назад +7

    Hey this is Patrick from STH, watch me wave around 17,000$ like it aint shit.

  • @paulm7267
    @paulm7267 2 года назад +1

    Love the 5th element blocks in the background

  • @MK-xc9to
    @MK-xc9to 2 года назад +2

    Short Version : It depends on your Workload. I bet that some Software Companys will optimize their Software for it , you can archive relativly big Performance gains if your Software is Cache intensive and you adjust the Software Size to the aviable Cache Size so that the Software basically can live in the Cache .

  • @dentjoener
    @dentjoener 2 года назад +6

    I would really be interested in Java and .NET benchmarks. That giant amount of cache will help a lot with the pointer heavy nature of garbage collected languages (I think)

  • @AdamsWorlds
    @AdamsWorlds 2 года назад +2

    Yay Wendell

    • @trakkasure
      @trakkasure 2 года назад

      WendOS ?
      Play a game of telephone: WendOS => WindOS => Windows... NOOOOOO!

  • @sfalpha
    @sfalpha 2 года назад +7

    No performance improvement on NGINX is no surprise. They mostly do network things and moving data around.
    But for Application Servers running Java, PHP, Python, Node, etc. It will have some improvements on larger L3 cache if old JIT compiled code already saturated L3 before upgrade to Milan-X.

  • @keyboard_toucher
    @keyboard_toucher 2 года назад +5

    I like big caches and I cannot lie.

  • @wingn3849
    @wingn3849 2 года назад +1

    This man talks with more enthusiasm about these things than I do about juicy burgers.

  • @nekomakhea9440
    @nekomakhea9440 2 года назад +2

    Patrick casually flashing over $16,000 of silicon bling in the intro XD
    That's like $10 per MB of cache

  • @RichardPeterShon
    @RichardPeterShon 2 года назад +2

    Wow! L3 is soooo big! That means more power and it will get hot!

  • @MarkRose1337
    @MarkRose1337 2 года назад +2

    I love big caches and I cannot lie

  • @oskarelmgren
    @oskarelmgren 2 года назад +1

    "Hey we should film something, so we went back to his hotel room" :D :D :D

  • @woldemunster9244
    @woldemunster9244 2 года назад +1

    Nice Fifth Element props in the background.

  • @Darkknight512
    @Darkknight512 2 года назад +5

    As an FPGA developer, I must say, this is the first time I have seen RTL development tools in bunch of CPU benchmarks.

    • @mzamroni
      @mzamroni 2 года назад +2

      Fpga development is the most stressful programming I've done.
      I used vhdl in 2002 for my college final project to design usb hub controller.
      Each compilation took 10 minutes then simulation took 5 minutes, which meant in 1 hour i could only do 4 editings😅

    • @Darkknight512
      @Darkknight512 2 года назад

      @@mzamroni Thats not very long actually. A compile for me at work is 2-7 hours, and we don't even use the huge chips. These are medium sized FPGAs.

    • @mzamroni
      @mzamroni 2 года назад +1

      @@Darkknight512 but my friends whose final projects were regular c based software compiled their codes in seconds.
      That comparison was stressful considering i had to graduate before my scholarship ended😂

    • @ClannerJake
      @ClannerJake 2 года назад +1

      @@mzamroni but fun fact: if you can do FPGA development, then you're basically a necromancer to older gamer's and nerds :)

    • @seldompopup7442
      @seldompopup7442 2 года назад

      I think xilinx actually has an answer record saying if you probably shouldn't use too many threads for implantation as that would just make cores fight for memory bandwidth.

  • @dimensional7915
    @dimensional7915 2 года назад +6

    there are a number of indie games that I can now store entirely in L3 Cache. who even needs local storage when you have all the ram and Cache in the world

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +6

      Crazy to think about! I can certainly remember days when having 1.5GB of RAM was almost impossible.

    • @NVMDSTEvil
      @NVMDSTEvil 2 года назад

      @@ServeTheHomeVideo Hmm.. wonder how running some games in software rendering mode would perform with these.

    • @LtdJorge
      @LtdJorge 2 года назад

      @@NVMDSTEvil bad, frequency is too low for such low core count (64). GPUs are a lot more parallel so you have to compensate with GHz, which the thermals from stacking prevent you from achieving.

  • @g.paudra8942
    @g.paudra8942 2 года назад +3

    It's coming!

  • @theK594
    @theK594 2 года назад +1

    Great stuff guys! I would only disagree with low power optimized. If you can collapse for your scale-out workload from 10 servers to 5 with say a bit more hungry Milan-X CPUs then in a big picture, a lot of power is saved!

  • @nemesis1588
    @nemesis1588 2 года назад +2

    the only way to get 768MB of L3 cache is to have all 8 CCD chiplets populated, which means that the 7373X only has 2 cores active on each CCD. i wonder if they are able to bin the CCDs before they add the vcache on them. this may also explain why there is not an 8 core X part, it would only have 1 active core per CCD (this would likely cause too much of a thermal imbalance on the CCD to use with vcache)

  • @brianmccullough4578
    @brianmccullough4578 2 года назад +2

    L1Tech baby! Woooooooo!

  • @spiralout112
    @spiralout112 2 года назад +2

    ERRMAHGERRDD its Wendawg! Ive been wondering when my 2 fav tech youtubers would get together on something! Although if there was a video just shooting the shit that also had Dr. Cutress involved that would be pretty sweet too.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +3

      Ian stayed with me in December when he was in Austin. We just forgot to shoot something. It will happen

  • @sadiethomas335
    @sadiethomas335 2 года назад

    Wendell is making the rounds today!
    🐻

  • @virtualinfinity6280
    @virtualinfinity6280 2 года назад +1

    First of all, I like the content both of you usually present on the internet. Thanks for that. But in case of Milan-X, both of you totally missed the point :)
    The big thing with Milan-X is, that is finally implements one of the "holy grails" in silicon manufacturing: stacked dies with through silicon interconnects (TSV for short). TSV is just one of these breakthrough technologies, that changed manufacturing significantly, just like CMOS, non-silicon interconnects, EUV lithography, etc. With TSVs, you can finally "split"a chip onto multiple chips while maintaining full-speed interconnects at on-silicon electrical loads. In essence, they (AMD) just managed to implement a way of massively increasing silicon real-estate without making bigger chips.
    Milan-X is just a technology demonstrator. Nothing of the EPYC architecure has changed, they just used the tech to slap on more L3 cache. Which makes sense, if you try to figure out, if you got all kinks of the package assembly process right.
    What can be expected in the future, is a radically re-architected version of EPYC: remove all L3-cache from the processor chiplets and use the gained real estate to significantly increase L1/L2 cache sizes. L3 is then implemented in chip stacks on to of the processor chiplets, or - valid in EPYCs arch - stack the entire L3 cache on top of the IO die. L2 and especially L1 caches are very costly in terms of silicon real estate. Which is why we still stick to ancient L1 cache sizes since decades. Somwhat the same applies to L2 caches. With TSVs, this is going to change radically and will significantly affect performance of upcoming chips. I expect a few more years before this will be put in effect in full swing, but when done, we will get really fast CPUs.
    This is what makes Milan-X so significant. Milan-X will go down in history as one of these landmark chips, that really changed the industry. IMHO, that should have been pointed out in the video.

  • @pilsen8920
    @pilsen8920 2 года назад +3

    Wendell !

  • @dkvello
    @dkvello 2 года назад +1

    I'd love to se a comparison between the "Normal" Rome 7302/7302P, Milan 7313/7313P and the Milan-X 7373X. Single socket, 16core servers have been very popular with service providers and others that need to synchronize their VMware licensing with the licensing models for the virtual workloads (Microsoft licensing, 16 cores total in a physical server is cost-optimal for MS products). Also, the 7373X is the highest Cache pr. core configuration. As for upgrades, many of the servers running Rome 7302/7302P CPU's today can be easilly upgraded to f.eks. 7373X by a simple CPU swap.

  • @__--JY-Moe--__
    @__--JY-Moe--__ 2 года назад +1

    yup! just flash those in front of my R7 3700x eyes! where's Ryker! just work me over!! super presentation! big league!

  • @Prophes0r
    @Prophes0r 2 года назад +6

    Leeluu Dallas MultiPass!

  • @zunriya
    @zunriya 2 года назад +5

    70% cpu power usage is for data movement, less movement, efficient caches way and size are better than recall it from ext memory like on ram or storage latency hit very hard on performance

  • @barney9008
    @barney9008 2 года назад +4

    Crazy to think you could run a whole server on 500w psu's 13 years ago now it's perhaps 250w per cpu

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +4

      A modern top-SKU server even without accelerators/ storage we now budget 1kW for.

    • @macicoinc9363
      @macicoinc9363 2 года назад +2

      @Ray Johnson What the hell do you have in your desktop? Do you have like 4 3090s?

  • @spambot7110
    @spambot7110 2 года назад +1

    something i'm thinking about: as cache gets larger, performance improves, but only really once a process has been running long enough for its most frequently accessed data to have been accessed and cached. My assumption is that from a cold start, this CPU's performance looks exactly the same as an equivalent CPU with a more normal amount of cache, both start out really slow, and then the speed starts rapidly increasing as more and more memory accesses turn into cache hits, until it hits some steady state when the cache is "full". The difference just being that that ramping period continues for longer with Milan-X, leading to a higher steady-state IPC at the expense of taking longer to hit that steady state.
    this is just my intuition smashing together the facts that the memory bus is the same and the fact that the cache is larger, therefore time for the cache to fill is longer. my big question is: does this mean context switches between processes are gonna get even more expensive? or am i overestimating how disruptive context switching is to the cache?

    • @spambot7110
      @spambot7110 2 года назад

      to be clear, i'm not asking if a much larger cache would increase the absolute cost of context switches, it seems pretty clear that the absolute worst-case scenario would be exactly equivalent performance. i'm just wondering if the relative improvement from increasing the cache size will be more sensitive to context switches (say, a process performing amazingly with a fully primed cache, but the first few percent of that process' time slots plus the moments after running a syscall are spent running with a high cache miss rate until it gets back in the groove)

    • @blipman17
      @blipman17 2 года назад

      A byte of memory is loaded once the first read or first write is done to that byte. Funny thing is that since CPU memory is chunked in "cachelines", the 63 bytes right next to it are also loaded. Meaning a lot of stuff will probably be already in cache before its accessed first. But essentially you're right. Cache is loaded on-demand and only after the first load can speedups of cache be retrieved, unless the programmer does some preloading but that's a whole other can of worms.

  • @johndoh5182
    @johndoh5182 2 года назад +1

    So use case, engineering and some DB. For DB that's optimized to have data stored in RAM I could see this helping. How much? You still have to sync the data between RAM and storage. It will be interesting to see.

  • @OTechnology
    @OTechnology 2 года назад +11

    That's insane. When can we have enough cache to not need RAM on consumer PCs lol.

    • @mzamroni
      @mzamroni 2 года назад +4

      It's only 768 MB.
      We still need ram.
      It's very large for cpu caches but small for ram.
      Low end pcs have 8 GB RAM

    • @levifig
      @levifig 2 года назад +3

      That’s basically where Apple is (kinda) headed with the M1… ;)

    • @System0Error0Message
      @System0Error0Message 2 года назад

      @@mzamroni thats plenty to run my website both the webserver and php excluding the database. I can just imagine the amount of php performance i could get from this without needing any RAM. Just debian, lsphp uses 80MB of ram.

  • @AOTanoos22
    @AOTanoos22 2 года назад +2

    Im so disappointed that AMD didn't give us 3D v-cache with the 5000 series threadripper pro after such a long delay. Gonna stick with my 3000 threadrippper pro until v-cache finally arrives.

    • @mzamroni
      @mzamroni 2 года назад +1

      Supermicro Epyc atx motherboard is only $700.
      I will go with epyc rather than thread ripper pro 5000

  • @wowfubar
    @wowfubar 2 года назад +1

    Hi Wendell

  • @joels7605
    @joels7605 2 года назад +2

    With that much L3 it would be a neat if AMD allowed them to boot without RAM. It would be handy for troubleshooting.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +1

      You could do that on the old Xeon Phi's actually.

    • @joels7605
      @joels7605 2 года назад +1

      @@ServeTheHomeVideo That's neat. I didn't know that.

  • @dhwanitchem
    @dhwanitchem 2 года назад +6

    Cash per core vs. cache per core

  • @LoneRiderz
    @LoneRiderz 2 года назад +2

    I see Wendel,I click.

  • @jdbb3gotskills
    @jdbb3gotskills 2 года назад +1

    I hope we see GB cache in desktop CPUs soon.

  • @pkt1213
    @pkt1213 2 года назад +2

    I've been lusting all day. Might put some 7373X in my server purchase. Why not?

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +1

      It may also come down to availability for these. We have to wait and see how that pans out.

    • @pkt1213
      @pkt1213 2 года назад +1

      Yeah. I don't need it. I'll put it in my request for bid as a nice to have but not required.

    • @pkt1213
      @pkt1213 2 года назад

      @@ServeTheHomeVideo The refresh cycle at work is not great. 2x E5-4617 v1 in a 4 socket server (for some reason) and DDR 1333. I started last year and when I asked for new VMs, they had to be put on a different machine that was new enough to support ESXI 6.7. 😱 I managed to budget for a replacement for this year and want to ensure I have enough overhead for 5 year.

  • @trakkasure
    @trakkasure 2 года назад +2

    I don't need it, I can't afford it... but I want it.

  • @oah8465
    @oah8465 2 года назад +1

    I guess this would have fantastic impact interpreted languages like PHP, Python.

  • @jms019
    @jms019 2 года назад +1

    IBM had gigabyte cache on their Z some years back.

  • @ATGEnki
    @ATGEnki 2 года назад +1

    Presenting: one Wendel (Patrick for scale, colorized 2022)

  • @GustavoNoronha
    @GustavoNoronha 2 года назад

    Do you have the configuration for the Linux kernel that you use for your benchmarks somewhere, maybe on the main site? I'd like to run a couple comparisons based on your data =)

  • @DileepB
    @DileepB 2 года назад +1

    The specintrate of the 7773X is almost identical to the 7763. Looks like the larger L3 compensates for the lower core frequency. Just an observation.

    • @Soras_
      @Soras_ 2 года назад

      Larger L3 surely make the CPU work a lot harder.

  • @shawncarroll5255
    @shawncarroll5255 2 года назад

    Tokamak twisted plasma flows and realtime adjustment of the magnetic fields? Or is it more of a clock speed function?

  • @denvera1g1
    @denvera1g1 2 года назад +1

    I was under the impression that with the extended cache pool, the processor would run hotter at lower frequencies because each thread is... parked(?) less often waiting on code to come from RAM/ROM. meaning it cant hit the higher speeds without better cooling

  • @gh975223
    @gh975223 2 года назад +1

    So when are we going to get this 1.5GB of cache on consumer CPU aka Ryzen and ThreadRipper where we need this nor than companies!

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад

      Probably some time before Ryzen gets 1GB.

    • @gh975223
      @gh975223 2 года назад

      @@ServeTheHomeVideo yeh we know, but what as a consumer we really needs is 128 PCI lanes min, aka 7 slots with 16 PCIe lanes, with GPU that actually install in 1 slot ( reason is VFIO and NVME Raid Cards as well as USB addons ( i wish they would give each USB slot a seperate pci id to pass USB slot 1 per 1 through to VM))

  • @BennyTygohome
    @BennyTygohome 2 года назад +1

    Very interesting

  • @pdk005
    @pdk005 2 года назад +2

    Awesome video and very throrough. Multipass!

  • @niks660097
    @niks660097 10 месяцев назад

    For something like emulation this cache can do wonders, imagine emulating ps3 and using the L3 cache as emulated SPU memory..

  • @EyesOfByes
    @EyesOfByes 2 года назад +2

    *"DOOM must on L3 cache be run"* - Cato The Elder"

  • @PreybirdMKII
    @PreybirdMKII 2 года назад +1

    Love the 4 elements, you're just missing Leeloo.

  • @shadowarez1337
    @shadowarez1337 2 года назад +2

    How soon till they adapt a newer standard for transfer from memory to the cpu like maybe instead of a trace have a link that can be upgraded over time.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +2

      Maybe thinking co-packaged optics? See my SC21 chip discussion with Raja Koduri for that.

    • @shadowarez1337
      @shadowarez1337 2 года назад +1

      @@ServeTheHomeVideo thank you I definitely will I keep hearing and seeing how these traces are becoming a bottleneck like signaling of PCIe 3-4 was alot of work I can only imagine PCIe Gen 6.
      Going to be a interesting few years. Also what do you think the availability of even the lowest end Epyc 12c will be thinking of building a proper Server for my needs and keeping my DS1621+ as a failover once I build out a nice strong 10GB Network.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +2

      It is sometimes cheaper to just buy a full EPYC server than to build one. Getting Rome is also sometimes faster/ less expensive.

    • @shadowarez1337
      @shadowarez1337 2 года назад +3

      @@ServeTheHomeVideo thank you I'll def look into what's available as it's time to make a real server as great as a Asus x570 i-gaming is I'd rather have a dedicated solution.
      Thank you for all this content serving it up in a way even I can understand 👍

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +3

      Next week we will have an ASUS Alder Lake build that is going to be fun.

  • @Isaac-X113
    @Isaac-X113 2 года назад +3

    Wendel doesn't fly he travels via server rack. Silly mistakes

  • @kayakMike1000
    @kayakMike1000 2 года назад +1

    Seimens? We like Rockwell Automation.

  • @matty1234a1
    @matty1234a1 2 года назад +5

    8800 usd, only like a 1000$ premium over regular milan

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +1

      Yes. It is very interesting from a product positioning perspective.

    • @matty1234a1
      @matty1234a1 2 года назад +1

      @@ServeTheHomeVideo considering all the extra 7nm wafer used, NRE, assembly complexity, and yield loss. I imagine their gross margin per chip is probably wayy lower. Seems to be more of a get it out there in everyones racks price.

  • @adokapo
    @adokapo 2 года назад +1

    Nice shirt.

  • @KingLarbear
    @KingLarbear 2 года назад

    This looks good in 2160p

  • @uncrunch398
    @uncrunch398 2 года назад +2

    I remember a time (year is vague) 1.5GB was a common max amount of RAM a server would support. It's crazy to consider that's hardly useable on a phone these days

    • @Kobrar44
      @Kobrar44 2 года назад +3

      It isn't "hardly usable", the software is just garbage.

    • @uncrunch398
      @uncrunch398 2 года назад +1

      @@Kobrar44 I wonder how much of it is simple DB engines needlessly locking data into RAM. That and excessive IO are why I dislike them.

    • @amogusamogus8490
      @amogusamogus8490 2 года назад

      @@Kobrar44 Manjaro runs well on my 2gb ram laptop

  • @redtails
    @redtails 2 года назад

    cpu vendors coming up with crazy concepts in order to compete with GPU. I wonder whether it's a lost cause from the get-go. Some of those benchmarks would be 10x faster on gpu, anyway

    • @mzamroni
      @mzamroni 2 года назад +2

      There are many software use cases that are not suitable for gpu.
      Web server software (httpd, jsp, php, etc) can't run on gpu.

  • @germz1986248
    @germz1986248 2 года назад +2

    My first dual "core" pc (dual slot 1 pentium III) only had 2gb of ram lol

  • @System0Error0Message
    @System0Error0Message 2 года назад

    but will it store?

  • @Nobe_Oddy
    @Nobe_Oddy 2 года назад

    BIG BADA-BOOM!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    oh no... wait...
    MULTI-PASS!!!!!
    (the BEST part of the movie is the very beginning when Corban Dalls (Bruce Willis) gets a 'knock' on his door and looks out the peep hole to see no one there, but when he opens it there's a guy with a a gun and a picture of the empty stuck to his head band, so that when Corban looked out the peep hole he just saw the picture...... BUAHAAAA!!!! That whole scene is HILARIOUS!!!!! - "TAKE IT!!1 I DON'T NEED IT!!!" HAHAHAHAHA!!!!!

  • @narobii9815
    @narobii9815 2 года назад

    cash is important because it allows you to make payments for things even if the power is out.

  • @scheimong
    @scheimong 2 года назад +2

    AMD: "More cash good. Cache good too."

  • @cfwin1776
    @cfwin1776 2 года назад +1

    “Your room or my room”? As long as you two are happy. Wink wink nudge nudge.

  • @SaberusTerras
    @SaberusTerras 2 года назад +1

    I need to see your multipass, sir...

  • @KingLarbear
    @KingLarbear 2 года назад +1

    Does that say $8800 for that cpu, holy sheetz

  • @joshhardin666
    @joshhardin666 2 года назад +1

    Isn't stacking another die on top of a die that already has trouble getting rid of it's heat problematic? What keeps the CPU dies from cooking themselves with SRAM dies on top of them? This is the same question I have with zen3 3d products coming any day now. Speaking of which, I've recently given thought to a Threadripper pro or epyc workstation because of the pcie lanes that both Intel and amd seem dead set against providing on Ryzen and amd seems keen to murder the consumer hedt parts which drives me crazy because there just isn't enough pcie for adding several nvme adapters, dual or quad 4k60 video capture cards, multiple GPUs for cuda acceleration, 10g networking (which SHOULD be onboard everything now and very much isn't), etc. Threadripper 3970x on zen2 looks stupid in terms of compute power next to a 5959x Ryzen part, even the recently announced Ryzen pro stuff is going to be zen3 based instead of waiting and releasing a zen5 part at the same time as the Ryzen parts drop in q4. Ugh. The CPU market is driving me batty. 16/0/4 plus an nvme or 3 (2 of which are shared) or 8/8/4 lanes just aren't enough for high end video production and nvme storage needs. And don't get me started on ryzen's lack of ecc support smh.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +4

      AMD is stacking the SRAM die above the cache portion of the CCD, not over the computer cores. Part of the reason is to manage heat.

  • @camerontgore
    @camerontgore 2 года назад +2

    Rome wasn't built in a day, but it was launched in one 😁

  • @tbone020ify
    @tbone020ify 2 года назад +1

    The easter egg is from Fifth element it's THE STONES. You should be protecting them.

  • @MatthewWalster
    @MatthewWalster 2 года назад +8

    Multipass!

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  2 года назад +3

      Nice! 100%

    • @MatthewWalster
      @MatthewWalster 2 года назад +1

      @@ServeTheHomeVideo and here's me thinking it was a pun on multipathing logic for the Infinity Fabric :P

  • @flywheeldk
    @flywheeldk 2 года назад +1

    F****** Hell - I remember when 32 kilobyte was a big deal.

  • @МаксСоловьев-щ1ь
    @МаксСоловьев-щ1ь 2 года назад

    Чего он такой довольный всегда ?))

  • @alystair
    @alystair 2 года назад +1

    Wait why are the earth/fire/water/air stones from the 5th element in the thumbnail of this video? LOL

  • @TotesCray
    @TotesCray 2 года назад +1

    The Elements! BZZZZZZZZZZ

  • @samlebon9884
    @samlebon9884 2 года назад +2

    I am sure Intel will stack some Bitcoin on top of their CPUs.

  • @System0Error0Message
    @System0Error0Message 2 года назад

    but will it cash?

  • @mikebruzzone9570
    @mikebruzzone9570 2 года назад +1

    Holding out on us Patrick stringing out the serial; late market, SRAM bit charge not scaling v logic is not analog that are circuit bridges of various sorts. Leading, lagging, balanced, various amplified frequency emissions that sort of thing. Well, 8C 3D late market Win 10 system CPU is 5800X3D at $449 is a good price on AMD to channel cost : price / margin the component price from TSMC to AMD is $210 + x1.55 for AMD hi volume customers and x2 to lower volume resellers. Wendell its spatial. x:y:z multi planes, mathematical determination x:y:z to calculate and compare to deliver a result is one application space then also data bases, on line transactional processing spreadsheets, modeling, simulation etc. And if you want 256 MB over 96 MB L3 get Milan 73F3 for $1655 is also a good price on AMD margin sacrifice for Epyc 73F3. Price analysis, yea I'm waiting you know how to get in touch with me. Windell, deep knowledge, come on OLTP and Open MP is not deep knowledge. Subject knowledge to work on it Wendell and Patrick create the benchmark standards and quit talking around or other creative and less than creative engineers will just move in on you and take your ideas that you both did not implement on. I can do that because I state ARs for engineers to do that so get to work! mb

    • @mikebruzzone9570
      @mikebruzzone9570 2 года назад

      Right structured data v unstructured needs some processing first on the coprocessor setting up to feed the CPU. mb

    • @mikebruzzone9570
      @mikebruzzone9570 2 года назад

      Real like examples that's you Patrick and Wendell's job to take that AR what r u waiting for? mb

  • @wishusknight3009
    @wishusknight3009 2 года назад +1

    5th element

  • @mzamroni
    @mzamroni 2 года назад

    Amd can reduce epyc power consumption if the io chiplet uses better manufacturing process.
    Gf 12nm is ancient compared to tsmc 7nm.

    • @pieluver1234
      @pieluver1234 2 года назад

      Are they still using GF 12nm? I thought they moved to TSMC 10nm or 7nm for the IO dies

    • @mzamroni
      @mzamroni 2 года назад

      @@pieluver1234 wikipedia even says 12nm for desktop and 14 nm for server io chiplet
      en.m.wikipedia.org/wiki/Zen_3

    • @tim3172
      @tim3172 2 года назад

      What is the actual power consumption of the IO die and why is it so much higher than the *checks notes* 6.6 watts AMD states it uses?
      Please let them know that they can get from 6.6 to 5 watts if they triple the cost of the IO die manufacturing process.

  • @meme-rp5ww
    @meme-rp5ww 2 года назад +1

    if x86 fron end is by passed, hmmmm APLHA 2/3 of egenirs

  • @wayland7150
    @wayland7150 2 года назад +1

    Rome wasn't launched in a day.