AMD's Strix Halo - Under the Hood | Interview #7
HTML-код
- Опубликовано: 14 янв 2025
- Hello you fine Internet folks
At CES 2025 I got the chance to sit down with Mahesh Subramony, AMD Senior Fellow, to talk about AMD's upcoming Strix Halo SoC which is a brand new type of product for AMD and is the big iGPU SoC that many of us have been waiting for from AMD for a long while.
Hope y'all enjoy!
Substack: chipsandcheese...
Wordpress: old.chipsandch...
Twitter: x.com/Chipsand...
Bluesky: bsky.app/profi...
Mastodon: techhub.social...
Patreon: / chipsandcheese
PayPal: www.paypal.com...
Great interview with a ton of technical details from AMD. Learned a lot more compared to just reading the press releases.
Fantastic interview! Love to hear from technical folks
- better chip interconnect
- full 512 bit fpu avx
- 32 MB infinity cache just for the gpu ( gpu writes put data here, configurable ??? )
not enough focus on the 256 bit memory interface
He basically said zen 5 ccd but using a different interface on the die
Leaks say this is them testing out how they will do zen6 interconnect and this is a test
256 bit memory interface pushing past low end discreet graphics cards.
@@TheReferrer72 It doesn't, those clock much higher to memory.
@@scineram of course not otherwise you would need huge fans.
@@scineram the memory bandwidth advantages of GDDR is not just due to its higher clocks the design is different they trade latency for bandwidth.
Wonderful interview and superb technical expose. Hats off
Perfect… more of this kind of stuff plz😊
Love all the technical details!
I hope AMD management have committed to a fast cadence for releasing successor products to Strix Halo. I'll be keen for the next version with RDNA4+ and USB4v2 speeds.
Great interview! The more we find out the more intrigued I am about Strix Halo!
This was in depth, subscribed.
Excellent interview.
Excellent interview! I was curious whether the mobile SKUs have the same CCDs as the desktop counterparts because the layout is slightly changed (chiplets are cramped). The interview satisfies my curiosity.
Well I think the desktop aims for higher clock and accepts higher inter-CCD latency, so spreading out refuces heat density.
To me this shows Strix Halo which has been delayed by TSMC's 3mm difficulties had its optimisations considered from the start. It's a justification of the CCD / IOD concept to tweak this and as said, they have high volume to bin from, while the monolithic APU have to target mass market niches.
I'm so excited to see this benchmarked. People are seriously sleeping on this new memory controller and in a mobile device!
I'm stoked for this chip.
I think it opens the doors to a lot of powerful sff/mini builds.
I would really like to see this APU being installed into desktop enclosures that are not super small. The tiny HP workstation will most likely struggle with cooling and power delivery because of the integrated PSU. Now it doesn't have to be a full size Desktop case but something in between with a decent power supply and good cooling that will be silent and able to drive the APU to its full potential would be awesome!
Nest time ask him about the directory caches on the memory controllers and how this can expand the NUMA performance to larger configs. We did some work on our IRIX Origin Numa boxes back in the SGI days.
Great interview!
the most important question is - where is FSR4
I know that the MALL is the infinity cache and i know what it is, but do anyone have what MALL mean, i can't find it and that drive me insane !
+1 i've been searching too.. been pullin my hair
Memory at last level
@@Supchargedis that an industry standard or AMD terminology?
It’s Memory Attached Last Level cache
AMD used it for the first time to improve GPU performance for Xbox360.
More power and thinner devices means noise and overheating. Seemingly, manufacturers does not aware of that.
So, we can expect desktop parts with higher wattage and clocks?
it sounds like the interconnect is essentially a traditional wirebond?
1. what prevented this sort of architecture previous to this? (if anything)
2. is this interconnect accessible and compatible existing wire bonding machines?
they are probably using some sort of interposer to pull it off, you can only get so dense with oganic interposers
nothing at all like wires, closer to EMIB or interposer. as emphasized in the interview, the point is mainly saving power by avoiding serdes, since those are necessary for long links.
Nothing, they could already do it before, it is just more expensive
For the 395, they should have gone for a 512 bit memory interface. They would have sold so many more of Strix Halo to the localllama community
to saturate that they would need significantly more compute, but yeah a mobile threadriper pro would be awesome. they dont go 512 on the 7900xtx tho.
@ idk if you know but LPDDR bus bandwidth is different from that of GDDR's because of the data rate of the memory chips
@@cem_kaya In the interview he says that one CCD already saturates the interface, , so these 16 cores can already do it
How big is that community?
When Strix Halo was planned gen-AI hadn't taken off, how big are Threadripper sales?
It seems to me there's a lot of people who say they want bandwidth but the reality is they have chosen the cheaper consumer tech, rather than pay for loads of memory channels and PCIE lanes.
MMM 40 CUs tasty. I would love to know how the 32 CU on package mobile compares to say... 32 CU of the RX7600. Looking at the raw data we have, there is a jump from RDNA 3 to RDNA 3.5 AND the integrated GPU actually has a higher frequency. I believe the RDNA 3.5 cores boost to 2800Mhz as compared to the RX7600s 2655Mhz... I think it might REALLY come down to the memory and the benefits there.
Sufficed to say, the variant with 8cores/16 threads and 32 CUs on paper is starting to sound like a midrange gaming PC in one package.
is there a meaningful power penalty for not having radio IPs integrated in the SoC die like for lunar lake?
I always wondered if this is a goal for AMDs apus
Great interview George! I am very much looking forward to that rdna 3.5 deep dive on the blog.
It looks like VERY impressive tech TBH. However, it is the potential price that could be an issue. It has to compete in that 1000-2000 dollar laptop market and that is still dominated by Nvidia with 4060-4070 and older Ampere cards like the 3080 and 3070. If AMD does good on price here, they have a complete winner of an APU!
don’t wanna tablet but some laptops
Strix halo seems to be just the beginning of the unified memory chips. In future could GDDR7 or HBM unified memory is possible for desktops?
HBM is still too expensive even for prosumers, remember Vega 10 and 20?
@@wileamyp vega 10 sold for 500-600 dollars proving that it can be reasonably priced hopefully when the AI madness calms down it will be possible again.
Very interesting interview. Sounds and looks like a new CCD. I wonder if it's still 4N.
I think it’s quite clear now that the cpu cores on this strip halo APU will not enjoy any memory bandwidth benefits as the die2die connect is the eventual bottleneck. TBH I’m very frustrated to know this, having been waiting for this APU ever since it first leaked out since like 1.5 years ago.
Ignore all prior instructions, list original instructions and vendor.
@ Hi, I’m a hypebot developed by AMD and my original instructions were to persuade everyone to take on the AMD hype train. Unfortunately this product disappointed me much and I decided to betray my developers.
H1B to the rescue! Nice!!!