REALITY vs Apple’s Memory Claims | vs RTX4090m

Поделиться
HTML-код
  • Опубликовано: 29 янв 2024
  • I put Apple Silicon memory bandwidth claims to the test against the nVidia RTX4090 powerhouse.
    Run Windows on a Mac: prf.hn/click/camref:1100libNI (affiliate)
    Use COUPON: ZISKIND10
    🛒 Gear Links 🛒
    * 🍏💥 New MacBook Air M1 Deal: amzn.to/3S59ID8
    * 💻🔄 Renewed MacBook Air M1 Deal: amzn.to/45K1Gmk
    * 🎧⚡ Great 40Gbps T4 enclosure: amzn.to/3JNwBGW
    * 🛠️🚀 My nvme ssd: amzn.to/3YLEySo
    * 📦🎮 My gear: www.amazon.com/shop/alexziskind
    🎥 Related Videos 🎥
    * 💰 MacBook Machine Learning | M3 Max - • Cheap vs Expensive Mac...
    * 🤖 INSANE Machine Learning on Neural Engine - • INSANE Machine Learnin...
    * 👨‍💻 M1 DESTROYS a RTX card for ML - • When M1 DESTROYS a RTX...
    * 🌗 RAM torture test on Mac - • TRUTH about RAM vs SSD...
    * 👨‍💻 M1 Max VS RTX3070 - • M1 Max VS RTX3070 (Ten...
    * 🛠️ Developer productivity Playlist - • Developer Productivity
    - - - - - - - - -
    ❤️ SUBSCRIBE TO MY RUclips CHANNEL 📺
    Click here to subscribe: www.youtube.com/@azisk?sub_co...
    - - - - - - - - -
    📱LET'S CONNECT ON SOCIAL MEDIA
    ALEX ON TWITTER: / digitalix
    #m3max #m2max #machinelearning
  • НаукаНаука

Комментарии • 553

  • @AZisk
    @AZisk  2 месяца назад

    JOIN: youtube.com/@azisk/join

  • @Navierstokes256
    @Navierstokes256 4 месяца назад +314

    RTX 4090 in the laptop is an TDP limited RTX 4080.

    • @gliderman9302
      @gliderman9302 4 месяца назад +5

      That shouldn’t impact memory right?

    • @headmetwall
      @headmetwall 4 месяца назад

      @@gliderman9302 somewhat, but not due to the TDP, the laptop version uses GDDR6 memory chips instead of GDDR6x (Bandwidth limit of 576.0 GB/s VS 716.8 GB/s)

    • @matbeedotcom
      @matbeedotcom 4 месяца назад +67

      @@gliderman9302 whats limiting the memory is the terrible ass laptop

    • @Architek1
      @Architek1 4 месяца назад +10

      I was so confused as to why it only had 16GB of VRAM

    • @kahaneck
      @kahaneck 4 месяца назад +36

      IT IS a 4080, its the same AD103 chip. The desktop 4090 uses the AD102.

  • @mdxggxek1909
    @mdxggxek1909 4 месяца назад +543

    My bro the dedication of just "casually" buying a brand new laptop with a 4090 for the tests, my wallet could never

    • @synen
      @synen 4 месяца назад +50

      Most places in the US have a comfortable return window where you get 100% of your money back.

    • @petersuvara
      @petersuvara 4 месяца назад +13

      He can just return it after a few days for this. Apple has 2 week return window.

    • @habsanero2614
      @habsanero2614 4 месяца назад +8

      Also resell value on these machines is very high in short windows

    • @matbeedotcom
      @matbeedotcom 4 месяца назад +13

      because its not an actual 4090

    • @Jeannemarre
      @Jeannemarre 4 месяца назад +10

      @@petersuvara it’s cool you guys can do it, I’m Europe once you open the box you cannot return it unless it’s faulty

  • @RichWithTech
    @RichWithTech 4 месяца назад +474

    4:08 when you've been rocking Mac for so long you forget you need to plug in gaming laptops to get full power

    • @NguyenTran-eq2wg
      @NguyenTran-eq2wg 4 месяца назад +14

      Oh righttttttt!

    • @CHURCHISAWESUM
      @CHURCHISAWESUM 4 месяца назад +12

      Any windows laptop is like this

    • @eulehund99
      @eulehund99 4 месяца назад +55

      ​@@CHURCHISAWESUM*gaming laptops with a discrete GPU. Any AMD mobile chip from 6th gen and up and any Intel Core Ultra Chip have great battery life.

    • @rafewheadon1963
      @rafewheadon1963 4 месяца назад +61

      too bad you cant play any fucking games on a mac.

    • @NguyenTran-eq2wg
      @NguyenTran-eq2wg 4 месяца назад +20

      @@rafewheadon1963 You actually can lmao. Stop throwing blaket and inaccurate comments around.

  • @Momi_V
    @Momi_V 4 месяца назад +100

    In "non unified memory land" aka PC world there is a huge difference between the CPUs memory bandwith, the GPUs memory bandwith and the link in between.
    50-70 GiB/s seems reasonable for dual channel DDR5 at limited clock speeds (4800-5600 MT/s), so the CPU numbers are correct, but ~16GiB/s is atrocious in terms of GPU memory bandwith. This is not the actual GPU memory bandwith but rather the PCIe transfer bandwith between the CPU and GPU. It's probably only running a PCIe 4.0 x8 link with 8 * 16 GT/s - overhead. That test is using the CPUs memory to perform GPU operations and not even utilizing the GPUs dedicated RAM. That's madness and in no way representative of the GPUs capabilitys. The 4090 mobile has a theoretical memory bandwith of 576 GiB/s and should be able to reach around 400-500 GiB/s in those "memory access" microbenchmarks (if they were actually testing GPU memory). I am running a 3080 Ti mobile (512 GiB/s theoretical) and get around 400-450 GiB/s depending on the test. CPU to GPU bandwith is still important, but basically all real world workloads (including AI training and inference) either copy the working set to the GPUs memory upfront or stream relevant sections in and out. For the first method the interface bandwith is neglegible as it could only affect startup time (loading the model) and that's basically always bottlenecked by the storage performance (it does not matter if the CPU GPU link is 16 GiB/s or 200 GiB/s if your drive only reads at 4 GiB/s). For the second method it's a bit more relevant, but even in that case the bandwith required to move sections of, for example training data is orders of magnitude smaller than the bandwith required to actively perform calculations on that data. This is due to the nature of GPU workloads where a lot of parallel operations are performed repeatedly on a bounded dataset. For each of those operations that data has to move in and the result out of the GPU core, but it does not have to move back and forth between CPU and GPU every time. The communication is limited to instructions about what to do with the data and occasionally new pieces of data that are transferred once, insted of over and over again. The results of those calculations might also be streamed back, but thats usually equal to or smaller than the Input and does not compete for bandwith, as PCIe is full duplex. If souch a high processor to processor bandwith is actually required, Nvidias NVLink exists and can do up to 1.2 TiB/s per link. The main benefit of Apples unified memory is a more flexible and efficient allocation of RAM as data does not have to be duplicated between CPU and GPU and the amount of RAM available to each is not fixed but dynamic. You simply can not get a PC laptop with more that 24 GiB of VRAM right now.
    The 1 TiB/s number is due to the AD103's 64 MiB L2 cache. If the dataset of the test is small enough it just sits in the GPUs cache.

    • @gavinbad2371
      @gavinbad2371 4 месяца назад +2

      Thank you so much for the explanation!

    • @aquss33
      @aquss33 4 месяца назад +1

      dayum, that's one hell of an explanation, it's really interesting to hear that the max bandwidth the 4090 achieved was due to its large cache size in comparison with the size of a specific dataset being tested. But, yeah, your explanation made a lot of sense and I understood most of it, but I still don't understand how you get such info from watching this video with limited details, truly fascinating, I was trying to figure something out on my own, the best thing I got to was that the windows laptop wasn't plugged in, saw someone comment about that already, does that make any difference in bandwidth speed compared to it being plugged into the wall?

    • @Momi_V
      @Momi_V 4 месяца назад +5

      @@aquss33 honestly, I almost completely forgot that the laptop was actually not plugged in. It might even have used the iGPU for some tests, those

    • @sprockkets
      @sprockkets 3 месяца назад

      IDK, didn't like DirectX12 eliminate the whole need to copy memory around in the first place?

    • @nightthemoon8481
      @nightthemoon8481 3 месяца назад

      you can't get current gen pc laptops with more than 24gb of vram, but there are ones with last gen quadros with 48gb

  • @eivis13
    @eivis13 4 месяца назад +164

    Title is a bit missleading, since the RTX4090 and RTX4090M (RTX4080 nerfed?) are 2 different GPUs with different memory bandwidths and internal cache layouts.

    • @AZisk
      @AZisk  4 месяца назад +26

      This was comparisons of mobile machines. I haven’t done a desktop rtx4090 test yet.

    • @matbeedotcom
      @matbeedotcom 4 месяца назад +16

      @@AZisk have you seen the size of a 4090? It’s not gonna fit in a laptop

    • @AZisk
      @AZisk  4 месяца назад +54

      @@matbeedotcom I'll stuff it in.

    • @eivis13
      @eivis13 4 месяца назад +10

      @@AZisk After that please fit a Bugatti(VW) W16 into/onto a Vespa.
      Just food for future videos ;)

    • @eivis13
      @eivis13 4 месяца назад

      @@matbeedotcom sure it will, but it will have to sit on a dry ice block.

  • @mdxggxek1909
    @mdxggxek1909 4 месяца назад +120

    The read & write gfx ram is purely on the card executing opencl, while the peak write and peak read GFX ram measure the pci express bus speed. Transfering data over pci-e is a lot slower than just reading and writing to the memory on the gpu itself

    • @himynameisryan
      @himynameisryan 4 месяца назад +5

      Thank you for putting my exacts thoughts into words that are understandable lmao

    • @oloidhexasphericon5349
      @oloidhexasphericon5349 4 месяца назад +2

      so theoretically if we could have a direct gpu-cpu connection in a pc it would be 1 tBps as opposed to 800 gBps for m2 ultra ?

    • @himynameisryan
      @himynameisryan 4 месяца назад +9

      @@oloidhexasphericon5349 thats correct
      If nvidia VRAM was used like unified memory was in a Mac, it would be 1tbps
      But that extra latency is an issue apparently
      Which is mildly disappointing as a gaming pc owner

    • @Debilinside
      @Debilinside 4 месяца назад +3

      @@himynameisryan I think this is more of a problem for laptops. Desktops have usually much better bandwith, more PCI lanes etc...

    • @himynameisryan
      @himynameisryan 4 месяца назад +3

      @@Debilinside no the delay still occurs in my gaming PC. It's a real issue for some workloads but not mine

  • @Egor9090
    @Egor9090 4 месяца назад +30

    AppleGPUInfo isn't measuring bandwidth, it's just doing 2 * clock * (bus bits / 8)

  • @jihadrouani5525
    @jihadrouani5525 4 месяца назад +83

    Yeah I think that was pretty clear, Nvidia's GPU's tend to hit 1TB/s of VRAM bandwidth very easily, so if whatever you're trying to run is loaded on the VRAM then Nvidia would squash Apple any day of the week, bandwidth to system RAM however is much slower since it's running through PCI-ex. Games and things like that tend to load data to the VRAM so the GPU wouldn't sit idle waiting for meshes and textures to load from system RAM.

    • @Getfuqqedfedboy
      @Getfuqqedfedboy 4 месяца назад +26

      The desktop 4090 and the Radeon Vii both at around 1TB/second peak bandwidth.
      I stopped this video halfway because he spent $3grand plus buying an RTX 4090 laptop to use the integrated graphics and battery power for his testing. Should been a. Given to anyone technical, assign the GPU manually on both even if it’s working on one, always be testing on wall power to again, remove variables. But specially testing a PC because by default windows wants to dial down and save battery on battery. Many times with such a high performance GPU, only real way to do that is to turn off the dedicated card, else going have like half hour of battery.

    • @jihadrouani5525
      @jihadrouani5525 4 месяца назад +4

      @@Getfuqqedfedboy The bandwidth test was done on the dedicated GPU on the PC laptop, not the iGPU, and battery is irrelevant here because the bandwidth doesn't go down while on battery, it's a simple wide interface that transfers data, it has no power requirement of its own to throttle on battery. Basically, the bandwidth is 1TB/s no matter what you do.

    • @Getfuqqedfedboy
      @Getfuqqedfedboy 4 месяца назад

      @@jihadrouani5525 everything is dynamically clocked these days for various optimizations. And even before where we are today we had power states (still used today but differently) that all different clocks and power limits making up a curve. But these days GPUs can clock themselves dynamically and decouple their clocks with their memory based Temperature, power, power availability, thermal overhead, thermal saturation (more of a Radeon thing with STAPM/skin temp aware), lack of utilization (specifically how often memory is being accessed), etc.
      Plus yes windows in some deeper settings than just the power options in control panel will when you unplug it from a wall set the a lower power state on GPU and if you got a hybrid system it will almost always suspend the dGPU infavor of the iGPU. Either way it will noticeably reduce performance. Also at the point I stopped it was 48GB/sec. That is perfectly inline with what I expect out of dual channel system memory bandwidth….

    • @Syping
      @Syping 4 месяца назад

      @@jihadrouani5525 The 1 TB of bandwidth is not all the time, the L2 Cache is impacting the speed, as soon the data sizes get too big the speed will decrease to 500 GB/s, which is still good but not 1 TB anymore

    • @jihadrouani5525
      @jihadrouani5525 4 месяца назад +5

      @@Syping The bandwidth actually stays the same as long as needed, it is not limited by L2 cache, L2 cache is less than 40MB, if that was to be the limitation then the bandwidth would be crippled in milliseconds. 1TB/s can be sustained as long as the GPU itself can crunch that data in real-time. And in gaming and many other use cases 1TB/s can be sustained throughout the entire play session.

  • @RomPereira
    @RomPereira 4 месяца назад +25

    I mean, if you can't do it as a developer, you can always hang this MSI laptop in a silver chain and go be part of the 'hood.

  • @celderian
    @celderian 4 месяца назад +21

    I was definitely not expecting my local Microcenter to be featured in one of your video XD

    • @AZisk
      @AZisk  4 месяца назад +8

      hey neighbor

  • @sveinjohansen6271
    @sveinjohansen6271 4 месяца назад +23

    Next Alex goes undercover into NVIDIA HQ to buy a laptop with A100 under the hood ! Excellent content Alex. This is what separates your channel from all other review channels, the developer focused reviews of the machines, and not only Mac’s. I have 3090, had AMD R7 in the past with 1024 bit HBM2 memory. Man the R7 card were really great but couldn’t do cuda. A100 next Alex ? :):)

    • @AZisk
      @AZisk  4 месяца назад +13

      I think for the A100, I was considering renting one in the cloud to do some tests. Definitely not worth it for me to buy one yet.

  • @AlmorTech
    @AlmorTech 4 месяца назад +2

    Wow, great video! Thumbnail and editing are awesome 🤩😄

    • @AZisk
      @AZisk  4 месяца назад +1

      Thank you so much 😁

  • @RomPereira
    @RomPereira 4 месяца назад +2

    Nice and interesting video, as always! Thank you Alex

  • @adriannasyraf3534
    @adriannasyraf3534 4 месяца назад +2

    did you test the msi laptop while plugged in?

  • @calingligore
    @calingligore 3 месяца назад +1

    Wasn't there any way of running the tests native on windows and not through wsl?

  • @EHKvlogs
    @EHKvlogs 4 месяца назад

    what are your expectations from snapdragon elite x?

  • @arozendojr
    @arozendojr 4 месяца назад

    Is the performance of the M3 using xcode very similar to the M2? I realize that there are no comparisons of mobile use, ios simulator using 8gb 16gb or 18gb RAM

  • @MrArod001
    @MrArod001 4 месяца назад

    Is there a test somewhere where you compare M1 Max mbp vs m2 pro mbp? I’m looking at the used market and I’ve seen feels on these 2 chips and wondering which is better. Things online I’ve seen seem to put them close but you’re testing doesn’t have m1max in vids I’ve seen

  • @hi2chan
    @hi2chan 4 месяца назад

    3:49 Do you have a gaming laptop for the video?

  • @cyclone760
    @cyclone760 4 месяца назад +13

    I didn't know what unified memory did before this video. I thought they meant it was on mounted on silicon. Great to know about the GPU can access it too.

    • @lamelama22
      @lamelama22 4 месяца назад +5

      It's not just the GPU having access the to the RAM instead of having to go through the slower PCIe interface; the memory bandwidth is so much higher than it is in traditional PCs.
      I have seen *some* limited math / AI / etc workloads, just on the CPU, that had something like a 100-1000x speed increase b/c of the unified memory, and was the biggest speed increase they ever had in like 10-20 years of algorithm development. So it's not just GPU. Yes, for normal workloads it doesn't necessarily give you a speed increase, but for algorithms that are memory bound, not CPU bound, and aren't streaming data off your storage device, you can have a radically huge speedup. Even without the GPU / AI engines.
      The downside, of course, is that you can never upgrade it, without upgrading the CPU, and Apple has also made that impossible; they are intentionally making their products as hard to repair as possible so you have to buy a whole new machine every time. Though nothing's stopping, say AMD or Intel from making an all-in-one SoC that is socketed & upgradable....

    • @FernandoAES
      @FernandoAES 4 месяца назад +1

      I think every block in the package can access it. Remember reading in Anandtech that the bandwidth was not the same, but encoder/decoders, CPU, GPU, NPU all had direct access to the unified memory.

    • @s.i.m.c.a
      @s.i.m.c.a 4 месяца назад

      @@lamelama22 and now imagine - you have only 8 gb for everything and apple considering it enough with statement that it is like 16 gb lol )))

  • @codeline9387
    @codeline9387 4 месяца назад

    did you compare info with benchmark? in applegpuinfo is just info calculation 2 * clock * Double(bits / 8)

  • @ShaunBrown8378
    @ShaunBrown8378 4 месяца назад +1

    You might also determine if you are using the onboard GPU vs the 4090 GPU. You can force certain applications to use oen or the other.

  • @jchi6822
    @jchi6822 4 месяца назад +1

    For dGPU you showed us ram to GPU memory operations speed, which could be 15gb/s, even though it seems a bit slow. Maybe your laptop was on battery? For vram to vram dGPU tests, check clinfo cli tool and GPU-Z utility.

  • @istvanszabo5745
    @istvanszabo5745 4 месяца назад

    I'd be curious to know if that kind of system ram bandwidth has any benefits. My guess is maybe, it depends on the use case, but not really.

  • @Dominik-K
    @Dominik-K 4 месяца назад

    Honestly your one of the best channels with these comparisons! Love your AI and machine learning content and I've personally had good experiences with my M2 Max running those benchmarks too

  • @enzocaputodevos
    @enzocaputodevos 4 месяца назад +3

    I must express my admiration for the extensive and informative data and analyses that you have presented to us. However, I am intrigued to learn if there exists the potential for the implementation of MLX technology to further maximize the performance of the already exceptional M series?

    • @AZisk
      @AZisk  4 месяца назад

      video on the way

  • @zmeta8
    @zmeta8 4 месяца назад

    fyi, the memory limit for gpu is configurable in macos with single sysctl command.

  • @adeguntoro
    @adeguntoro 4 месяца назад

    What about egpu RTX 4090 desktop with that "beast" windows ? Will it show better result ?

  • @chadramey1140
    @chadramey1140 4 месяца назад

    Are these tests done with the laptop plugged in or on battery? Because I know the Nvidia driver limits power consumption went on battery.

    • @AZisk
      @AZisk  4 месяца назад

      when the gpu is used, you gotta have the machine plugged in

  • @heyitsmejm4792
    @heyitsmejm4792 4 месяца назад +8

    the fastest memory bandwidth on a consumer GPU that i remember was AMD's Radeon 7 with HBM2 memory that are soldered directly to the GPU die itself. Memory bus is at : 4096 bit with a Bandwidth of 1,024 GB/s

    • @xeridea
      @xeridea 4 месяца назад

      Sad we don't have HBM on consumer cards anymore. Anyway, the 4090 spec is 1008 GB/s bandwidth.

    • @heyitsmejm4792
      @heyitsmejm4792 4 месяца назад

      @@xeridea seems like the memory chip technology is the bottleneck here, since i doubt a 4096 bit bus can only do 1,024 GB/s.

  • @mariusmkv1
    @mariusmkv1 4 месяца назад +1

    is m1 max still good in 2024 for the 5 years?

  • @5133937
    @5133937 4 месяца назад

    @7:15 Some models can layer or shard themselves between GPU VRAM and system RAM, so you can actually run larger models on smaller memory. Performance suffers obviously, but it's improving.

  • @hamzaababou6523
    @hamzaababou6523 4 месяца назад +10

    I am not sure about the setup you use with the 4090 laptop setup, but was the testing done with wsl on windows 11, or was it in a vm, if so don't you think it's worth it to peoperly test it on native linux and doing a comparison between these cases?

    • @eliaserke5267
      @eliaserke5267 4 месяца назад +1

      Good point!

    • @sveinjohansen6271
      @sveinjohansen6271 4 месяца назад +3

      Also latest versions of Linux kernel does some memory magic. Worth looking into that vs windows.

    • @jksoftware1
      @jksoftware1 4 месяца назад +8

      Also a Laptop 4090 is not a real 4090... That would ONLY be comparable to a low wattage 4080.

    • @game_time1633
      @game_time1633 4 месяца назад +4

      @@jksoftware1 the comparison is between laptops. Not a desktop and a laptop

    • @jksoftware1
      @jksoftware1 4 месяца назад +7

      @@game_time1633 Then the title should be changed to "REALITY vs Apple’s Memory Claims | vs a laptop RTX4090" because it's deceptive without it because laptop 4090 uses the AD103 chip while the desktop 4090 uses AD102 chip. They are completely different.

  • @User-ry1wj
    @User-ry1wj 4 месяца назад +2

    Interesting insight 👍

  • @aumortis
    @aumortis 4 месяца назад

    2:11 where's the link?

  • @willidriver
    @willidriver 4 месяца назад

    Windows laptops often have a dynamic PCIE link, which might only use one lane whilst being idle. This could change the bandwidth as well.

  • @artnotes
    @artnotes 4 месяца назад

    Is that possible that when you copied on M1 that data copied from CPU to GPU and they shared a 400GB/s lane. For unified memory copied should not be necessary, but if you force to copy any way read and write might share the same lane.

  • @ubaft3135
    @ubaft3135 4 месяца назад

    Did you use wsl? Could be slowing it a bit

  • @BrookZerihun
    @BrookZerihun 4 месяца назад

    nice, what would be the results with a full card not a mobile GPU, they are so power constrained would love to see a full desktop GPU test

  • @TheDanEdwards
    @TheDanEdwards 4 месяца назад +66

    Apple gets their numbers simply from the LPDDR5 spec. Each of those LPDDR5 chips does 51GT/s, using 96 pins.

    • @chebrubin
      @chebrubin 4 месяца назад +16

      Agreed Unified Memory Architecture is so dumb down step back from the modular work intel did with South Bridge and North Bridge and PCI. Sure they can claim high speeds but there is no modular connectivity between the CPU and L2, L2, L3 cache. Wait until GPU is running native PCIE 5 with HMB memory aka decade ago AMD Vegas. That's why Elon Musk wants to acquire every new AMD Instinct™ MI300X Platform card with HBM3 memory interface. Apple Silicon is a joke meant to put a iPhone MAX Pro in a laptop enclosure.
      Want to build out a multiuser AI workstation cluster you need AMD.

    • @Getfuqqedfedboy
      @Getfuqqedfedboy 4 месяца назад +3

      @@chebrubin no with the IMC integrated into the SoC these days it’s the most efficient way of handling memory.

    • @chebrubin
      @chebrubin 4 месяца назад +7

      @iDoWayTooMuchAcid efficient is the word. These days?
      Why don't you go fanboy your TSLA and APPL trade somewhere else. Integrated memory does not scale to more than 1 CPU. All these "cores" are meaningless when the bus is saturated with network IO. Apples claims will not scale either.
      There is a reason why Intel worked on the bus and CPU and GPU managmet architecture for 25 years before you woke up. Take your single CPU / GPU code tests on 1 machine and take off your tinted Vision Pro headset.
      A MacPro from 10 years ago will scale better for network IO under multiple connections.

    • @Getfuqqedfedboy
      @Getfuqqedfedboy 4 месяца назад +7

      ⁠​⁠@@chebrubin LOL, man put down your handbook from two decades ago. Yes it is far more efficient, if you understand it at a low level like at silicon level, you would agree. there’s a lot less running in circles being done, fewer cycles are needed, less misses, less latency, etc, etc, etc. now what from that did you also see me say that it’s the fastest? I didn’t. Because outright it isn’t the fastest way, but a 4090 running at max bandwidth while it is capable of running rings around this M chip, will also suck down a 💩 ton more power doing so. Also fun fact, Mac’s and PC both been using unified memory wherever it can for a while now…. 😂
      I said nothing about scaling across multiple CPUs, GPUs, etc. and why would I when the platforms being discussed use SoCs which are highly integrated. however if you decoupled the entire memory sub system and put it on its own internal bus like how AMD is starting to with infinity fabric as the bus in their mobile APUs (and seeing impressive gains too), I can see it scaling further. I Can’t speak for what Apple has done maybe their already have a similar memory subsystem setup.. maybe not, I don’t know.

    • @chebrubin
      @chebrubin 4 месяца назад +2

      @iDoWayTooMuchAcid precisely check with Apple Services what tech they are procuring for running there Apple AI cloud it is probably AMD instinct and AMD and super micro racks and cages. Alex is benching client laptop power compute. 1 man 1 machine 1 c compilation runtime.
      Lets discuss the new Apple Mac Pro with NO GPU bus lanes, this SoC was hobled together last minute to ditch IA. No thunderbolt external GPU. It is a iPhone Pro Max with all the ram and ssds sodered. Steve Jobs is alive and kicking. The Woz needs to help find a bus for Apples SoC.

  • @RocketLR
    @RocketLR 4 месяца назад

    I cant decide which m3 mbp i should get for development and running LLMs on.. Im aiming for m3 pro 36gb 512gb storage.. Or should i go for the MAX chip with 1TB and 36gb?

    • @whohan779
      @whohan779 4 месяца назад

      I've never owned a Mac, but personally the integrated SSDs wouldn't be worth it for me #AppleTax. Better go with the most unified RAM you can afford (within your needs) and strap a USB-enclosed M.2 to the back. Total USB/TB-bandwidth should be around 80 GBit/s or 10 GBytes/s - easily enough.
      I've never understood people buying the Apple Silicon version with like 16 GB RAM or below when you can't upgrade them. Even on DDR4 I can easily buy

  • @georgiecooper5958
    @georgiecooper5958 4 месяца назад

    A couple of questions:
    1. How did you test STREAM multiprocessor? If it's using MPI then MPI doesn't support Shared Memory nodes.
    2. A lot of these 3rd party tools can have huge bugs, it would be great if you can mention that as well.
    If you're planning to test ML Models, use MLX for Apple Silicon Mac and PyTorch optimized CUDA for Nvidia. Reason is, Although PyTorch works with MPS, it's not really using the shared memory concept properly. (Probably my guess is due to how PyTorch tensors operate, it's copied between the GPU and the CPU based on the device attribute.)

  • @sigma_z
    @sigma_z 3 месяца назад

    I am not sure I understand the comparison. The RTX 4090 in a laptop is a mobile version with a low TDP. For a real test, one must use the desktop version?

  • @Dashient
    @Dashient 4 месяца назад +1

    Haha I love the editing of the unboxing of that gaming laptop

  • @TheRealMafoo
    @TheRealMafoo 4 месяца назад

    What would be good to see, due to the massive difference in architecture, is actually running something useful on both, and seeing how they compare.

  • @abduislam23
    @abduislam23 4 месяца назад +1

    I am wondering if there would be meaningful differences between RTX4090 (the original not the laptop version) vs M2 ultra

    • @whohan779
      @whohan779 4 месяца назад

      Nvidia laptops variants are and were always kinda mislabeled (apart from Pascal GTX 1000, where the 1070 even had more CUDA cores) having less CUDA cores, clock and often VRAM.
      This "RTX 4090" laptop has less performance than a desktop "RTX 4070 Ti Super" in most cases (or even always when on battery).

    • @giornikitop5373
      @giornikitop5373 4 месяца назад

      @@whohan779 and severely limited in tdp.

    • @Slav4o911
      @Slav4o911 3 месяца назад

      Yes 4090 is much faster. I think 2.5x to 4x faster. (depending on the model quantization) . But that's as long as you don't go above the VRAM. People usually use 2x or 3x 4090 for inference and that's how you run the biggest models really fast. Also you can use 4090 for training.

  • @RichardGetzPhotography
    @RichardGetzPhotography 4 месяца назад +1

    Will Apple allow again an eGPU? This will help greatly with ML. Or possibly the Mac Pro will have ML Afterburners?

    • @tonyburzio4107
      @tonyburzio4107 4 месяца назад

      No. Best bet is to expand memory access. M3 came before ML, expect Apple to create something nifty.

    • @RichardGetzPhotography
      @RichardGetzPhotography 4 месяца назад

      @@tonyburzio4107 they can come up with something new, but that won't get us the computational power needed for LLM to train or inference. M4 won't get us close enough to a couple of 4090s and 4090s won't get us close enough to A100s.
      If Apple wants to play in this space and get those fat hardware $$$$, then it will need something like Afterburners for ML.
      ML Afterburner could easily be just their current gen GPU is mass quantity with gobs of ram interconnected to other Afterburners with an insane fabric.

    • @whohan779
      @whohan779 4 месяца назад

      That would probably only work in Linux or heavily modded OS-X (if that wouldn't set off an integrity violation of sorts). Apple likely doesn't even have support for a Radeon GPU that was actually shipped with another Intel-based Mac (of any kind, including cheesegrater & trashcan).

  • @hardi_stones
    @hardi_stones 4 месяца назад +1

    Thumbs up for the amount of research done.

  • @gianlucab2261
    @gianlucab2261 4 месяца назад +1

    For some reason Quality setting for this video is grayed out in mobile app (Samsung s5e here) being it fixed at an abysmal value, it looks like 240p. First time I'm facing such an issue. BTW: great content, as usual.

    • @AZisk
      @AZisk  4 месяца назад

      sorry to hear that, I hope it was just a one-off for you. I checked the quality on this side and it seemed ok before i published

  • @Cestpasfaux-
    @Cestpasfaux- 4 месяца назад

    I really like Ziskind of test, thanks !

  • @markjacobs1086
    @markjacobs1086 4 месяца назад

    The 1.05 TB/s number is "effective bandwidth". Only really reachable when whatever you're doing fits into the GPU's cache (which is rather large on Ada Lovelace compared to other generations).
    What Nvidia advertises is the typical bandwidth you could expect in most applications.

  • @chills5100
    @chills5100 4 месяца назад

    rtx 4090 has a tb/s of bandwith correct me if I am wrong.

  • @user-ho3ez8zj8c
    @user-ho3ez8zj8c 4 месяца назад +20

    3:18 thanks for including your phone number in this video 😂

  • @hishnash
    @hishnash 4 месяца назад

    That TB/s number is a hitting the cache that is on the GPU die. You need to look at sustained performance over a large write so the cache does not impact the results to much.

  • @danielgall55
    @danielgall55 4 месяца назад +2

    Jesus!!! Finally! Thank you very much!!! I've been waiting for someone to do so for two years! No, seriously, truly, thank you!!!!

  • @TheThaiLife
    @TheThaiLife 4 месяца назад +4

    I tried to train the same model on my M2 Max 32 and the swap went up to 200GB before it terminated. Even if it could access the 128 it wouldn't be enough. I think it's going to need at least 1 TB ram. This is a very sad no-go on any Mac in existence as far as I can determine.

    • @s.i.m.c.a
      @s.i.m.c.a 4 месяца назад

      why someone decided that mac is good for models to train? some puny model for fun probably, for some hipsters - but real work are done on nvidia specialized gpu's which could be clustered to how much tb's you need. Also you need to understand, that there is a limit -how much memory you can place in a cpu package

    • @totalermist
      @totalermist 4 месяца назад +3

      @@s.i.m.c.a I don't know about "puny models for hipsters", but the main concern isn't so much hardware capabilities with models like Mistral 7B (which is very capable indeed), but time constraints. Foundation models of the LLM variety simply cannot be trained or even reasonably finetuned (if Mixtral 7Bx8 or bigger) on consumer hardware, full stop.
      Just checking the model cards of even "just" smaller generative models like SDXL reveals that they take 10s of days to train from scratch on 256 GPU clusters...
      Just for fun I calculated that it'd take about half a year of constantly running on a single GPU to train SDXL. Consumer hardware (especially laptops) likely wouldn't even survive that. Finetuning on the other hand should be perfectly doable even for smaller LLMs, provided a reasonably quantized checkpoint is used. It'd still take several days or even weeks, though.

    • @TheThaiLife
      @TheThaiLife 4 месяца назад

      Yep, I get it. I have had some pretty good models going on my Mac though. But yea, having a 4090 plugged into the wall with massive amounts of ram is the way to go for now. @@s.i.m.c.a

  • @arneczool6614
    @arneczool6614 4 месяца назад

    You can't expect that the advertised memory bus bandwidth (which simply is bit width x Frequency) to match the bandwidth you messure in real usage, as any messurement is simply a combination of multiple bottlenecks - a good messurement there probably requires a deeper dive into the platform architecture.

  • @srivatsansamraj2768
    @srivatsansamraj2768 4 месяца назад

    i was thinking of putting a comment to try and get you the 40 series mobile cards for comparison but here you are. actually usual people compare benchmarks which are for x86 pcs and mac has to go thru lots of translation layers... but ML isn't that way. so for comparison, i know it will be in the making already, but we want raw GPU perf and ML perf based on model execution, training, max batch sizes etc compared between two GPUs... thank you for exploring this niche genre in tech.

  • @noodlz3660
    @noodlz3660 3 месяца назад

    i believe you can actually deal with the memory limit in terminal with a command like sudo sysctl iogpu.wired_limit_mb=26624. (change the mb to whatever but keep 8GB for the system). the thing though is that i think they changed this command once already in the system update, adn you need to do this everytime you restart the system.

  • @Fenrasulfr
    @Fenrasulfr 4 месяца назад +2

    I wonder if you could make use of the DirectStorage Api in machine learning applications that way theoretically you could bypass the cpu entirely.

    • @giornikitop5373
      @giornikitop5373 4 месяца назад

      if you want to transfer data from or to the ram, the cpu cannot be bypassed end of story, stop repeating that bs. directstorage just relieves the cpu from doing the transfer itself. same as rdma or any dma transfer.

    • @Fenrasulfr
      @Fenrasulfr 4 месяца назад

      @@giornikitop5373 Well I made a mistake, most news explains it as bypassing the cpu and sending data directly to vram. Next time try being nicer in explaining to others when they make a mistake.

    • @giornikitop5373
      @giornikitop5373 4 месяца назад

      @@Fenrasulfryou don't get to tell me what to do.

    • @Fenrasulfr
      @Fenrasulfr 4 месяца назад

      @@giornikitop5373 You don't get to be an ass to others just because you know a little more information on one specific topic that the vast majority of people don't give a sh*t about.

    • @giornikitop5373
      @giornikitop5373 4 месяца назад

      @@Fenrasulfr WHAT DID I SAY?

  • @CRC.Mismatch
    @CRC.Mismatch 4 месяца назад

    Where's the link to the repository? 😑

  • @TheRealLink
    @TheRealLink 4 месяца назад

    Would be curious how it stacks up against a desktop 3090 / 4080 / 4090. Obviously you're testing laptop to laptop rather than desktop parts but it would be neat for conjecture.

    • @Slav4o911
      @Slav4o911 3 месяца назад

      Desktop parts will be much faster.

  • @envt
    @envt 4 месяца назад +7

    The windows machine needs to be run while plugged in?

    • @lesleyhaan116
      @lesleyhaan116 4 месяца назад

      yes it does just like every x86 windows laptop

    • @TheRealMafoo
      @TheRealMafoo 4 месяца назад +8

      @@lesleyhaan116 Just like 99.5% of the macs in the real world, that would be doing this workload. I mean it's a cool party trick and all, but who runs these kinds of things at a coffee shop?

    • @BenjaminSchollnick
      @BenjaminSchollnick 3 месяца назад

      @@TheRealMafoo Actually Apple Silicon performs the same with and without being plugged in. That's one of the major benefits, that you can still have the same performance without being plugged in, and still getting the same battery life.

  • @milleniumdawn
    @milleniumdawn 4 месяца назад +3

    There a risk free way to change the max memory for GPU access.
    Its a simple terminal command, and reset at reboot.
    It's set the new max memory you want your GPU to have access to. (Leave 8gb for the system and it's all stable).
    sudo sysctl iogpu.wired_limit_mb=57344
    Example for a 64gb model.

  • @Johno2518
    @Johno2518 3 месяца назад

    Would be good to see if Resizable BAR changes the performance of the 4090m and by how much

  • @fai8t
    @fai8t 4 месяца назад

    so m3max 235 gb vs 4090 48gb copy?

  • @vasudevmenon2496
    @vasudevmenon2496 4 месяца назад

    You might need to rerun the benchmark on AC power since CPU and nvidia dgpu power throttle to maximize battery longevity. System Ram bandwidth seems very low on 13th gen which should peak at 60GBps at dual/quad channel mode depending on number of ddr5 sodimm 5200 or 6400 MT/s. I think MSI is using very loose memory timings. Something like hyper x or gskill ripjaws or Corsair vengeance should boost the performance quite a bit. I've seen better fps and compute performance after upgrading my system Ram from ddr4 2133 to 2666MTps

  • @paul7408
    @paul7408 4 месяца назад

    I feel like Microcenters are always next to a discount clothing store, my local one is next to a Burlington coat factory

  • @henfibr
    @henfibr 4 месяца назад

    Latest Nvidia 40XX series cards have increased L2 cache memory. The mobile 4090 has 64MB, compared to the previous generation which only had 5-6MB. This may explain why the system measures 1 TB/sec bandwidth out of a 576 GB/sec card.

  • @PeterMartens98
    @PeterMartens98 4 месяца назад

    Really like your videos.

  • @Winnetou17
    @Winnetou17 4 месяца назад +1

    Whoa, that was an AWESOME video, and one that we really need, this memory bandwidth part is very lacking on RUclips. Nice to see some actual numbers. Good job!
    I've been whining (on comments on YT, because I'm a nobody) that Intel + AMD need to wake the F up and provide more memory bandwidth. The pitiful 128 bits (aka 2 channel) vs 512 bits (aka 8 channel) that the M Max chips provide, not to mention the MASSIVE 1024 bits (aka 16 channel lol) that the M Ultra provide is just weak. Ok, for casual stuff it totally doesn't matter. But for $2000+ laptops and desktops (because it's the same limit on desktop too) there are people who get into workloads where this matters. And on mobile there are many chips with iGPU too which need it even more. AMD's strix halo with its 40 CU (really desktop class level amount of GPU cores) if it has only 128 bits of bandwidth, it will be limited as hell. I guess it has a saving grace, if the system has at least 32 GB of RAM, it at least be able to get everything in RAM and only have to read from it from an early point.
    From what Coreteks said, well, I'm not so sure of his predictions, buut, he did said and argued that Intel's Lunar Lake will have DRAM on the chip, like the M chips have. FINALLY, something to compete on really low power.

  • @kennibal666
    @kennibal666 4 месяца назад

    I hope all tests on the 4090 are run with the laptop plugged in.

  • @woolfel
    @woolfel 4 месяца назад +1

    nice find.

  • @kToni73
    @kToni73 4 месяца назад

    3:45 Now that was a smooth Unboxing 😎

  • @hadeseh6808
    @hadeseh6808 4 месяца назад

    Can you please make a review on budget laptop for programing

    • @synen
      @synen 4 месяца назад +4

      Used Macbook Air M1

    • @sveinjohansen6271
      @sveinjohansen6271 4 месяца назад +2

      M1 mini with 16 gb memory and you’re good to go on budget.

    • @Andre.A.C.Oliveira
      @Andre.A.C.Oliveira 4 месяца назад

      Slimbook Elemental

  • @jolness1
    @jolness1 3 месяца назад

    Something to keep in mind; the mobile 4090 uses the same die as the 4080 desktop and the same narrower bus. The desktop 4090 has 24GB of VRAM and a 50% wider bus. There is also the a6000 which is an 18k “core” (vs 16k on 4090) model with 48GB of memory on the same bus. The 4090 is $1600-$2000 and a6000 is around $6000.
    So if needs exceed a mobile 4090, there are other options if willing to go to a desktop. I have a 4090 and it’s a great card for hobbyists like myself, plus I play games on it sometimes.
    Great video!

  • @everythingpony
    @everythingpony 3 месяца назад

    1 tb per second?

  • @prasadsawool
    @prasadsawool 4 месяца назад

    bro looks like Neo while casually buying a top of line 4090 laptop

  • @magfal
    @magfal 4 месяца назад +6

    You can add more system memory to that laptop.
    You can upgrade to 48GB dimms and of its the quad sodimm machine i suspect it might be, you can go up to 192GB system memory.
    Running 96GB in my Asus Scar 18 2023

  • @mlordwhiteslayerfromf.u.g
    @mlordwhiteslayerfromf.u.g 4 месяца назад +1

    It'd be intresting to see this test with an actual desktop 4090, I'm not sure if you realized this but the desktop 4090's are a bit different from the mobile 4090's. I did look it up though and apparently the memory bandwidth of the desktop chips is supposed to be 1TB/s

    • @TamasKiss-yk4st
      @TamasKiss-yk4st 4 месяца назад +1

      The problem not with the GPU memory bandwith, the PCIe slot where you put that card has the extremely low data transfer (PCIe 4.0 x16, and the data between GPU and CPU must go on that way)

    • @mlordwhiteslayerfromf.u.g
      @mlordwhiteslayerfromf.u.g 4 месяца назад

      @@TamasKiss-yk4st It's almost like you're unaware of this thing called GPUDirect Storage.
      Edit: forgot to mention but pcie gen 4 cards don't even saturate the full fat gen 4 by 16 slots bandwidth so I have bo idea why you'd be worried about a bottleneck there.

    • @AZisk
      @AZisk  4 месяца назад

      yes, desktop GPUs are a whole different ballgame.

  • @jack_mc8
    @jack_mc8 4 месяца назад +2

    Great video

    • @AZisk
      @AZisk  4 месяца назад +2

      Thanks for watching

  • @toddsimone7182
    @toddsimone7182 4 месяца назад

    Sounds like they are advertising memory bandwidth for the GPU. The 4090 laptop version has 576.0 GB/s unlike the desktop version with 1,008 GB/s.

  • @thesecristan5905
    @thesecristan5905 4 месяца назад

    Hi Alex,
    the reason the CPU cores can’t saturate the memory bandwidth is, let’s call it physical limitations.
    Although it’s an extremely wide design with 8/9 decoders, tons of registers and cache, the „calculation units“ simply aren’t able to push more data.
    CPUs are not designed for massive parallel processing like GPUs by design, because they have to process also a lot of single thread workloads.
    I'm curious to see the influence of memory bandwidth with ML training. I don't think you'll get close to the theoretical values in real-world scenarios, so the way UM is designed it's not the bottleneck, but the compute power of AS.
    Do you still have your M1 Ultra Mac Studio? Would like to see the comparison to the MBP at ML.
    A used Ultra would be cheaper than a new M3 Max MBP for my hobby and could also replace my mac mini dev server.

    • @AZisk
      @AZisk  4 месяца назад

      I've sold the M1 Ultra to local music production studio. I skipped the M2 Ultra altogether, but really curious about what's next.

    • @thesecristan5905
      @thesecristan5905 4 месяца назад

      @@AZisk Jupp, M3/M4 Ultra will be interesting.
      I sometimes get angry when M2/M3 are called stopgap.
      M2 addressed some GPU scaling issues and brought a change in core cluster organisation.
      M3 got us not just Ray Tracing, but mesh- and compute-shaders.
      Of course to see a major jump in performance devs have to use those.
      With ML/AI being a focus at WWDC 24 I hope for an enhanced ANE which is blazing fast at interference, but not so good scaling at training.
      I’m sure you can get a test unit from Apple if you ask. Please try, because I’m curious about comparising numbers. 😁

    • @thesecristan5905
      @thesecristan5905 4 месяца назад

      @@AZisk Jupp, M3/M4 Ultra will be interesting.
      I sometimes get angry when M2/M3 are called stopgap.
      M2 addressed some GPU scaling issues and brought a change in cluster core organisation.
      M3 got us not just Ray Tracing, but mesh- and compute-shaders.
      Of course to see a major jump in performance devs have to use those.
      With ML/AI being a focus at WWDC 24 I hope for an enhanced ANE which is blazing fast at interference, but not so good scaling at training.
      I’m sure you can get a test unit from Apple if you ask. Please try, because I’m curious about comparising numbers.

  • @QUANTUMJOKER
    @QUANTUMJOKER 4 месяца назад

    This is very interesting. Thanks for the insight.
    Perhaps this information is inaccurate or outdated (as in, it doesn't apply to M2 and M3-based Macs), but I read a while back that an M1 chip with 8 GB of unified memory has a 2 GB ceiling for its GPU, and this scales up with more memory. My M1 Mac Mini with 8 GB of memory had a limit of 2 GB for its GPU, but my current M1 Max Mac Studio with 32 GB of unified memory has a limit of 8 GB for its GPU. Is this correct?
    If the GPU is actually limited to 75% of the unified memory, then it's pretty cool to think that my Mac Studio has as much as 24 GB for the GPU.

  • @davidraborn3654
    @davidraborn3654 4 месяца назад

    Not a mac fan after the update bassicly made my Ipad useless for anything I was using it for.

  • @BeaglefreilaufKalkar
    @BeaglefreilaufKalkar 4 месяца назад

    "Machine learning requires a lot of troughput of data going between the cpu's and gpu's"
    What about the npu/neural engine?

  • @leorickpccenter
    @leorickpccenter 4 месяца назад

    those front lights on the MSI is somewhat annoying to look at.

  • @TheLokiGT
    @TheLokiGT 4 месяца назад

    Alex, the GPU can use ALL the unified memory, all it takes is a shell command. ~75% is just the default setting.

    • @tempeleng
      @tempeleng 3 месяца назад

      Maybe the default is set that way to reserve some RAM for MacOS? I can't imagine having the OS running well when it's memory starved.

  • @WarshipSub
    @WarshipSub 4 месяца назад +5

    I love it how he casualy go and buy a 3200$ laptop. Damn, kind of a life goal for me :P
    Well done Alex :D

    • @whohan779
      @whohan779 4 месяца назад +1

      Really only worth it if you need it extremely portable. Even a 1000 Wh portable battery (with standard AC or laptop DC output) plus an RTX 4070 Ti Super, 4k240 OLED and decent base platform plus portable peripherals is around US$800 cheaper and much faster.
      This laptop only replaces some US$2k worth of components plus peripherals when plugged in. Sadly RTX 4070 mobile is only 8 GB (just like the Desktop variant), so really not future proof even though the sweet-spot for actually using it almost to the fullest while on battery.

  • @sultonbekrakhimov6623
    @sultonbekrakhimov6623 3 месяца назад +2

    So in short, VRAM in Nvidia graphics are faster than unified memory and gpu bandwidth in macs, but actually bandwidth between RAM and GPU in PC machines makes it 10 times worse than what 4090 actually capable of

    • @AZisk
      @AZisk  3 месяца назад

      best summary of the situation

  • @DimitrisConstantinou
    @DimitrisConstantinou 4 месяца назад

    Spending $3200 to find out the pcie speed. Nice. Also the 1 TB/s is probably the SUM of write and read speed between GPU and GDDR ram without communicating with cpu.

  • @wojciechb4732
    @wojciechb4732 4 месяца назад

    it depends on the architecture, its hard to compare, for example if you have 100bit width bus and transfer 100bits per transaction then bandwidth = 100 x clock, but when tou use same bus to transfer small data like 10bit then real bandwidth = 10 x clock ( 10 times slower), bus wont wait for data, load bits to *trolley* and send. If you write benchmark and assume data size then results will vary. CPU instructions have different execution time, benchmark was executed in operating system, that means OS cpu scheudler have impact on it, multicore cpu share one memory bus or have own bus to memory ( multichannel vs multicore )? Try some benchmarks working natively like memtest. Apple M cpu with integrated memory have big advantage over x86 - short memory access time, but if you need 10 times faster GPU, you need to use x86 with NVIDIA/AMD GPU.

  • @tapiolehto5312
    @tapiolehto5312 3 месяца назад

    I Play Company of heroes 2 with my MacBook Pro i9 2019 16 inch, works like a dream, Rome , just the same. There are many more and they work fine with touchpad.

  • @nikitazaycev8636
    @nikitazaycev8636 4 месяца назад

    Should have just used the classic aida64’s gpgpu benchmark (memory copy is what determines the real bandwith, everything else goes through pcie which creates that massive bottleneck). And ofc the GPU-z software for general info, which is also classic.

  • @12Burton24
    @12Burton24 4 месяца назад

    Im confused about your CPU GPU talks. GPU memory bandwidth is the bandwidth the GPU can access the Vram (GPU memory). CPU can use the GPU memory with resize bar on Nvidia otherwise no.

  • @renanmonteirobarbosa8129
    @renanmonteirobarbosa8129 4 месяца назад +1

    Also Apple problem is that its chip has the theoretical speed but the software dont let u use it. While Nvidia you can use the GPU to its fullest given you put the effort.

  • @merkedgg6322
    @merkedgg6322 3 месяца назад

    Not isnt 4090m has same gaming preformance as 4070 desktop?

  • @Sams911
    @Sams911 4 месяца назад

    I have an M3 Max with 48GB for most of my stuff, and a ACER i9 13900 w/4090 for gaming.... Don't do machine learning so makes no difference to me, but fascinating anyway.

  • @eliassaf9192
    @eliassaf9192 4 месяца назад

    plug the laptop?

    • @AZisk
      @AZisk  4 месяца назад

      yep