You’re right, I made a mistake here - I only really noticed this reviewing the video, I guess since I made it I could tell the difference. Next time the graphs will be easier to distinguish!
In the process of learning ML/Ai related tasks. Based on your experience would you prefer a 13” MBP M2 24GB RAM ($1,299 new) or a 14” MBP M3 Pro 18GB RAM ($1,651 used)?
The 24GB of RAM would allow you to load larger models. But it also depends on how many GPU cores the two laptops have. Either way, both are great machines to start learning on
not even that much, it doesn't even come close to those who really use Tensorflow and Pytorch, besides that if you have your production environment in the cloud, those 2 libraries are better integrated than MLX, in addition to the fact that for quick deployments you already have the containers preconfigured and optimized with those libraries and CUDA since the cloud servers are dominated by NVIDIA and not Apple's "Neural Engine".
I believe you can also target pytorch to run on Apple silicon's NPU rather than the GPU. And I am sure it will perform better. Though not sure about how much memory the NPU has access to. It will be great if you can explore this and do a video on it.
Thanks, Daniels, for the video and for sharing the materials' links. You're a legend. Got an M3 Pro 14" (11-core CPU, 14-core GPU, 18GB) last month and have been wondering it was an optimal move.
Surprised, that you did not include RAM bandwidth in the beginning. Whenever you do non-batched inference, the memory-bandwidth becomes your main constraint, instead of your GPU-performance. As shown in your M1 Pro to M3 Pro comparison. llama-cpp's M-series benchmarking shows really nicely, why the M3 Pro with it's 150GB/s instead of 200GB/s memory is a problem, not its (faster) GPUs. If one just does inference and has large models, requiring lots of RAM, the M2 Ultra really shines with its loads of 800GB/s RAM. Totally agree, that with learning and batching, it's different and NVIDIA's new GPU performance blows away Apple silicon.
that's RUclips quality education, good enough but most of the times they are missing crucial details and due to that mistake twisting the truth especially for performance. Although this person has studied and gets paid BIG salary to know such details ..... weird but I boil it down to maybe a simple human mistake. Still a good video!
Woah, I didn't know about the lower memory bandwidths between the M1/M3. Thank you for the information. I just wanted to try raw out-of-the-box testing. Fantastic insight and thank you again.
@@gaiustacitus4242 yes, but you can split layers to multiple cards. For me, I decided for a M2 Max 96GB MacStudio and not for a 1kW+ heater PC, even though in pure GPU horsepower the 4090 is much faster. And never regretted it. Correction - I now regret my M2 Max decission since last week, because Apple/MacOS Sequoia finally will do nested Virtualization. But only on M3 and above. And with this I have hopes of virtualized GPUs at some time. Nvidia/CUDA always was virtualize-able and works in Docker-containers/VMs.
This is interesting! It seems between the m3 pro 16GB (150GB/s) and m3 max 32GB (400GB/s), and considering the m1 pro 32gb (200GB/s), would you suggest that RAM is a much important factor to these ML tasks than memory bandwidth? Or other? Would be keen to see a test between an m3 pro 32gb vs your m1 pro 32gb to see if memory bandwidth of 50GB/s difference has any real world result differences. (also one less GPU core but faster boost in M3 pro)
Would be interesting to see how the 128GB version of M3 Max performs compared to the RTX cards on very large datasets, since 75% ~ 96GB could be used as vram in that Apple Silicon.
Nice video sir, I am buying MacBook book m4 max hoping that I could do data science, AI and machine learning easily. Would you please recommend how much ram I might need on this m4 max so that I don’t need to invest again at least for 5 years? Moreover 14 inch can handle it for your preference or I might need 16 inch. What would be your best suggestion to invest on 35 gb m4 max and use amazon cloud for higher deep learning or ? . It would be my pleasure if you could respond on it sooner.
One thing is clear even as a PC person Mac had a steep advantage with M3's dynamic ram to vram conversion and mow power. Sure they don't have the hardware or software of nVidia but for some Ai users, the entry price for the VRam is a winner.
Great video, could you please update us if the new mlx change the result or your conclusion at all? Would love to know if the m series chip is as good as what the others are saying .
Hi Daniel, will you be teaching something more than image classification? You are the best programming teacher I have ever followed. Looking forward to your new deep learning course on ZTM.
Do apple silicon chips handle the workload on neural cores themselves or do they need to be specifically invoked via an sdk from the code? what was the workload on those during each test? I wonder if they were invoked at all. if they were, it sounds like they do not matter compared to GPU, however it's claimed they can do something like 17 tops which outperforms any google coral. Moreover, apple claims neural cores are 60% faster on m3 compared to m1. confused now.
In this video, the M3 base model has only 8gb ram and the M1 pro has 32gb ram. What if I'm choosing between the M3 base that has 16gb ram and the M1 pro that has 16gb ram as well, should I still go for the M1 pro? Thanks
I'd love to see you test the M3 Ultra with 64 GB RAM when it comes out, I am using the M2 Studio Ultra at present and wonder if it will be worth upgrading. Running batches, it gets warm, but I've never heard its fan yet.
So cool, are you able to run these tests on a m3 max chip with a maxed out ram configuration? Could it be more "usable" than say a 4090 with "only" 24gbs of dedicated vram?
Question: I bought the M1 max with 64 GB ram, and 32 cores GPU. Like you, I am now extremely satisfied with my purchase two years later. Question: I like your set up using the Apple machine in conjunction with a box with that RTX4090 installed. Would that set up run in parallel with my GPU course? And similarly, if I added equivalent ram to that box, would it work together with my installed 64 GB?
Thanks for making such insightful deep-dive videos about these M Chips. I wonder if Apple will ever open their NPU to more APIs. Right now, MLX is starting to gain traction in the open-source ML community, but it still can't tap into the Neural Engine for inference, so we’re still stuck with a slower GPU for Macs. The M4 chip already has a Neural Engine that can compute 38 TOPS, and it's just sitting there doing nothing while the GPU does all the work when running ML inference. It would give Macs a huge boost in ML performance if they opened that up.
Yes that would be a perfect laptop to start learning ML. You can get quite far with that machine. Just beware that you might want to upgrade the memory (RAM) so you can use larger models.
@@mrdbourke sir I am having M3 Air 16 GB and Macbook Pro M3 Pro 18 GB What should I go for, if I am starting to learn and grow in ML in long term and the price difference between both is 30,000 /- please adive, thanking you
@@Rithvik1-v3i You dont need such heavy powered machines to start learning ML. Just use google collab to learn. May be then once you implement projects you will understand which is better.
The advances in MLX warrant an update to this great video! It's getting REALLY good at performing some tasks, and I'd love to see how all these machines perform with MLX framework, since on my iPhone 15 Pro I'm now able to run Llama3.1-8B-q4 at around 8.5 toks/sec, Llama3.2-3B-q4 at 20 toks/sec, and Llama3.2-1B-q4 at a whopping 49 toks/sec, something impossible just a few months ago!
When you have the same chip you will hit the silicon lottery and one machine will have a better GPU while the other will have a better CPU depending on dead transistors and little lottery based differences. So I'm not surprised that an M# pro and M3 Max with the same Neural engine will perform differently. The Silicon lottery is a real thing that will always be a factor in computing. Great video by the way and very informative.
I think the misunderstanding is generally Base/Pro/Max is the performance spread. M1/M2/M3/M4 yes, do have mild improvements each successive generation, but an M1 Max will still probably outperform a base M3/M4.
At one point you say the bottleneck is memory copies from CPU to GPU and back, but the M-series doesn't have to do memory copies because it's all shared memory. In fact, one of the first optimizations for code on Apple Silicon is removing all the memory copying code because it's an easy gain. Have you accounted for this in either your code or the library code you're using, or both?
Sir I follow all of your blogs, vedios etc I want to be a ML Engineer so i enrolled in your 'Complete ML and Data Science course on ZTM'. What a marvellous way of teaching ❤❤
can you test snapdragon elite x when it comes out with a laptop vs the apple m3 please. they say it has better npu than m1,m2 and m3. coz im planning to buy a deep learning laptop next year.
The comparison between the M1 Pro and M3 Pro is not ideal. The M3 pro you are testing is the binned version with only 14 cores however your comparing it too the full M1 Pro. To get an accurate performance measurements its best to measure both the full chips rather than the binned version that way we can truly see if the memory bandwidth has any difference when it comes to Machine learning
I am a medical doctor with a recently acquired Ph.D. in pharmacology. I am currently engaged in clinical research, focusing on identifying factors that lead to therapeutic failure in patients with various conditions. My work involves analyzing patient data files that include sociodemographic information, pathological records, clinical data, and treatment details. These datasets typically contain between 100 and 2,000 variables per patient, with a maximum of 1,000 patients in an ideal scenario. I will be using R and RStudio to process and analyze this data in various ways. Based on your experience, could you suggest a computer configuration capable of handling this type of data processing efficiently? Thanks in advance!
R is such a light workload that any M series device would handle that for you no problem. Without actually specifying what you mean by analyze and what kind of procedures you will do to the data, I figure it's going to be something very light and easy to run (otherwise you would have surely specified it). So anything, absolutely anything will do here. Get an M4 chip with the single core, either a base model M4 macbook pro, or wait for an air if you can handle their screens.
Hi Daniel! What a great PyTorch tutorial you have made. Thanks for that! Also thanks for that speed comparing video. Can you record the video that comparing the speed of different Colab versions? I mean free, 10$ and 50$. Also here can be added M3 max and your Titan (which you already have done). Maybe one of your friends has 50$ account and he can do that tests for you [for all of us :)]
7B parameters / 25 ( 25 and delete 7 zeroes or divide by 250 000 000) = 28GB which is close enough for a simple maths for GB Memory for Molde Parameters.
Very helpful thanks Daniel. I was going to race out and buy an M3 to do my ML work, but I will hold off for now. I suspect Apple will do something to help boost performance considerably on the software side, but who knows.
Finally a useful video. Too many “reviews” focus solely on content creators. Now I know I can do light ML on my Mac. And do the heavy lifting with my 30 series RTX card.
Yeah you're right, I also just found out that M1 has a higher memory bandwidth than the M3 (150gb/s vs 200gb/s) thanks to another comment. That likely adds to the performance improvement on the M1. Strange to me that a 2-year-old chip can comfortably outperform a newer chip.
I have only a 16 GB M1 Pro, on the first 2 benchmark I get similar or slightly faster speeds. I will try to run them on the other benchmarks, I got side tracked modifying the 1st benchmark to run on a quad RTX 1070 setup.
Do you have more details on this? I've looked for something like this before and all I can find is something that seems to let you convert PyTorch to CoreML, or info on Pytorch using the. GPUs but not ANE. But I probably am missing something!
IMHO macbooks are only inference machines, not training. It's great to run locally 7B, 13B, 30B LLMs (depends of your # of RAM), run quick stundents training on something like MNIST. I personally write code for training and run experiments with small batch size on my M1 pro, than copy the code on my 3090 PC and run long training with bigger batch and fp16. While PC is busy, I run next experiments in paralle on laptop. If you load with big training your main laptop, you will have uncomfortable experience if you want browsing, gaming, etc in parallel with training.
although it's nice to see vision models most people wanted to see inference w/transformer LLM's then fine tuning LORA, SFT. llama2 q40 is hardly a test even an 8gb mac metal can run that. would like to see different quants at 33b and 70b with different loaders, AWQ, GPTQ, exllama etc.
This is really a great video. The problem I have is all my development is on a laptop and I think this is wrong. The conundrum is simple, I will present my work, that's a given, so how do I dev on a much more powerful desktop and still have the ability to present my work? I hate powerpoints of screenshots, I want to really show what I'm doing.
For small LLMs, you are correct. For 13B parameter or larger LLMs a maximum spec'd Mac Studio M2 Ultra or MacBook Pro M3 MAX will outperform the best Windows-based solution you can build. Of course, the new Copilot+ PCs running Snapdragon X Elite CPUs will also outperform the desktop build you've recommended when running 3B to 3.8B parameter LLMs.
The M3 Pro being slower/not much faster in some tests is probably because of the slower ram. I'd be interested to see how 30 and 40 series cards stack up but considering the cost of the laptops already this is quite the effort so no complaints.
@@kborak I'm not a mac user, I wouldn't buy Apple hardware for love or money. But the chips are still pretty good so it's interesting to see how they stack up to a better GPU for this kind of workload.
On my m1 max 64GB... I'm getting 8208 on Core ML Neural Engine... My Core ML Gpu falls more in line at 6442... All this while powering 3 screens. Watching youtube and a twitch stream. Not that I expect those things to add much load... But it is nice to have a machine that can basically do everything at once with near zero penalty.
Finally somebody explains this shit properly not like all the other youtubers that only use it to create videos
Nice! Missed you buddy!
I'm sure this video touched your heart.
I actually thought this was your video when it popped in my feed!
@@anthonypenaflorI don’t have such a beautiful desk.
Yo, it’s been a while I saw my teacher, nice to see again and good video by the way. More blessings bro.
So cool that you used 10 shades of green in your graphs. It's very convenient to distinguish
You’re right, I made a mistake here - I only really noticed this reviewing the video, I guess since I made it I could tell the difference. Next time the graphs will be easier to distinguish!
besides that, the video is awesome and very informative
you're a clown
@@hyposlasherswitch up is crazy
@@ExistentialismNeymarJunior wdym?
In the process of learning ML/Ai related tasks. Based on your experience would you prefer a 13” MBP M2 24GB RAM ($1,299 new) or a 14” MBP M3 Pro 18GB RAM ($1,651 used)?
The 24GB of RAM would allow you to load larger models. But it also depends on how many GPU cores the two laptops have. Either way, both are great machines to start learning on
Please redo using MLX as that's what the developers using this laptop will probably be using.
Especially since this week Apple released MLX with quantization support and other stuff.
Fantastic idea! I started with TensorFlow/PyTorch since they're most established. But MLX looks to be updating fast.
not even that much, it doesn't even come close to those who really use Tensorflow and Pytorch, besides that if you have your production environment in the cloud, those 2 libraries are better integrated than MLX, in addition to the fact that for quick deployments you already have the containers preconfigured and optimized with those libraries and CUDA since the cloud servers are dominated by NVIDIA and not Apple's "Neural Engine".
Appreciate hardwork. But please consider using better color scheme for bars. They all look the same.
1) How do you exactly SSH to your remote NVIDIA setup? Via VS code? 2) For a remote NVIDIA setup, is Windows ok or should it be linux-based?
I believe you can also target pytorch to run on Apple silicon's NPU rather than the GPU. And I am sure it will perform better. Though not sure about how much memory the NPU has access to. It will be great if you can explore this and do a video on it.
If portability isn't a requirement, then the Mac Studio Ultra should be considered with its 60 GPU cores and 800GB/s memory bandwidth.
is your test is using the Mx GPU : are TensorFlow and Pytorch optimized for Apple GPU silicon ?
Thanks, Daniels, for the video and for sharing the materials' links. You're a legend. Got an M3 Pro 14" (11-core CPU, 14-core GPU, 18GB) last month and have been wondering it was an optimal move.
Surprised, that you did not include RAM bandwidth in the beginning. Whenever you do non-batched inference, the memory-bandwidth becomes your main constraint, instead of your GPU-performance. As shown in your M1 Pro to M3 Pro comparison. llama-cpp's M-series benchmarking shows really nicely, why the M3 Pro with it's 150GB/s instead of 200GB/s memory is a problem, not its (faster) GPUs. If one just does inference and has large models, requiring lots of RAM, the M2 Ultra really shines with its loads of 800GB/s RAM. Totally agree, that with learning and batching, it's different and NVIDIA's new GPU performance blows away Apple silicon.
that's RUclips quality education, good enough but most of the times they are missing crucial details and due to that mistake twisting the truth especially for performance. Although this person has studied and gets paid BIG salary to know such details ..... weird but I boil it down to maybe a simple human mistake. Still a good video!
Woah, I didn't know about the lower memory bandwidths between the M1/M3. Thank you for the information. I just wanted to try raw out-of-the-box testing. Fantastic insight and thank you again.
nVidia's GPU performance falls on its face once the LLM's size exceeds the video card's onboard RAM.
@@gaiustacitus4242 yes, but you can split layers to multiple cards. For me, I decided for a M2 Max 96GB MacStudio and not for a 1kW+ heater PC, even though in pure GPU horsepower the 4090 is much faster. And never regretted it.
Correction - I now regret my M2 Max decission since last week, because Apple/MacOS Sequoia finally will do nested Virtualization. But only on M3 and above. And with this I have hopes of virtualized GPUs at some time. Nvidia/CUDA always was virtualize-able and works in Docker-containers/VMs.
@@andikunar7183 Even with two nVidia 4090 GPUs a 70B parameter LLM will still yield lower performance than a high-end M-series Mac.
This is interesting! It seems between the m3 pro 16GB (150GB/s) and m3 max 32GB (400GB/s), and considering the m1 pro 32gb (200GB/s), would you suggest that RAM is a much important factor to these ML tasks than memory bandwidth? Or other?
Would be keen to see a test between an m3 pro 32gb vs your m1 pro 32gb to see if memory bandwidth of 50GB/s difference has any real world result differences. (also one less GPU core but faster boost in M3 pro)
You missed memory bandwidth, the M1 pro has higher bandwidth than the non Max m3 macbooks.
Thank you! I didn't know this. Very strange to me that a 2 year old chip has higher bandwidth than a brand new chip.
Would be interesting to see how the 128GB version of M3 Max performs compared to the RTX cards on very large datasets, since 75% ~ 96GB could be used as vram in that Apple Silicon.
Nice video sir, I am buying MacBook book m4 max hoping that I could do data science, AI and machine learning easily. Would you please recommend how much ram I might need on this m4 max so that I don’t need to invest again at least for 5 years? Moreover 14 inch can handle it for your preference or I might need 16 inch. What would be your best suggestion to invest on 35 gb m4 max and use amazon cloud for higher deep learning or ? .
It would be my pleasure if you could respond on it sooner.
I am planning to buy M3 pro. Which one should i go for 30 core GPU or 40cire gpu. My use will be around running some prototype models in LLMs.
Thank you Daniel, a thorough research for ML engineers. This research is worth a conference session 💪
Hey Daniel, Consider trying their MLX versions as some of the models enjoy performance gain as high as 4x compared to their torch counterparts
Does MLX work with Llama 2?
@@siavoshzarrasvand yup and much much faster than llama.cpp
Good to see you again you made machine learning and ai fun
Nice! Could you add some insights regarding thermals and throttling on the Macs?
Can you also make comparison with Neural Engine of M processors?
One thing is clear even as a PC person Mac had a steep advantage with M3's dynamic ram to vram conversion and mow power. Sure they don't have the hardware or software of nVidia but for some Ai users, the entry price for the VRam is a winner.
Great video, could you please update us if the new mlx change the result or your conclusion at all? Would love to know if the m series chip is as good as what the others are saying .
Hi Daniel, will you be teaching something more than image classification? You are the best programming teacher I have ever followed. Looking forward to your new deep learning course on ZTM.
Do apple silicon chips handle the workload on neural cores themselves or do they need to be specifically invoked via an sdk from the code? what was the workload on those during each test? I wonder if they were invoked at all. if they were, it sounds like they do not matter compared to GPU, however it's claimed they can do something like 17 tops which outperforms any google coral. Moreover, apple claims neural cores are 60% faster on m3 compared to m1. confused now.
In this video, the M3 base model has only 8gb ram and the M1 pro has 32gb ram. What if I'm choosing between the M3 base that has 16gb ram and the M1 pro that has 16gb ram as well, should I still go for the M1 pro? Thanks
I'd love to see you test the M3 Ultra with 64 GB RAM when it comes out, I am using the M2 Studio Ultra at present and wonder if it will be worth upgrading. Running batches, it gets warm, but I've never heard its fan yet.
this video is exactly what i was searching for, thank you so much for proving such clear and usefull information.
So cool, are you able to run these tests on a m3 max chip with a maxed out ram configuration? Could it be more "usable" than say a 4090 with "only" 24gbs of dedicated vram?
Question:
I bought the M1 max with 64 GB ram, and 32 cores GPU. Like you, I am now extremely satisfied with my purchase two years later.
Question: I like your set up using the Apple machine in conjunction with a box with that RTX4090 installed.
Would that set up run in parallel with my GPU course? And similarly, if I added equivalent ram to that box, would it work together with my installed 64 GB?
Thanks for making such insightful deep-dive videos about these M Chips. I wonder if Apple will ever open their NPU to more APIs. Right now, MLX is starting to gain traction in the open-source ML community, but it still can't tap into the Neural Engine for inference, so we’re still stuck with a slower GPU for Macs.
The M4 chip already has a Neural Engine that can compute 38 TOPS, and it's just sitting there doing nothing while the GPU does all the work when running ML inference. It would give Macs a huge boost in ML performance if they opened that up.
It's a good idea going for a new M3 MacBook Air model with 16GB for starting to learn ML?
Yes that would be a perfect laptop to start learning ML. You can get quite far with that machine. Just beware that you might want to upgrade the memory (RAM) so you can use larger models.
@@mrdbourke sir I am having M3 Air 16 GB and Macbook Pro M3 Pro 18 GB
What should I go for, if I am starting to learn and grow in ML in long term and the price difference between both is 30,000 /- please adive, thanking you
@@Rithvik1-v3i You dont need such heavy powered machines to start learning ML. Just use google collab to learn. May be then once you implement projects you will understand which is better.
@@Rithvik1-v3i I'm planning to get the M3 Air with 16Gs of RAM
The advances in MLX warrant an update to this great video! It's getting REALLY good at performing some tasks, and I'd love to see how all these machines perform with MLX framework, since on my iPhone 15 Pro I'm now able to run Llama3.1-8B-q4 at around 8.5 toks/sec, Llama3.2-3B-q4 at 20 toks/sec, and Llama3.2-1B-q4 at a whopping 49 toks/sec, something impossible just a few months ago!
When you have the same chip you will hit the silicon lottery and one machine will have a better GPU while the other will have a better CPU depending on dead transistors and little lottery based differences. So I'm not surprised that an M# pro and M3 Max with the same Neural engine will perform differently. The Silicon lottery is a real thing that will always be a factor in computing. Great video by the way and very informative.
Can you try the same tests on M3 Ultra with 196GB RAM?
you’re a great teacher! extremely clear
OH FINALLY, waiting for that. U are king bro
Very nice comparison. The colors from labels didn't helped much to understand which is which though.
M series doesn't allow for external GPUs so how do you hook a 4090? This would make a good video.
Why didn't you compare M1 max?
I have the exact same Macbook Pro 32GB 16 Core GPU !!!!
Wondering, will running this fry your macbook?
Glad to see you back.
What is the name of this monitoring tool he is using in the terminal?
Would you be able to benchmark maxed out Mac Studio with M2 Ultra, 192 GB ram and 76 GPU cores against the nVidias?
I think the misunderstanding is generally Base/Pro/Max is the performance spread. M1/M2/M3/M4 yes, do have mild improvements each successive generation, but an M1 Max will still probably outperform a base M3/M4.
Hi Daniel love your video, can you please suggest which laptop is good for deep learning mac or windows or linux
At one point you say the bottleneck is memory copies from CPU to GPU and back, but the M-series doesn't have to do memory copies because it's all shared memory. In fact, one of the first optimizations for code on Apple Silicon is removing all the memory copying code because it's an easy gain. Have you accounted for this in either your code or the library code you're using, or both?
how come m3 max is slower than m3 regular, and m3 pro in PyTorch test?
Thanks for the video & nice in-depth comparisons. I thought GPUs are for game playing and M series had dedicated multicore neural engines.
Question: how did you get pytorch / tensorflow on the m3 max chip? There is no current support?
can ypu try llma 2 70b with 128gb ? m3 max
Hi sir! Can you suggest the proper/recommended way to install tensorflow in macbook?
Hey daniel, just wondering can i fine tune my llama 13b param model on m3 pro with 14 core gpu 11 core cpu 18gb ram
Sir I follow all of your blogs, vedios etc
I want to be a ML Engineer so i enrolled in your 'Complete ML and Data Science course on ZTM'.
What a marvellous way of teaching ❤❤
Thanks mate so much for doing this.
can you test snapdragon elite x when it comes out with a laptop vs the apple m3 please. they say it has better npu than m1,m2 and m3. coz im planning to buy a deep learning laptop next year.
The comparison between the M1 Pro and M3 Pro is not ideal. The M3 pro you are testing is the binned version with only 14 cores however your comparing it too the full M1 Pro. To get an accurate performance measurements its best to measure both the full chips rather than the binned version that way we can truly see if the memory bandwidth has any difference when it comes to Machine learning
Thanks a lot for the valuable information. You saved me a tonne of time to come to a conclusion. cheers mate
I am a medical doctor with a recently acquired Ph.D. in pharmacology. I am currently engaged in clinical research, focusing on identifying factors that lead to therapeutic failure in patients with various conditions. My work involves analyzing patient data files that include sociodemographic information, pathological records, clinical data, and treatment details. These datasets typically contain between 100 and 2,000 variables per patient, with a maximum of 1,000 patients in an ideal scenario. I will be using R and RStudio to process and analyze this data in various ways.
Based on your experience, could you suggest a computer configuration capable of handling this type of data processing efficiently?
Thanks in advance!
R is such a light workload that any M series device would handle that for you no problem. Without actually specifying what you mean by analyze and what kind of procedures you will do to the data, I figure it's going to be something very light and easy to run (otherwise you would have surely specified it). So anything, absolutely anything will do here. Get an M4 chip with the single core, either a base model M4 macbook pro, or wait for an air if you can handle their screens.
Excellent comparison, thanks 😊
Hi Daniel! What a great PyTorch tutorial you have made. Thanks for that! Also thanks for that speed comparing video. Can you record the video that comparing the speed of different Colab versions? I mean free, 10$ and 50$. Also here can be added M3 max and your Titan (which you already have done). Maybe one of your friends has 50$ account and he can do that tests for you [for all of us :)]
can you make for intel ultra 7 155h
7B parameters / 25 ( 25 and delete 7 zeroes or divide by 250 000 000) = 28GB which is close enough for a simple maths for GB Memory for Molde Parameters.
Very helpful thanks Daniel. I was going to race out and buy an M3 to do my ML work, but I will hold off for now. I suspect Apple will do something to help boost performance considerably on the software side, but who knows.
Happy Christmas Sir❤❤
Happy Christmas legend!
Finally a useful video. Too many “reviews” focus solely on content creators. Now I know I can do light ML on my Mac. And do the heavy lifting with my 30 series RTX card.
Shoul i buy air m3 8/256 or windows i7 gpu gtx 4050 for Artificial intelligence??
Bro can you please tell me which latop i need to go for machine learning Windows or M3?
Windows - specifically, the MSI Titan.
Your tests just prove how bullcrap synthetic benchmarks are. Love your work.
your M1 Pro RAM is about twice that of your m3 pro, so maybe that is why it performs better than the latter.
Yeah you're right, I also just found out that M1 has a higher memory bandwidth than the M3 (150gb/s vs 200gb/s) thanks to another comment. That likely adds to the performance improvement on the M1. Strange to me that a 2-year-old chip can comfortably outperform a newer chip.
I have only a 16 GB M1 Pro, on the first 2 benchmark I get similar or slightly faster speeds. I will try to run them on the other benchmarks, I got side tracked modifying the 1st benchmark to run on a quad RTX 1070 setup.
tell us the difference with m1 max 44 GPU 32 gb ram with the M2 M3 max please
what are the parameters that you used for powermetrics? I liked the monitoring you had in terminal.
I used the asitop (github.com/tlkh/asitop) library for monitoring in terminal
I would like to see if you have utilized intels gpus.
This is excellent!
Do you have a masters/phd degree on ML ? Does your job require data science degree?
There is a CoreML optimization for PyTorch on Apple Silicon. Was this used?
Do you have more details on this? I've looked for something like this before and all I can find is something that seems to let you convert PyTorch to CoreML, or info on Pytorch using the. GPUs but not ANE. But I probably am missing something!
Are they faster than Intel Ultra Core 155 or 185 ?
IMHO macbooks are only inference machines, not training. It's great to run locally 7B, 13B, 30B LLMs (depends of your # of RAM), run quick stundents training on something like MNIST. I personally write code for training and run experiments with small batch size on my M1 pro, than copy the code on my 3090 PC and run long training with bigger batch and fp16. While PC is busy, I run next experiments in paralle on laptop. If you load with big training your main laptop, you will have uncomfortable experience if you want browsing, gaming, etc in parallel with training.
Many thanks for the video 🙏
PLEASE update for M4
although it's nice to see vision models most people wanted to see inference w/transformer LLM's then fine tuning LORA, SFT. llama2 q40 is hardly a test even an 8gb mac metal can run that. would like to see different quants at 33b and 70b with different loaders, AWQ, GPTQ, exllama etc.
I think it would be interesting if you standardized your measures by memory and number of cores.
Which one would you prefer buying now between
M3 (8gb ram and 512ssd) VS
M1 (64gb ram and 512 ssd)
M1 = 2,286 usd
M3 = 2045 usd
Has anyone tried the same stuff on mlx? I am wondering if it makes it faster, I had insanely fast inference using it on q4 mistral.
Thank you for the Knowledge it really gave me an insight.
nobody likes me. but that’s okay! i’ll suqqOn it anyway. !!! i’ll suqqOn it anyway !!!!
feed me a lemon bro. give me your lemon dude.
Bought Nvidia RTX 4070 Ti Super.This video was very helpful.
This is really a great video. The problem I have is all my development is on a laptop and I think this is wrong. The conundrum is simple, I will present my work, that's a given, so how do I dev on a much more powerful desktop and still have the ability to present my work? I hate powerpoints of screenshots, I want to really show what I'm doing.
How about connecting with ssh to your desktop from your laptop?
These machines are great as laptops, for desktops, Intel 14th Gen i9 plus Nvidia GPU smoke them away.
For small LLMs, you are correct. For 13B parameter or larger LLMs a maximum spec'd Mac Studio M2 Ultra or MacBook Pro M3 MAX will outperform the best Windows-based solution you can build.
Of course, the new Copilot+ PCs running Snapdragon X Elite CPUs will also outperform the desktop build you've recommended when running 3B to 3.8B parameter LLMs.
You posted the single-core CPU scores for the M3 Macs, that's why they are all the same pretty much.
Could you elaborate on that? Are you referring to the ML timings?
The M3 Pro being slower/not much faster in some tests is probably because of the slower ram. I'd be interested to see how 30 and 40 series cards stack up but considering the cost of the laptops already this is quite the effort so no complaints.
my 6750xt will beat these things lol. You macboys are so lost in the woods.
@@kborak I'm not a mac user, I wouldn't buy Apple hardware for love or money. But the chips are still pretty good so it's interesting to see how they stack up to a better GPU for this kind of workload.
.... and what about M2 pro mac ?
Thank you, I was hoping someone would look into how these machines perform on ML, not only video processing. The results are quite disappointing.
I don't think the choice of the graph colors was good
On my m1 max 64GB... I'm getting 8208 on Core ML Neural Engine... My Core ML Gpu falls more in line at 6442... All this while powering 3 screens. Watching youtube and a twitch stream. Not that I expect those things to add much load... But it is nice to have a machine that can basically do everything at once with near zero penalty.
Great job! Would be great to include some popular Windows laptops as well in the comparison :)
So macbooks are shitty for AI?
Thankyou for the video.
from my experience, tensorflow optimization is a little better than pytorch for convolutional models.