People Are Stacking Mac Minis to Create AI Hubs

Dailytekk 2

Просмотров 38 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 21 дек 2024

Комментарии • 77

@evidentexchange1306 24 дня назад ⁺¹⁰
This is EXACTLY what I have been searching for the last 10 days!
@tomsun3159 26 дней назад ⁺³⁵
@ 0:43 Base Model has TB 4 not TB 5. The quoted example uses not the 600$ Base versiuon but the 1400$ Pro version therefore able to use TB5.
@pbrigham 26 дней назад ⁺⁶
You are right, but even TB4 should be enough for daisy chain.
@j.j.oliphant9794 26 дней назад ⁺²
What I came to say
@dailytekk2 25 дней назад
Thanks!
@ianTnai 23 дня назад ⁺¹
If your compute needs actually call for clustering to increase performance then you should have already exceeded the performance of any maxed out, single node. Budgets far above us normies.
The real problem though is apple has had hardware enterprise wanted to use before and not supported that use at all. They target consumers not fleets/data centers. They long ago discontinued their xserves.
@tomsun3159 23 дня назад ⁺²
@@ianTnai The node topic is just an answer to the rip off mentality in apple pricing policy factoring 6 to 8 times the appropriate pricing for storage and RAM and that an maxed out Mac Mini M4 (doubling Storage and doubling RAM) is more costly than 2 Base Mac Mini M4 (where you get the second PCB, the second PSU, the second case, the second Software license, the second power Cord, the second packaging, the ability to set up a second workspace.
So if you factor in all these items according to apples pricing scheme PCB 50$, Case 100$, Software License 100$, Power Cord 30$, Packaging 10$ it adds up to probably 300$ so the maximum acceptable uzpgrade price for 16GB RAM to 32 GB RAM plus Storage from 256 GB to 512 GB would be roughly 300$ in Total, and i bet Upgrade prices of 100$ for each increment (which would be still massively overpriced by factor 4) would be a great success as shelf products (manufactured with a much higher productivity than Customer builds).
@vtrandal День назад ⁺¹
You’re not chaining together $600 units that use thunderbolt five. Those units are thunderbolt four.
@merion297 25 дней назад ⁺¹⁷
Apple should compete with Nvidia making racks of AI hardware.
@HikaruAkitsuki 22 дня назад ⁺¹⁰
Deploying your own AI is never become easier since the arriving of Ollama and Hanging Face.
@varietygifts 15 дней назад ⁺¹
hang face brotha🤙
@TheTrafficBoss 24 дня назад ⁺⁶
I wanna know what the poster with the price tag says in full!
@MarkhCastelo 23 дня назад ⁺³¹
What they dont realize is theyre putting hot air on top of their minis with this stacked setup
@Hoopergames 22 дня назад ⁺⁸
They're aware bro... It's just for a pic
@mvargasmoran 20 дней назад
would that same stack turned over horizontal be better?
@adamstern5116 20 дней назад ⁺²
@@mvargasmoranhorizontal and opposite facing for each bank. Like a server farm
@ESGamingCentral 15 дней назад ⁺²
Yeah consuming 3watts idle is a problem /s THEY EFFICIENCY SIR
@niijipilot 3 дня назад
Drill holes and install case fans.
@DerekToro 20 дней назад ⁺⁴
EXO adds a lot of overhead which slows down tokens per second responses. Adding more than 4 MAC Minis requires a hub which also slows things down. Maybe EXO will get better, but only time will tell.
@DavidMateos56 18 дней назад ⁺⁸
Yeah, efficient CPU and GPU muscle stacking but system memory bandwidth is not that great in the mini. Huge bottleneck for AI. I don't think this is really a good option.
@TimoBirnschein 5 дней назад ⁺¹
I remember another youtuber specifically testing this and at some point he just ran the same LLM on the terminal of ONE MacMini and it was four (4) times faster. Unless the cluster software has drastically improved since then, you're probably right
@reginaldbowls7180 День назад
@@TimoBirnscheinexo is slower, mlx is faster but requires more setup.
@riser9644 5 дней назад ⁺¹
links to the tweeks and articels man
@MonigMedia 25 дней назад ⁺³
Waiting for the studio M4 and then well stack
@GregSzarama 25 дней назад ⁺²
Did you have to use their os/software? Did you have to create an Apple account?
@piotrfila3684 25 дней назад ⁺⁵
Asahi linux does not yet support M4, but it's on their roadmap. For now you would have to use macOS.
@pushthebutton4602 4 дня назад ⁺²
you are wrong, the mini mac for 600$ has not thunderbolt 5. only 4.
@Boy_Wonder888 20 дней назад ⁺²
does the stacked mac minis work as one computer?
@tartarughina_1 7 дней назад ⁺¹
Not as you may think, they create a computing cluster. Each Mac Mini communicates in a collective way with each other to solve a distributed problem.
@sondrax 21 день назад ⁺¹
Anyone tries this… please let us know! Looking to run something along the lines of Llama 3.2 90b FP16…. (It’s that FP16 bit will create the need for 4 MMP (Mac Mini Pro) to create the required near 200 gig memory bank!
Very curious… obviously, via EXO et al, you will be able to pool your unified RAM. Question is… do the 4 M4 Pro chips also all get to pitch in for the TPS (tokens per second) number? Or is it all bottlenecked by only utilizing 1 cpu?
Yes at FP16 it’s gonna be slow… but all we need is 7.74 tps! 4 would work (though you’d age twice as fast when working with it!)
@otterhopper 3 дня назад
When you say, "all we need is 7.74 tps," what did you mean by that? Why that rate of tokens-per-second specifically?
@mickbadgero5457 4 дня назад
Expensive way to do it.
@eltamarindo 25 дней назад ⁺³
How many Rs in Strawberry?
@dailytekk2 25 дней назад ⁺⁷
Enough Rs to make an AI question its life choices
@efwewfwef1549 День назад
so apple there you got your usecase for a new mac pro variant!!!!!
@EROSNERdesign 5 дней назад ⁺⁵
how are you linking them>>>. do you need software to combine the power??
@MichaelScharf 21 день назад ⁺²
What’s the double pendulum next to you?
@BeneGain 20 дней назад
Yes AI homelab is cool but this is what I would like to know :-D
@jacobferus 2 дня назад
swinging sticks
@yuryvlasov8601 3 часа назад
Bullshit it is similar to buying stack of Toyota Priuses to get a new sofa from Ikea...
@trade_design23 25 дней назад ⁺³
Classic Apple pricing. Baseline but ok performance for $ - but to get what apple boasts performance, well then, you're looking at $$$$$ !! They must use extortion as their core business model.
@jsizemo 22 дня назад
Is there still a shortage of Raspberry PI’s? Otherwise, why are people using these? This seems crazy!
@mattbosley3531 15 дней назад
Except the Mini with TB5 is $1,400, not $600. Although even with TB4 it's faster than 10gb ethernet.
@Shimulahmed100 5 дней назад
My mind blown... How rich these people's are🥲
@20windfisch11 5 дней назад
You can get three to four base-model Mac Minis for the price of an RTX4090 and these will run much larger models. A bit slower than the RTX4090, but quieter and they draw less power. Ideal for homelabbers with interest in AI.
@zvaIa 17 дней назад
Cannot wait for the $1000 mac mini RACK servers!!!!
@20windfisch11 5 дней назад
Racknex of Austria offers rackmount enclosures for the Mac Mini. You can fit two Minis in 1.33U, they also have covers to make it 2U.
@HotRodTypewriter 19 дней назад
Spinal tap
@TheBowersj 18 дней назад
No please stop, these things are while supplies last...
@RunForPeace-hk1cu 25 дней назад ⁺⁵
Imagine Stacking
4 * M4 Ultra Mac Studio 256GB Unified memory (1TB) ...
OR
4 * M4 Max Mac Studio 128GB Unified memory (512GB)
that's some SERIOUS AI workload and a huge LLM and memory bandwidth
🤣
@arwazaz6329 20 дней назад ⁺¹
Here’s the recalculated comparison based on the updated Mac Mini price of ₹60,000 and a more optimized cloud GPU/multi-GPU setup using shared motherboards and minimal storage:
1. M4 Mac Mini Cluster
• Price: ₹60,000 per unit.
• Number of Units: ~25 units (₹60,000 × 25 = ₹15,00,000).
• Specifications per Unit:
• CPU: 8-core Apple M2 (4 performance, 4 efficiency cores).
• GPU: Integrated 10-core Apple GPU.
• RAM: 8GB unified memory.
• Storage: 256GB SSD.
• Power Consumption: ~40W per unit (~1,000W for 25 units).
• Cluster Total:
• CPU Cores: 200 cores (8 × 25).
• GPU Cores: 250 cores (10 × 25).
• RAM: 200GB unified memory (8GB × 25).
• Storage: 6.4TB SSD (256GB × 25).
• Performance Notes:
• Pros: macOS-native optimization, efficient power usage, and great for lightweight AI/ML workloads.
• Cons: Limited GPU power for deep learning or robotics. Integrated GPUs are far slower than discrete GPUs.
2. Custom Multi-GPU System with RTX 4090s
• Optimized Build:
• Motherboard: ASUS TRX40 or equivalent (supports 4 GPUs, ~₹70,000).
• CPU: AMD Ryzen Threadripper 3960X (24 cores, ~₹1,00,000).
• GPUs: 4× NVIDIA RTX 4090 (~₹1,75,000 per GPU × 4 = ₹7,00,000).
• RAM: 64GB DDR4 (~₹30,000).
• Storage: 1TB NVMe SSD (~₹8,000; storage for OS and minimal datasets).
• PSU: 1600W Gold (~₹30,000).
• Chassis and Cooling: ~₹50,000.
Total for 4-GPU Setup: ~₹10,00,000.
Add-ons: ₹5,00,000 for extra GPUs (e.g., 2 more 4090s or upgrading storage).
• Cluster Configuration:
• GPUs: 4 RTX 4090s (16,384 CUDA cores each; total 65,536 CUDA cores).
• GPU Memory: 96GB GDDR6X (24GB × 4 GPUs).
• CPU Power: 24 cores, 48 threads.
• RAM: 64GB DDR4.
• Storage: 1TB expandable.
• Power Consumption: ~1,600W for full load.
Performance Comparison
Feature Mac Mini Cluster (25 units) Multi-GPU Setup (1 system, 4 GPUs)
GPU Power 250 integrated GPU cores 65,536 CUDA cores (RTX 4090 x 4)
GPU Memory 200GB unified (shared with CPU) 96GB GDDR6X (dedicated)
CPU Power 200 Apple cores 24 high-performance cores
RAM 200GB unified 64GB DDR4 (expandable)
Storage 6.4TB SSD 1TB NVMe (expandable)
Training Speed ~10-15x slower for large models Optimized for deep learning tasks
Energy Efficiency ~1,000W ~1,600W
Cost Scalability Low (more devices needed) High (add GPUs incrementally)
Key Insights
1. GPU Performance: The 4090s dominate in AI/ML tasks due to CUDA core count, memory bandwidth, and optimized software (PyTorch/TensorFlow).
2. Cost Efficiency: A single 4-GPU setup is more cost-efficient for high-end tasks than 25 Mac Minis.
3. Scalability: Add more GPUs to the system later for increased performance without major additional costs.
4. Energy Usage: Mac Mini clusters are more efficient but can’t handle large AI/ML datasets effectively.
Recommendation
• For High-End AI/ML: Go with the multi-GPU system with RTX 4090s.
• For General Workloads/Power Savings: Opt for the Mac Mini cluster.
@otterhopper 3 дня назад
Is this based on real-world setup or did you just calculate this. I'm mostly curious how you determined the performance difference (such as tokens-per-second) between such setups.
@caskraker 24 дня назад ⁺⁴
Wow, stack them to make a Ai hub! Who cares? What is an AI hub? Do people really do this? 😂😂😂
@bftjoe 15 дней назад
This is a waste of money, who even buys apple for anything besides setting their money on fire?
@mattmmilli8287 7 дней назад
developers
@20windfisch11 5 дней назад
Four base model Mac Minis are about the price of one RTX4090, if you are lucky, with a crappy PC around it to make it actually do something. And an Exo cluster made out of these four Minis can not only run larger models than that PC, but they are also quieter and draw less power.
@bftjoe 5 дней назад
@20windfisch11 Why would you compare the price to a GPU instead of other desktop/mini pc system? Are you trying to be misleading?
@mattmmilli8287 5 дней назад
@@bftjoe because a ton of models and importantly Qwen and Deepseek work just fine with a ton of system ram and a fast CPU
@20windfisch11 5 дней назад
@@bftjoe I actually said that the PC was needed for the GPUs to do anything.
If you are a hobbyist with interest in AI and want to self-host, the Mac Mini is currently great value.

Следующие

Автовоспроизведение