I suggest waiting before buying anything. This technology is too new to invest in; instead, stick with APIs or rent a server. At the pace it's progressing, the next generation of PCs will be ten times better for half the cost
I am using nvidia jetson nano 8gb, could run llama 7b model quite smooth. And also tested nvidia jetson orin 64gb, could run 34b llm (llava), around 7-10 token per second, also tested load llama 70b model, pretty slow (1.x token per second)
Not to mention how much your power bill will go up when running models locally. At this point I really recommend API’s and cloud GPU’s until the tech is more refined otherwise you will burn through too much money.
@@originalmagneto well, you can bundle NVIDIA 4090s, but what is your use case? A bit of RAG, some agents? How many requests do you want to reply per minute? What model are you going to use? Laughable is your imprecise question. And no, there is no 4090 with > 170 GB vram. If you don’t know to calculate the amount of power, ask chatGPT.
I don't think it's a good value. I would go for the Asus Zephyrus G16 (2024) OLED, 240Hz Gaming Laptop, Intel Core Ultra 9 - 32GB LPDDR5X - GeForce RTX 4080 - 1TB SSD. This will be a better option.
I suggest waiting before buying anything. This technology is too new to invest in; instead, stick with APIs or rent a server. At the pace it's progressing, the next generation of PCs will be ten times better for half the cost
Soo true
I didn't need to watch this, but I did, and I can say this is all on point and 100% accurate. Well done.
I am using nvidia jetson nano 8gb, could run llama 7b model quite smooth. And also tested nvidia jetson orin 64gb, could run 34b llm (llava), around 7-10 token per second, also tested load llama 70b model, pretty slow (1.x token per second)
Not to mention how much your power bill will go up when running models locally. At this point I really recommend API’s and cloud GPU’s until the tech is more refined otherwise you will burn through too much money.
Wait for ARM compatible laptops like Snapdragon X Elite for Linux then I will buy one
Can you make a video about the cloud platform to perform high ai development in it.
M2 Ultra has 36,1 TOPS. 192GB ram (about 170 useable as vram), but the memory-bandwidth of 800 is half the NVIDIA 4090. Unfortunately.
how much is a 4090 with 170gb of vram? 🤔 and what about the power consumption, CPU, IO…
@@originalmagneto there is no 4090 with that amount of VRAM.
@@originalmagneto the M2 ultra needs under full load around 120 watt in total. On idle about 5 to 20.
@@MeinDeutschkurs that’s laughable 😬 how much for 7*4090 with 24gb of VRAM? A then, how much to run those guys for 1 day? 🤣
@@originalmagneto well, you can bundle NVIDIA 4090s, but what is your use case? A bit of RAG, some agents? How many requests do you want to reply per minute? What model are you going to use? Laughable is your imprecise question. And no, there is no 4090 with > 170 GB vram.
If you don’t know to calculate the amount of power, ask chatGPT.
thank you thank you thank you thank you thank you thank you thank you
I think AMD Ryzen AI is the best, I hope a 64Gb ram version will be available
Opinions on price/value:
Surface Pro, Copilot+ PC
$2.6k
Snapdragon X Elite (12 Core) Oled-scherm
Platina
Wi-Fi
32GB RAM
1TB SSD
I don't think it's a good value. I would go for the Asus Zephyrus G16 (2024) OLED, 240Hz Gaming Laptop, Intel Core Ultra 9 - 32GB LPDDR5X - GeForce RTX 4080 - 1TB SSD. This will be a better option.
@@AICodeKing I'll look into that one, thanks!
bro this video is right on time! thanks thanks
PC 128GB ram dual GPU (private AI server) -> tunnel -> access with laptop/mobile device.
Alienware i9 4090 is this option a plus? Battery life 4hrs running llms other downside I have come across Wifi not have the distance of M3 Apple.
what if all LLMs able quantized 1.84 without losses it's accuracy potato gpu might can run
Yes, it's a good thought. But I don't think it's quite possible without loss in accuracy.
@@AICodeKing i see, would you think is there will be something else might get us there?