I'd stand in line with you in Houston. I think the cap on the price will be the system builders though. I didn't worry about buying a 4090 when the price spiked, I just bought a $3000 PC with a 4090. I did this throughout the GPU shortage era. Because Nvidia will feed MSI, Asus, Alienware, and even iBuyPower; your best bet will be to buy a PC with a 5090 instead of feeding the scalpers if you can't get one near MSRP.
Agreed that the 5090 will be Unobtainium. I picked up four 3090's pretty cheap 650~ ish but had to repaste and replace a few fans on them. Totally worth it. I'm running TabbyAPI so I can do speculative decoding and it cranks out around 15 t/s using llama 3.3 70b 5.0bpw with a 1b 3.1 model in front I get around 30 t/s for smaller conversational prompts. I haven't done any fine tuning yet and haven't run long conversations chains but very happy with the quality of the responses and the performance so far. I'm still using open webui. I'd love to get your opinion on Lebrechat. It has been kind of a PITA to get up and running and I don't plan on using any outside services so I don't know how much value it offers over open webui.
I still think the 3090 makes the most sense for a 24GB GPU for inference by a longshot. Love mine, agreed total PITA to clean and repad but they run so good now I expect to get a lot of use out of them.
A 5090 @ $2000 is crazy and when the gamer gets it home and wonders why his PC sudleny turns off or blue screens he's going to realize his 800W PS wasn't enough...
These power requirements are getting out of hand. Back when I use to mine with asics, 1000w, $95/month power consumption. The vram limitations, watts to real world utilization. It almost seems like two older cards combined can do more, cost less monthly.
I feel like they are going to judge a lot on S or Ti based on performance of the 5080 and 5090 launch. Unforseen 80, s. Unforseen 90, ti. At least I have this hopium.
I am just feeling more and more that I made the right choice buying (at MSRP) and then keeping my 4090. I think it will be my six year card. It's so good for everything.
They are fp8 monsters and I am very unsure about the 5090 raw perf raster no DLSS now that the wukong demo playing video dropped and it was a 9x bump. Gaming optimized but maybe not anything meaningful on pure compute would make the 4090 still a top choice for non gamers. They really need to let me borrow a 5090 to get some stats.
The 50 series looks interesting for AI on paper, but I don't think they're even worth looking at for the average person until supply chain catches up to eventually saturate the market (probably right before the 60 series drops). Personally, I'm more interested in investing in a pair of RTX A8000s. Should still be very performant (especially compared to my pair of Tesla P40s currently in use), and 96GB of VRAM to boot instead of "just" 72 with a pair of 5090s. Good luck securing your slice of unobtanium, I'll be curious to see actual performance metrics comparing it to the 3090 and 4090.
15:50 Do people just have no memory? They said a $500 3070 would be like a 2080ti, and it was. I don't know why people are so wuick to dimiss this. Im very intrigued by the 5070ti for selfhosting AI.
Same. I want to get a 5070ti to add to my 4070ti So I can run multiple models simultaneously, or one really big model. Of course using a model router and agents. Then render Skyrim on my 3060, with one of those 2000 mod, mod packs.
I want a 5060Ti Super with 24Gb of VRAM. Between chip claimshelling and 3Gb GDDR7 memory modules, it can be done! Could help with prices on the used 3090 and 4090 cards too. Lots of us just need the memory on a modern RTX card. Half the speed is fine if it means a third to a quarter of the price.
I hope this pushes down the P and A server cards down to something reasonable. I know vram is king for what i do, intels 24 gig workstation card also looks good on paper, i would love to see what you think of it.
I wonder whats going to be the difference in AI Tops - Nvidia showing 2x-3x increase over the 40 series - how will that translate into Ollama workflow?
Their tops appears to be fp4, which im not sure if llama.cpp supports (what ollama uses) compared to fp8. You get 2x just by going from fp8 to fp4 is one if the issues with the stats reporting. We need hard numbers for AI specifically, they need to send me a test unit 😜
to make it short: 5070/5070ti/5080 will only be good for Picture generations and stuff around that corner, as soon as you wanna do animations you need a 4090/5090/A100 at least (besides you wanna do really low resolution animations then you could go with a 16gb VRam card) and i just hope they somehow bump it up a notch either with ti/ti super versions or the 60 series in a few years
Wow, I was planning to buy the 5090, since I use VRAM hungry programs, and I had calculated about 3200€ as starting price (in Spain). ¿But $5000? That's a lot! if so, I'll have to consider buying a 4090. But well, let's see on January 30.
As long as the 5090s arent tested for REAL speed-advantage and for coilwhine (which is a HUGE dread of mine ever since the 40-series cards started sounding like in-case-metal-detectors), ill gladly stay on my 4090. Took me long enough to get one that didnt have noticable coilwhine.
Not going to have supply issues this time around. Nvidia could have competed with Apple for the 3nm process and charged really crazy prices for a genuine, substantial generational uplift. Instead, it chose to remain in fhe 4nm process, sell cheaper cards, make an incremental (not generational) performance upgrade, and not leave money on the table for scalpers this time around. Its unlikely we will see these out of stock for more than 2-4 weeks. Dont FOMO into paying scalpers or higher prices this time.
My dream AI workstation is 1tb+ of Unified memory, so I can effectively run any and every AI/Agi, uncensored, locally and privately Hopefully the next digits with the new rubin ai gpu and new CPU will have 512gb (or more) of unified memory per Digits! The future is bright! :)
I'm curious about an M4 Ultra Studio or Pro being paired with the new Asus XG Mobile that has USB5 and the 24GB laptop 5090. I don't know how long Apple can hold out not supporting Nvidia hardware at the high end. M4 Ultra may be too soon for the pivot though.
AMD's next gen GPUs (not their 9000 series, the one after) can become the de-facto consumer level AI cards if they become liberal with their VRAM. 32 GB is still not enough for a quality mid-sized model. By the time that gen comes out, I predict that fact will become more relevant to the layman with the onset of AI gaming.
If your primary use is gaming the 5070 TI is a better deal than the 5080. The 5090 is the best deal for gamers if you want to future proof yourself 6-10 years. If I get the 5090 I am keeping it for 10 years before upgrading again.
People who still sits on 3060, 3070 like Me. (I own 3060 Legion 5 laptop) Its time to upgrade. 5070 looks to be a crazy upgrade. And prices is for some reason are fine ? I mean even if you bought Laptop ? 1200$ for 5070 laptop ? Thats unheard-of. Back in the day I payed 1300€ for 3060 laptop. And 3070 one was like 1500€ Yes its not "raw" performance of 4090. But I actually dont care at all. Rasterization and Native Resolution has been dead for years now. And people who makes big deal out of it are total butthurts.
Jensen and nRipia are on top they think Tesla has competition they think they can make in roads into the automotive market. Dream on Jensen dream on. Tesla will dominate. Period.
Been waiting with excitement for your thoughts on the value of the new cards for AI!
Thank You! For taking the time to/ energy to post em!
😬
I'd stand in line with you in Houston. I think the cap on the price will be the system builders though. I didn't worry about buying a 4090 when the price spiked, I just bought a $3000 PC with a 4090. I did this throughout the GPU shortage era. Because Nvidia will feed MSI, Asus, Alienware, and even iBuyPower; your best bet will be to buy a PC with a 5090 instead of feeding the scalpers if you can't get one near MSRP.
Yeah i think that's what i may do
Good call. Will do a heads up post before heading to Houston
the DIGITS supercomputer is what I will get later this year, everything else is too expensive 128GB of VRAM for $3k is a no brainer
Agreed that the 5090 will be Unobtainium. I picked up four 3090's pretty cheap 650~ ish but had to repaste and replace a few fans on them. Totally worth it. I'm running TabbyAPI so I can do speculative decoding and it cranks out around 15 t/s using llama 3.3 70b 5.0bpw with a 1b 3.1 model in front I get around 30 t/s for smaller conversational prompts. I haven't done any fine tuning yet and haven't run long conversations chains but very happy with the quality of the responses and the performance so far. I'm still using open webui. I'd love to get your opinion on Lebrechat. It has been kind of a PITA to get up and running and I don't plan on using any outside services so I don't know how much value it offers over open webui.
I still think the 3090 makes the most sense for a 24GB GPU for inference by a longshot. Love mine, agreed total PITA to clean and repad but they run so good now I expect to get a lot of use out of them.
A 5090 @ $2000 is crazy and when the gamer gets it home and wonders why his PC sudleny turns off or blue screens he's going to realize his 800W PS wasn't enough...
These power requirements are getting out of hand. Back when I use to mine with asics, 1000w, $95/month power consumption.
The vram limitations, watts to real world utilization. It almost seems like two older cards combined can do more, cost less monthly.
I hope we will get a 24GB 5080 Ti later this year!
That would be a real 5080, as the current 5080 is a 5070 with a fake name. 50% specs of a 90 class and 80 name don't add up. Mega corp lies.
I feel like they are going to judge a lot on S or Ti based on performance of the 5080 and 5090 launch. Unforseen 80, s. Unforseen 90, ti. At least I have this hopium.
I am just feeling more and more that I made the right choice buying (at MSRP) and then keeping my 4090. I think it will be my six year card. It's so good for everything.
They are fp8 monsters and I am very unsure about the 5090 raw perf raster no DLSS now that the wukong demo playing video dropped and it was a 9x bump. Gaming optimized but maybe not anything meaningful on pure compute would make the 4090 still a top choice for non gamers. They really need to let me borrow a 5090 to get some stats.
26:15 you can actually replace the RAM chips on a 2080ti and double the RAM. I don't know if it's cost-effective, but it's definitely a thing.
The 50 series looks interesting for AI on paper, but I don't think they're even worth looking at for the average person until supply chain catches up to eventually saturate the market (probably right before the 60 series drops). Personally, I'm more interested in investing in a pair of RTX A8000s. Should still be very performant (especially compared to my pair of Tesla P40s currently in use), and 96GB of VRAM to boot instead of "just" 72 with a pair of 5090s. Good luck securing your slice of unobtanium, I'll be curious to see actual performance metrics comparing it to the 3090 and 4090.
Do you guys just buy expensive GPUs like this for testing or do you actually make money with LLMs ( by building SaaS or something like that) ?
I want a rackmount Project DIGITS Pro with 512G.
You can buy two and link them together, at least that is what NVIDIA says.
15:50 Do people just have no memory? They said a $500 3070 would be like a 2080ti, and it was. I don't know why people are so wuick to dimiss this. Im very intrigued by the 5070ti for selfhosting AI.
For gaming it was due to dlss accel, the raw raster however is a different measure and the 2080ti was still faster vs 3070 in that metric.
Same.
I want to get a 5070ti to add to my 4070ti So I can run multiple models simultaneously, or one really big model. Of course using a model router and agents.
Then render Skyrim on my 3060, with one of those 2000 mod, mod packs.
I want a 5060Ti Super with 24Gb of VRAM. Between chip claimshelling and 3Gb GDDR7 memory modules, it can be done! Could help with prices on the used 3090 and 4090 cards too. Lots of us just need the memory on a modern RTX card. Half the speed is fine if it means a third to a quarter of the price.
I hope this pushes down the P and A server cards down to something reasonable. I know vram is king for what i do, intels 24 gig workstation card also looks good on paper, i would love to see what you think of it.
Is the $200 ring still worth building, or should I wait for 5060/70?
I hope they drill some ventilation holes in that Digits box.
But it's gold. I want to drill the holes out myself 🤣 (kidding... but 2x qsfp at least, that is hot running full tilt)
I wonder whats going to be the difference in AI Tops - Nvidia showing 2x-3x increase over the 40 series - how will that translate into Ollama workflow?
Their tops appears to be fp4, which im not sure if llama.cpp supports (what ollama uses) compared to fp8. You get 2x just by going from fp8 to fp4 is one if the issues with the stats reporting. We need hard numbers for AI specifically, they need to send me a test unit 😜
I think the project digits machine will be the best GPU for AI self hosting. 32 GB is still not enough.
Best content of 2025. You win. Excellent analysis.
to make it short: 5070/5070ti/5080 will only be good for Picture generations and stuff around that corner, as soon as you wanna do animations you need a 4090/5090/A100 at least (besides you wanna do really low resolution animations then you could go with a 16gb VRam card) and i just hope they somehow bump it up a notch either with ti/ti super versions or the 60 series in a few years
Yea, 35 to/s is barely fast enough, I tried the 3.3 70b at 16.5. Just too slow for production work. I’m hoping the 5090 will deliver the needed speed.
Wow, I was planning to buy the 5090, since I use VRAM hungry programs, and I had calculated about 3200€ as starting price (in Spain). ¿But $5000? That's a lot! if so, I'll have to consider buying a 4090. But well, let's see on January 30.
As long as the 5090s arent tested for REAL speed-advantage and for coilwhine (which is a HUGE dread of mine ever since the 40-series cards started sounding like in-case-metal-detectors), ill gladly stay on my 4090. Took me long enough to get one that didnt have noticable coilwhine.
Not going to have supply issues this time around.
Nvidia could have competed with Apple for the 3nm process and charged really crazy prices for a genuine, substantial generational uplift.
Instead, it chose to remain in fhe 4nm process, sell cheaper cards, make an incremental (not generational) performance upgrade, and not leave money on the table for scalpers this time around.
Its unlikely we will see these out of stock for more than 2-4 weeks. Dont FOMO into paying scalpers or higher prices this time.
I bet they do
The difference between 4090 and 5090 will be about 13 to 33% difference.
It all depends on cuda cores.
My dream AI workstation is 1tb+ of Unified memory, so I can effectively run any and every AI/Agi, uncensored, locally and privately Hopefully the next digits with the new rubin ai gpu and new CPU will have 512gb (or more) of unified memory per Digits! The future is bright! :)
The memory needs to be extremely fast memory!
@@Machiavelli2pc It's weird that Project Digits uses GDDR5X. Not 6X, 5X.
i wont buy a 5090 but run my two 3090 cards for a while. i still think waiting for the m4 ultra is due =)
I'm curious about an M4 Ultra Studio or Pro being paired with the new Asus XG Mobile that has USB5 and the 24GB laptop 5090. I don't know how long Apple can hold out not supporting Nvidia hardware at the high end. M4 Ultra may be too soon for the pivot though.
Bro i got 5 4090s for msrp at/after launch. They will be available if you try.
The 4090 was actually the deal with the 4000 series. This time the 5090 seems like the worst deal in the stack.
i’ll be in houston in line for the 5090
Things shaping up for houston waiting line 😃
@@DigitalSpaceport oh yes! i'll be there with warm coat and a LOT of coffee... laptop for gaming while we wait!
AMD's next gen GPUs (not their 9000 series, the one after) can become the de-facto consumer level AI cards if they become liberal with their VRAM. 32 GB is still not enough for a quality mid-sized model. By the time that gen comes out, I predict that fact will become more relevant to the layman with the onset of AI gaming.
64GB 5090 Ti. 128GB Titan RTX 5000.
The future we deserve.
If your primary use is gaming the 5070 TI is a better deal than the 5080. The 5090 is the best deal for gamers if you want to future proof yourself 6-10 years. If I get the 5090 I am keeping it for 10 years before upgrading again.
I want two DIGITS
I they said 2, but I hope it turns out to be like 16 that can be networked.
People who still sits on 3060, 3070 like Me. (I own 3060 Legion 5 laptop) Its time to upgrade.
5070 looks to be a crazy upgrade. And prices is for some reason are fine ? I mean even if you bought Laptop ? 1200$ for 5070 laptop ? Thats unheard-of. Back in the day I payed 1300€ for 3060 laptop. And 3070 one was like 1500€
Yes its not "raw" performance of 4090. But I actually dont care at all. Rasterization and Native Resolution has been dead for years now. And people who makes big deal out of it are total butthurts.
Jensen and nRipia are on top they think Tesla has competition they think they can make in roads into the automotive market. Dream on Jensen dream on. Tesla will dominate. Period.
Sniffing for Nvidia shills ....