Downgrading My GPU For More Performace

Novaspirit Tech

Просмотров 40 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 14 ноя 2024
Checking out a older nvidia tesla card that can meet my needs for AI.
○○○ LINKS ○○○
Nvidia Tesla M40 ► ebay.us/ED5oqB
Nvidia Tesla P40 ► ebay.us/HWpCZO
○○○ SHOP ○○○
Novaspirit Shop ► teespring.com/...
Amazon Store ► amzn.to/2AYs3dI
○○○ SUPPORT ○○○
💗 Patreon ► goo.gl/xpgbzB
○○○ SOCIAL ○○○
🎮 Twitch ► / novaspirit
🎮 Pandemic Playground ► / @pandemicplayground
▶️ novaspirit tv ► goo.gl/uokXYr
🎮 Novaspirit Gaming ► / @novaspiritgaming
🐤 Twitter ► / novaspirittech
👾 Discord chat ► / discord
FB Group Novaspirit ► / novasspirittech
○○○ Send Me Stuff ○○○
Don Hui
PO BOX 765
Farmingville, NY 11738
○○○ Music ○○○
From Epidemic Sounds
patreon @ / novaspirittech
Tweet me: @ / novaspirittech
facebook: @ / novaspirittech
Instagram @ / novaspirittech
DISCLAIMER: This video and description contains affiliate links, which means that if you click on one of the product links, I’ll receive a small commission.

Комментарии • 117

@syspowertools3372 8 месяцев назад ⁺¹¹
I picked one up on Ebay for $45 shipped. I also had a FTW 980ti cooler laying arround. As long as the cooler fits the stock PCB of any 970 to titan X card, you can just swap it. You may need to cut out or re-solder the 12v power connector in the other orientation tho, in my case I moved it from the back to the top. I also thermal glued heat sinks on the backplate because not beingin a server case means that vram gets warm.
@yungdaggerdikkk 7 месяцев назад
holy molly bro, 45? any link or tip to get one that cheap? ty and hope u enjoy it x)
@joshuachiriboga5305 6 месяцев назад
@@yungdaggerdikkk Newegg has them at about that price
@joshuachiriboga5305 6 месяцев назад ⁺¹
Running Stable Diffusion does it run out of vram at 12gb or at 24gb?
The tech docs claim the system is 2 systems of Cuda and vram etc...
@joo9125 Год назад ⁺⁵⁷
Turing, not TURNing lol
@nneeerrrd Год назад ⁺³
He's a Pro, don't tell him he's wrong 😂
@igyysdaddy191 6 месяцев назад ⁺²
you just turinged him on
@subsubl 6 месяцев назад
😂
@KomradeMikhail Год назад ⁺¹⁹
SD, GPT, and other AI apps _still_ not taking advantage of AI Tensor cores...
Literally what they were invented for.
@gardenerofthesun 9 месяцев назад ⁺³
As long as I know, llama-cpp can use tensor cores
@KiraSlith Год назад ⁺²²
I'm using a trio of P40s in my headless Z840, kinda risking running into the PSU's power limit, but there's nothing like having a nearly real-time conversation with a 13b or 30b parameter model like Meta's LLaMA.
@jaffmoney1219 Год назад ⁺³
I am looking into buying a Z840 also, how are you able to keep the P40s cool enough?
@KiraSlith Год назад ⁺³
@@jaffmoney1219 Air ducting and cramming the PCIe zone intakes to 100%. If you buy the HP branded P40s supposedly their BIOS will tell the motherboard to ramp the fans automatically. I'm using a pair supposedly from PNY so I don't know.
@strikerstrikerson8570 Год назад ⁺⁴
@@KiraSlith Hello! Can you make a short video on how it works for you from the side of hardware and a language model such as LLAMA?
If you can’t or don’t want to make a video, you can briefly describe here your hardware configuration, and what is better to take for this?
I'm looking at an old platform 2011-v3 18-22 core cpu, gaming motherboard from asus or asrock with 128/256gb ddr4 ecs ram. At first I wanted to buy a modern video card RTX 30xx / 40xx line, but then I came across Tesla server accelerators, which have a large amount of VRAM 16/24/32 GB
which we have about 150/250/400 euros
Unfortunately, there is somehow little information, and if you come across videos on RUclips, then people start stable diffusion, which gives very deplorable results even at tesla V 100, which the RTX3060 bypasses.
Thanks in advance!
@KiraSlith Год назад
@@strikerstrikerson8570 Sure, when it comes down for maintenance next. It's currently training a model. If you want new cards only and don't have a fat wallet to spend from, you're stuck with Consumer cards either way. Otherwise, what you want depends entirely on what your primary goal is. Apologies in advance for the sizable wall of text you're about to read, but it's necessary to understand how to actually pick a card.
I'll start by breaking it down by task demand:
- image recognition and voice synthesis models want fast CUDA cores but still benefit from higher core counts, and the larger the input or output, the more VRAM they need.
- Image generation and voice recognition models also want fast CUDA cores, but their VRAM demands expand exponentially faster.
- LLMs want enough VRAM to fit the whole model uncompressed and lots of CUDA cores. They aren't as affected by core speed but still benefit.
- Model training always requires lots of VRAM and CUDA cores to complete in a reasonable amount of time. Doesn't really matter what the model you're training does.
Some models bottleneck harder than others (though the harshest bottleneck is always VRAM capacity), but ALL CUDA Compute capable GPUs (basically anything made after 2016) are able to run all models to some degree. So I'll break it down by their degree of capability, within their same generation and product tier.
- Tesla cards have the most CUDA cores and VRAM, but have the slowest cores and require your own high CFM cooling solution to keep them from roasting themselves to death. They're reliably the 2nd cheapest card option for their performance used and the only really "good" option for training models.
- Tesla 100 variants trade VRAM capactiy for faster HBM2 memory, but don't benefit much from that faster memory outside enterprise environments with remote storage. They're usually the 2nd most expensive card in spite.
- Quadro cards strike a solid balance between Tesla and Consumer. Fewer CUDA than Tesla but more than Consumer. Faster CUDA cores than Tesla but slower than Consumer. More VRAM than consumer, but usually less than Tesla. Thanks to "RTX Experience" providing solid gaming on these cards too, they're the true "Jack of all trades" option and appropriately end up with a used price right in the middle.
- Quadro "G" variants (eg GP100) trade their VRAM advantage over consumer for HBM2 VRAM at absurd clock speeds, giving them a unique advantage in Image generation (and video editing). They're also reliably the most expensive card in their tier.
- Consumer cards are the best used option for the price if you want bulk image generation, voice synthesis, and voice recognition. They're slow with LLMs, and if you try to feed them a particularly big model (30b or more) will bottleneck even more harshly on their lacking VRAM (be it capacity or speed) and potential to bottleneck even further paging out to significantly slower system RAM.
@og_tokyo 11 месяцев назад ⁺²
Stuffed a z440 mobo into a 3u case, will be putting 2x p40s in here shortly.
@gregorengelter1165 Год назад ⁺⁹
I also got myself an M40 a few months ago. But cooling with air is not really a good solution in my opinion. I was lucky enough to get a Titan X (Maxwell) water block from EK for 40€/~44USD. With it, the part runs perfectly and comes under full load to a maximum of 60 ° C / 140 °F.
If you are not so lucky, I would still recommend using these AiO CPU to GPU adapters (e.g. from NZXT).
Air cooling is comparatively huge and extremely loud (most of the time).
@StitchTheOtter Год назад ⁺⁷
I did get myself a P40 for 170€. RTX 2080 gaming performance and 24gb GDDR5 694.3 GB/s. Stable diffusion on my 2080 runs around 5-10x Faster than on the P40. But it would make a good price/performance cloud gaming GPU.
@zilog1 Год назад ⁺⁴
They are going for $50 currently. get a server rack and fill them up!
@Bjarkus3 2 месяца назад ⁺³
If you put a p40 with a 3090 will it be bottlenecked at p40 speeds or will it be an average?
@SpottedHares Год назад ⁺³
So according to Nvidia own specs the m40 uses the same board as the titan x and 900 series. So theoretical any cooling system that works for either of those two should also work on the M40.
@vap0rtranz 6 месяцев назад ⁺¹
Great explanation. Basically a Gamers vs AI hackers. The AI models want to fit into V/RAM, but are huge, so the 8G or 12G VRAM cards can't run them. Getting a new + huge VRAM GPU is hella expensive right now. So an older card with lots of VRAM works. Also, the Gamers tend to overclock/overheat, but the Tesla and Quadro are usually datacenter liquidations, so there's less risk of getting a fried GPU. BTW: the P40 is newer version of the M40.
@KratomSyndicate Год назад ⁺⁶
I just bought a rtx 4090 last night and all the parts for a new desktop, i9 13900K, MSI Meg Z790, ddr5 128gb, 4 - samsung 990 pros, to just do SD and AI, maybe over kill
@Mark300win 10 месяцев назад ⁺²
Dude you’re loaded 😁$
@sa_med 8 месяцев назад ⁺²
Definitely not overkill if it's for professional use
@madman1397 Год назад ⁺⁴
Tesla P40 24gb cards are on ebay for sub $200 now. Considering one for my server
@charleswofford5515 Год назад ⁺³
For anyone wanting to do this. I found the best cooling solution is a Zotac gtx 980 amped edition 4 Gb model. It has the exact same footprint. The circuit board is nearly identical. Bolts right on with very little modifications. You will need to use parts from tesla and zotac gpuGPU to make it work. Been running mine for a while now without issue.
@timomustamaki5407 Год назад ⁺⁴
I have been planning this move as well as the M40 is dirt cheap on ebay. But I worry about one thing you did not touch on this video (or at least I did not notice if you did): How did you solve the power cabling issue? I believe the M40 does not take a regular pcie gpu power cable but needs something different, an 8-pin cable?
@KiraSlith Год назад ⁺²
That's right, the Tesla M40 and P40 use an EPS (aka "8-pin CPU") cable, which can thankfully be resolved using an adapter cable. Just a note, the 6-pin PCI power to 8-pin EPS cables some chinese sellers offer should ONLY be used with a dedicated cable run from the PSU to avoid cable meltdowns! Thankfully this isn't an issue if you're using a HP Z840 (which also conveniently solves the airflow issue too), or a custom modular PSU with plenty of PCI power connections, but it can quickly become an issue for something like a Dell T7920.
@mythuan2000 10 дней назад
Guys, have you ever heard about mining gpus? If you do, you have a solution for maybe 10 or 20 cards at once, why complaint about 2 or 3 cards?
@FlexibleToast Год назад ⁺⁵
You say you need a newer motherboard to use the P40. Does any motherboard with PCIe x16 3.0 work?
@k-osmonaut8807 Год назад ⁺⁴
Yes, as long as it supports above 4g decoding
@brianwesley28 Месяц назад
Also don't forget the fan.
@schifferu Год назад ⁺²
Got my Tesla M40 a while back, and now have a fan cooling on it (EVGA SC GTX 980ti cooler) to mess around with, but just seeing the power consumption 😅😅
@edgecrush3r Год назад ⁺³
I just purchased a Telsa P4 some weeks ago, and having a blast with it. The Low Profile even fits in the QNAP 472XT chassis. Passthrough works fine (minor tweaks). Currently compiling kernel to get support for vGPU (if i ever succeed).
@jamestaylor1849 26 дней назад
I got to ask. Why do you say it needs PCIE Gen 4 and a newer motherboard? Documentation says it's PCIe 3
@DanRegalia Год назад ⁺⁴
So, I picked up a P40 after watching this video... Thanks! Do you have any videos that talk about loading these LLMs, or if I should go with linux/windows/etc... maybe install Jetpack from the Nvidia downloads? I've screwed around a little with hugging face, and that made me want to get the card to run better models, but rabbit hole after rabbit hole, I'm questioning my original strategy.
@NovaspiritTech Год назад ⁺⁴
i'm glad you were able to pick up a p40 and not the m40 since pascal arch can run 4bit modes which is most llm models but llm's changes so rapidly i can't even keep up myself but i have been running the docker container for github.com/Atinoda/text-generation-webui-docker . but yes this is a deep rabbit hole i feel your pain
@vap0rtranz 6 месяцев назад
Easiest out-of-box apps for running local LLMs are GPT4All and AnythingLLM. Huggingface requires lots of hugging to not sink into rabbit holes :) The apps like I mention keep things simple. Both have active Discord channels that are helpful too.
@l0gic23 2 месяца назад
Remember how much it was at the time?
@DanRegalia Месяц назад
@@l0gic23 180 bucks locally here off of fb marketplace.
@TheRainbowdashy 8 месяцев назад ⁺¹
How does the p40 perform for video editing and 3D design programs like Blender?
@sergiodeplata Год назад ⁺⁴
You can use both card simultaneously. There will be two CUDA devices.
@joshuachiriboga5305 6 месяцев назад ⁺¹
The Tesla K80 with 24gb vram, claims a setup of 2 system each with it own Cuda and vram. When running Stable Diffusion does it behave as one GPU with 24gb or does it behave as 2? Does it run out of vram at 12gb or 24gb in image production?
@BoominGame 6 месяцев назад ⁺¹
That's exactly my question.
@AChrivia Год назад ⁺¹
2:21 Actually, that Tesla card has 1150 more cuda cores than that 2070...
3,072-1,922= 1150
The only thing im curious about is how well it can mine. 🤔
If anything, why the hell wouldnt you just get a 3090ti? It has 10,496 cuda cores which is far and beyond the tesla in both capabilities for work and gaming.
If its due to sheer prices, i get it but the specs are still beyond what you currently have.
@Antassium Год назад
Cost:Performance...
@beholder4465 Год назад ⁺²
i have asus h410 hdv m.2 intel chipset, compatibilty good with the tesla m40?
ty
@fuba44 Год назад ⁺²
But wait, i was under the impression that both the M40 and the P40 are dual GPU cards, so the 24gb of vram is split between the to gpu's. or am i mistaken ? when i look up the specs it looks like only 12gb per gpu.
@unicronbot Год назад ⁺⁴
M40 and P40 GPU are single CPU
@yb801 Год назад ⁺²
I think you are talking about K80 gpu.
@alignedfibers Год назад ⁺⁴
I went with K80 but stable diffusion only runs with torch 1.12 and cuda 11.3 and right now only runs on 12GB half memory and half gpu in the k80 because it is Dual GPU. M40 should allow modern cuda and nvidia driver and also no work around needed to access full 24GB on K80.
@joshuachiriboga5305 6 месяцев назад ⁺¹
Thank you, I have been looking for this info
@BoominGame 6 месяцев назад ⁺¹
Does it use the whole 25Gb VRam, because it's basically 4 cores put together, is the Vram working as 1?
@Robstercraw Год назад ⁺¹
You can't just plug that card in and go. There are driver issues. Did you get it working?
@gardenerofthesun 9 месяцев назад ⁺²
Owner of P40 and 3090 in the same PC.
No problems whatsoever, just install Studio driver
@carlosmiguelpimientatovar8458 9 месяцев назад
Excellent video.
In my case I have a workstation with an msi X99A TOMAHAWK motherboard with an Intel Xeon E5-2699 v3 processor, (and I currently use 3 monitors). Because of this I installed a GPU, AMD firepro w7100 which works very well for me in Solidworks.
The RAM is Non-ECC 32 gigabytes.
The problem is that I am learning to use ANSYS, and this software is married to Nvidea, and for GPU calculation acceleration, looking at the Ansys GPU compatibility lists, I see that the K80 is used, and taking into account the second-hand price, I am interested in purchasing one.
How can I configure my system to install an Nvidea Tesla K80 and have the AMD GPU work as an image or video generator for my monitors as it currently does? Does the Nvidea K80 gpu have 24 GB of ram, can this be affected when using this gpu in conjunction with the AMD GPU that only has 8 GB of ram? Would the K80 be restricted to the RAM of the Firepro w7100?
My PSU is 700 watts.
Thank you.
@simpernchong Год назад
Great video. Thanks!!
@seanoneill9130 Год назад ⁺⁹
Home Depot has free delivery.
@NovaspiritTech Год назад ⁺⁴
😂
@garthkey Год назад ⁺⁵
With them having the choice of worst wood, no thanks
@jerry5566 Год назад ⁺¹
P40 is good, but only concern is that it had probably been used for mining
@Antassium Год назад ⁺⁵
Mining has been proven to not cause any more significant wear than regular duty cycles..
In fact, in some situations the mining rig would be a cleaner and safer environment than in a PC case, on the floor in some persons home with toddlers sloshing their chocky milk around, for example 😂
@nodewizard 3 месяца назад ⁺¹
Just buy a used RTX 3090 for $500. Works great with generative art, LLMs, etc.
@titopancho 4 месяца назад
after watching your video, i tried to do the same, but, i had a problem.. i have the HP DL380 server and I purchased the Nvidia Tesla P100 16GB, but i can't find the power cable.
watching other poeple i am afraid to buy the wrong one and fry my server.... can you please tell me the right cable to buy please..
@zygge Год назад ⁺³
Pc dont need HDMI output to boot. Any display interface is ok. VGA, DVI or DP
@bopal93 10 месяцев назад
What's idle power consumption the m40. I'm thinking to use in my server but can't find details on internet. Thanks
@jetfluxdeluxe Год назад ⁺³
what is the power draw "idle" of that?! if on 24/7 in a server. can it power down? cant find info on that online.
@execration_texts Год назад ⁺²
My M40 idled at ~30 watts, P40 is closer to 20
@hardbrocklife Год назад ⁺¹
So P40 > M40?
@b_28_vaidande_ayush93 Год назад ⁺²
Yes
@ghardware_3034 Год назад
For training or FP16 inference get the P100, it got decent FP16 performance, the P40 is horrible at that, it was specialised for INT8 inference@@b_28_vaidande_ayush93
@win7best 2 месяца назад
the p40 from the price is already way better, also if you wanted more cuda cores you could have gotten 2 K80s for the same price
@idcrafter-cgi Год назад ⁺⁶
My 4090 takes 2 seconds to make a 512x512 at 25 steps. It only has 24gb vrm which means that i can only like make 2000x2000 inages with no upscaling
@MWcrazyhorse Год назад
How does this compare to an RTX A2000?
@chjpiu 8 месяцев назад
Can you suggest a desktop workstation can include tesla m40? Thank you so much
@BoominGame 6 месяцев назад
look for an HP z840, but buy a GPU separately because you are probably going to pay way more if included.
@blackthirt33n 3 месяца назад
i have one of these cards how do i use it an ubuntu 22.04 computer
@alignedfibers Год назад
m40?
@davidburgess2673 10 месяцев назад
What about hbcc on a vega 64 to "unlimited" boost in ram all be it a little slower but with video out etc
@bulcub Год назад
I have server that I'm going repurpose as a video renderer to a multiple storage drive bay (24) I wanted to know if this is possible? would I need proxmox etc would the p40 model be sufficient?
@NovaspiritTech Год назад ⁺²
I have a video on this topic with using tdarr
@joshuascholar3220 8 месяцев назад
I'm about to try it with a 32 gb Radeon Instinct Mi 50.
@cultureshock5000 Год назад
is the 8gb lopro good for my sff dell i like my rx550 but i could play alot more stuff i bet i could lay starfield 1080 on low on the 8gb m4 .... is it worth th e90 bucks
@MrHasie Год назад
Now, I have Fit, what’s its comparison? 🤭
@markconger8049 Год назад ⁺²
The Ford F150 of graphics cards. Slick!
@robertfontaine3650 10 месяцев назад
That is a heck of a lot cheaper than the 3090's
@FreakyDudeEx Год назад
kind of sad that the price of these cards in my region is ridiculous.... its actually cheaper to get a rtx3090 2nd hand rather than getting the p40.... and the m40 is double the price compared to the one in this video....
@brachisaurous 2 месяца назад
P100 would be better for stable diffusion
@gileneusz 9 месяцев назад
isn't 4090 faster?
@akissot1402 Год назад ⁺¹
Finally, i will be able to fine-tune and upgrade my Gynoid. Btw 3090 has 10496 cudas, and its about 850$ the cheapest in the market brand new.
@tomaszmaciaszczyk2116 7 месяцев назад
cuda cores my frend .ihave this card on my table right now.g f pol
@mateuslima788 5 месяцев назад
You could've made an actual comparison.
@112Famine Год назад ⁺⁴
Did anyone able to get this server graphic card able to play video games? Or only able to get it to only work how you have, running tasks, its a "smart" card, like how cars are able to drive.
@llortaton2834 Год назад ⁺⁵
All tesla cards can play games, the problem with those is the cooling because there is no heatsink fan, you have to either buy your own 3D printed shroud or have a server that shoots air across the chassis
@jameswubbolt7787 Год назад ⁺¹
I never knew .THANKS.
@TheRealBossman7309 Год назад ⁺¹
Great video👍
@garrettnilan5609 Год назад
Can you run a stable diffusion test and show us how to set it up please!
@shlomitgueta Год назад
i have NVIDIA GeForce GTX 1080 Ti 3584 CUDA Cores. and i was thinking it is so old lol
@НеОбычныйПользователь 5 месяцев назад
Купил Максвелл и хвастается. Хоть бы Паскаль...
@skullpoly1967 Год назад
Yay rmiddle
@itoxic-len7289 Год назад ⁺³
Second!
@MaikeLDave Год назад ⁺³
First!
@unclejeezy674 Год назад
Try --medvram or --lowvram. 24gb should be able to get 2048x2048 with --lowvram.

Следующие

Автовоспроизведение