TheDataDaddi
TheDataDaddi
  • Видео 62
  • Просмотров 250 119
GPU Performance Benchmarking for Deep Learning - P40 vs P100 vs RTX 3090
In this video, I benchmark the performance of three of my favorite GPUs for deep learning (DL): the P40, P100, and RTX 3090. Using my custom benchmarking suite, BenchDaddi, I assess the performance of these GPUs across three major DL architectures: CNN, RNN, and Transformers. Whether you're a data scientist, a machine learning engineer, or just an AI enthusiast, this comparison will provide valuable insights into the capabilities of these GPUs.
In this video, you'll discover:
Benchmark Tests: Detailed performance benchmarks across various AI/ML/DL workloads.
Analysis & Insights: In-depth analysis of the results, highlighting strengths and weaknesses.
Use Case Suitability: Recommendations on w...
Просмотров: 3 751

Видео

An Open Source GPU Benchmarking Project: BenchDaddi
Просмотров 5742 месяца назад
In this video, I'm excited to introduce BenchDaddi, an innovative open-source GPU benchmarking suite that I've developed. BenchDaddi empowers users to analyze GPU performance across various AI tasks with precision and ease. Throughout this tutorial, I'll walk you through the suite's repository structure, setup, and installation process, showcasing how to leverage its benchmarking scripts tailor...
10G Networking Demystified: Tips, Cost, and Real-World Use Cases
Просмотров 4893 месяца назад
Are you considering upgrading your network to 10G but unsure where to start? Look no further! In this comprehensive guide, I'll demystify the world of 10G networking, providing you with practical tips, cost considerations, and real-world use cases to help you make an informed decision. Specific Topics Covered: - 10G Networking Basics - SFP vs SFP vs QSFP - SFP Module Types and Considerations - ...
Setting Up External Server GPUs for AI/ML/DL - RTX 3090
Просмотров 4,1 тыс.4 месяца назад
🚀 Join me on a deep dive into setting up external GPUs for high-performance AI computations! In this comprehensive guide, I'll walk you through connecting powerhouse GPUs like the NVIDIA RTX 3090 with NVLink to servers such as the Super Micro 4028GR-TRT and Dell Power Edge R720. We'll cover everything from hardware requirements to configuration steps. ⚙️ What you'll learn: - The advantages of u...
8 GPU Server Setup for AI/ML/DL: Supermicro SuperServer 4028GR-TRT
Просмотров 7 тыс.4 месяца назад
In Part 2 of our series, we're diving into the nuts and bolts of setting up the Supermicro SuperServer SYS-4028GR-TRT for optimal AI/ML/DL performance. Discover the step-by-step process to configure this powerhouse server, from installing the GPUs to optimizing software and hardware settings. Learn how to harness the full potential of up to 8 dual-slot GPUs for unparalleled computational power ...
Your New 8 GPU AI Daily Driver Rig: Supermicro SuperServer 4028GR-TRT
Просмотров 2,5 тыс.4 месяца назад
Welcome to Part 1 of our exciting series, where we embark on a deep dive into the transformative world of artificial intelligence, machine learning, and deep learning with the Supermicro SuperServer SYS-4028GR-TRT. In this episode, we're set to explore the myriad benefits of making this powerhouse server your go-to daily driver for all your AI/ML/DL endeavors. Uncover the incredible capabilitie...
Navigating the AWS Applied Science Internship Interview: My Experience
Просмотров 1434 месяца назад
Description: In this video, I share my journey through the AWS Applied Science interview process, providing insights and tips for aspiring candidates. From the initial application to the final decision, I'll take you step-by-step through each stage, highlighting the challenges and learning experiences along the way. Whether you're an aspiring applied scientist or just curious about the tech int...
Boomers, Millennials, Gen Z: Who Faced Toughest Home Buying Hurdles? - A Data-Driven Exploration
Просмотров 8656 месяцев назад
Welcome to my deep dive into the evolving landscape of home buying! In this video, I take a unique, data-driven approach to explore one of today's most pressing questions: Is it harder to buy a home now compared to previous years (1967-2022)? Join me as I dissect trends in rising home prices and stagnant wage growth, topics that have been widely discussed on social media and beyond. My journey ...
Throttle No More: My Strategy for GPU Cooling in Dell PowerEdge
Просмотров 1,8 тыс.6 месяцев назад
Join me on a deep dive into managing and optimizing GPU cooling in the Dell PowerEdge server series. In this video, I share my personal experiences and strategies for preventing thermal throttling to maintain peak server performance. I'll show you how I interact with the server fans by SSHing directly into the server's iDRAC, and how to access the fans using the local command line with Dell's i...
Setting Up a Full Bitcoin Core Node with Docker and Ubuntu 22.04
Просмотров 3,1 тыс.7 месяцев назад
🚀 Join us as we dive into the world of cryptocurrency and blockchain technology! In this comprehensive tutorial, we're setting up a full Bitcoin Core node inside a Docker container on Ubuntu 22.04. Whether you're a crypto enthusiast, a developer, or just curious about how Bitcoin nodes work, this step-by-step guide is tailored for you! 🔍 What You'll Learn: Introduction to Bitcoin Core: Understa...
AI/ML/DL with the Dell PowerEdge R720 Server - Energy, Heat, and Noise Considerations
Просмотров 2,5 тыс.7 месяцев назад
Dive deep into the world of high-performance computing with our thorough examination of the Dell PowerEdge R720 Server, a powerhouse for AI, ML, and DL applications. This video is not just a guide; it's an insightful exploration into the server's operational dynamics, focusing on power consumption, heat production, and acoustic management. What You'll Uncover in This Video: Power Consumption In...
Best AI/ML/DL Rig For 2024 - Most Compute For Your Money!
Просмотров 17 тыс.7 месяцев назад
🚀 Dive into the future of AI/ML/DL computing with our in-depth analysis of the best rigs for 2024! In this video, we take a close look at the powerhouse Dell PowerEdge R720 server, equipped with top-tier components, and see how it stacks up against custom-built setups and cloud computing solutions. 🔍 Components Breakdown: Dell PowerEdge R720 Server: A robust foundation for demanding AI tasks. ...
iDRAC7 Setup and Access Guide on Dell PowerEdge R720 Server
Просмотров 1,1 тыс.8 месяцев назад
Embark on a journey to harness the full potential of your Dell PowerEdge 12th Generation server with our comprehensive iDRAC7 setup and access guide! In this tutorial, we guide you through the intricate steps of configuring and accessing iDRAC7, the Remote Access Controller that empowers seamless server management. Whether you're a seasoned IT professional or a tech enthusiast, this video break...
Mastering iDRAC7: Unlocking Virtual Console & Enterprise Features on Dell PowerEdge R720
Просмотров 7368 месяцев назад
Dive into the world of enterprise server management with our comprehensive guide on setting up iDRAC7 for Dell PowerEdge R720 servers. In this tutorial, we'll walk you through the step-by-step process of configuring iDRAC7 to harness the full potential of virtual console and other advanced features. Whether you're a seasoned IT professional or a server enthusiast, our detailed instructions appl...
DIY Home Network Monitoring System - Raspberry Pi 4, Smart Switch, and Wireshark
Просмотров 4,5 тыс.8 месяцев назад
🌐 Join me on an exciting journey as we craft a powerful Home Network Monitoring System using a Raspberry Pi 4, a Smart Switch, and the Wireshark network protocol analyzer tool! 💻🛠️ In this step-by-step guide, I'll walk you through the process of setting up and configuring the Raspberry Pi 4 to serve as a dedicated network monitor. We'll harness the capabilities of a Smart Switch with port mirro...
Dell PowerEdge R720 GPU Deep Learning Upgrade: Installing Dual Tesla P40s with NVIDIA Drivers
Просмотров 7 тыс.8 месяцев назад
Dell PowerEdge R720 GPU Deep Learning Upgrade: Installing Dual Tesla P40s with NVIDIA Drivers
Dell PowerEdge R720XD GPU Upgrade: Installing Tesla P40 with NVIDIA Drivers
Просмотров 7 тыс.8 месяцев назад
Dell PowerEdge R720XD GPU Upgrade: Installing Tesla P40 with NVIDIA Drivers
Server Data Recovery: How to Save Your System & Extract Data from a Non-Bootable Server
Просмотров 1388 месяцев назад
Server Data Recovery: How to Save Your System & Extract Data from a Non-Bootable Server
Installing DUAL Tesla P100 GPU on Dell PowerEdge R720 Server with Driver Installation
Просмотров 8 тыс.9 месяцев назад
Installing DUAL Tesla P100 GPU on Dell PowerEdge R720 Server with Driver Installation
Installing Tesla P100 GPU on Dell PowerEdge R720 Server with Driver Installation
Просмотров 7 тыс.9 месяцев назад
Installing Tesla P100 GPU on Dell PowerEdge R720 Server with Driver Installation
AI/ML/DL GPU Buying Guide 2024: Get the Most AI Power for Your Budget
Просмотров 64 тыс.9 месяцев назад
AI/ML/DL GPU Buying Guide 2024: Get the Most AI Power for Your Budget
Detectron2 Setup with Docker: Simplify Computer Vision - Fix Bugs & Dependencies (2023)
Просмотров 1,1 тыс.9 месяцев назад
Detectron2 Setup with Docker: Simplify Computer Vision - Fix Bugs & Dependencies (2023)
Neo4j - WebSocket Connection Failure - How to Fix!!!
Просмотров 2,3 тыс.10 месяцев назад
Neo4j - WebSocket Connection Failure - How to Fix!!!
Neo4j - Introduction to Querying with Real World AirBnB Data
Просмотров 14410 месяцев назад
Neo4j - Introduction to Querying with Real World AirBnB Data
2FA Setup For Linux Home Lab or Server - Cisco Duo
Просмотров 1,4 тыс.11 месяцев назад
2FA Setup For Linux Home Lab or Server - Cisco Duo
Why 3.5" Caddy (Tray) Lights Blink Amber After 2.5" SSD Replacement
Просмотров 123Год назад
Why 3.5" Caddy (Tray) Lights Blink Amber After 2.5" SSD Replacement
How to Configure a Virtual Drive ?!
Просмотров 162Год назад
How to Configure a Virtual Drive ?!
Upgrade Your Server's Old 3.5" HDD To 2.5" SSD
Просмотров 2,2 тыс.Год назад
Upgrade Your Server's Old 3.5" HDD To 2.5" SSD
Great Budget PC Build For The Data Science/ML Beginner: P3 - Software
Просмотров 720Год назад
Great Budget PC Build For The Data Science/ML Beginner: P3 - Software
Great Budget PC Build For The Data Science/ML Beginner: P2 - The Build
Просмотров 1,7 тыс.Год назад
Great Budget PC Build For The Data Science/ML Beginner: P2 - The Build

Комментарии

  • @vulcan4d
    @vulcan4d 2 дня назад

    So for FP16 it is an obvious choice but what about INT8 /Q8 models, how is that affected? Would love to see a comparison of Q8 and Q4 models between these cards as there is very little info out there.

  • @Fachowiec998811
    @Fachowiec998811 3 дня назад

    Is the final result DLperf score?

  • @javanerd05_29
    @javanerd05_29 3 дня назад

    I dont have a gpu in my device. How can I run it?

  • @NeophytosChristou
    @NeophytosChristou 4 дня назад

    Hello! What’s the wecommended amount of detotated wam I should add to the server?

  • @stunchbox7564
    @stunchbox7564 7 дней назад

    What is the size of the flash drive used for the ventoy setup?

  • @stunchbox7564
    @stunchbox7564 7 дней назад

    I love the dad part

  • @alzeNL
    @alzeNL 7 дней назад

    Thanks for the info on the resier needed for 2xP100, really good series of informative videos you have here.

  • @RotiSlay
    @RotiSlay 8 дней назад

    Great video^^, getting tired of seeing those gpu comparison video where all they think are just about gaming. it pops up right in time when i'm thinking to build a new pc. I was thinking about buying the 4060 ti with its 16gigs vram to help me with my thesis research that I assume the 16gigs would be really helpful for the ML/DL(used to have 1660 with 6gigs vram and its horrendous XD) but also pretty good enough for my daily use such as streaming and editing. totally in a tight budget that i needed to squeeze a bit more to get that 470$(the price in my country rn) card or should i just wait for the rtx 50 series to come out later hoping the older gen price drop?

  • @ultradroid4k
    @ultradroid4k 8 дней назад

    question regarding the deep learning/ai..let say i use my current old pc setup ryzen 3700x 32 gb ram and and rtx gpu..later i want to upgrade to am5 chipset probably ryzen 7900 or maybe ryzen 9900.. how the data migration ?.. can it easily be done or it is better if upgrade spec now before running the deep learning model

  • @marcsanmillan3165
    @marcsanmillan3165 9 дней назад

    Really enjoyed the video! Do you know if the drive bays on that server are compatible with NVMe drives? I’m thinking about getting that server but I can’t find that piece of information. Thanks!

  • @nmihaylove
    @nmihaylove 9 дней назад

    There is sth I dont understand. P100 has 100x the performance of P40 in FP16, yet this doesn't show up anywhere. What gives?

    • @vulcan4d
      @vulcan4d 3 дня назад

      P40 is better at fp32 and int8 but it sucks at fp16. P100 is the opposite.

  • @TazzSmk
    @TazzSmk 9 дней назад

    1080Ti is also a Pascal gpu, "only" 11GB vram, but also a viable card for the bandwidth

  • @AnishGoyal-q9s
    @AnishGoyal-q9s 9 дней назад

    What do you think is better for running LLMs: 2 L40 or 1 H100?

  • @ГолосКалифорнии
    @ГолосКалифорнии 11 дней назад

    Great explanation. Thanks 🙏

  • @DragonsR4Ever2
    @DragonsR4Ever2 11 дней назад

    Has anyone attempted connecting more than 4 tesla p40's to the r720xd? I've got 4 connected now but when I originally had one of them connected to the lower pcie on riser 2 I got a pcie training error. 2 on riser 1 and 1 each on risers 2 and 3 seems to work fine. I've got another p40 in the mail that I will attempt to connect to the last slot on riser 1 but it would be awesome if I could have 6😀 or even 8. However this machine does not support bifurcation and adding pcie switches to the 2 full bandwidth pcie's is too expensive.

  • @stevenhe3462
    @stevenhe3462 11 дней назад

    Good explanation!

  • @johnshaff
    @johnshaff 12 дней назад

    Does the 720 have a GPU shroud? The 740 and greater does as part of a GPU enablement kit. The regular shroud cuts air flow to the PCIe risers.

  • @yilso8663
    @yilso8663 14 дней назад

    Nice

    • @TheDataDaddi
      @TheDataDaddi 13 дней назад

      Thanks so much for the kind words!

  • @DrDipsh1t
    @DrDipsh1t 14 дней назад

    Colton from hardware haven showed a nice trick in his videos for removing thermal paste: use a coffee filter instead of paper towel. It's more abrasive, durable, and doesn't leave any flakes on whatever you're cleaning!

    • @TheDataDaddi
      @TheDataDaddi 13 дней назад

      Ah this is an excellent tip! Will definitely come in handy in the future. Thanks so much for the comment here!

  • @rukitorin1998
    @rukitorin1998 17 дней назад

    Im torn in choosing for Gpu for ai use first (koboldccp + sillytavern) and gaming second. My choices were a 3060 12gb at first then the 4060 ti 16 stood out but then the 4070 ti super got recommend to me. I intend to use the card for at least 3-5 years. The only thing limiting me is my small budget. Like i could buy the 3060 now and the 4060 ti after few weeks. While ill wait and watch out for deals on the 4070...

    • @TheDataDaddi
      @TheDataDaddi 13 дней назад

      Hi there. Thanks for the comment! I am not super familiar with koboldccp or sillytavern so please take what I say with a several grains of salt, but from my brief research they need an AI model integrated in some way. I am assuming you want to host this locally. For this I would go with the GPU with the largest VRAM, so the 4060 16GB TI is the clear choice in my book.

    • @rukitorin1998
      @rukitorin1998 13 дней назад

      @@TheDataDaddi yeah. Basically locally hosting an AI model to my pc. I'm not really into machine learning as of yet. Currently I have a 2060 on my pc and 6 GB isn't really enough. Another question is AMD not a good alternative?

    • @TheDataDaddi
      @TheDataDaddi 13 дней назад

      @@rukitorin1998 It can be, but AMD is generally not as easy to use for machine learning. However, it may be applicable for your use case. AMD definitely seems to offer better price to performance, but there are still a lot of bugs from what I understand. I also cannot recommend it strongly AMD because I have not personally dabbled there. What I am telling you now is just based on feedback I have received from viewers.

    • @rukitorin1998
      @rukitorin1998 18 часов назад

      @@TheDataDaddi thanks for responding, sorry for the late reply. :D i will be buying the 4060 ti 16gb soon when i find a cheaper price point. Living in the Philippines prices are somewhat higher. 30 kph peso(519.08$) to 32k (553.68$) is the prices im looking at right now.

  • @DoomhauerBTC
    @DoomhauerBTC 17 дней назад

    Excellent video. I very much appreciate the time and research that you put into this.

  • @anthonyhoward1296
    @anthonyhoward1296 18 дней назад

    I have the same server but I can’t find the 20 core cpu E5 2670v2 you mention. Can you provide some links ? Thanks for your time and info !

    • @TheDataDaddi
      @TheDataDaddi 13 дней назад

      Hey there! Thanks for the comment Yeah, I have gotten a couple of comment on this, and I think I misspoke in the video. I think I meant 20 logical cores not physical. Sorry for the confusion here.

  • @bazgo-od7yj
    @bazgo-od7yj 19 дней назад

    Where did you get the $149 figure for the P100 from? I got one recently, but for about $40 more.

    • @TheDataDaddi
      @TheDataDaddi 13 дней назад

      Hey there. Thanks for the comment! The prices vary almost daily so it hard to say why the price is so different. However, I got the values for this video from EBAY with respect to my region. I found the lowest price from a reputable seller (ie good ratings and greater than 500 reviews).

  • @MJOLNIR242
    @MJOLNIR242 19 дней назад

    I actually just purchased a Dell Precision 7820 workstation off eBay for $300. It came with 2 xeon silver 4114's, 32 gb ram, a radeon wx pro 2100 and 500 gb ssd. I upgraded the CPU's to 2x Xeon Gold 6138 (20 Cores ea) for total of 40 core 80 threads. Cost me $70 for those two and its a workhorse. Its practically a server in a workstation that could easily run AI/ML tasks all for under $500. Also, I think your E5-2670v2 is a 10 core cpu so it would be 20 cores total, not 40.

    • @TheDataDaddi
      @TheDataDaddi 13 дней назад

      Awesome! That is gonna be a great setup for you. Yes, I think I misspoke in the video. I probably meant 20 logical cores not physical. Sorry for that.

  • @joeeey0079
    @joeeey0079 20 дней назад

    Maybe it’s a beginner question: what is the difference between your 1 year old video on beginner’s ml rig and this server? I assume the earlier is more likr a sandbox for machine learning hobby projects. This server is more for local hosting of LLM. Or maybe they are in different price range too. Could you please clarify it?

    • @TheDataDaddi
      @TheDataDaddi 13 дней назад

      Hi there. Thanks so much for the question! Yep, that is a fairly accurate view. The first ml rig that was a custom build is more for smaller/single projects. It have also work for me well as a nice test bed before scaling up projects. I think this was about $1000 at the time I made the video. Here is what it would cost today: pcpartpicker.com/user/sking115422/saved/#view=zrXn23 The server in this video was designed to work with large scale DL application like working with some of the smaller open source LLMs, diffusions workloads, running many projects at once (this one has been a god send), large scale GNNS. I would say that this server is pretty much equipped to do anything, but work with very large LLMs (think Llama 70B as this pretty much requires a cluster to work with). I believe this rig cost me about $8300 to build. However, a large portion of that was RAM. I also bought RTX 3090s instead of p40s or p100s. So you could scale things down a bit and save a lot money here. I choose these specs for particular use cases that may not be applicable to everyone. I believe I address that in part of the video series if you are interested in more details there.

  • @cmdr_talikarni
    @cmdr_talikarni 21 день назад

    Things have changed, time to step into an Arc A770. Will need a slight adjustment with the tools but now you can get the higher end GPU for a lot less money.

    • @TheDataDaddi
      @TheDataDaddi 21 день назад

      Hi there. Thanks so much for the comment! This is great to hear! What AI/ML/DL applications have you used the ARC A770 for? I would be super curious to know if you have tried it out for anything yet!

  • @viraldailyz
    @viraldailyz 21 день назад

    I would only want to run large language models, are the p40s good enough or do I need the p100s? PS: Great video, very informative!

    • @viraldailyz
      @viraldailyz 21 день назад

      and do I need the 3090 then?

    • @viraldailyz
      @viraldailyz 21 день назад

      and do i need the 3090s then? :) Thank you

    • @TheDataDaddi
      @TheDataDaddi 21 день назад

      Hi there. Thanks so much for your comment and kind words! So if you goal is to exclusively work with LLMs. I would go with the RTX 3090 or the RTX Titan both are well suited for LLM workloads.

  • @bitcode_
    @bitcode_ 23 дня назад

    AFAIK you cannot run Cuda version higher than v10 on these, could someone confirm?

    • @TheDataDaddi
      @TheDataDaddi 21 день назад

      Hi there. Thanks so much for the comment! I have successfully run CUDA 12.2 with no issues. I have not tried any newer versions though as of yet.

  • @DragonsR4Ever2
    @DragonsR4Ever2 23 дня назад

    I recently got a R720Xd and fitted a few p40s externally. All was well u till I added a team group ssd 4tb qx verity. On reboot the created virtual disk disappears and the physical disk shows as failed. it works fine on a fresh start but I still get blinking yellow light on the drive and a few error symbols. Should I use software raid or a different card or any other workarounds? Anyone have this problem? Id hate to have to buy a different ssd😢

    • @TheDataDaddi
      @TheDataDaddi 21 день назад

      Hi there. Thanks so much for your comment! I have a video related to this topic. Link is below: ruclips.net/video/oQqcXyS2WQ4/видео.htmlsi=iisyvWNAJlWJwQQD If it is just error light on the drive, you should be fine. I have been using TeamGroup drives for a while now and been very happy with them. If there are issues booting after installation, there is likely a problem with the drive and you might want to have it replaced. I believe TeamGroup offers a 3 year warranty at minimum on most of their products. Might be worth reaching out to have it replaced if there are issues.

    • @DragonsR4Ever2
      @DragonsR4Ever2 19 дней назад

      ​I ended up getting it working by flashing the perc h710 mini D1 with Lsi IT mode firmware and installing the boot images using the guide on fohdeesha. It's working great and I now have 4 p40's mounted on a gpu rack above the case. I still haven't got the fans not to run at 100% yet but when I get some extra time I'll continue digging into the ipmi tool. I've gotten it to return the fan info but it doesn't seem to respond to turning off automatic fan control or manual controlling the fan speed ​@@TheDataDaddi

    • @DragonsR4Ever2
      @DragonsR4Ever2 17 дней назад

      I resolved my fan problem by using RACAdm. I cleared the idrac logs and disabled the PCI fan response. Fans are running at 10% at idle now. I watched your video on installing the idrac service module and I'd like to note that version 10 point something has a Ubuntu specific version so you don't have to do any converting. Although I am running Ubuntu version 20 so all I did was run the shell script and voila all done

  • @Emerson1
    @Emerson1 26 дней назад

    great video

    • @TheDataDaddi
      @TheDataDaddi 21 день назад

      Hi there. Thanks so much for the kind words! Really appreciate you watching!

  • @iasplay224
    @iasplay224 28 дней назад

    can you try to share the .deb packages because I cannot make alien to work it says the following "unpacking of XXX failed at /usr/share/perl5/Alien/Package/Rpm.pm" looked around can't seem to find a solution

    • @TheDataDaddi
      @TheDataDaddi 21 день назад

      Hey there. Thanks so much for the comment! Unfortunately, I cannot seem to find the deb package from this solution. However, here is an alternative way that you might be able to covert the RPM to DEB. stackoverflow.com/questions/61932118/convert-rpm-to-deb-without-alien Let me know if it works!

  • @darnell8897
    @darnell8897 28 дней назад

    TDD, great effort. The great P40 v. P100 question has been heating up for a while now and is only growing. Good on you for getting some good information out to the community. Some observations: Is the "Scaled Throughput per $ By GPU", chart accurate? Is it actually measuring 'Scaled CPU Throughput / $'? I could be misinterpreting the chart... forgive me if that's the case. Dividing the 'Scaled Throughput by GPU' number in the top chart (@17:23) by the prices you gave ($161.99, $149, $819.95): P40 0.034508 / 0.03315 P100 0.043020 / 0.04101 RTX 3090 0.020684 / 0.01855 (Making the P100 about double the _value_ of the RTX3090 vis-a-vis scaled CPU). Value is the meat and potatoes of these benchmarks for a lot us, so I figured I'd ask for clarification in case others are confused as I am. A very minor observation-- and again I might be misinterpreting: the graphic representation of the 'Broad Performance' chart (@17:23 for reference) doesn't seem to correlate with the labelled scale on the left (or the relative proportions for a single bar either...). For instance, the blue 30.42 for the dual RTX 3090's looks way more than double the turquoise 16.96 for the single card, and also nowhere near the 47-ish labelled on the left of the chart. I'm no data scientist so maybe exaggerating proportions is common practice to highlight differences? You have the best videos around about the relative price:performance of these cards so keep 'em coming.

    • @TheDataDaddi
      @TheDataDaddi 21 день назад

      Hi there. Thank so much for the comment! 1) So I think this is related to a similar question by another view so I will post the response here: This has to do with the way the average is calculated in this case. The CPU-scaled throughput for each scenario is divided by the price of the GPU(s) for each different scenario (GPU, Number of GPUs, Model, Precision, Task, etc). For more specific details, please take a look at the raw data in the Google Sheet link in the video description. It should make things clearer if there is confusion here. After some review, in a fair number of cases, the LSTM did not see much performance benefit over the CPU. This drove the GPU-scaled throughput per dollar way up nominally. These values artificially dragged up the overall average. I tested this by switching the aggregation method to median rather than average, and the values are more in line with what you would expect based on your observation here. You can also see this if you just look at the BERT or RESNET50 model scenarios. These are much more in line with what you would expect. In summary, the numbers do appear to be correct even if they are higher than expected globally. The data when training the LSTM was significantly different than the other models. This leads me to believe there may be some issue with the model setup, dataset, or hyperparameters (or some other reason I am missing). My gut tells me I just did not use large enough batch sizes or a large enough dataset. In any case, more digging will be required to understand why this occurred. 2) This does look weird. This is because the values list are for each part of the bar graph so you would need to add both together so that the top of the bar alight with the Y-axis. I have since changed it so that it is 2 separate bars. I you visit the report now you should find that it is much more readable. Sorry for the confusion here. Really appreciate the kind words! Thanks so much for watching.

  • @jdcodersteinersky7257
    @jdcodersteinersky7257 28 дней назад

    Confused and very new to this but interested in a GPU build. Reading about NVLink on their site it says it was first introduced for the P100 but the specs on the one you showed only shows "Yes" for that on the 3090. Does the model of the P100 you tested lack that? If so seems a model with NVLink could potentially still be a good deal. Great content. Thanks!

    • @TheDataDaddi
      @TheDataDaddi 21 день назад

      Hi there. Thanks so much for the question! There is definitely a lot of confusion around which GPUs support NVLINK. I myself was confused until recently. Basically, for the Pascal series GPUs NVLINK only exist for the SMX form factor. If you go with PCIE, like most people there is not option for NVLINK (or at least not that I have ever seen). So to clarify both the P100 and P40 GPUs test in this video are not NVLINKed.

  • @Seventeen76
    @Seventeen76 28 дней назад

    I scored a 3080 10gb for 353.00 out the door a few weeks back to replace/alternate with a 3060 12gb.

    • @TheDataDaddi
      @TheDataDaddi 21 день назад

      Hi there. Thanks so much for the comment! Man! That is an excellent price. Great pick up! Hope it works well for you!

  • @Ricky_III
    @Ricky_III 29 дней назад

    Juat picked up a 3060 12gb for under $200 glad to see it stacks up with a p40 just unfortunate that the p40 is about the same price with more ram I just couldn't have those loud small fans on my PC

    • @TheDataDaddi
      @TheDataDaddi 29 дней назад

      Hi there. Thanks for the comment! Yeah the RTX 3060 is a good choice for sure, and it has 2nd gen tensor cores as an added bonus. Definitely agrees that having noisy fans on a small PC is not the best way to go.

  • @dslkgjsdlkfjd
    @dslkgjsdlkfjd Месяц назад

    damn daddi that video was great

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there! So glad you found this video helpful. Really appreciate the kind words!

  • @gdmax5
    @gdmax5 Месяц назад

    Grate job dude one of the clean explanations I have seen lately ❤

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there. So glad you enjoy the content! Really appreciate the kind words!

  • @darrylosterland276
    @darrylosterland276 Месяц назад

    This is great, thank you for sharing! Same system here, and this works amazingly. Appreciate the time and effort you put into this!

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there! Thanks so much for the comment. Happy to do it. So glad this was able to help you!

  • @callmebigpapa
    @callmebigpapa Месяц назад

    Again this is great content! Commenting for the Algo.

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Thanks again for commenting! Can't tell you how much I appreciate the support!

    • @callmebigpapa
      @callmebigpapa Месяц назад

      @@TheDataDaddi I just bought a P100 based on your testing and my budget. Already have a P40 for inference.

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      @@callmebigpapa Awesome man! Hope it works out well for you. Let me know if you have any questions on setup or anything.

  • @callmebigpapa
    @callmebigpapa Месяц назад

    This is great content keep it up! Lik'd and Sub'd

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there. Thank you so much for the like, sub, and the kind words. Really appreciate the support!

  • @Guiltia_Sin_Python
    @Guiltia_Sin_Python Месяц назад

    3080 ti 2nd or 4060 ti new ? which the best for ML or DL

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there. Thanks so much for the question! This is tough. I think you really can't go wrong either way. Both are solid choices for ML/DL GPUs. However, for about the same price (I checked just now and they seem to be about the same price on EBAY in my area) I think I would go with the RTX 3080. You get much better performance and the difference in VRAM from 12 to 16 GB is not significant enough to justify the performance difference.

  • @dleer_defi
    @dleer_defi Месяц назад

    I just had an R730xd arrive at my house today. I want to do a deep learning build. How would you rate the dual Tesla P40s after a few months of use?

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there. Thanks for the question! I am really happy with them. I still think that they are the best GPUs for the price at this current moment.

  • @emonsysemonsys
    @emonsysemonsys Месяц назад

    can we reach out to you for Universities AI requirement ?

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Absolutely. Please feel free to contact me any way you like. All of my contact information can be found in my RUclips bio. I will also paste it below for convenience. 🐦 X (Formerly Twitter): @TheDataDaddi 📧 Email: skingutube22@gmail.com 💬 Discord: discord.gg/RyRHEn3yMx

  • @sandeepvk
    @sandeepvk Месяц назад

    Firth thing I would do it to train a model using free clous resources or laptop gpu. By then you can tell if this is a career you want to pursue.

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there. Thanks for the comment! I would definitely agree here. If you are not really sure if AI/ML/DL is a career or a passion for you, taking advantage of resources like Google Colab, Kaggle, and others like this would be a great option to figure things out before purchasing any hardware. Once you decide though that you do want to want to work a lot in this area, I think buying your own hardware makes to most sense right now for the majority of people.

  • @tsclly2377
    @tsclly2377 Месяц назад

    Thanks!

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there! Thank you so much. This is amazing. Really really appreciate your generosity. Can't tell you how much this helps the channel.

  • @tsclly2377
    @tsclly2377 Месяц назад

    hey.. what do you think about the AMD Radeon Instinct Mi line of GPUs???? I'm starting out with P40 also, but as you have said in your last video, one has to know or search out the programs (versions) that don't use FP16 for the P40, or convert all those code lines and operations to FP32 (and hope for the best!!?!!) I have an aversion for TLC NVME and always look at the Petabyte write levels that the NVME give.. settling on using the S35XX-P37XX X4-X8 (and Optane) PCIe for the GPU dumps and fast loads Me going, ML350 G9 (still PCIe 3). my SSDs are slower, SLCs Raid5 and adding Infiband40Gb for the second machine (the winter heaters) .. I estimate my cost is three times of yours, because I bought the units two-three years ago with 128GB memory (installed).

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there. Thanks so much for the comment! A lot to unpack here. 1) I want to start by saying from my experience my DL models tend to use FP32 by default. Maybe with some of the LLMs this is not the case. I am honestly not sure here. Your point though still stands the P40 is generally bad at FP16 or Mixed Precision (MP) operations, or put another way, you do not get any speed up from using FP16 or MP with this GPU. Check out my video here for the actual numbers: ruclips.net/video/OCx2xr5Xaj8/видео.html This point I am trying to make here is with the P40 while you don't get any speed up for FP16 or MP you do not lose any either. So it is not nearly as bad as the theoretical specs indicate. Also, for many programs you would not need to search out specific version that are FP32 compatible it would actually be the opposite. You would need to find version that use FP16 or MP by default. Additionally, even if the program was set to FP16 or MP by default the P40 (based on my testing) would run it just as as well as if it was FP32. 2) If you are going to go the AMD route, the MI series is probably the best way to go. AMD GPUs in general offer a ton of compute for the money and look really attractive on paper. However, based on the feed back I have gotten from viewer that have gone this route it is a pain in the ass to get things set up correctly. I will paste below the exact conversation I had with a viewer recently. """ > What are your thoughts on AMD GPUs in general? There is so much to say here I don't even know where to start. AMD is not worth it if you value your time, but once it's working it is fairly decent and a good alternative to Nvidia. I built my PC around a year ago and back then I would not have recommended it due to the lack of online resources. But since then, backends and troubleshooting support have gotten significantly better. The hardware is great! But it's the software, optimization and lack of general AMD support that make it incredibly time consuming to deal with. The documentation for Instincts is so poor that I had to use Nvidia's P100 flowrate recommendations to get even an idea on how to build my system haha. MLC LLM is a promising project that is optimizing LLM inference though, and it is looking to be an alternative to current CUDA wrappers. > How buggy is ROCm? Is it still bad as of now? ROCm has improved a lot since then. In fact ROCm itself is generally working very well! (Non-driver portion) The problem comes from their closed source proprietary drivers, most infamously amdgpu-dkms. You need very specific kernels and very specific distributions to successfully build the kernel module. Not only that, AMD refuses to fix the infamous GPU reset bug which makes stopping and starting VMs incredibly frustrating. For modern AMD cards, you *should run into fewer problems, but for Instinct cards you will need to do a lot of troubleshooting. Motherboard support is also very poor for Instinct cards, you will run into problems with the GPU reset bug, trouble initiating the GPU, detecting the GPU, and so much more. It took me weeks of non-stop work to get x2 MI100s even detected last year. It took countless hours of headache and trial and error without any help from AMD. I tried messaging their custom service and only got this response: "Your motherboard is unsupported. I can't help you." Unfortunately, the bugs keep coming. To get amdgpu-dkms working on Debian 12 I need to troubleshoot a driver build problem. AMD shipped with corrupted build packages. Pain lol. I'm currently dealing with a problem where the amdgpu driver keeps failing to get user pages, and it is extremely inconsistent to get llama.cpp running. I could keep going but I think you get the point. haha """ Personally, I have not gone down the AMD rabbit hole at this point for many of the reasons mentioned above. However, when I get more time I would like to. I think if you could ever get things running AMD might be a much more cost effective way to go. It just a question of how much time is it worth to work through all the bugs. 3) Damn that is awesome. Love your setup. Are you running a cluster or do you have them a two separate machines? Yeah I bet you paid quite a bit more a couple years back. Interestingly, I feel like a lot of servers and adjacent hardware has come down in price significantly in recently years, but has started to get more expensive again recently with the AI craze.

  • @TuMusicaTV
    @TuMusicaTV Месяц назад

    I cant find the custome cable guy's link

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there. Thanks for letting me know! Here is the link to the cable that has worked best for me: a.co/d/0317ir6Y

  • @TuMusicaTV
    @TuMusicaTV Месяц назад

    I cant find the custome cable guy's link :(

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there. Thanks for letting me know! Here is the link to the cable that has worked best for me: a.co/d/0317ir6Y

  • @kazadori164
    @kazadori164 Месяц назад

    very informative but this guy talks like elon, have to watch at 1.25-1.5 speed.

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      Hi there. Thanks so much for the feedback! Lol. Not sure if being compared to Elon is good thing here, but I do know I am a bit slow when presenting. This is feedback I have received in the past. I am working on making my videos faster, more concise, and trying to talk faster. I hope you found the video useful despite my delivery. I appreciate the feedback again and your view!

  • @gamingthunder6305
    @gamingthunder6305 Месяц назад

    im considering a second p40. do you know if comfyui supports a dual setup?

    • @TheDataDaddi
      @TheDataDaddi Месяц назад

      I am not super familiar with ComfyUI. For my research though, it does look like it support a dual GPUs setup. Looks like you can run separate instances on each GPUs and also run a distributed setup by using the ComfyUI_NetDist extension. Here are some links that might be helpful to you: github.com/comfyanonymous/ComfyUI/issues/155 github.com/city96/ComfyUI_NetDist github.com/comfyanonymous/ComfyUI/discussions/836

    • @gamingthunder6305
      @gamingthunder6305 29 дней назад

      @@TheDataDaddi i was not aware of the NetDist node. that solves a problem for me. thank you for your responds.