A Journey with Llama3 LLM: Running Ollama on Dell R730 Server with Nvidia P40 GPU & Web UI Interface

Поделиться
HTML-код
  • Опубликовано: 26 окт 2024

Комментарии • 27

  • @kenpark4783
    @kenpark4783 5 месяцев назад +1

    I subscribed. My 730 is on the way now and I have my 2 P40s sitting on my desk. I was amazed to find your videos of the same setup I have planned. Thanks for doing these. So far it has been really informative. I'm a Windows guy with only limited exposure to Ubuntu so far, so it has been really nice having a sanity check for that as well.

    • @MukulTripathi
      @MukulTripathi  5 месяцев назад +1

      You'll love the setup. If your 730 has enterprise idrac license then you'll be able to remote start stop it as well. Those cables that go into P40 and riser are cheaply made usually. So that's sometime you'll have to keep in mind too.

    • @kenpark4783
      @kenpark4783 5 месяцев назад

      @@MukulTripathi I've been researching the power cable issues and am going to pick a proven vendor if I can't find genuine OEM. I've seen a few references posted on Reddit. Anyway, I'm looking forward to watching the rest of the series. Thanks again!

    • @MukulTripathi
      @MukulTripathi  5 месяцев назад

      I'm glad there's an audience for this stuff! I'm planning to make a series on my research work on AI articles I've published.

  • @ThomasBattle-fh8gg
    @ThomasBattle-fh8gg 5 месяцев назад

    So I'm building this system right now and your video is a godsend. Thank you.

  • @QuantumLeapEvent
    @QuantumLeapEvent 5 месяцев назад

    Subscribed! Great Stuff and inspired me make a similar setup with r730xd (After I figure out a way to remove what seems to be fixed in place dual drive ssd back flex bay to make space for the 2nd GPU). How are you keeping the system cool fan noise minimal with two GPU and LLM running? Or what do you suggest on keeping the system as cool as possible? Glad I found you! Thank you for making these videos!

  • @renobodyrenobody
    @renobodyrenobody 13 дней назад

    Ok, thanks. More or less what I was experimenting and it is nice to see your explanations. Anyway I think maybe the video lacks a comparizon with the usage of CPU instead of your GPU. Thanks anyway.

    • @MukulTripathi
      @MukulTripathi  13 дней назад +1

      I have a bunch of videos that you'd like if you're interested in server builds. LLMs love CUDA cores so I build these with Nvidia GPUs. I agree I could have done some comparisons with CPUs, for sure!

  • @saniel_cz
    @saniel_cz 3 месяца назад

    How many tokens per second did you achieve during inference? I am thinking about buying p40 because of vram, but I would like to know how fast it performs inference. I can't find this anywhere, so your insight would be greatly appriciated :)

  • @stamy
    @stamy 6 месяцев назад

    Very nice vidéo !
    What happens when you download the 70b llama LLM file and try to put this 40BG file into the 8GB VRAM on your GPU ? Does it work ?
    I am asking because I tried to use the 70b LLM on my CPU, and despite having 32GB of RAM it was not enough :)

    • @MukulTripathi
      @MukulTripathi  6 месяцев назад +2

      I'll address this in next video :) you need two P40 GPUs to fit a 70b model essentially. Two P40s give you 48 GB of vram. Llama3 70b fits perfectly with 40GB size on it.

    • @wlgt3257
      @wlgt3257 6 месяцев назад

      @@MukulTripathi looking forward to it man.

    • @MukulTripathi
      @MukulTripathi  6 месяцев назад

      I am almost done with it. It'll be coming soon :)

  • @sridevmisra
    @sridevmisra 6 месяцев назад

    Informative video!!

  • @danv8086
    @danv8086 4 месяца назад

    Is this using 1 or 2? Did a CPU only experiment which was unusable for anytbing practical but I've beem looking at p40s and p100s

    • @MukulTripathi
      @MukulTripathi  4 месяца назад

      I have done two videos. One with single GPU and another one with dual GPU setup. With dual GPUs it uses them both

  • @jacksonpham2974
    @jacksonpham2974 2 месяца назад

    How to passthrough the P40 driver into ubutu vm?

    • @MukulTripathi
      @MukulTripathi  2 месяца назад

      I used ESXi's passthrough feature for it.

  • @mk.host.here.
    @mk.host.here. 4 месяца назад

    How you pass gpu true virtual machine?

    • @MukulTripathi
      @MukulTripathi  4 месяца назад +1

      I show a step by step installation and GPU passthrough guide in esxi, if you watch my previous videos. There are two of them. They are an hour long, but jam packed with information.

    • @mk.host.here.
      @mk.host.here. 4 месяца назад

      @@MukulTripathi thanks

    • @jacksonpham2974
      @jacksonpham2974 Месяц назад

      @@MukulTripathi Can you please give me your video links for that? I need to see GPU, Cuda of P40, in Ubutu VM.

    • @MukulTripathi
      @MukulTripathi  Месяц назад

      Here is the server build playlist:
      ruclips.net/p/PLteHam9e1Fecmd4hNAm7fOEPa4Su0YSIL
      Here is one of the videos in there in esxi VM for what you're looking for:
      ruclips.net/video/BO5YPIToJKo/видео.html

  • @callmebigpapa
    @callmebigpapa 3 месяца назад

    I have the same setup! Lik'd and Sub'd to see where this channel goes!