How to Run Any LLM using Cloud GPUs and Ollama with Runpod.io

Поделиться
HTML-код
  • Опубликовано: 1 окт 2024

Комментарии • 20

  • @ZodakZach
    @ZodakZach Месяц назад +2

    My question is why would u use runpod and still pay their rate when you can just throw a llama 405b model or whatever model in a aws server and deploy it yourself and only being charged for hosting that aws server which would be probably cheaper and probably is what runpod is doing anyways.

    • @TylerReedAI
      @TylerReedAI  Месяц назад

      Generally they are doing the same thing yes, I just made this video on runpod because it is a little simpler to setup compared to AWS. It can be daunting to some people, but I don't disagree with you. I haven't ran the prices though to see! The better the model though the higher the costs, just need to be careful of that though

  • @Larimuss
    @Larimuss 2 месяца назад +3

    Can you do a new guide for text gen ui as well please? The bloke doesnt work anymore.

  • @chukypedro818
    @chukypedro818 10 дней назад

    when you close the terminal tab, the model stops running. That is not cool, the endpoint should always work unless I choose to shut down the pod

  • @attilavass6935
    @attilavass6935 6 месяцев назад +3

    How using Runpod serverless and pods differ in this use case, considering eg. costs? How can we minimize our costs eg. with stopping running the pod after usage?

    • @TylerReedAI
      @TylerReedAI  6 месяцев назад +1

      Well, the idea is if you don't have a local machine that can run models well (if at all), then depending on the model you need, you can 'rent' a cheap server on this platform. The one I have in the example if it's up and running, was .79 per hour. If I stop it, then it says it costs .0006 per hour. So the cost of holding it until you want to run it again without actually TERMINATING it, is minimal.
      I will look into the scheduling (if its possible) of the servers so like in AWS you can have it run for a certain amount of time per day

  • @BradDStephensAIFPV
    @BradDStephensAIFPV 6 месяцев назад +2

    Since you can run a python file there in runpod, I’m assuming you can also serve a gradio ui from there? Kinda like in your RUclips service video. I really appreciate all of your hard work on your channel. One of my favorite ag centric channels.

    • @HistoryIsAbsurd
      @HistoryIsAbsurd 6 месяцев назад +1

      Yes you should be able to do that for sure

    • @TylerReedAI
      @TylerReedAI  6 месяцев назад

      Yes you absolutely should be able to do this! Thank you I appreciate it 👍

  • @jarad4621
    @jarad4621 4 месяца назад

    Hi what is the difference between this method and using vllm I saw in runpod data centric video which way is better

  • @lololoololdudusoejdhdjswkk347
    @lololoololdudusoejdhdjswkk347 4 месяца назад +1

    Is it possible to host the server here, or is the run pod just used for fine tuning and training models

    • @TylerReedAI
      @TylerReedAI  4 месяца назад

      you can absolutely host a server here!

    • @lololoololdudusoejdhdjswkk347
      @lololoololdudusoejdhdjswkk347 4 месяца назад +1

      Just found out how and got it, apparently you need to host it on port 80 but apparently I didn’t select that option when I made the GPU.

    • @TylerReedAI
      @TylerReedAI  4 месяца назад

      Ah gotcha I’m glad you got it figured out 👍

  • @johnbarros1
    @johnbarros1 6 месяцев назад +1

    This is the sauce! Thanks you! 🙏🏾

  • @MichaelTrader1
    @MichaelTrader1 5 месяцев назад

    Is it possible to use a model on the server and parse it to the local Ollama to use it in any software locally?

    • @TylerReedAI
      @TylerReedAI  5 месяцев назад +1

      yeah so I think like, if you had an api to retrieve something from the runpod.io llm, and then bring it locally for anything, then absolutely. You would just need the url for the runpod to grab the request. Hope that made sense. I do plan on having a video where we have something more 'production' ready

  • @JSON_bourne
    @JSON_bourne 6 месяцев назад +1

    Thanks!