Run your Own Private Chat GPT, Free and Uncensored, with Ollama + Open WebUI

Поделиться
HTML-код
  • Опубликовано: 29 июн 2024
  • Run an Uncensored Private ChatGPT Clone on your Computer for Free with Ollama and Open WebUI
    In this video, we'll see how you can use Ollama and Open Web UI to run a private ChatGPT copycat locally for free using open source large language models (LLMs) such as Meta's Llama2 and Mistral.
    This lets you try out different models, and even use uncensored models.
    Don't send your private data to OpenAI's ChatGPT or Anthropic's Claude.ai, keep it private on your pc or mac.
    👍 Please like if you found this video helpful, and subscribe to stay updated with my latest tutorials. 🔔
    ❤️ You can support this channel by buying me a ☕: buymeacoffee.com/codesfinance
    For written instructions and additional details, check out my blog post: vincent.codes.finance/posts/o...
    🔖 Chapters:
    00:00 Intro
    00:50 Ollama
    03:54 Ollama CLI
    07:15 Open WebUI
    07:48 Open WebUI & Docker Install
    10:26 Open WebUI Chat
    13:23 Modelfiles and Prompts
    16:10 Outro
    🍺 Homebrew installation commands:
    brew install ollama
    brew install docker # For the background process
    brew install --cask docker # For Docker Desktop
    🔗 Video links:
    Ollama: ollama.com/
    Open WebUI GitHub: github.com/open-webui/open-webui
    Open WebUI: openwebui.com/
    Docker: www.docker.com/
    🐍 More Vincent Codes Finance:
    - ✍🏻 Blog: vincent.codes.finance
    - 🐦 X: / codesfinance
    - 🧵 Threads: www.threads.net/@codesfinance
    - 😺 GitHub: github.com/Vincent-Codes-Finance
    - 📘 Facebook: / 61559283113665
    - 👨‍💼 LinkedIn: / vincent-codes-finance
    - 🎓 Academic website: www.vincentgregoire.com/
    #ollama #llama3 #openwebui #chatgpt #llm #gpt #opensourceai #llama2 #mistral #bigdata #research #researchtips #uncensored #private #claude #anthropic #llama #openai #opensource
  • НаукаНаука

Комментарии • 79

  • @VincentCodesFinance
    @VincentCodesFinance  Месяц назад

    👍 Please like if you found this video helpful, and subscribe to stay updated with my latest tutorials. 🔔
    ❤ You can support this channel by buying me a ☕: buymeacoffee.com/codesfinance

  • @goldmund22
    @goldmund22 2 месяца назад

    Thanks for this in depth video on this. So far you are the only one who has explained the whys and how to download Docker. Will subscribe for more.

  • @drumbyte
    @drumbyte 3 месяца назад +3

    Great video, very informative! Thank you for sharing!

  • @KevlarMike
    @KevlarMike 2 месяца назад +1

    Thanks this is exactly what I was looking for.

    • @VincentCodesFinance
      @VincentCodesFinance  2 месяца назад +1

      Happy to help! Make sure to try the new Llama 3 that was recently released, it's a big step up from llama2 in the video.

  • @cucciolo182
    @cucciolo182 3 месяца назад +1

    Justo lo que estaba buscando. Muchas gracias

  • @zandanshah
    @zandanshah 3 месяца назад

    Keep the good work going 5 Star!!!

  • @AvgJane19
    @AvgJane19 3 месяца назад

    THIS IS SO AWESOME TY SO MUCH!!

  • @tiffanyw3794
    @tiffanyw3794 Месяц назад

    Thank you this is the best video explaining how to do this?

  • @englishmimics
    @englishmimics Месяц назад

    Thanks for sharing! Running your own private chat with Ollama and Open WebUI sounds interesting!

    • @VincentCodesFinance
      @VincentCodesFinance  Месяц назад

      It is! In my latest video I show a new even simpler UI called Msty, which is a simple desktop app that comes bundled with Ollama (or can use your existing Ollama). If you don't mind using a closed-source UI, it's the nicest one I've seen so far: ruclips.net/video/REEYqYEtqAc/видео.html

    • @englishmimics
      @englishmimics Месяц назад

      @@VincentCodesFinance That sounds cool! It's awesome that you're checking out simpler UI options. Thanks for sharing!

  • @lalpremi
    @lalpremi 3 месяца назад

    Thank you for sharing, looking forward to testing this on my Debian workstation :-)

  • @nfrcomputers
    @nfrcomputers 3 месяца назад

    awesome, it works on my main win-10 pc which is now getting a little old. (i7 4790k, rx 580, 16gb ddr3). gonna build a homeserver pc running linux with 64gb ram soon, hopefully that can handle it better. definitely slower than chat gpt rn. runs a little faster if i run it directly out of the terminal instead of using docker/openwebui

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад +1

      For a linux pc you'll also want to have a gpu with a lot of ram. On mac the ram is shared between cpu and gpu so there is only one ram number to care about.

  • @froggy5967
    @froggy5967 3 месяца назад

    Thanks for the video. Just a question about your keyboard, what are keycap, switch, and lube steps?

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      I have a few different keyboards. The one I was using while recording this video is a Keychron K8 with Gateron brown switches (no custom lube).

  • @robwin0072
    @robwin0072 5 дней назад

    Thank you for this walk-through.
    At 01:43 you spoke of ‘chat’ tags:
    Two things:
    1. There was a ‘text’ line three lines above ‘chat,’ what benefits come with ‘text.’ command line instruction?
    2. I did not notice at what point you copied and pasted the ‘chat’ command line instruction.

    • @VincentCodesFinance
      @VincentCodesFinance  2 дня назад +1

      Text models are optimized for text completion instead of chat-style querying. In most uses cases à la ChatPGT that we usually think of, the chat variant is the one you want.
      2.You can copy any of the ones that are there (you should use llama3 no, not llama2, or one of the newest uncensored models.) The only thing to be aware of if that the command that gets copied is "ollama run modelname" instead of "ollama pull modelname". The run command will trigger a pull if necessary and the model will be downloaded, but run will also load the model in memory and make it available in the command line.

  • @commanderpaladin
    @commanderpaladin 3 месяца назад

    Hi. Amazing video! WOW! By the way a guy on RUclips called Network Chuck copied your video and idea and tutorial 😠

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад +1

      Thanks, glad you enjoyed! I saw his video, it's the same topic but I can't claim that he "copied" my video. In any case, looking at my stats, a lot of my views are coming from people watching his video, so I'm winning as well 😀

    • @commanderpaladin
      @commanderpaladin 3 месяца назад

      @@VincentCodesFinance ❤️

  • @slawomirr12
    @slawomirr12 3 месяца назад +4

    These models are trained with a limited amount of information. So it doesn't learn from chatting with users, from their questions, their prompts or tasks they give him to solve, does it. It's trained once and then is given out to the public. It means that we are actually still very far away from real artificial inteligence that gathers own experience, learns from its own errors, etc. To train it to make it better you still need a company like meta and their comp farms, at least when it comes to big models like LLMs. Can someone correct me if I am wrong here?

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад +2

      You need large infrastructure to train a base model like llama2, but what you are describing can be achieved (in theory) with fine-tuning. I'm still early in my fine-tuning experiments. If you want to try for yourself and are on mac, this is a pretty decent tutorial: apeatling.com/articles/simple-guide-to-local-llm-fine-tuning-on-a-mac-with-mlx/

    • @SiliconSouthShow
      @SiliconSouthShow 2 месяца назад

      I have found tweaking how you use documents in owu you can have it learn (sorta) it knows the data, and use it with its existing in and long running interaction.

    • @kc-jm3cd
      @kc-jm3cd 2 месяца назад

      You can’t run it anyway

  • @Laszlo34
    @Laszlo34 3 месяца назад +1

    This does work! Pretty nerdy! Umm... so I tried the ollama2-uncensored (because why would I choose to keep information from myself?? duh). Yeah, it's...not very bright:
    >>> How many grams in 1L of water?
    There are approximately 1000 milliliters in one liter of pure water, so one liter contains about 4200 g (or approximately 27.5 kg) at room
    temperature.
    >>> You are very wrong. In 1000 milliliters there are 1000 grams of water which is exactly 1 kg.
    Thank you for correcting me! You are absolutely right that one liter of pure water at room temperature contains exactly one kilogram (or
    approximately 2.2 pounds) by mass. However, the initial question asked about grams in a specific volume unit, so I responded accordingly.
    >>> How many grams in 1 cubic centimeter of water?
    As mentioned earlier, there are approximately 42 g (or approximately 2 mg) in one cubic centimeter of pure water at room temperature.

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      Yes, llama2-uncensored in based on the llama2 7B q4 model, which is fast but not very powerful. They also have the wizard-vicuna-uncensored, also based on llama2, that has a 30B version which should be a bit better.

    • @Laszlo34
      @Laszlo34 3 месяца назад

      @@VincentCodesFinanceThank you! Will try it!

    • @Laszlo34
      @Laszlo34 3 месяца назад

      >>> Give the weight of 1L of water, in grams.
      The weight of 1 liter (or 1000 milliliters) of water is approximately 981 grams.
      >>> Does 1cc of water weight 1g?
      No, 1 cubic centimeter (cc) of water does not weigh exactly 1 gram. The weight of 1 cc of water depends on its temperature and pressure. At standard temperature and pressure
      (STP), which is 0°C (32°F) and 1 atmosphere (atm), 1 cc of water weighs approximately 1.007 grams.
      >>> How many cc's are in 1L?
      There are 1000 cubic centimeters (cc) in 1 liter (L) of water. So, if you have 1 liter of water, you also have 1000 cc of water.
      >>> What is 1000 x 1.007g?
      If you multiply 1000 by 1.007 grams, you get 1007 grams. Therefore, the weight of 1000 cc of water at standard temperature and pressure (STP) would be approximately 1007 grams.
      >>> Give the weight of 1L of water, in grams.
      The weight of 1 liter (or 1000 milliliters) of water is approximately 981 grams.
      ...Well, it did say "approximately". :)

  • @tenebrae1191
    @tenebrae1191 3 месяца назад +1

    👏👏

  • @amroeser
    @amroeser 3 месяца назад

    Hi and thanks for the video. Will there be a way to parse full documents to get a summary of a particular section, for example? Any ideas?

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      I haven't had any success with this kind of task within Open WebUI. It should be possible to do it programmatically through the Ollama API, but it's definitely not as user friendly...

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      While I haven't found a way with Open WebUI, in my latest video I show how you can build your own app to do it: ruclips.net/video/Tnu_ykn1HmI/видео.html

  • @AlwaysCensored-xp1be
    @AlwaysCensored-xp1be 3 месяца назад

    I have been running llava on my Pi5 too. This might be away to run ollama on another Pi5. These LLMs give the Pi5 a serious workout so running it on another headless networked Pi5 would be better.

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      I can imagine that this is stretching the Pi5 to its capacity. I only have a Pi4 so I never even considered it. What do you use it for on your Pi5, as a server for chat like this or more of an OpenAI API replacement?

  • @mehdihachana_3138
    @mehdihachana_3138 3 месяца назад

    thank you

  • @khalidkifayat
    @khalidkifayat 3 месяца назад

    nice one,
    A question was how to move it to production OR (after proof of concept) give it to client as project ??
    Thanks

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      I'm only using it locally so I haven't considered that bit yet. Open WebUI would be the easy part because it comes as a container (their github pages has instructions on setting it up with Ollama on a remote server github.com/open-webui/open-webui). As for Ollama, they do have an official Docker image as well, but I haven't tried it yet: ollama.com/blog/ollama-is-now-available-as-an-official-docker-image

    • @jerbear7952
      @jerbear7952 3 месяца назад

      An odd question for someone wanting to sell a product to someone. It seems like you should know more first

    • @RomaineChutckhan
      @RomaineChutckhan 3 месяца назад

      @@jerbear7952 How conceited. Ai is a new market with new players joining every day. What made you think others had it all figured out already?

  • @swapnenduchatterjee8978
    @swapnenduchatterjee8978 3 месяца назад

    Hi, I like your video. As an educator, I am newly acquainted with this Open Web UI. Kindly provide the minimum hardware specifications and RAM requirements necessary to enhance the speed of the outcome on my PC. Is having a graphics card and a stable internet connection necessary for achieving optimal program performance?

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      Hi, glad you found the video useful! As for requirements, you will need a good internet connection for downloading the models, but after that you should be able to work completely offline. I don't think you need a powerful graphic card to run the smallest models, but you will have to try the different models sizes to see which one run at a decent speed on your machine.

  • @tvandang3234
    @tvandang3234 Месяц назад

    i work for a dental business and i want to import all their documents, like spreadsheets, pdfs, docs, text file, and ect into open ui as knowledge base. can i do that and have it save locally so that when i restart it that i do not have to import them again?

    • @VincentCodesFinance
      @VincentCodesFinance  Месяц назад

      I haven't tried to build a setup as involved as this. I think you can save the loaded documents, but I only tried it as a single user, so I'm not sure if sharing is possible, and what types of files are supported besides pdf. Be aware however that this is a recent open source project under active development, I would be careful before loading any sensitive medical data in a server software that has not undergone a security audit

  • @LauraLanford
    @LauraLanford 2 месяца назад

    When I installed Docker it had some error associated with WSL, then when I try to run it ( 10:00 ) it shows me an error and I can't proceed with it.

    • @VincentCodesFinance
      @VincentCodesFinance  2 месяца назад

      Hum, I don't have much experience with WSL. Could it be related to this issue? github.com/docker/for-win/issues/13845

  • @romanmed9035
    @romanmed9035 2 месяца назад

    I even downloaded recently updated models, but they contain data from at least a year ago. and the data I need came out at the end of last year. how do I find out the approximate time of accumulation of data and their relevance?

    • @VincentCodesFinance
      @VincentCodesFinance  2 месяца назад

      When available, that info should be on the model card (either on ollama.com, or in the references). But even then it's not always the case. It's not an easy answer because they tend to use multiple datasets with different cutoff dates and there are multiple stages (training, fine-tuning, etc.) that can rely on more recent data. For example, from what I found online Llama2 has a training cutoff date of Sept 2022, but fine-tuning cutoff is July 2023...

    • @romanmed9035
      @romanmed9035 2 месяца назад

      @@VincentCodesFinance thanks. I also find models with dates of the end of 2023 and even 2024, but with the data up-to-date, the best is for the beginning of 2023, which is very late for me.

  • @bobbaganush1
    @bobbaganush1 3 месяца назад

    It won't let me run llama2 and mistral with ollama. It says each one is a "racial slur used against people of African descent." Is there a workaround for this?

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад +1

      Base model safeguards can be quite sensitive. Have a look at uncensored models.

    • @bobbaganush1
      @bobbaganush1 3 месяца назад

      @@VincentCodesFinance Any suggestions? It seems every one I try garners similar results.

    • @VincentCodesFinance
      @VincentCodesFinance  2 месяца назад

      @@bobbaganush1 dolphin-mixtral is the most popular one currently. It is based on Mixtral so it is somewhat heavy on RAM usage. Have you tried that one?

  • @swapnenduchatterjee8978
    @swapnenduchatterjee8978 3 месяца назад

    Is there any way to summarize documents using your method?

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      Not that I have been able to figure out. As far as I know, Open WebUI supports semantic search within your documents, but not operations on whole documents such as summary. That is definitely something I will be looking into, if I find a decent solution I'll make a video about it.

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      While I haven't found a way with Open WebUI, in my latest video I show how you can build your own app to do it: ruclips.net/video/Tnu_ykn1HmI/видео.html

  • @mohammedsaleh-ck8jf
    @mohammedsaleh-ck8jf 3 месяца назад

    🙏🙏

  • @gigatar
    @gigatar 2 месяца назад

    How can I have a beautiful terminal like yours?

    • @VincentCodesFinance
      @VincentCodesFinance  2 месяца назад +1

      I'm using iTerm2 with Oh My Zsh github.com/ohmyzsh/ohmyzsh with the agnoster theme (another popular theme is powerlevel10k). You will also need to set your terminal font to a Nerd Font www.nerdfonts.com/ . I think I have the Fira Code font in that video.

    • @gigatar
      @gigatar 2 месяца назад

      @@VincentCodesFinance thanks a lot 👍👍

  • @Hakamgh
    @Hakamgh 3 месяца назад +1

    I want a method for an Android phone

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад +1

      I'm sure it's coming eventually, but with the current state of these models, that would drain your battery very quickly.

    • @Hakamgh
      @Hakamgh 3 месяца назад

      @@VincentCodesFinance OK thanks

  • @MoonLiteNite
    @MoonLiteNite 3 месяца назад

    @8:22 how do billions of webservers run around the world without docker? You do NOT need docker to run a webserver...

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      I totally agree, I was simplifying a bit there. But using Docker does make things easier to manage, especially when the package maintainers (Open WebUI in this case) push that as their preferred installation method.

  • @majukanumi9639
    @majukanumi9639 3 месяца назад

    but before you need to tell us that to run these models on own computer we need ram of more than 46gb and GPU ....that costs expensive

    • @VincentCodesFinance
      @VincentCodesFinance  3 месяца назад

      You're right, large models like Mixtral have heavy on RAM and perform better on GPUs. However, requirements for running smaller models like Gemma or Llama2 are much lower, and they can run on CPU (at lower speed). If you want to run everything locally your hardware will dictacte what you can run.