Using Ollama To Build a FULLY LOCAL "ChatGPT Clone"

Поделиться
HTML-код
  • Опубликовано: 14 май 2024
  • In this video, I show you how to use Ollama to build an entirely local, open-source version of ChatGPT from scratch. Plus, you can run many models simultaneously using Ollama, which opens up a world of possibilities.
    Enjoy :)
    Join My Newsletter for Regular AI Updates 👇🏼
    www.matthewberman.com
    Need AI Consulting? ✅
    forwardfuture.ai/
    Rent a GPU (MassedCompute) 🚀
    bit.ly/matthew-berman-youtube
    USE CODE "MatthewBerman" for 50% discount
    My Links 🔗
    👉🏻 Subscribe: / @matthew_berman
    👉🏻 Twitter: / matthewberman
    👉🏻 Discord: / discord
    👉🏻 Patreon: / matthewberman
    Media/Sponsorship Inquiries 📈
    bit.ly/44TC45V
    Links:
    Code From Video - gist.github.com/mberman84/a12...
    Ollama - ollama.ai/
  • НаукаНаука

Комментарии • 385

  • @xdasdaasdasd4787
    @xdasdaasdasd4787 6 месяцев назад +13

    Ollama series! This was a great starting video❤ thank you for all your hard work

  • @MakilHeru
    @MakilHeru 6 месяцев назад +12

    This is awesome! I'd love to see more. I feel like this can become something pretty robust with enough time.

  • @rakly3473
    @rakly3473 6 месяцев назад +13

    Every time I need something, you present a tool doing exactly that. Thanks!

  • @curvyshrine
    @curvyshrine 6 месяцев назад +22

    Thank you for this video; after trying many models, and failing, I finally succeeded at running a local GPT! 🤗

    • @virityrealtual3831
      @virityrealtual3831 6 месяцев назад +1

      How do you like the responses of your local GPT compared to official one?

  • @DB-Barrelmaker
    @DB-Barrelmaker 5 месяцев назад +1

    This was done so! Perfectly. Every part swollen with meaning

  • @user-kw3sp7lb5c
    @user-kw3sp7lb5c 4 месяца назад

    Ollama is incredible! Runs fast LLMs. And i see in your channel about autogen and so... agents building and find that i was looking for. I love your channel and your teaching manner. Thanks Mattew!

  • @snuffinperl8059
    @snuffinperl8059 2 месяца назад +1

    You created an incredible video, precise, concise, and I couldn't have asked for more!

  • @free_thinker4958
    @free_thinker4958 6 месяцев назад +3

    This is the type of straightforward high quality content ❤

  • @mossonthetree
    @mossonthetree 2 месяца назад

    This is so cool! And the fact that they give you an rest endpoint running on a port on the machine is great.

  • @mashleyelliott4668
    @mashleyelliott4668 5 месяцев назад

    Thanks! This concise video is exactly what I was looking for to help me take next steps with Ollama!

  • @aldoyh
    @aldoyh 6 месяцев назад +14

    Thank you so much Mathew, this is so incredible!

  • @agntdrake
    @agntdrake 6 месяцев назад +3

    Really great video! The easiest way to get history is to take the `context` which was given in the response and just pass it back as the 'context' field in the request.

  • @zef3k
    @zef3k 6 месяцев назад +4

    Wow, this makes it so extremely accessible. Your video also shows how accessible interacting with these ai's is in general as well. I haven't programmed much since I was younger, but have been wanting to, and this seems like a great jumping off point! Now I just need to wait until the Windows version comes out.

    • @luce985
      @luce985 4 месяца назад

      MADA SAKA

  • @avi7278
    @avi7278 6 месяцев назад +195

    I'm building my own personal AI assistant but every time I start something a week later something better drops. My god, this is impossible. I've got to think better about my abstractions to make some of this stuff more drop-in ready. That might be an interesting video (or series of videos) for you Matthew, if not likely a bit advanced for your audience.

    • @LeonardLay
      @LeonardLay 6 месяцев назад +33

      I'm in the same boat. The tech changes so quickly, my ideas become antiquated as soon as I get something working 😆

    • @matthew_berman
      @matthew_berman  6 месяцев назад +17

      The nice thing is if you stick with using OpenAI API, that seems to be the standard

    • @LeonardLay
      @LeonardLay 6 месяцев назад +3

      @@matthew_berman I have an Azure account and I'm trying to use it to act as a server for the different models rather than hosting them locally. I'm having so much trouble doing that because the models that are included with Azure aren't the ones I want to try out. Do you have any advice?

    • @DihelsonMendonca
      @DihelsonMendonca 6 месяцев назад +3

      You're lucky. I still have to learn Python. But since ChatGPT is developing too fast, when I learn, my knowledge would be obsolete, because just now we can create a personal assistant using GPTs very easily, do you agree ? 🙏👍

    • @free_thinker4958
      @free_thinker4958 6 месяцев назад +3

      ​@@DihelsonMendoncame too, once I focus on something then later I find something else exists and with high quality than the previous one hhhh

  • @dustincoker5233
    @dustincoker5233 6 месяцев назад +1

    This is so cool! I'd love to see a deeper dive.

  • @xdasdaasdasd4787
    @xdasdaasdasd4787 6 месяцев назад

    You are a god send. Thank you
    Ive been using it through WSL for windows

  • @srikanthg_in
    @srikanthg_in 29 дней назад

    Wow. That's the best 10 minutes I have spent today. Great learning.

  • @takione5991
    @takione5991 6 месяцев назад

    Great video! Simple. clear and concise. Thanks for that. An idea for a continuation (as a complete novice on AI) could be how to start a simple training on the model to keep improving on some topic we would like?

  • @bersace
    @bersace 6 месяцев назад

    You are so passionate. And you are right to do so. Thanks !

  • @donaldparkerii
    @donaldparkerii 6 месяцев назад

    Another great video, I was able to achieve the same in LM Studio running multiple models, on Mac, by spawning instances from the CLI and incrementing the port. Then in my autogen app passing different llm_config objects to the specific assistant agent.

  • @AlGordon
    @AlGordon 6 месяцев назад +4

    Nice video! You definitely picked up a new subscriber here. I’d be interested in seeing how to build out a RAG solution with Ollama, and also how to make it run in parallel for multiple concurrent requests.

  • @nickdnj
    @nickdnj 6 месяцев назад +2

    Great Video.. Thank you!. I would love to see a deep dive into using Olama with Autogen, Having each agent use its own model.

  • @greeffer
    @greeffer 5 месяцев назад

    Great content bro, you're my new favorite youtuber!

  • @MrBravano
    @MrBravano 4 месяца назад

    Love your videos, much respect and appreciation for all the work you do. I do have one humble suggestion, if you could hide your image just enough to see what you have typed, for instance at 8:49, it would have been great. I know that most RUclips instructors do this, not sure why but please take that into consideration. Either way, thank you for all you bring.

  • @wurstelei1356
    @wurstelei1356 6 месяцев назад +3

    Thanks for this nice video. I would like to see a video about MemGPT implementing the history function instead of just pasting everything in front of a new prompt.
    A good idea could be: PrivateGPT with Huggingfaces model cards in it is passed the prompt with the task to tell the best model for that prompt. Then the prompt is passed via ollama to that model with MemGPT on top of each model. That actually might be the most powerful local solution right now.

  • @michaelwallace4757
    @michaelwallace4757 6 месяцев назад

    Integrating Ollama and Canopy would be a great video. Having that local retrieval would have many use cases.

  • @MrAcarlo
    @MrAcarlo 5 месяцев назад

    the video on Ollama is really beautiful. Among other things, I would also start doing benchmarks on the various text generation user interfaces. Ollama allows me, for example, to use my laptop with a small GTX 1060 and Dolphin at incredible speed. the same laptop struggles with Oobabooga. However, after some interactions, the model goes into "overload", as if the RAM is no longer enough. In short, this comment is a too long thank you for your excellent work. And a hope for more videos about Ollama and local models.

  • @gbengaomoyeni4
    @gbengaomoyeni4 6 месяцев назад

    @Matthew_berman: You are very brilliant! I have been watching ollama videos but none of them taughthow to use it with API or structured it the way you did. Keep it coming bro. Thank you so much. God bless!

  • @fenix20075
    @fenix20075 5 месяцев назад +5

    About the privateGPT, I found the accuracy can be improved if the database change from duckDB to elasticsearch.

  • @PeterPain
    @PeterPain 6 месяцев назад +1

    Absolutely the best video yet. ollama looks amazing.
    Now show me what options there are for doing similar such things in android apps :)

  • @WaefreBeorn
    @WaefreBeorn 6 месяцев назад +8

    this model will allow us to make open source models fast, I love the simultaneous part, please make more tutorials on this once it hits windows without wsl

    • @AaronTurnerBlessed
      @AaronTurnerBlessed 6 месяцев назад

      agree... This OLlama really looks promising Matthew!! Light weight and simple. More plz!!

    • @chrismachabee3128
      @chrismachabee3128 6 месяцев назад

      I am at WSL now, join me. WSL - Windows Subsystem for Linux. It is at Microsoft Ignite. The title is How to install Linux on Windows with WSL. So, you are on your own now. I have several computer requiring updating. good luck.

    • @WaefreBeorn
      @WaefreBeorn 6 месяцев назад

      @@chrismachabee3128 you are an AI generated comment. Please follow terms of service on RUclips for automated accounts, creator of this bot.

  • @chorton53
    @chorton53 Месяц назад

    This was a fantastic video ! Cheers for that !

  • @jeanfrancoisponcet9537
    @jeanfrancoisponcet9537 6 месяцев назад

    I did comment about it few weeks ago on one of your videos ! Indeed, very useful for autogen (but also for Langchain).

  • @prof969chaos
    @prof969chaos 6 месяцев назад +3

    Very interesting, would love to see how well it works with autogen or any of the other multi-agent libraries. Looks like you can import any gguf as well.

  • @taeyangoh7305
    @taeyangoh7305 6 месяцев назад +16

    yes! it would be really interesting how autogen + Ollama goes !😍

    • @BibopGresta1
      @BibopGresta1 6 месяцев назад +2

      I'm interested, too! I wonder if Autogen is obsolete now that OpenAI unleashed the kraken with the GPTs! What do you think?

    • @alextrebek5237
      @alextrebek5237 6 месяцев назад

      @@BibopGresta1i think you have yourself a popular follow-up video, given the comments asking about autogen 😉

    • @Gatrehs
      @Gatrehs 6 месяцев назад

      @@BibopGresta1 Unlikely, GPT's are more of a single custom Agent instead of a set of agents working together.

  • @photorealm
    @photorealm Месяц назад

    Awesome video, they have a WIndow version now (3-30-24), and it installed an ran perfectly.

  • @avosc5316
    @avosc5316 3 дня назад +1

    DUDE! This was an awsome tutorial!

  • @Jose-cd1eg
    @Jose-cd1eg 5 месяцев назад

    Amazing job!!! Everyone wants more!!

  • @elierh442
    @elierh442 6 месяцев назад +59

    😮 Please create a video integrating Ollama with autogen!

    • @federicocacace1070
      @federicocacace1070 6 месяцев назад +11

      and autogen's function calling with local models too!!

    • @LeonardLay
      @LeonardLay 6 месяцев назад +6

      This was my first thought. Please do this

    • @blackstonesoftware7074
      @blackstonesoftware7074 6 месяцев назад +6

      Yes!!! Do this with AutoGen!

    • @skullseason1
      @skullseason1 6 месяцев назад +3

      Great idea dudes 🔥🔥🔥🔥🔥

    • @matthew_berman
      @matthew_berman  6 месяцев назад +17

      Easy enough! I’ll make a video for it.

  • @pedroverde1674
    @pedroverde1674 2 месяца назад

    Many thanks it's really useful and really easy because you explain extremely good

  • @the.flatlander
    @the.flatlander 6 месяцев назад +5

    This is just great and easy as well! Could you show us how to train these models with PDFs and Websites?

  • @LerrodSmalls
    @LerrodSmalls 6 месяцев назад +5

    This was so Dope! - I have been using Ollama for a while, testing multiple models, and because of my lack of coding expertise, I had no understanding that it could be coded this way. I would like to see if you can use Ollama, memGPT, and Autogen, all working together 100% locally to choose the best model for a problem or question, call the model and get the result, and then permanently remember what is important about the conversation... I Double Dare You. ;)

  • @chenle02
    @chenle02 5 месяцев назад

    So mind blowing~! Thanks Dude~!

  • @rogerbruce2896
    @rogerbruce2896 6 месяцев назад

    Another cool video! I hope that they come up with a windows version soon :) Definitely want the deeper dive. ty

  • @jkbullitt8986
    @jkbullitt8986 5 месяцев назад

    Awesome work!!!

  • @GutenTagLP
    @GutenTagLP 5 месяцев назад +4

    Great video, just a quick note, you actually do not need to all the previous messages and responses as the prompt, the API response contains an array of numbers called the context, just send that in the data of the next request

  • @gru8299
    @gru8299 2 месяца назад

    Thank you very much! 🤝

  • @mbrochh82
    @mbrochh82 6 месяцев назад

    loved this, Matthew! Right to the point, super hands on. This looks like an awesome project!

  • @Artificialintelligenceo
    @Artificialintelligenceo 6 месяцев назад

    Great vid!

  • @yngeneer
    @yngeneer 6 месяцев назад +1

    super video! if you can make something more deeply about memory management, it would be lovely.

  • @tintin_teaches
    @tintin_teaches 6 месяцев назад

    Please make more videos on these topics in detail.

  • @ubranch
    @ubranch 5 месяцев назад +8

    00:01 Building Open-Source ChatGPT using Olama
    01:27 Ollama and Mistol enable running multiple models simultaneously with blazing fast speed.
    02:50 Running multiple models simultaneously with Open-Source ChatGPT is mind-blowing.
    04:14 Building Open-Source ChatGPT From Scratch
    05:40 Creating a new python file called main.py to generate a completion.
    07:00 Adjusting the code to get the desired response and adding a Gradio front end.
    08:35 Built an open-source ChatGPT from scratch using Mistol
    09:56 The conversation history is appended to the prompt in order to generate a response.

  • @modolief
    @modolief 6 месяцев назад

    Thanks for talking about fully local engines. Do you have a video with hardware recommendations for this?

  • @michaelbrown8289
    @michaelbrown8289 6 месяцев назад

    This is so over my head! But I'm following! Very cool!

  • @chrisBruner
    @chrisBruner 6 месяцев назад +1

    Wow! Jaw dropping video!

  • @EffortlessEthan
    @EffortlessEthan 6 месяцев назад

    I hope this works as well when they release it for Windows! Switching between models so fast like that is crazy!

  • @piyushlamsoge6007
    @piyushlamsoge6007 6 месяцев назад +1

    Hi matthew,
    You are doing amazing work to teach everyone about real power of AI with support of LLM
    I have a question , what to do if we to build something which works with any kind of documents as like this video model are working does it possible to do such things as well and what if we able to build them is there any way that we can deploy them in production as website or applications
    is there any way please make a video on it
    i'm looking forward to it
    thank you!!!!!

  • @NOTNOTJON
    @NOTNOTJON 6 месяцев назад

    And boom goes the dynamite.
    I'll bet integrating this with autogen isn't hard. Heck, you coukd just ask autogen to re-write its own interaction settings to use the various models.
    The interesting bit here would be asking autogen or the main dispatch model to find the best answerable model based on the context of the prompt.
    As always, great vid!

  • @slavrgo
    @slavrgo 5 месяцев назад +2

    Please make a guide on setting it up on the virtual machine, and creating API so we can use it in our apps (even with Make for example)

  • @finnews_
    @finnews_ Месяц назад

    I am not a coder, but somehow I achieved this building. Million Thanks!!
    Its a bit slow, but good enough to showcase to friends.
    By anychange we can host this live ? If yes, then How, kindly make a video on that !!!
    Million Thanks again😀🙏

  • @ujjwalchetan4907
    @ujjwalchetan4907 6 месяцев назад

    This video is awesome❤

  • @kumargupta7149
    @kumargupta7149 8 дней назад

    Thanks I find it. Great help

  • @renierdelacruz4652
    @renierdelacruz4652 6 месяцев назад

    Oh my god, what amazing video.

  • @tanmayjuneja6128
    @tanmayjuneja6128 6 месяцев назад +1

    Hey Matthew!
    Great video. Please help me with this, would hosting fine-tuned open source models on Sagemaker cost lesser as compared to GPT-4 API? Is there a comparison anywhere on any forum, reddit, etc? I want to fine-tune a model on my data, and I am thinking of going with GPT-3.5-turbo fine-tuning, but it's really expensive at scale. I want to know how do fine-tuned open source models compare to these prices (assuming we get a good efficiency at our desired task after fine-tuning)?
    Would really appreciate any thoughts on this. Thanks a lot!

  • @scitechtalktv9742
    @scitechtalktv9742 6 месяцев назад +31

    Building an AutoGen application using Ollama would be wonderful ! Example: one of the agents is a coder, implemented by a LLM specialized in coding etc.

    • @SushilSingh2005
      @SushilSingh2005 6 месяцев назад +4

      I was about to write this myself.

    • @27dhan
      @27dhan 6 месяцев назад +1

      haha me too!

    • @EduardsRuzga
      @EduardsRuzga 5 месяцев назад +1

      I started writing same comment, and then saw yours :D

    • @MungeParty
      @MungeParty 5 месяцев назад +3

      I'm an autogen application using ollama, I was going to write this comment too.

    • @EduardsRuzga
      @EduardsRuzga 5 месяцев назад

      @@MungeParty O nice to meet you! Why autogen ollama app is interested in this? :D

  • @jayfraxtea
    @jayfraxtea 6 месяцев назад +1

    Boy, Matthew is so inspiring. Thank you for ruining my weekend plan. I'd interested in the same matter as @padonker: how can we train with own data?

  • @vadud3
    @vadud3 6 месяцев назад +3

    This is amazing. I live in terminal and I do python. perfect!

  • @_Apep_
    @_Apep_ 3 месяца назад

    Congratulations, great video, I wonder if I could install a model similar as Claude 2 ( obviouslly if there's a similar that I could install on Ollama) and train it with documents (doc or pdf in Spanish) to create a webchat for questions and answers.

  • @user-hd7wd4nu1o
    @user-hd7wd4nu1o 5 месяцев назад +1

    Thanks!

  • @Piotr_Sikora
    @Piotr_Sikora 6 месяцев назад +1

    It will be awesem to have tutrial about how to create fine tunend model from i.e. mistral to gguf running with ollama :)

  • @renierdelacruz4652
    @renierdelacruz4652 6 месяцев назад

    I consider like so other subscribers you could create a video integrating ollama and autogen and the conversation can be stored on database and another video creating a AI personal assistant

  • @adnenmessaoudi9550
    @adnenmessaoudi9550 6 месяцев назад

    Really awesome Matthew !!
    I have a request: Can you make a video for a free LLM that can interact with Big Data like AWS Redshift please?

  • @carrolte1
    @carrolte1 6 месяцев назад +2

    i think the only thing it needs now is to be able to monitor a project folder so you can reference a set of documents. then I could ask it to help me with my specific project and not waste time and tokens feeding it code.

  • @alamjim6117
    @alamjim6117 2 месяца назад

    Great Thank you very much.

  • @dr.mikeybee
    @dr.mikeybee 4 месяца назад

    Nice. Now I understand why chatbots only allow a few prompts before they start over. They fill up their context window. BTW, it would be great to ad RAG with document and Google search. There's also a way to access Ollama from Siri. That would be ideal.

  • @hy3na-xyz
    @hy3na-xyz 6 месяцев назад

    cant wait for the autogen expert video!!!

  • @marianosebastianb
    @marianosebastianb 6 месяцев назад

    Hi! Thanks for the content, I always follow your videos. Can you show how to deploy ollama on runpod to have this multi-model setup running on cloud?

  • @BetterThanTV888
    @BetterThanTV888 6 месяцев назад

    Thanks for making it approachable. How would this work with Docker? And a portable nvme drive?

  • @ryutenchi
    @ryutenchi 6 месяцев назад +1

    Can you take a deep dive into using the Modelfiles to make your own model for specialty takes? Where can we find out things like token limits?

  • @Bill_v1
    @Bill_v1 Месяц назад

    The Orca2 language model got the killers question right. When you first ask the question, you may disagree with it's answer, but it justifies itself and does correctly answer the question as asked.

  • @Pietro-Caroleo-29
    @Pietro-Caroleo-29 6 месяцев назад

    So excited last night forgot my manners, if its possible Mr Berman, I would really like to see models talking to each other via there dialogue windows. say by adding a conversation starter window to set the topic and seeing there path of there conversational logic. Please. (Teams of separate modals processing a given task)

  • @mordordew5706
    @mordordew5706 6 месяцев назад +1

    Regarding the memory issue, can you integrate this with Memgpt? Could you please make a video for that?

  • @abdulazizalmass
    @abdulazizalmass 6 месяцев назад +1

    Thank you for the info. Kindly, let us know what are the specs on your pc? I have a very slow response on my macbook air from 8GB Memory and CPU of M1

  • @HyperUpscale
    @HyperUpscale 6 месяцев назад

    Finally, a good video🥳

  • @Pietro-Caroleo-29
    @Pietro-Caroleo-29 6 месяцев назад

    Great show "Yes dive deeper" Link them working together bi-directional communication. How far can it go.

  • @Techonsapevole
    @Techonsapevole 6 месяцев назад

    wow, fantastic. OpenSource models and ecosystem is everyday more powerful

  • @orkutmuratyilmaz
    @orkutmuratyilmaz 6 месяцев назад +2

    Ollama FTW! ✌

  • @Equilibrier
    @Equilibrier 5 месяцев назад

    Hi, what are the minimal specs for some of the most popular models ? Is there any model which can ran on 4GB RAM and a slower 2-cores CU, like an i3 ?

  • @WesTheWizard
    @WesTheWizard 6 месяцев назад +1

    Are the models that you can pull quantized or should we still get our models from TheBloke?

  • @Pithukuly
    @Pithukuly 4 месяца назад

    awesome, going to check text to sql for my project, hope it gives proper sql query for give schema. what do you think?

  • @jawadmansoor6064
    @jawadmansoor6064 6 месяцев назад +1

    is there an api end point that i can use, as openai's api replacement?

  • @crobinso2010
    @crobinso2010 Месяц назад +1

    Hi Matt, as someone who watches every video, I'm feeling overwhelmed and am wondering if you could do a "take a step back" episode every once in a while -- where you go over previous content from a broader perspective. For example, what is the difference between LM Studio, Ollama, Jan, AnythingLLM etc and where should someone start? Or go over the "gotchas" and frustrations in the comment sections to highlight those little errors and solutions commentators found but may have been missed by the casual viewer. It would be a review of old content, but with updated fixes, comparisons, and general perspective/advice. Thanks!

  • @avg_ape
    @avg_ape 5 месяцев назад

    Hi Matthew. Can you make a video about the various LLM models and the specialization of each?

  • @ikjb8561
    @ikjb8561 6 месяцев назад

    Ollama is cool if you are looking to build a personal assistant on your own PC. If you try to hit a model with multiple requests, be prepared to wait in line.

  • @BrianHockenmaier
    @BrianHockenmaier 6 месяцев назад

    Please make a video on the new vision support of open interpreter!

  • @gurudaki
    @gurudaki 6 дней назад

    Hi! Excellent work!I tried to replicate but when visiting the URL the input prompt tab to the right top is missing...

  • @dweebification
    @dweebification 6 месяцев назад

    Great video! I'm trying to figure out where the model is downloaded. What do I search for? I'm on a mac and since my terminal is at the root (~) level I assume it should be there. But if I do ls -a in that folder, I don't see it. I'm using Mistral, and a search in the finder for Mistral doesn't show it either. Anyone know how to find it?

  • @renierdelacruz4652
    @renierdelacruz4652 6 месяцев назад

    For the Linux user, I had and issue running the script directly from vs code, so I ran it on a terminal and it's working now, the script it's "python main.py"

  • @user-jz8ts5ku2n
    @user-jz8ts5ku2n Месяц назад

    how do I modify this code, especially the data that I'm training the model with, for building my own custom faq chatbot?
    Please let me know how.

  • @YuryGurevich
    @YuryGurevich 4 месяца назад

    Please, continue development.. Maybe inclusion of local Redis cache on docker and using it for conversion memory?