Custom LLM Fully Local AI Chat - Made Stupidly Simple with NVIDIA ChatRTX

Поделиться
HTML-код
  • Опубликовано: 7 май 2024
  • NVIDIA have recently updated ChatRTX, a free local LLM chat bot for people with NVIDIA graphics cards. The key feature of ChatRTX are its
    1 - its local, so safer for your personal documentation AND query history
    2 - it's free
    3 - it's configurable
    4 - it's easy
    5 - it runs multiple LLM models
    We look at how to use ChatRTX to search your own data, perfect for making your own local AI companion that runs entirely on your own machine.
    This is of course not the only option out there for running LLM locally, so we briefly mention some of the alternatives available.
    Links
    gamefromscratch.com/nvidia-ch...
    -----------------------------------------------------------------------------------------------------------
    Support : / gamefromscratch
    GameDev News : gamefromscratch.com
    GameDev Tutorials : devga.me
    Discord : / discord
    Twitter : / gamefromscratch
    -----------------------------------------------------------------------------------------------------------
  • НаукаНаука

Комментарии • 100

  • @gamefromscratch
    @gamefromscratch  Месяц назад

    Links
    gamefromscratch.com/nvidia-chatrtx-easy-local-custom-llm-ai-chat/
    -----------------------------------------------------------------------------------------------------------
    *Support* : www.patreon.com/gamefromscratch
    *GameDev News* : gamefromscratch.com
    *GameDev Tutorials* : devga.me
    *Discord* : discord.com/invite/R7tUVbD
    *Twitter* : twitter.com/gamefromscratch
    -----------------------------------------------------------------------------------------------------------

    • @FujiLivz
      @FujiLivz Месяц назад

      Awesome coverage, hope to see more of the others as well, even if it's only surface level coverage like this (most of us don't know shit about AI either, but some of these tools feel like they definitely fall into gamefromscratch territory, I don't know exactly why, but they feel like a good fit here!). Part of it for me is, I hear a different LLM-related thing released every week, so it's REALLY hard to know what each thing "is" until someone shows you a video or demonstrates it's usage meaningfully - THATS where a project like this is awesome, it's got the Nvidia stamp, a download-and-run style local deployment... there's a reason to use it without first having to do 20 hours of homework on whatever project you are wanting to try. I'll definitely be taking a look at some of the open options listed at the end of the vid, but I'd bet most of us viewers would love to see some of those demonstrated from you as well (if you can find the time lol). Maybe we need an "notanaiguy" channel where we can explore some of this surface level stuff together, but honestly felt right at home here! Thanks for the vids!

  • @Theraot
    @Theraot Месяц назад +27

    I have used LM Studio, it will run on lower end hardware, but expect very bad performance. You can try using models that have been quantized, which will perform better, but will be less precise (they can degenerate into random text). And I do not remember it having an easy way to reference files.
    About RAG, be aware that you want a model that has been trained on the general subject. That is: if a model is specialized in poetry, it probably won't do well with code even if you give it all the text books. Why? Because it is trying to rhyme, that is the pattern it learned.
    On the other hand, with convinience of ChatRTX, you should be able to give it your project files - those that you would not dare upload to an online AI - and have it give you results based on that, which would be specific to what you are doing. And let that be another reason to put comments and choose good variable names: the better the context you can give the AI the better.
    Finally, do not forget: Garbage in, garbage out.

  • @TheRealAfroRick
    @TheRealAfroRick Месяц назад +4

    RAG does not go to the web. With retrieval augmented generation, the local data you provide is embedded (converted to numerical form) and store in a vector database of some sort..then when you make a request to the chat bot, your query is also embedded using the same method as the data you previously embedded. A semantic search is performed (generally with cosine distance from the vectors) and the relevant data is sent to the LLM in its context window so it can base its response on the content in your data. This is done specifically to prevent hallucinations by the LLM since it has never seen the data in your documents.

    • @Matlockization
      @Matlockization 20 дней назад

      'Vector database' sounds like the cloud or 3rd party data mining operatives who are only too happy to pay for the privilege. People also have to understand that one of the AI's here are linked to Meta, which is owned by Mark Zuckerberg who is famous for sharing people's data.

  • @micmacha
    @micmacha Месяц назад +8

    As a man who has countless useful epubs and pdfs, this looks very useful. I especially like that it will give you a list of its sources; not exactly a full citation yet but very usable. However, I'm not terribly keen on it being Windows only and it's asking for a hefty graphics card and a lot of disk space for something I can do by hand. I think this is good news and it shows that Nvidia is, if haphazardly, listening to the real concerns with LLMs.
    Otherwise it's becoming an extremely tired subject.

    • @MurphyArtPrints
      @MurphyArtPrints Месяц назад

      What's your primary source for said PDF's and files? I need to start building a collection with the way things are going.

    • @micmacha
      @micmacha Месяц назад +1

      @@MurphyArtPrints Oh, I've scanned a number of them, and many others are from independent epub sellers like Humble Bundle and a few (legal) torrents. I'm with you on proprietary ebook viewers; it may be more durable and portable than paper but you never know when someone's going to pull the plug.

  • @sergiofigueiredo1987
    @sergiofigueiredo1987 28 дней назад

    Anything LLM is probabbly one of the best rags chat with documents software. its open source, the developers are dedicated and it rock a TON of configurations

  • @D3bugMod3
    @D3bugMod3 Месяц назад +4

    Yoh,
    Will definitely spend some time playing with this. Was using chat to help me write a story & lore bible. But Chat can only remember so much before you have to start a new conversation.
    Not to mention Chats constant need to equivocate over nuanced or political ideas. I spent so much time getting it to see holes in its logic. No doubt this system will still have issues. But at least i wont have to keep starting over.
    Thanks as always

  • @tmanook
    @tmanook Месяц назад +1

    Interesting usage for AI. Seems like it could be handy. For me, I really want an easier time localizing my game. I still need to figure out the optimal way to do that.

  • @a.aspden
    @a.aspden Месяц назад +1

    You mentioned copilot. Does this work as well as copilot if you give it your code folder to train on?

  • @shotelco
    @shotelco 9 дней назад

    Excellent! Thanks.

  • @jefreestyles
    @jefreestyles Месяц назад

    Thanks for showing this! Seems maybe 1 other downside is you shouldn't have too many editors open or in use while using it. I wonder what would break first when local compute is maxed out.

  • @samwood3691
    @samwood3691 Месяц назад

    The quote from HAL made me have to check this video. Local LLM is a cool idea. Regarding NVidia, they basically bailed on Linux which really sucks, but hopefully this will not stop this from being made available soon on Linux.

  • @AscendantStoic
    @AscendantStoic Месяц назад

    LM Studio is great ... I use it quite often ... there is also Ollama but it as far as I know it doesn't have a UI, but it's easy to use.

  • @phizc
    @phizc Месяц назад +5

    Ollama is also an interesting option. It supports Linux, Windows, and Mac. AMD support is in preview on Linux and Windows. It sets up a server that can be accessed via an API or a simple cli chat interface.

  • @mascot4950
    @mascot4950 Месяц назад +2

    My experience is that these small models fall apart really quickly, especially when it comes to generalized questions. For programming, they seem do to a bit better, but the difference is still quite noticeable if you ask small and large models the exact same question.
    The first "oh, hey, this actually feels pretty close to at least ChatGPT3.5" for me was llama 3 70b, clocking in at 42GB in size. I can only fit about half of that on my GPU, and with the rest running on CPU it's pretty slow. Like 2 tokens per second.

  • @bdeva029
    @bdeva029 Месяц назад

    Nice video. This is good content

  • @AnnCatsanndra
    @AnnCatsanndra Месяц назад

    Easy to install and use, easy to train on my own data? Man this thing is gonna be killer for brainstorming and worldbuilding!

  • @strangeboltz
    @strangeboltz Месяц назад

    Awesome video! thank you for sharing

  • @SimeonRadivoev
    @SimeonRadivoev Месяц назад +4

    It's not training doccumintation/dataset it's document retrival. It literally just takes pieces of your documents and inserts them into the prompt.

  • @scribblingjoe
    @scribblingjoe Месяц назад +3

    This actually sounds pretty cool.

  • @Stealthy_Sloth
    @Stealthy_Sloth Месяц назад

    Llama 3 with Pinokio works great for this as well.

  • @rob679
    @rob679 Месяц назад +6

    I use linux version of Ollama through WSL with OpenWebui as frontend, it already has RAG functionality and everything is installed by basically 3 commands. Llama3 8b works great and I can hook it to VS Code through Continue extension and have personal local Copilot.

    • @13thxenos
      @13thxenos Месяц назад

      Came to comment something similar.
      Now if openWevUI gives the functionality to fine tune models furthur...

    • @rewindcat7927
      @rewindcat7927 Месяц назад

      Is there a good resource for a smooth-brain to get started on this track? Thanks!!! 🙏

    • @TrolleyTrampInc
      @TrolleyTrampInc Месяц назад

      @@rewindcat7927 networkchuck has recently done a video explaining everything. setup ollama and then simple install the continue extension on vs code

    • @jimmiealencar7636
      @jimmiealencar7636 Месяц назад

      Would it run well with a 6gb rtx?

    • @rob679
      @rob679 Месяц назад

      @@jimmiealencar7636 Yes, but unless you run natively under windows, you also need 16GB of system ram. Llama3 8b uses about 3.5GB of vram on my 3050. And if everything falls, you can always run it on CPU only, but it will be slow.

  • @youMEtubeUK
    @youMEtubeUK Месяц назад +1

    I have this and while it runs nice with my 4090, I still use online tools for PDFS and general research. Understand if you want to keep files private. But with a chrome extension i can use all the main AI platforms across mutiple devices. Also found gemini pro 1.5 better for large 700 page pdfs.

  • @etherealregions2676
    @etherealregions2676 Месяц назад

    This is very interesting 🤔

  • @quantumgamer6208
    @quantumgamer6208 Месяц назад

    Does it work with pycharm code like python and Lua code for game and game engine development

  • @kurtisharen
    @kurtisharen Месяц назад

    How does it handle cross-referencing? What happens if you ask the math question, then ask how to calculate the same thing in Godot? It would need to know and understand the first question and how it applies to the second question instead of just looking up a direct answer in the documentation you give it.

  • @Matlockization
    @Matlockization 20 дней назад

    That RAG or 'sanity checker' means that it's possible your data is being distributed to 3rd parties for analysis.

  • @djumeau
    @djumeau Месяц назад

    Does it read image based pdfs? Or do you have to convert the pdfs into readable format?

    • @rob679
      @rob679 Месяц назад

      Most likely it doesn't it doesnt state anywhere and some people commented on NV page that it doesn't see the files.

  • @refractionpcsx2
    @refractionpcsx2 Месяц назад +1

    Can confirm this does *not* install on a 2000 series RTX card. Tried on my 2080Ti and the installer goes nope.

  • @RoughEdgeBarb
    @RoughEdgeBarb Месяц назад +5

    This might be the first use-case of LLMs I'm interested in. Local is necessary to address huge env cost of GenAI, and the ability to parse your own documentation is interesting.

  • @nightrain472
    @nightrain472 Месяц назад +1

    I use GBT4All for local LLM

  • @OriginRow
    @OriginRow Месяц назад +4

    How can I fetch Unreal Engine Docs to PDF ?
    🤔

    • @UltimatePerfection
      @UltimatePerfection Месяц назад

      Getleft or other website downloader and then HTML to PDF converter.

    • @OriginRow
      @OriginRow Месяц назад

      @@UltimatePerfection
      Recently they moved Docs to forums LMAO 🥵
      It's not working

  • @JaxonFXPryer
    @JaxonFXPryer Месяц назад

    Dang it... I have so much text in markdown format that is useless for this training data 😭

  • @MrHannatas
    @MrHannatas Месяц назад

    Need this with agents

  • @josemartins-game
    @josemartins-game Месяц назад

    Turn off the internet. What does it respond ?

  • @FusionDeveloper
    @FusionDeveloper Месяц назад

    Neat idea, but unnecessarily high system requirements, make it prohibitive for most people.
    I can run Ollama with llama3 with lower system requirements and make my own GUI.

  • @0AThijs
    @0AThijs Месяц назад +1

    No api, no custom model loading, just a simple ui, no updater... (I already downloaded it three times to update it, each time ~30gb, yes 30GB FOR MISTRAL!).

  • @Saviliana
    @Saviliana Месяц назад

    So Kobold but Nvidia?

  • @judasthepious1499
    @judasthepious1499 Месяц назад

    AI : hallo user, what are you doing?
    please upgrade your nvidia graphics card.. or you can't continue using our AI service

  • @user-yi2mo9km2s
    @user-yi2mo9km2s Месяц назад

    ChatRTX's installer has lots of bugs, never fixed. My PC has Win11 24H2 192GB DDR5+4090 installed.

  • @JARFAST
    @JARFAST Месяц назад

    Does it support the Arabic language?

  • @gokudomatic
    @gokudomatic Месяц назад +1

    I have a feeling that this thing need NVidia hardware that has RTX. My 1060 GTX won't run that.

    • @FromagioCristiano
      @FromagioCristiano Месяц назад +2

      At 1:20 there is the system requirements: Geforce RTX 30/40 Series, RTX Ampere (the ones like RTX A2000, RTX A4000) and the Ada Generation GPUs (but that is not for us mere peasants)

    • @hipflipped
      @hipflipped Месяц назад +3

      the 1060 is ancient.

    • @gokudomatic
      @gokudomatic Месяц назад +1

      @@hipflipped yes, it is. What's your point?

  • @dariusz.9119
    @dariusz.9119 Месяц назад +7

    One thing to add is ChatRTX requires Windows 11. 70% of the market is Windows 10 so it's only for a limited number of users

    • @MonsterJuiced
      @MonsterJuiced Месяц назад +3

      Thanks for that lmao, I'm on win10 because 11 breaks my dev software and kills my performance. Shame this is win 11 only

    • @sean7221
      @sean7221 Месяц назад +1

      LMDE 6 is the future, Windows can go to hell

    • @habag1112
      @habag1112 Месяц назад +6

      It runs fine on win10 for me (using rtx 3070)

    • @varughstan
      @varughstan Месяц назад +3

      I am running this on Windows 10. Working fine.

    • @TheSleepJunkie
      @TheSleepJunkie 18 дней назад

      Get real. I haven't seen a single windows 10 pc on the market. Not even the cheap ones competing with chromebooks.

  • @kyryllvlasiuk
    @kyryllvlasiuk Месяц назад

    I've got 2060 with 6 GB :(

  • @PurpleKnightmare
    @PurpleKnightmare Месяц назад

    OMG This is way more cooler than I thought.

  • @vi6ddarkking
    @vi6ddarkking Месяц назад +2

    Well they've been Stupidly Simple fo a couple years now with the WebUIs like Oobabooga.
    So this isn't exactly anything impressive.

  • @nangld
    @nangld Месяц назад

    Too slow, given it is 7b running on a GPU.

  • @JasonBrunner-SM
    @JasonBrunner-SM Месяц назад +1

    Any HELPFUL comments from those that are already experts on this topic about the better LLMs to use with this from the stand point of game dev? Since Mike admitted this is not his area of expertise.

  • @aa-xn5hc
    @aa-xn5hc Месяц назад

    That is a bad app. For example, it cannot take into account the previous chat when answering a follow up question.

  • @ionthedev
    @ionthedev Месяц назад

    Why do they hate linux so much

  • @ryanisanart
    @ryanisanart Месяц назад

    more ai stuff pls this is awesome

  • @24vencedores11
    @24vencedores11 Месяц назад

    Nice! but you're too fast man!

  • @pm1234
    @pm1234 Месяц назад +14

    They're late to the party: not llama3, only windos, only a basic chat interface. Open source RAG tools are already here.

    • @r6scrubs126
      @r6scrubs126 Месяц назад +4

      Did you even watch the first 30 seconds. It's an easier alternative to all the open source build it yourself ones. I think that's great

    • @pm1234
      @pm1234 Месяц назад

      ​@@r6scrubs126 It would have been great (and still late) if it had all the things I mentioned in my comment. I watched it, THEN commented it. Menu shows llama2 13B (@2:05), no llama3, it's only for window$ (@1:17), and the chat UI is basic (not even sure it does markdown tables). RAG are getting common now. If you're happy because you don't know OS tools, no problemo!

    • @gabrielesilinic
      @gabrielesilinic Месяц назад

      Llama 3 is not even open source by definition, Mistral is doing a better job

    • @claxvii177th6
      @claxvii177th6 Месяц назад +1

      Llama3 isnt open-source??

    • @claxvii177th6
      @claxvii177th6 Месяц назад

      Seriously, i was using for an entrepreneur application

  • @user-ym6gt8zz4v
    @user-ym6gt8zz4v Месяц назад +2

    train unreal engine 5

    • @gamefromscratch
      @gamefromscratch  Месяц назад +4

      You can if you can get a text or PDF version of the documentation. Or enough PDF Unreal Engine books. Really its a matter of dumping as much documentation into your training model folder as you can source.

    • @jefreestyles
      @jefreestyles Месяц назад +1

      Can one add multiple file/folder locations? Or is it really just one folder that has to be the root? Can it use symbolic links or folder/file shortcuts?

  • @thesteammachine1282
    @thesteammachine1282 Месяц назад +6

    Win 11 only ? Lol no..

  • @impheris
    @impheris Месяц назад +2

    i like some things on AI but this is getting pretty boring now

    • @ronilevarez901
      @ronilevarez901 Месяц назад

      That's like saying to a new parent that watching their child breathing must be boring.
      This is a new type of life developing in front of your eyes. This is History. I do find History boring, but seeing it happening every day is in a different level.