Llama 3 RAG: How to Create AI App using Ollama?

Поделиться
HTML-код
  • Опубликовано: 8 июн 2024
  • 🚀 Join me as I dive into the world of AI with LLaMA 3! In this video, we'll explore how to create a powerful RAG (retrieval-augmented generation) app using LLaMA 3 to enhance your projects with intelligent data retrieval. 🧠💻
    Advanced Chunking Strategies: • Chunking Strategies in...
    🔍 What You Will Learn:
    Downloading and Setting Up LLaMA 3: Get started by installing the necessary libraries and downloading the LLaMA 3 model.
    Creating the RAG App: Step-by-step process of building the app, from loading data from a URL to saving it in a vector database.
    Designing a User Interface: Implement a UI where users can interact by asking questions to retrieve contextually relevant responses.
    Enhancing Performance with Nomic Embeddings: Upgrade your app by integrating specialised embedding models for improved accuracy.
    🔗 Components:
    Ollama to Download LLaMA 3
    Vector Databases: Chroma DB
    Gradio: An easy way to build custom UIs for your projects
    👍 Why Watch This Video?
    Gain hands-on experience in AI application development, from basic setups to advanced data handling techniques, all tailored to empower your software development and data science skills.
    🔗 Resources:
    Sponsor a Video: mer.vin/contact/
    Do a Demo of Your Product: mer.vin/contact/
    Patreon: / mervinpraison
    Ko-fi: ko-fi.com/mervinpraison
    Discord: / discord
    Twitter / X : / mervinpraison
    Code: mer.vin/2024/04/llama-3-rag-u...
    📌 Timestamps:
    0:00 - Introduction to LLaMA 3 and RAG App
    0:35 - Setup and Downloads
    1:10 - Building the RAG App Core Functionality
    3:00 - Embedding Generation and Storage
    4:05 - Creating and Integrating the User Interface
    5:25 - Final Testing and Demonstration
    Make sure to subscribe and hit the bell icon to get notified about our latest uploads! Smash the like button if you find this tutorial helpful and share it to help others in the tech community. 🌟
    #LLaMA3 #RAG #OLLaMA #AIApp #LLaMA3App #LLaMA3AIApp #LLaMA3RAG #LLaMA3RAG #RetrievalAugmentedGeneration #RetrievalAugmentedGenerationLangchain #RetrievalAugmentedGenerationLLaMA3 #LLaMA3 #RetrievalAugmented #LLaMARag #LamaRag #OLLaMARag #LLaMA3OLLaMA #LLaMA3OLLaMARag #OLLaMALLaMA3Rag
  • ХоббиХобби

Комментарии • 58

  • @linkit88
    @linkit88 Месяц назад +10

    This is my 1st ever comment, but I feel it is necessary. Excellent presentation without resorting to Click Baits and Time wasting segments. Thanks 👍

  • @kirilkirchev285
    @kirilkirchev285 Месяц назад +1

    Amazing video in only 7 minutes!!! Straight to the points. Great!

  • @Slim
    @Slim Месяц назад

    I like how you break every line down. Subbed. Looking forward to new videos.

  • @m12652
    @m12652 Месяц назад +1

    Thank very much 👍🏴󠁧󠁢󠁷󠁬󠁳󠁿

  • @user-uk9ls
    @user-uk9ls Месяц назад +2

    Gradio and Streamlit are both great UIs. I will try this .

  • @hrmanager6883
    @hrmanager6883 Месяц назад

    Excellent content, superb efforts, kudos bro

  • @DerNamenvolle
    @DerNamenvolle 16 дней назад

    Absolute king, thanks for the great tutorial and code

  • @mahdihosseini1085
    @mahdihosseini1085 Месяц назад +3

    Great video again, I love to see how can I use llama 3 with TensorRT as well, I believe I can be awesome!

  • @fab_spaceinvaders
    @fab_spaceinvaders Месяц назад

    excellent tutorial

  • @nexuslux
    @nexuslux Месяц назад

    So amazing! Now something like this setup but with custom tools to automate with agents 😂

  • @ilianos
    @ilianos Месяц назад +1

    You said you wanted to put a link for a video of yours about chunking into the description? I’m especially interested in advanced chunking strategies like semantic and agentic chunking!

  • @slayermm1122
    @slayermm1122 Месяц назад

    very nice thanks

  • @MrDenisMurphy
    @MrDenisMurphy Месяц назад

    Great Video! Why didn't you use agentic chunking?

  • @ShishirKumar07
    @ShishirKumar07 Месяц назад

    Great video!
    How do you compare LLM to evaluate which LLMs are going to be better for your usecase?

  • @zamanganji1262
    @zamanganji1262 Месяц назад

    Hi, Mervin. Thank you for your excellent presentation and tutorial. Could you please do the procedures in docker compose?

  • @Enkumnu
    @Enkumnu 13 дней назад

    Thanks you very much. It is working perfectly. Question. concerning : Create Ollama embeddings and vector store. How to save it, and how to load it to avoid again the embedding process. (If I want to have a model for a specific url). It is to win time. Thank for your answer. BR.

  • @MirGlobalAcademy
    @MirGlobalAcademy Месяц назад +1

    what about uploading document instead of webpage for retrivial?

  • @silenthusky1222
    @silenthusky1222 Месяц назад

    Great video! And I am confused by the prompt format in LlaMa 3: 8B instruct (I followed the Meta document), but it always generates errors like keep repeating for some words and some symbols like in the generation. Is there an example of prompt engineering? Thanks!

  • @user-fv1xe8xm4h
    @user-fv1xe8xm4h 25 дней назад

    vectorising the document(s) and generating the response based on the prompt takes time for me, either using Chroma or FAISS, but yours went fast, any walkaround to ensure more efficiency and less runtime?

  • @vijayrameshkumar8522
    @vijayrameshkumar8522 Месяц назад

    Can you share with us the Llama 3 resource details and which compute instance you were using?

  • @MeinDeutschkurs
    @MeinDeutschkurs Месяц назад

    Great and exciting! That is very useful as well. The database is always created new, isn’t it? Is there a way to anyhow save / collect the embeddings for reuse? Or are the embeddings exclusively to the current prompt?

    • @silenthusky1222
      @silenthusky1222 Месяц назад +1

      You can. I used Chroma DB to save it and load it for next time.

  • @mohamedkeddache4202
    @mohamedkeddache4202 Месяц назад

    please tell me how to stream the response?
    to have better interaction with the user

  • @journey7455
    @journey7455 Месяц назад

    Thanks for the video. Can you make a guide how to make a ui for llama3 + RAG with data from local file? Maybe by using privategpt

  • @GetzAI
    @GetzAI Месяц назад

    this needs to be updated for the 70b :)

  • @tomzhang9614
    @tomzhang9614 23 дня назад

    Hi does this work for fine-tuned local llama 3 model?

  • @KumR
    @KumR 7 дней назад

    do u have something which includes streamlit

  • @maizizhamdo
    @maizizhamdo Месяц назад

    great tut, i try it but it seems doest work with js html

  • @Yakibackk
    @Yakibackk Месяц назад

    Legend, does it support memory?

  • @stanTrX
    @stanTrX Месяц назад

    How to do it for local document instead of url?

  • @m12652
    @m12652 Месяц назад +1

    ChromaDb won't install 😢

  • @haritsinhgohil6071
    @haritsinhgohil6071 Месяц назад

    can you show tutorial on how to deploy these types of programs ?

  • @talehbalo9351
    @talehbalo9351 Месяц назад

    Can I ask a question? I wanted to know how to interact with chat locally in Excel using the anthropic type. I hope the question is clear. I want to receive answers in Excel, through queries to the local Jan ai, interacting through anthropic formulas in Excel. Maybe there are free solutions for such integrations for Api work with Excel. I would be very grateful for your answer! more precisely like in Claude for Sheets

  • @JarppaGuru
    @JarppaGuru Месяц назад

    same as llama1 amd 2?

  • @Yakibackk
    @Yakibackk Месяц назад

    if my llama3 on other server. how can i use it remotly ?

  • @free_thinker4958
    @free_thinker4958 Месяц назад

    We would like you to build an ai assistant that learns from past experiences using crewai

  • @envoy9b9
    @envoy9b9 Месяц назад

    dumb quwstion but when you say "touch app.py" what code editor are you using because im not getting the same results. im very very very new at this.

    • @chhil
      @chhil Месяц назад +2

      touch is a unix/linux command and ports are available to other OSs. It's run in a terminal/shell. It simply creates an empty file or updates the existing file's timestamp or other functionality based on parameters passed to it.

    • @envoy9b9
      @envoy9b9 Месяц назад

      @@chhil thank you.

    • @envoy9b9
      @envoy9b9 Месяц назад

      @@chhil im using mac, how would i get it to work the same to follow tutorials? ty for you help

  • @arajeshkanna5551
    @arajeshkanna5551 Месяц назад

    While run the code it stopped here for a longer period of time "embeddings = OllamaEmbeddings(model="llama3")", can you please help on this?

    • @MervinPraison
      @MervinPraison  Месяц назад

      Try nomic-embed-text instead of llama3
      ollama.com/library/nomic-embed-text
      In that line

    • @gouravpatil7412
      @gouravpatil7412 Месяц назад +2

      vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)
      In this line i am getting this error please help👇
      raise ValueError(f"Expected IDs to be a non-empty list, got {ids}")
      ValueError: Expected IDs to be a non-empty list, got []

    • @UmutErhan
      @UmutErhan Месяц назад

      @@gouravpatil7412 did you solve this? I get the same error even with nomic-embed-text

    • @arajeshkanna5551
      @arajeshkanna5551 Месяц назад

      Wow! great response guys, thank you all

    • @gouravpatil7412
      @gouravpatil7412 Месяц назад

      @@UmutErhan nope i guess if we run it in linux the error will go

  • @JarppaGuru
    @JarppaGuru Месяц назад

    5:01 still can see llama3

  • @user-oj2ge8cb5z
    @user-oj2ge8cb5z Месяц назад

    Mervyn you are full of energy telling us about Llama 3 and that's great but could you spare your precious time and tell us how to properly package a trained model into an exe file with the ability to upload it remotely) I probably want too much, sorry))) but maybe someone can tell us where to find the treasure chest) thank you for your time.

    • @Jesusdparra
      @Jesusdparra Месяц назад +3

      You are talking about something that doesn’t exist… you can’t just “pack” a model on a .exe and upload it somewhere… it does not work that way

  • @JarppaGuru
    @JarppaGuru Месяц назад

    2:30 yet again we allreaddy know answer xD ai is so cool. it not know anything else than what is programmed todo, but sure it know MOM jokes. stupid things like that should remove

  • @rude_people_die_young
    @rude_people_die_young Месяц назад

    I'm curious why you didn't use Ollama / OllamaWebUI to just upload and embed the documents? Would that save time, is it necessary to have the code to load/split/embed, if you can just drag and drop? What are the advantages to using this code?

    • @MervinPraison
      @MervinPraison  Месяц назад +2

      Using custom code means , it’s more extendable and easy to integrate with any application.
      It’s more customisable

    • @UmutErhan
      @UmutErhan Месяц назад

      what's the point of this question? You can also use copilot in bing for ai needs... The point here is to learn the concept and apply it to your own code design

    • @rude_people_die_young
      @rude_people_die_young Месяц назад

      @@UmutErhan the point is that I can ask questions. So not sure what the point is that you're making. I'm specifically exploring why you'd upload in code, when you can just drag and drop into the UI. Then you can focus on one key step... how to retrieve and provide the chunks in the prompt. Please don't try to edit people's point of view. I started with, "I'm curious".... Aren't we all?