Private GPT4All : Chat with PDF with Local & Free LLM using GPT4All, LangChain & HuggingFace

Поделиться
HTML-код
  • Опубликовано: 22 авг 2024

Комментарии • 37

  • @venelin_valkov
    @venelin_valkov  Год назад +16

    Full text tutorial: www.mlexpert.io/prompt-engineering/private-gpt4all
    Get the Google Colab notebook: github.com/curiousily/Get-Things-Done-with-Prompt-Engineering-and-LangChain
    Prompt Engineering Guide: www.mlexpert.io/prompt-engineering
    Thank you for watching!

  • @RobMaurerMA
    @RobMaurerMA Год назад +1

    This is an excellent and comprehensive demonstration. Thank you for being realistic. You have well described the performance limitations of this Experiment, But the privacy still makes it attractive.

  • @atanasmatev9600
    @atanasmatev9600 Год назад +1

    Добро е GPT4All и аз го тествах, но наистина е бавно. Според Gartner, близката година за технологиите ще е с фокус Edge AI, което ще значи, че тези модели би трябвало да се оптимизират да работят на embedded PC-та .. дори на телефони. Но "поживем - увидим" както се казва :). Браво за видеото. Супер е.

  • @a.amaker4038
    @a.amaker4038 Месяц назад

    nice wutang shirt and great content. thanks

  • @DarkOceanShark
    @DarkOceanShark Год назад +5

    Can you please make a video on using LangChain for Pandas DataFrame with GPT4All querying.

  • @elekktrikk_home_video
    @elekktrikk_home_video Год назад +1

    It immediately asked for permission to read all sources of data (documents, desktop etc.) There is a workaround but that was chilling. (MacOS GUI)

  • @Romson03
    @Romson03 Год назад +2

    Woah. This is so cool! Thanks for video

  • @emanuelm2399
    @emanuelm2399 Год назад +1

    Thank you so much! this is really awesome content!

  • @user-nn7sg3og3w
    @user-nn7sg3og3w Год назад +2

    your solution is superb i really enjoyed this but i want to ask one thing like in this code query time is too much how can we reduce this?

    • @hillhitch
      @hillhitch 11 месяцев назад +1

      you can use other models instead of the huge gpt4all models e.g:
      model_path = "/LaMini-Flan-T5-248M"
      from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
      from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
      from langchain.llms import HuggingFacePipeline
      tokenizer = AutoTokenizer.from_pretrained(model_path)
      model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
      pipe = pipeline('text2text-generation', model=model_path,tokenizer=tokenizer,model_kwargs={"max_length":512,"do_sample":True,
      "temperature":0.2})
      llm = HuggingFacePipeline(pipeline=pipe)

  • @vandanasingh2249
    @vandanasingh2249 Год назад +1

    I am trying to do on server. I am getting issue
    Using embedded DuckDB with persistence: data will be stored in: db
    Illegal instruction (core dumped). can you help me out regarding this?

  • @rennan4403
    @rennan4403 Год назад +1

    Thanks man! go ahead

  • @user-ve6zy2dj7i
    @user-ve6zy2dj7i 9 месяцев назад

    I am trying this, but its not give output only give connection timeout. Why ? Could you help me in this

  • @davidcurious4055
    @davidcurious4055 5 месяцев назад

    explained thoroughly, however quickly outdated, code can not run properly now.

  • @okcelnaj
    @okcelnaj Год назад

    I got the following answer:
    "I do not have access to information regarding specific companies' financials or dividends paid in previous years due to data privacy concerns and restrictions imposed by laws such as GDPR (General Data Protection Regulation). However, you can find dividend amounts for a company's historical period on their website if available."
    LOL, what am I miising?

  • @mohamedzekkiriya4854
    @mohamedzekkiriya4854 Год назад

    I am getting this error,
    llama_model_load: loading model from './models/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ...
    llama_init_from_file: failed to load model. Please help me to get this resolved.

  • @rnronie38
    @rnronie38 4 месяца назад

    does it compromise my data?

  • @AwsmAnkit
    @AwsmAnkit Год назад

    did you find any way to speed up the response?

  • @quarinteen1
    @quarinteen1 7 месяцев назад

    it's not 100% local still

  • @Vladislav-ox4uj
    @Vladislav-ox4uj 9 месяцев назад

    It's normal +- 10 min on request?I think it's very long time. I have other methods for it task? I can run on GPU?

  • @lionfederal5096
    @lionfederal5096 11 месяцев назад

    How many files do you think this would be able to successfully query?

  • @aboudezoa
    @aboudezoa Год назад

    Can I use the new model LLAMA 2 to chat with PDF ? Or this is good and better ?

  • @Videodecumple
    @Videodecumple Год назад

    I'm a total beginner. Once all packages are locally saved, can I use this offline or do I need to be connected to Internet?

  • @user-lf8gk4wo2y
    @user-lf8gk4wo2y Год назад

    Does it work in any language?

  • @georgekokkinakis7288
    @georgekokkinakis7288 Год назад +1

    You video was really inspiring for me, many many thanks. I use a google colab environment with 35 GB RAM. When I ask a question for the second time it crashes due to RAM limits. Do you know why.? Also I would like your opinion in the following. I have a csv file where each row holds a mathematical definition (generally it holds some text) What would be the best way to return the text that is more relevant to the user's question?. Should I use gpt4all with langchain or there is a simpler way to do it? One problem I am facing is that I want it to work for the Greek language so far I can't find an LLM that works with Greek, only GPT from openAI, but I want an open source solution. I am struggling with this many years now. : (

    • @hillhitch
      @hillhitch 11 месяцев назад +1

      you can use other models instead of the huge gpt4all models e.g:
      model_path = "/LaMini-Flan-T5-248M"
      from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
      from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
      from langchain.llms import HuggingFacePipeline
      tokenizer = AutoTokenizer.from_pretrained(model_path)
      model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
      pipe = pipeline('text2text-generation', model=model_path,tokenizer=tokenizer,model_kwargs={"max_length":512,"do_sample":True,
      "temperature":0.2})
      llm = HuggingFacePipeline(pipeline=pipe)

    • @georgekokkinakis7288
      @georgekokkinakis7288 11 месяцев назад

      Thanks @@hillhitch for your response, I will try it. Is there a chance that you know any open sourced LLM which supports the Greek language? Until now I have only found some fine-tuned gpt2 models
      "nikokons/gpt2-greek" , "lighteternal/gpt2-finetuned-greek" and tried to fine tune them further for my task, but with no luck ☹ since my data is small.

    • @hillhitch
      @hillhitch 11 месяцев назад +1

      @@georgekokkinakis7288 let me check for you

    • @hillhitch
      @hillhitch 11 месяцев назад

      Have you checked the ones on huggingface?

    • @georgekokkinakis7288
      @georgekokkinakis7288 11 месяцев назад

      Yes. The ones I mentioned above are from huggingface.

  • @user-zl1pf2sy5s
    @user-zl1pf2sy5s Год назад

    How to increase the response time?

    • @hillhitch
      @hillhitch 11 месяцев назад +2

      you can use other models instead of the huge gpt4all models e.g:
      model_path = "/LaMini-Flan-T5-248M"
      from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
      from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
      from langchain.llms import HuggingFacePipeline
      tokenizer = AutoTokenizer.from_pretrained(model_path)
      model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
      pipe = pipeline('text2text-generation', model=model_path,tokenizer=tokenizer,model_kwargs={"max_length":512,"do_sample":True,
      "temperature":0.2})
      llm = HuggingFacePipeline(pipeline=pipe)

  • @shivamkumar-qp1jm
    @shivamkumar-qp1jm Год назад

    Is it run on the CPU

  • @MarceloLimaXP
    @MarceloLimaXP 8 месяцев назад

    GPT4All only works on CPU so far ;)

  • @relaxed.stories
    @relaxed.stories Год назад

    It's toooo slow

  • @bhaskartripathi
    @bhaskartripathi Год назад

    Excellent video but these free models r useless

  • @datasciencetoday7127
    @datasciencetoday7127 Год назад +3

    can you please make these videos?
    1. finetuning the 4bit quantized models e.g., anon8231489123/vicuna-13b-GPTQ-4bit-128g, step by step
    2. top 4 opensource embedding models how to finetune them and use in langchain
    3. scaling with langchain, how to have multiple sessions with LLM, meaning how to have a server with the LLM and serve to multiple people concurrently. What will be the system requirements to run such a setup. I believe we will be needing kubernetes for the scaling