How to Build a Custom Knowledge ChatGPT Clone in 5 Minutes

Поделиться
HTML-код
  • Опубликовано: 9 сен 2024

Комментарии • 302

  • @LiamOttley
    @LiamOttley  Год назад +4

    Leave your questions below! 😎
    📚 My Free Skool Community: bit.ly/3uRIRB3
    🤝 Work With Me: www.morningside.ai/
    📈 My AI Agency Accelerator: bit.ly/3wxLubP

  • @rushikshah2824
    @rushikshah2824 Год назад +62

    God bless the algorithm for showing this channel to me!

  • @adboost_AI
    @adboost_AI Год назад +38

    The AI beast dropping knowledge bombs again! Awesome video Liam, punchy, engaging and dripping with actionable content 👏🏼 Cutting edge stuff.

  • @birkopheim-3263
    @birkopheim-3263 Год назад +3

    Anyone having trouble with the code, llama index updated the name "of "GPTSimpleVectorIndex" to "GPTVectorStoreIndex". Just replace them and it should work if that is the error you are getting

  • @Mich6961
    @Mich6961 5 месяцев назад

    omg thank you! You've helped me not have to categorise my grocery shopping list into fruit, meat etc. manually.

  • @ahmaddada-dq7zn
    @ahmaddada-dq7zn Год назад +5

    Nice one Liam, always enjoying your contents.

    • @LiamOttley
      @LiamOttley  Год назад

      Much appreciated mate, glad I could help!

  • @AdamPaulTalks
    @AdamPaulTalks Год назад +2

    Yes an update to this with GPT 3.5 turbo (current model) would be incredible.

  • @sohamagarwal00
    @sohamagarwal00 Год назад +3

    really cool stuff! Musch more efficient than traditional methods of custom training the model or making custom responses. Thanks a lot!

  • @samwilliams3929
    @samwilliams3929 Год назад +3

    Cracking video ! Short, sharp and focused ! Information is spot on and really helpful. Thanks! Looking forward to seeing more good content.

  • @ashishrathore7783
    @ashishrathore7783 Год назад +3

    The code isn't working anymore, got stuck with an error on GPTVectorStoreIndex. The libraries you have used have been modified.

  • @martinmadlmayr9947
    @martinmadlmayr9947 Год назад +3

    Sorry, I am a novice in the area of AI, but I have a typical management question:
    What would it need to being able to use my own knowledge bot WITHOUT feeding the critical information to meta, Google or openai? Or: how can I ensure that my data is safe?
    By the way: great content- much appreciated

    • @workinprogress2077
      @workinprogress2077 Год назад +2

      I am looking for an asnwer for this too. I think the answer is you need to run the chatbot locally/on a server you control. There are many different ways to do this, including using Portainer + Docker (this is what I was told by a exeriences coder)

    • @bl8596
      @bl8596 Год назад +1

      Yeah I would like a video specifically on this

    • @kavian4249
      @kavian4249 11 месяцев назад

      Did you find any solutions?

    • @kavian4249
      @kavian4249 11 месяцев назад

      @@workinprogress2077 Did you find any solutions?

    • @kavian4249
      @kavian4249 11 месяцев назад

      @@bl8596 Did you find any solutions?

  • @ExploitInsight
    @ExploitInsight Год назад +1

    you deserve tones of subscribers

  • @StudioTatsu
    @StudioTatsu Год назад

    Thank you, I've been looking for something like this for weeks

  • @Warclimb64
    @Warclimb64 Год назад +1

    Thanks for making this!

  • @codershorts
    @codershorts Год назад +4

    Liam never disappoints :)

  • @spzen98
    @spzen98 Год назад +1

    Hi, I have a problem with the OpenAI API rate limit when using large set of data. This is when loading the GPTSimpleVectorIndex. For small data sets it's okay. Can u advise?

  • @TheHeartShow
    @TheHeartShow Год назад +3

    Great vid as always!

  • @webdancer
    @webdancer Год назад +3

    Liam, thanks for sharing this information. This is quality stuff.

    • @LiamOttley
      @LiamOttley  Год назад

      Glad you enjoyed it 🙏🏼

    • @EnlistedBootCamp
      @EnlistedBootCamp Год назад

      @@LiamOttley wish i knew 5% of how to get any idea started, ugh, thanks you are golden

  • @udaynj
    @udaynj Год назад +1

    Put speed for the video at 0.75 - Liam speaks really fast!

  • @nadinejammet7683
    @nadinejammet7683 Год назад

    Thank you, i can already see how to use it in education.

  • @mamdouhalmheid9685
    @mamdouhalmheid9685 Год назад +2

    I really like your content, thank you!

  • @bilalmsd07
    @bilalmsd07 Год назад +3

    Great video as always. Keep up the good work. Wish you very best of luck for your channel. I hope it will rock in the near future.

  • @paulpaturle6957
    @paulpaturle6957 Год назад +4

    Super interesting ! Thank you, Liam ! I was wondering how can we see the model used, and if we can control the temperature ?

  • @yashsrivastava677
    @yashsrivastava677 Год назад

    Code doesn;t work anymore because from llama_index import GPTSimpleVectorIndex throwing an error now.

  • @winstonmisha
    @winstonmisha Год назад +2

    At:
    # Setup your LLM
    from llama_index import LLMPredictor, GPTSimpleVectorIndex, PromptHelper
    # define LLM
    llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.1, model_name="text-davinci-002"))
    # define prompt helper
    # set maximum input size
    max_input_size = 4096
    # set number of output tokens
    num_output = 256
    # set maximum chunk overlap
    max_chunk_overlap = 20
    prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
    custom_LLM_index = GPTSimpleVectorIndex(
    documents, llm_predictor=llm_predictor, prompt_helper=prompt_helper
    )
    I'm getting:
    NameError Traceback (most recent call last)
    Cell In[16], line 7
    3 from llama_index import LLMPredictor, GPTSimpleVectorIndex, PromptHelper
    6 # define LLM
    ----> 7 llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.1, model_name="text-davinci-002"))
    9 # define prompt helper
    10 # set maximum input size
    11 max_input_size = 4096
    NameError: name 'OpenAI' is not defined

    • @dannywlhung
      @dannywlhung Год назад

      I am having the same error message. 😢
      I have been trying to find solution to this without success...😭

    • @dannywlhung
      @dannywlhung Год назад

      @Mr_LiamOttley........... yes sir?

  • @ramp2011
    @ramp2011 Год назад +1

    thank you for the video. What is the difference between using LlamaIndex vs Langchain? thank you

  • @yazanrisheh5127
    @yazanrisheh5127 Год назад

    I keep getting this error despite creating a brand new OpenAI account. I'm trying to create a chatbot that can read from a pdf file. How do I fix this error: "You exceeded your current quota, please check your plan and billing details"
    There's no way I can exceed if I literally just created my account...

  • @jornreuvers1598
    @jornreuvers1598 Год назад

    Awesome video! Going to mess around with all this as my first steps into AI programming... well sorta first steps anyway!

  • @AlbanBytyqi
    @AlbanBytyqi Год назад

    Thank you. It is a vit above my head

  • @TheRealPlayer00
    @TheRealPlayer00 Год назад +3

    I can't use the custom code for some reason. it says OpenAI is not defined here: llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.1, model_name="text-davinci-002")).
    I tried from openai import OpenAI but it is not working.
    Any suggestions anyone?

    • @JebliMohamed
      @JebliMohamed Год назад +3

      You need to add : from langchain import OpenAI

    • @TheRealPlayer00
      @TheRealPlayer00 Год назад

      @@JebliMohamed good man

    • @LiamOttley
      @LiamOttley  Год назад

      Thanks Jebli

    • @millerco2000
      @millerco2000 Год назад +1

      I am still getting this error.
      NameError Traceback (most recent call last)
      Cell In[37], line 6
      2 from llama_index import LLMPredictor, GPTSimpleVectorIndex, PromptHelper
      5 # define LLM
      ----> 6 llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.1, model_name="text-davinci-002"))
      7 from langchain import OpenAI
      8 from openai import OpenAI
      NameError: name 'OpenAI' is not defined

    • @TheRealPlayer00
      @TheRealPlayer00 Год назад

      @@millerco2000 from langchain import OpenAI this fixed it for me

  • @greg_thomson
    @greg_thomson Год назад

    Amazing tutorial! subscribed

  • @tomtomatron8625
    @tomtomatron8625 Год назад

    Great pacing and demo, thank you for the tutorial.

    • @muradbaghirli
      @muradbaghirli Год назад

      Hi, do I have to pay for open_ai key?

  • @romancandlefight1144
    @romancandlefight1144 Год назад

    Great video
    Respect for sharing your files 🙏

  • @paul-thys
    @paul-thys Год назад +1

    It seems anyone will be able to do this soon. The value will be in the data. Can you use AI to gather the data to train in on?

  • @lonniesims868
    @lonniesims868 Год назад +3

    very informative video! Could we get a video on langchain soon? 👀

    • @LiamOttley
      @LiamOttley  Год назад +1

      Quite a big beast to tackle, hard to not make it too technical for most of my viewers :/

    • @lonniesims868
      @lonniesims868 Год назад

      @@LiamOttley understandable, you have been one of the best teachers when it comes to AI and how to leverage it! If your able to do a video on it in the future it would definitely help a lot. Until then I’ll be waiting for your next video!

  • @avikshitbanerjee1
    @avikshitbanerjee1 Год назад

    Getting the error "chunk_overlap_ratio must be a float between 0. and 1.", any solution for this guys?

  • @Umuragewanjye
    @Umuragewanjye Год назад

    Thanks. for sharing the skills

  • @roberthuff3122
    @roberthuff3122 Год назад

    Fantastic! Thank you.

  • @ColtonCampbell
    @ColtonCampbell Год назад +2

    FYI, "GPTSimpleVectorIndex" changed to "GPTVectorStoreIndex"

  • @z1mt0n1x2
    @z1mt0n1x2 Год назад

    oh..... now imagine throwing in all the D&D PDF's into one simple bot :D

  • @oryxchannel
    @oryxchannel Год назад

    I edited this because I want to emphasize the importance of search (and how poor RUclips search is). I had a vein in the center of my forehead trying to do this on my own about three weeks ago....and I'm only seeing Liams video now. Get your search alerts and notifications down during this revolution, and you just may be lucky enough to find the Liams of this world giving you a play-by-play breakdown of exactly what you want in your AI build.

    • @LiamOttley
      @LiamOttley  Год назад

      Very kind words mate glad I could help ❤️🙏🏼

  • @michielsmissaert
    @michielsmissaert Год назад

    Wow impressive video, thsnk you so much!

  • @smann43231816
    @smann43231816 Год назад

    Thankyou, great video

  • @patrick.cheung
    @patrick.cheung Год назад

    Great Video. Thanks for sharing. 🎉

  • @ishaanme91
    @ishaanme91 Год назад

    Extremely cool! Looking forward to more awesome content.

  • @chevvvv
    @chevvvv Год назад +3

    I would like to see a Javascript version of this

    • @LiamOttley
      @LiamOttley  Год назад +1

      Not sure if there are javascript equivalents for libraries like Llamaindex

  • @SimonStJohn
    @SimonStJohn Год назад +1

    Hey Liam! Awesome thanks...can you do a follow-up on an addon to index a website like a blog? And have the output to include a link to the article used for the answer so users can click through to read more?

  • @adil.acoustic
    @adil.acoustic Год назад +2

    Amazing Liam bro..

  • @jsveilleux1655
    @jsveilleux1655 Год назад

    Looks like it doesn't work, after some research it looks like they renamed GPTSimpleVectorIntex to GPTVectorStoreIndex. What version of Llama_index are you using?

  • @Limesh
    @Limesh Год назад

    Superb Video, I loved it. ❤❤❤

  • @mohdjibly6184
    @mohdjibly6184 Год назад +1

    Awesome video...thanks bro

  • @livb4139
    @livb4139 Год назад

    isn't this limited by tokens from the openai api? like if i want to load 500 doucments it wouldn't work right?
    if so is there any way to run these models locally

  • @Noboy504gaming
    @Noboy504gaming 5 месяцев назад

    Great info!

  • @victorquinones9111
    @victorquinones9111 Год назад +1

    Question: the information that we index, is this shared with an external server loosing its confidentiality, or it just remain in the users computer?

  • @michaelabdoofficial
    @michaelabdoofficial Год назад +1

    Fucking brilliant man. Keep up the mad hustle.

  • @chrisrios802
    @chrisrios802 7 месяцев назад

    bro, do you know if any of them will let me create a chatgpt clone that users can input stories but they contain profanity, and the chatgpt able to reply with profanity, im using it for story scripts

  • @olamilekanajao6377
    @olamilekanajao6377 Год назад

    Well done Liam

  • @lucasalvarezlacasa2098
    @lucasalvarezlacasa2098 Год назад +1

    Great video!. I have some questions:
    1) When we create an index, I understand that what's going on is that somehow based on the question we know which part of the files inside the index should be used to reply to it, and this is context information given to GPT as part of the prompt. Is that the case?
    2) Is there a limit in how bit this index can be?

  • @OnChainEpic
    @OnChainEpic Год назад +2

    Hey Liam, sorry not sure what Jupiter is, is that what your running the code in? Assuming we can run this locally? Also how would you integrate this into something you built, by referencing this new model or what within Open AI? Would like some more details on the code your using and integration???

    • @LiamOttley
      @LiamOttley  Год назад

      This is all Python code running in Jupyter notebook. Super easy install with the anaconda launcher. Deploying apps is a bit trickier so you’d probably want to play around with things on your own as I am then hire a developer once you’re happy with it to create a product out of it

  • @bombibombi7258
    @bombibombi7258 Год назад

    Hello, How can I increase response message length (maybe token limit I guess) . After I train data and ask question answer is not complete and token shows always not more than 2500 600 etc

  • @TheRealPlayer00
    @TheRealPlayer00 Год назад +2

    Good guy Liam!

  • @mohamednihal8215
    @mohamednihal8215 Год назад +1

    Wow! Can we use any open source llm model instead of using openai api key?

  • @kiranshiveshwar3108
    @kiranshiveshwar3108 Год назад +1

    If I have a PDF of 300 Pages, will it still work as I saw another video using Lanchain and Pinecore (to store vector data for 300 pages)

    • @LiamOttley
      @LiamOttley  Год назад

      Good question, I haven't seen anything on limits for these kinds of indexes so worth testing. Ask different questions about info on sample pages?

  • @user-zs4zb9qn1r
    @user-zs4zb9qn1r Год назад +1

    Thanks, but I can't see any code that 'prompt' variable is used for openai's api.
    Can u explain how can the chatbot remember previous chat?

    • @user-zs4zb9qn1r
      @user-zs4zb9qn1r Год назад +1

      I figured out it.
      Need to modyify as query(prompt) intead of query(user_input)

    • @LiamOttley
      @LiamOttley  Год назад

      🙏🏼

  • @SeanietheSpaceman
    @SeanietheSpaceman Год назад +1

    can we teach it to iteratively improve its own code?

  • @nemesis851_
    @nemesis851_ 3 месяца назад

    Does this training of the knowledge base, cut into the token limit?

  • @MrOmsa12
    @MrOmsa12 Год назад

    Does the ChatGPT api key required for this to work? And if so does it cost money to use llma?

  • @owen_silk
    @owen_silk Год назад +1

    keep making great videos

  • @gaysasuke5618
    @gaysasuke5618 Год назад

    I'm so confused, did you skip a step when we got to the llama index section. Is there a video I need to watch before this, this doesn't make any sense to me.

  • @NickMart1985
    @NickMart1985 Год назад

    Can something like this be used with sensitive business data? Is OpenAI consuming and storing all of this data as its being indexed?

  • @smudgepost
    @smudgepost Год назад

    Very good. Need a nice front end and link to a vector db like Pinecone

  • @rickyroffey
    @rickyroffey Год назад

    Is there a file size limit for a txt file? And if you upload a large file, does it take a long time to respond to the query?

  • @jeanpeuplu3862
    @jeanpeuplu3862 Год назад

    Not sure if I got it wrong, but it means we have to pay for an openAI key, it's not free, is it?

  • @user-cs2xi1mo8z
    @user-cs2xi1mo8z Год назад

    does this method prevent chatGPT from using/learning from the indexed data?

  • @mohammedashfaaq3071
    @mohammedashfaaq3071 Год назад

    can any one help me why am i getting Ratelimit error while using GPTVectorStoreIndex(documents) with my openai key? and suggest me how tackle this

    • @ValoriteUK
      @ValoriteUK Год назад

      Same issue, here, did you solve it?

  • @darkknightgaming9016
    @darkknightgaming9016 Год назад +2

    Great video! I was just wondering if it is possible to make it less expensive, because when I use big data bases it uses a lot of tokens.

    • @LiamOttley
      @LiamOttley  Год назад +1

      GPT 3.5 Turbo is extremely cheap, hopefully they add support for it soon instead of davinci-003

  • @a999haa
    @a999haa Год назад

    Hey Liam! Great video mate 🙌
    Can I ask if this can generate responses in a particular json format if needed after indexing any document?
    Thanks again for the video!

  • @curtismrasmussen
    @curtismrasmussen Год назад

    Is this the best approach to export thousands of ChatGPT conversations and then be able to search that data (I'm a prolific writer)? Does it import and structure (or show you how) automatically? OR is there a better tool for doing that? Thanks in advance for any helpful answers. I'm NOT a programmer so that is an important consideration.

  • @troymcneil4015
    @troymcneil4015 Год назад

    Do you know if there is a way to use multiple sources for the documents? For instance can you use text docs AND discord?

  • @proudindian3697
    @proudindian3697 Год назад

    Thankyou so much..!!

  • @april11729_
    @april11729_ Год назад

    wow!!!! than you so much !!!

  • @andrei0_058
    @andrei0_058 Год назад

    Hey man I know this was months ago but the new llama index has different names for some of the stuff in your code, creating problems when running the code. Is it possible for an update on this?

  • @Someone-mn1sx
    @Someone-mn1sx Год назад +1

    My question is will doing this bypass the content filter of ChatGPT? Could I host GPT and use llama-index or something to do that? No, it's not for sexy time. It won't talk about a lot of things like stock trading or cybersecurity because it flags it as bad content and gives an excuse instead of responding.

    • @LiamOttley
      @LiamOttley  Год назад

      Good question, I'd assume because it's just using your API key and davinci-text-003 or whatever you set it as then it would still hit the filter

  • @jamesrruff
    @jamesrruff Год назад

    I've spent hours on this to no avail due to an error "TypeError: __init__() got an unexpected keyword argument 'llm_predictor'". While i've been able to bypass the error, it seems like there is still an issue because as I change the temperature and OpenAI model, the output is identical every-time, which shouldn't be the case. ***Note, the first index iteration is commented out intentionally for debugging.*** Any idea on how to fix this?
    The solution 'seemed to be' specifying variables:
    "# Create a new GPTSimpleVectorIndex with custom LLM predictor and prompt helper
    custom_LLM_index = GPTSimpleVectorIndex.from_documents(documents)
    custom_LLM_index.llm_predictor = llm_predictor
    custom_LLM_index.prompt_helper = prompt_helper"
    ### code being used:
    import os
    os.environ['OPENAI_API_KEY'] = "API KEY"
    from llama_index import SimpleDirectoryReader
    from langchain import OpenAI
    from llama_index import LLMPredictor, GPTSimpleVectorIndex, PromptHelper
    # Load your data into 'documents' a custom type by Llamaindex
    documents = SimpleDirectoryReader('./Documents/Jupytr Docs/llama_doc_index/data').load_data()
    # Create an index from your documents
    ##index = GPTSimpleVectorIndex.from_documents(documents)
    # Query your index
    ##response = index.query("what is the article about")
    ##print(response)
    # Define another LLM explicitly
    llm_predictor = LLMPredictor(llm=OpenAI(temperature=0.1, model_name="text-davinci-002"))
    # Define prompt configuration
    max_input_size = 4096
    num_output = 256
    max_chunk_overlap = 20
    prompt_helper = PromptHelper(max_input_size, num_output, max_chunk_overlap)
    # Create a new GPTSimpleVectorIndex with custom LLM predictor and prompt helper
    custom_LLM_index = GPTSimpleVectorIndex.from_documents(documents)
    custom_LLM_index.llm_predictor = llm_predictor
    custom_LLM_index.prompt_helper = prompt_helper
    # Query your custom index
    response = custom_LLM_index.query("what is the article about")
    print(response)

  • @Faisal1504
    @Faisal1504 10 месяцев назад

    Very Interesting

  • @user-iv4gz5do2d
    @user-iv4gz5do2d Год назад

    How do I break the word limit for an answer,Sometimes the answer feels half, not quite ,How can I modify it thank you

  • @jgilmourtechsmog
    @jgilmourtechsmog Год назад

    great stuff, looking to tie this out as a slack bot to answer questions from employees for various business facing items contained in our KB

    • @umairx25
      @umairx25 Год назад

      Hello, if I use the Google docs loader, will the file be updated every time I update the Google doc?

  • @michaelstevenson2517
    @michaelstevenson2517 Год назад

    GPTSimpleVectorIndex is not a import that llama_index has?

  • @0GRANATE0
    @0GRANATE0 Год назад

    Do I get this right, that also with gpts API, if you build an App on your website; each time a new user is accessing the chat, you have to pass all the data again to the gpts API? so it costs everytime some tokens to "prepare" the chat bot for my new user, right?

  • @antfiv007
    @antfiv007 Год назад +1

    Dear Liam, great stuff. Already subscribed and looking forward to learn from you. A question: how to increase the length of the output ? I am using it to document some code and it stops before completing the task entirely. many thanks

    • @taylormun
      @taylormun Год назад +1

      often the token limit is too low

  • @bimwerx
    @bimwerx Год назад

    Great content! How would you get around the response character limit using this example?

  • @adumont
    @adumont Год назад

    Any way I can save the index for later reloading it? Can one add several type of things to the same index? Like Wikipedia pages and also pdf documents and maybe calender? All would go to the same index? Or how would it work? For the calendar, I assume it would need to be refresh on a regular basis, how does one refresh part of the index (the calendar part)?

  • @aprilrobertson7450
    @aprilrobertson7450 Год назад

    What do you recommend to build a construction engineering contractor estimating cooperation

  • @MomenRashad
    @MomenRashad Год назад

    Thanks

  • @noorameera26
    @noorameera26 Год назад +1

    Hi Liam! I'm interested to build a chatbot for an internal website, however, I worry that this might caused information leakage. What's your opinion on this?

    • @LiamOttley
      @LiamOttley  Год назад +1

      I’d say OpenAI is taking privacy pretty seriously. I wouldn’t be worried personally, people have built huge apps using their APIs already.

  • @joebanks4997
    @joebanks4997 Год назад

    Very good. How do we train the bot to be context oriented? If I only want the bot to have knowledge of radio controlled cars for example. At the moment I can ask this bot about what's in its index, but I can also ask it about washing machines and it will answer.

  • @MarcelPreis-ee8kd
    @MarcelPreis-ee8kd Год назад

    I guess this information are still go through OpenAI's Server, right?
    Isn't that an issues according to data privacy of the costumer?

  • @iqbalhonnur4451
    @iqbalhonnur4451 Год назад

    How is this different from using langchain?

  • @yakkalabour
    @yakkalabour Год назад

    Really cool video

  • @bgtubber
    @bgtubber Год назад

    Very nice! Does this thing run locally? Especially if there is no internet connection available.