Crystal clear, thank you! At the end, it seems to me that the result is not very different from what one would get from an extractive model, except then the generator puts together an answer in English. Makes me wonder if it would be possible to take a generator instead (like BART or GPT 3.5) and fine-tune it to directly answer questions based on the corpus of text (a part of Wikipedia here), without going through the extraction phase first. Perhaps it would take to rent a datacenter for a couple weeks?
Bro. This is incredible! I need this man. Questions: - I can probably use elasticsearch to get my contexts vs. Pinecone right? - If the generative model doesn't have a good answer, how do we retrieve the confidence score so we can trigger some other logic? - we can easily fine-tune the model or replace it with a fine-tuned GPT model right? Do you have resources for fine-tuning? - do you have better ideas of building a chatbot for a specific domain / corpus of Q&A tasks? Thanks and I'm sorry for all the questions!
hey Elijah, glad it helps, for your questions: 1. Yes but you might need to use a sparse embedding model if using typical elasticsearch, so you will miss out on the semantic search part - alternatively you can use their kNN / ANN service, from what I've heard it's decent but doesn't scale to very big datasets (if you're in the *few millions*, it shouldn't be a problem) 2. As far as I know there isn't a "confidence score" output by the generative model, unless it's possible to extract the confidence for the token predictions (I imagine this is possible, but I haven't tried) -- as an alternative you could try calculating the semantic similarity using the retriever or some other QA retriever model and using that as a confidence score 3. Yes you can, I actually did something like this with GPT-3 here ruclips.net/video/coaaSxys5so/видео.html - I don't have anything on fine-tuning GPT-like models 4. I think the video linked above is the best I've ever seen for Q&A, other than ChatGPT. If you wanted to adapt it as a chatbot you could initialize the input with something like "you are a chatbot" and preceed each user input with "user: " and each answer with "chatbot: " and with each step in the conversation you input the previous steps I hope that helps!
@@jamesbriggs helps a lot! Thanks James! Do you have something on pinecone+elasticsearch architecture? I have an app that we've decided to use elasticsearch but need semantic search capability and will scale to at least billions.
There's an upcoming article covering indexing with elasticsearch to pinecone - I'll share that when ready If you're looking to merge sparse + dense (semantic) search, there is an hybrid search feature in private preview at Pinecone - or you can use elasticsearch as the sparse index, pinecone as the dense index, query both and then combine result scores for each (shared) record to give the final records. I think hybrid search with elasticsearch+pinecone is a good use-case so I could look into doing something on it
@@jamesbriggs thanks again James! I'll look forward to that article. I think we can do dense vectors now in elasticsearch with knn search. You can bring in a sentence-transformer model from huggingface per their docs. Thanks again man! Keep bringing out bangers!
If you can, I would love using one of the open source LLMs like Meta's OPT or Bloom to create chatbots (like ChatGPT). I think you're one of the best in the space, and would love to see how you do it
I am sewrching for video on answer generation using hf model . Hopefully this is the one ..., I would like to train the model with a simular "sqaud styled dataset" as well as just a corpus of documents .. then when presented with a question to get an answer generated .. from the corpus (either exact as a specific span ? Ie quote) or a text generated answer (guess) perhaps based on probablity or simularity ? .. can the model be trained for two purposes ie , first to provide text generation , then fine tuned as a qa model ? Or vice versa ? So that my languge model can perform various functions in a single model instead of making many models for different purposes .. (ps: im a .net dev not a python) .. henxe still confusing (pytorch methods only ) + hugging face ? . Also is the GPT model a general all purpose model ?
sir please reply, The error message you're encountering is related to resource quotas and limitations within the Pinecone service. Pinecone is a platform that allows you to build and deploy vector similarity search systems, and the error you're seeing indicates that you're trying to create an index that exceeds the allocated resource quota for your project. this is the error, since i cant afford pinecone pro, is there any ways to fix this?
Waoh exactly what i neede for my Thesis project please How do I get the final model ready for production with flask API I hope my question is not too dumb😅
James, enjoyed very much this video. You are the only one which explains current advances in NLP topics in depth. I suppose u can replace wiki data with your own documents. But is it enough to have only an id and some context string or do I need to supply more metadata? btw; It would be awesome if this example can be extended similar to chatgpt. for example Q1 -> A1 but A1 is not correct and try with answer A2(for example by google Q1 and the results feed into generator) etc...I did search in the past for chatbot using RL(reinforced learning) but did not found any and see that chatgpt is using RL also....
thanks a ton :) Yes you can replace with your own docs, and no all you need is an ID and context string, it can be useful to include some info on where the document comes from in the metadata but it isn't necessary Yes for sure, I'd love to see chatgpt references it's sources, and for more niche subjects (like recent NLP papers) it would help chatgpt answer questions accurately
yeah but I wouldn't recommend for larger datasets, it's slower, has less features, is less accurate, and more expensive - but if you're working with smaller datasets
Thanks a lot, such a great video. I am working on French dataset and I want to know if there exist a French model trained on a French corpus and tested on a QA task
Amazing tutorial ! in my use case I would want to do create a question answering system such that basically for each document I can query something and if the answer exist in the document( a kind of a text extraction model) the model give me it back I think it doesn't really fit what you show in the tutorial do you have any recommendations in terms of model or suggestions ?..
This video details what's in the pinecone documentation, thanks for that. Chatgpt has the ability to do some statistical analysis as well as SWOT. Does this analytical capability exist in any particular pipeline, and can BERT do it? Thanks for your reply
Good question ! Should you pass the history in a text file as user(q) ,AI(answer) , context (prev message) .. and retrain a new model ? Ie fine-tune it with the new history ??
the wiki dataset (as far as I know) contains mostly short sentences - but for those that are longer, as we used sentence-transformers that will automatically truncate anything over the max length of the model
This tutorial was a GEM 💎Loved it. Actually for me, I have used Haystack to replicate the same. There I used the same LFQA technique with GenerativeQAPiepeline, I could observe that the answers that the model gave were more "complete". One more thing I would like to ask is, is it possible to give the structural data (tables) and also the unstructured data together as the context and let the model give answer from both of the datasets? Again, Here we would like to see the "generative response" from the tables as well, instead of the "extractive". Like, If I am feeding in the data of USA economy (past 5 years - CSV) + All news articles (for past month - Text), then asking question like: "How is the USA economy in past 6 months?" then instead of "extracting" the answer like "+6% GDP" it should generate answers from the data like: "It is perorming really good which is compared to last year at 4%..." etc. Can we do that? How? Please guide. Thank you.
Bart is one example we can use for text generation, in huggingface there are a few similar models like T5 and GPT-2 that could also be used. If you have the resources BLOOM could return better results too. Outside of Huggingface you can use OpenAI's GPT-3, I did another video on that here: ruclips.net/video/coaaSxys5so/видео.html
@@jamesbriggs Thanks for the replies. Actually, I hav tried text summarization using T5, Bart and pegasus. I hav learnt abstractive qa system from your video today. I hav only one doubt whether the response of generative answer will be good from the predefined context of retriever.
If using large language models (LLMs) like GPT-3 or BLOOM the generated answers are very good, almost perfect most of the time. Using T5, Bart, and Pegasus you will still get good results but they will fail more frequently. Nonetheless, adding the context (as we do here) improves generated answers significantly.
James, thanks for your awesome videos. In the future, it would be great if you used something other than pinecone (maybe something open-source, like milvus) for the vector database, just so we can see other options!
Thanks for watching! Sometime soon I'll likely cover Elasticsearch and Faiss some more. I work with Pinecone so that will remain the main tool, but may do some Milvus/Weaviate at some point :)
@@jamesbriggs awesome, thanks for the quick reply! Will look forward to it. Would you recommend any of those options to someone looking to maintain their own vector database?
I've only used Elasticsearch and Faiss, naturally neither of those are fully packaged as vector databases (elastic search is but the ANN search is not ideal). For the others, I haven't used them extensively so I couldn't say, Milvus and Weaviate are on a pretty similar level as far as I know. I think Milvus might be a little behind Weaviate in development of features like hybrid search - but I'm not sure
Hey James, Thanks for sharing this video. Any information how to solve this error: ""MaxRetryError: HTTPSConnectionPool(host='controller.your_environment.pinecone.io', port=443): Max retries exceeded with url: /databases (Caused by NewConnectionError(': Failed to establish a new connection: [Errno -2] Name or service not known'))""
Most likely it is the environment in your Pinecone.init call that is wrong, default for new projects is now “us-east1-gcp” but you can check for your in the Pinecone console next to your api key
This is amazing , you are a hidden gem in the NLP education space 🤩🤩🤩😍😍😍
Shamal - seriously man! I agree! James is a blessing to us!
Thanks both 🙏
HI James,@4:07 is pinecone resposible to covert matched embedding back to text ?
Crystal clear, thank you! At the end, it seems to me that the result is not very different from what one would get from an extractive model, except then the generator puts together an answer in English. Makes me wonder if it would be possible to take a generator instead (like BART or GPT 3.5) and fine-tune it to directly answer questions based on the corpus of text (a part of Wikipedia here), without going through the extraction phase first. Perhaps it would take to rent a datacenter for a couple weeks?
Hey James, been following you for 8 months now. Really enjoy your videos man! Thanks for everything you do.
that's awesome. Thanks for sticking around! 🙏
Bro. This is incredible! I need this man.
Questions:
- I can probably use elasticsearch to get my contexts vs. Pinecone right?
- If the generative model doesn't have a good answer, how do we retrieve the confidence score so we can trigger some other logic?
- we can easily fine-tune the model or replace it with a fine-tuned GPT model right? Do you have resources for fine-tuning?
- do you have better ideas of building a chatbot for a specific domain / corpus of Q&A tasks?
Thanks and I'm sorry for all the questions!
hey Elijah, glad it helps, for your questions:
1. Yes but you might need to use a sparse embedding model if using typical elasticsearch, so you will miss out on the semantic search part - alternatively you can use their kNN / ANN service, from what I've heard it's decent but doesn't scale to very big datasets (if you're in the *few millions*, it shouldn't be a problem)
2. As far as I know there isn't a "confidence score" output by the generative model, unless it's possible to extract the confidence for the token predictions (I imagine this is possible, but I haven't tried) -- as an alternative you could try calculating the semantic similarity using the retriever or some other QA retriever model and using that as a confidence score
3. Yes you can, I actually did something like this with GPT-3 here ruclips.net/video/coaaSxys5so/видео.html - I don't have anything on fine-tuning GPT-like models
4. I think the video linked above is the best I've ever seen for Q&A, other than ChatGPT. If you wanted to adapt it as a chatbot you could initialize the input with something like "you are a chatbot" and preceed each user input with "user: " and each answer with "chatbot: " and with each step in the conversation you input the previous steps
I hope that helps!
@@jamesbriggs helps a lot! Thanks James! Do you have something on pinecone+elasticsearch architecture? I have an app that we've decided to use elasticsearch but need semantic search capability and will scale to at least billions.
There's an upcoming article covering indexing with elasticsearch to pinecone - I'll share that when ready
If you're looking to merge sparse + dense (semantic) search, there is an hybrid search feature in private preview at Pinecone - or you can use elasticsearch as the sparse index, pinecone as the dense index, query both and then combine result scores for each (shared) record to give the final records.
I think hybrid search with elasticsearch+pinecone is a good use-case so I could look into doing something on it
@@jamesbriggs thanks again James! I'll look forward to that article. I think we can do dense vectors now in elasticsearch with knn search. You can bring in a sentence-transformer model from huggingface per their docs. Thanks again man! Keep bringing out bangers!
You are amazing men. Please continue your advance NLP tutorials it's help a lot to learn.
I will :)
I've been going through lots of videos on RUclips regarding creating QA chat bots but gotta admit your videos are really concise and helpful.
The model hallucinates if answer is not in context. Why can't it say, the answer is not in context?
If you can, I would love using one of the open source LLMs like Meta's OPT or Bloom to create chatbots (like ChatGPT). I think you're one of the best in the space, and would love to see how you do it
planning on doing some videos with LLMs, both open source and not - they'll be coming soon :)
I am sewrching for video on answer generation using hf model . Hopefully this is the one ..., I would like to train the model with a simular "sqaud styled dataset" as well as just a corpus of documents .. then when presented with a question to get an answer generated .. from the corpus (either exact as a specific span ? Ie quote) or a text generated answer (guess) perhaps based on probablity or simularity ? .. can the model be trained for two purposes ie , first to provide text generation , then fine tuned as a qa model ? Or vice versa ? So that my languge model can perform various functions in a single model instead of making many models for different purposes .. (ps: im a .net dev not a python) .. henxe still confusing (pytorch methods only ) + hugging face ? .
Also is the GPT model a general all purpose model ?
what if i want the model return None if no resutl or spesific word i can do that ??
sir please reply,
The error message you're encountering is related to resource quotas and limitations within the Pinecone service. Pinecone is a platform that allows you to build and deploy vector similarity search systems, and the error you're seeing indicates that you're trying to create an index that exceeds the allocated resource quota for your project.
this is the error, since i cant afford pinecone pro, is there any ways to fix this?
I have a Rule/Acts which are separated into different columns as Rule Number,Rule Discription,Subrules etc. How to train them in NLP
Waoh exactly what i neede for my Thesis project please How do I get the final model ready for production with flask API
I hope my question is not too dumb😅
@jamesbriggs
hello @James Briggs can this be done using gpt-2 instead of bart ?
Just what I needed! AMAZING work man🙌🏻 lots of love ❣️
🙌
James, enjoyed very much this video. You are the only one which explains current advances in NLP topics in depth.
I suppose u can replace wiki data with your own documents. But is it enough to have only an id and some context string or do I need to supply more metadata?
btw; It would be awesome if this example can be extended similar to chatgpt. for example Q1 -> A1 but A1 is not correct and try with answer A2(for example by google Q1 and the results feed into generator) etc...I did search in the past for chatbot using RL(reinforced learning) but did not found any and see that chatgpt is using RL also....
thanks a ton :) Yes you can replace with your own docs, and no all you need is an ID and context string, it can be useful to include some info on where the document comes from in the metadata but it isn't necessary
Yes for sure, I'd love to see chatgpt references it's sources, and for more niche subjects (like recent NLP papers) it would help chatgpt answer questions accurately
Very nice video. Can I use Elasticsearch dense vector instead of pinecone?
yeah but I wouldn't recommend for larger datasets, it's slower, has less features, is less accurate, and more expensive - but if you're working with smaller datasets
Thanks a lot, such a great video. I am working on French dataset and I want to know if there exist a French model trained on a French corpus and tested on a QA task
Thanks for the tutorial. Do you have code for fine-tuning this BART QA model?
What notebook provider do you used?
Any idea how i can fine tune my dataset to extract email signatures?
I like your videos, and I would love to see videos on training/finetuning the Generative Text Models.
Woah! somebody that actually zooms into the text to make it visible. Your videos get 6 out of 5 stars for the attention to detail!
Amazing tutorial ! in my use case I would want to do create a question answering system such that basically for each document I can query something and if the answer exist in the document( a kind of a text extraction model) the model give me it back I think it doesn't really fit what you show in the tutorial do you have any recommendations in terms of model or suggestions ?..
This video details what's in the pinecone documentation, thanks for that. Chatgpt has the ability to do some statistical analysis as well as SWOT. Does this analytical capability exist in any particular pipeline, and can BERT do it? Thanks for your reply
How can my chatbot respond based on previously asked questions?
Good question ! Should you pass the history in a text file as user(q) ,AI(answer) , context (prev message) .. and retrain a new model ? Ie fine-tune it with the new history ??
Can we use gpt 2 for please reply fast?
Really Appreciate the hard work you do.
dont we need to truncate the text as the sentence transformer cannot look at very long text?
the wiki dataset (as far as I know) contains mostly short sentences - but for those that are longer, as we used sentence-transformers that will automatically truncate anything over the max length of the model
Would it make sense to use ChatGPT for the text generation model?
Definitely would, it would probably result in a similar result to the GPT-3 video, but I assume better (considering the improvements in chatgpt)
This tutorial was a GEM 💎Loved it.
Actually for me, I have used Haystack to replicate the same. There I used the same LFQA technique with GenerativeQAPiepeline, I could observe that the answers that the model gave were more "complete".
One more thing I would like to ask is, is it possible to give the structural data (tables) and also the unstructured data together as the context and let the model give answer from both of the datasets? Again, Here we would like to see the "generative response" from the tables as well, instead of the "extractive".
Like, If I am feeding in the data of USA economy (past 5 years - CSV) + All news articles (for past month - Text), then asking question like: "How is the USA economy in past 6 months?" then instead of "extracting" the answer like "+6% GDP" it should generate answers from the data like: "It is perorming really good which is compared to last year at 4%..." etc.
Can we do that? How? Please guide.
Thank you.
Thanks for the valuable video. The Bart model chosen whether it is under text generation of huggingface or other?
Bart is one example we can use for text generation, in huggingface there are a few similar models like T5 and GPT-2 that could also be used. If you have the resources BLOOM could return better results too.
Outside of Huggingface you can use OpenAI's GPT-3, I did another video on that here: ruclips.net/video/coaaSxys5so/видео.html
@@jamesbriggs Thanks for the replies. Actually, I hav tried text summarization using T5, Bart and pegasus. I hav learnt abstractive qa system from your video today. I hav only one doubt whether the response of generative answer will be good from the predefined context of retriever.
If using large language models (LLMs) like GPT-3 or BLOOM the generated answers are very good, almost perfect most of the time. Using T5, Bart, and Pegasus you will still get good results but they will fail more frequently. Nonetheless, adding the context (as we do here) improves generated answers significantly.
@@jamesbriggs Thanks for your kind replies.
Underated channel, thanks for videos
Thanks man
James, thanks for your awesome videos. In the future, it would be great if you used something other than pinecone (maybe something open-source, like milvus) for the vector database, just so we can see other options!
Thanks for watching! Sometime soon I'll likely cover Elasticsearch and Faiss some more. I work with Pinecone so that will remain the main tool, but may do some Milvus/Weaviate at some point :)
@@jamesbriggs awesome, thanks for the quick reply! Will look forward to it. Would you recommend any of those options to someone looking to maintain their own vector database?
I've only used Elasticsearch and Faiss, naturally neither of those are fully packaged as vector databases (elastic search is but the ANN search is not ideal). For the others, I haven't used them extensively so I couldn't say, Milvus and Weaviate are on a pretty similar level as far as I know. I think Milvus might be a little behind Weaviate in development of features like hybrid search - but I'm not sure
Hey James, Thanks for sharing this video. Any information how to solve this error: ""MaxRetryError: HTTPSConnectionPool(host='controller.your_environment.pinecone.io', port=443): Max retries exceeded with url: /databases (Caused by NewConnectionError(': Failed to establish a new connection: [Errno -2] Name or service not known'))""
Most likely it is the environment in your Pinecone.init call that is wrong, default for new projects is now “us-east1-gcp” but you can check for your in the Pinecone console next to your api key
Thank you. I have learnt a lot from your videos. Is there a link to this code in github?
Yep! Here it is: github.com/pinecone-io/examples/blob/master/search/question-answering/abstractive-question-answering.ipynb
can you please explain how generative AI models work? like GPT or abstractive summarization, pointer generation network etc.
16:12
You are thee best!
please you need to improve your teaching. It's hard to understand and interpret what you are saying, please try to explain things in better manner.
Actually he is super good 😂.. very slow paced and detailed . English is very clear ! ..
Also there are subtitles available
For me he forgets to export or show how to save the model after training !!! Every video !
Thanks, for the video, iam not able to get this dataset 'vblagoje/wikipedia_snippets_streamed' can u upload it in git or in huggingface