If you want to learn Advanced RAG techniques, checkout my course here: prompt-s-site.thinkific.com/courses/rag If you are looking for advising or consulting on AI/LLM, get in touch: calendly.com/engineerprompt/consulting-call
Great stuff! Thanks for sharing this! Glad to see new OSS models performing so good! I wonder if tuning the prompts a bit for the 8B model to force it use the tools only would help with hallucinations! But I guess that’s on us the viewers to experiment. Thanks again. Great job!
After watching your video, I suddenly have an idea in my mind but I don't know if it makes any sense or not. My idea is to have to have multiple tools where one tool is used to answer user questions on the upload documents (RAG process. E.g., load, chunk, embed, store in vector db, re-ranker), another tool will be used to answer questions that is not within the vector DB (using LLM trained knowledge) and the final tool will be used to answer the question by searching through the internet as the LLM and knowledge base do not have the info or knowledge.
@@engineerprompt Thanks! Have you tried good ol'' Mistral 7B on function calling? In my experience, it is better at function calling than Lama 3 8B. Although definitely falters too... Is also definitely better in JSON output.
I have a problem. I am using llama3.1 tool function. The problem is it hallucinates when ı did not supply necessary required info. How can ı make it ask me follow up questions about required parameters
If you want to learn Advanced RAG techniques, checkout my course here: prompt-s-site.thinkific.com/courses/rag
If you are looking for advising or consulting on AI/LLM, get in touch: calendly.com/engineerprompt/consulting-call
Great stuff! Thanks for sharing this! Glad to see new OSS models performing so good!
I wonder if tuning the prompts a bit for the 8B model to force it use the tools only would help with hallucinations! But I guess that’s on us the viewers to experiment. Thanks again. Great job!
I really like your videos! They are very interesting and up to date. They come in pretty handy in addition to my machine learning course.
Thanks, glad they are helpful.
After watching your video, I suddenly have an idea in my mind but I don't know if it makes any sense or not.
My idea is to have to have multiple tools where one tool is used to answer user questions on the upload documents (RAG process. E.g., load, chunk, embed, store in vector db, re-ranker), another tool will be used to answer questions that is not within the vector DB (using LLM trained knowledge) and the final tool will be used to answer the question by searching through the internet as the LLM and knowledge base do not have the info or knowledge.
Great! Can I use this code structure with other models like openai/claude ?
When are the quantised instruct versions of Lama 3.1 8B coming up?
There were some issues with the quants in llamacpp. Hopefully that will be resolved soon.
@@engineerprompt Thanks! Have you tried good ol'' Mistral 7B on function calling? In my experience, it is better at function calling than Lama 3 8B. Although definitely falters too...
Is also definitely better in JSON output.
Interesting, at least there are improvements with 3.1 so perhaps a finetune of the 8B for function calling is possible?
Yes, and seems like prompting can also help. Have seen some work on it.
Question: Can you kindly share how I can add those highlights to my subtitles? We need something like that for our language class. Thanks
I use descript for it.
@@engineerprompt Thanks for the quick reply! I will look into that asap! Kudos for your awesome content! @MyCloudVIP Chicago
I have a problem. I am using llama3.1 tool function. The problem is it hallucinates when ı did not supply necessary required info. How can ı make it ask me follow up questions about required parameters
Can you make the notebooks shareable please 🙏. Currently have to request permission to them
Sorry for that. Now you should have access.
Imagine when LLMs can create their own tools in order to accomplish a task rather than the human list of static tools
Is it like agents? Functions calling
Yes