Build a Medical RAG App using BioMistral, Qdrant, and Llama.cpp
HTML-код
- Опубликовано: 21 авг 2024
- In this tutorial, I guide you through the process of building a cutting-edge Medical Retrieval Augmented Generation (RAG) Application using a suite of powerful technologies tailored for the medical domain. I start by introducing BioMistral 7B, a new large language model specifically designed for medical applications, offering unparalleled accuracy and insight into complex medical queries.
Next, I delve into Qdrant, a self-hosted vector database that we run inside a Docker container. This robust tool serves as the backbone for managing and retrieving high-dimensional data vectors, such as those generated by our medical language model.
To enhance our model's understanding of medical texts, I utilize PubMed BERT embeddings, an embeddings model specifically crafted for the medical domain. This ensures our application can grasp the nuances of medical literature and queries, providing more precise and relevant answers.
A crucial component of our setup is Llama.cpp, a library that enables the inference of large language models on CPU machines. This quantized model approach allows for efficient and cost-effective deployment without compromising on performance.
For orchestrating our application components, I introduce LangChain, an orchestration framework that seamlessly integrates our tools and services, ensuring smooth operation and scalability.
On the backend, I leverage FastAPI, a modern, fast (high-performance) web framework for building APIs with Python 3.7+. FastAPI provides the speed and ease of use needed to create a responsive and efficient backend for our medical RAG application.
Finally, for the web UI, I employ Bootstrap 5.3, the latest version of the world’s most popular front-end open-source toolkit. This enables us to create a sleek, intuitive, and mobile-responsive user interface that makes our medical RAG application accessible and easy to use.
Join me as I walk you through each step of the process, from setting up the environment to integrating these technologies into a cohesive and functional medical RAG application. Whether you're a developer interested in medical applications, a data scientist looking to expand your toolkit, or simply curious about the latest in Gen AI and machine learning, this tutorial has something for you.
Don't forget to like, comment, and subscribe for more tutorials like this one. Your support helps me create more content aimed at exploring the forefront of technology and its applications in the medical field. Let's dive in!
GitHub Code: github.com/AIA...
Qdrant Video: • Get Started with Qdran...
RAG Playlist: • RAG (Retrieval Augment...
Join this channel to get access to perks:
/ @aianytime
To further support the channel, you can contribute via the following methods:
Bitcoin Address: 32zhmo5T9jvu8gJDGW3LTuKBM1KPMHoCsW
UPI: sonu1000raw@ybl
#mistral #ai #llm
This is gold. Thanks bro you are really fast, saw your medical rag app and then saw in last few days biomistral was released, I wondered this would be better suited to the RAG app and in a day you come up with the video!
i love ur content, i highly request you to please start a course where you take it from beginner friendly to advanced for LLMs. where you cover all imp aspects of LLM. i dont care if its paid or not, please do it
Love the way the detailings are provided in your videos(i.e. i was thinking about it only that why Qdrant was used and not FAISS , and then he answered my questions itself without even checking it somewhere else).
Keep it up .. And thanks for making such informative and detailed videos. :)
Amazing video and so much to learn. You expose the technologies that are hidden but gems.
Thanks for writing ☺️
Excellent and up to date content as always. Thanks for the code examples. I'm working on something similar and BioMistral 7B looks promising.
Here in NZ 10's of thousands do not have access to a doctor, and this type of application should be funded and made available to those in need.
Glad it was helpful!
Amazing content for free.. thanks
Happy to help
Nice work.. Thanks
Explained in depth
Excelente trabajo, gracias maestro.
Thank you sir
Amazing Content.
youre awesome man!! keep it up. Hope to see you grow!
Thanks, you too!
Your a god THX
Amazing work. It seems be working fine. I faced the issue of the retriever not fetching the entire response
Thanks... You should look at the chunk size and then top_k documents.
Is it possible to add vision to it, where we can submit a X-ray or a blood report and it can analyse and try to answer some findings.
no. it is only instruct model. if required you have to add bio vision model.
@@sharathkumard can you suggest any model which is available on HuggingFace or open source
@@sharathkumard how can I do this
Great work.. could you make a video about self RAG or self reflection Rag. Thank you in advance
Self RAG is overrated ... But will create one.
I tried building the same on my mac, the thing is, which python version you are using was unclear, the requirements.txt needed to be tweaked like n number of times accordingly, the dependencies for the venv environments were colliding with one another, it took me 55 minutes to get started, so excellent work in trying to shorten it but to the viewers my request is if it doesn't work the first go with the code in your local, don't give up, the instructor is nice but he has to think about RUclips, so can't do everything verbatim.
Hey jatin can you please help me out even I am on mac and the requirements.txt SUCKS
@@naveenpoliasetty954 Eventually it didn't work out for me
I have a very generic question about evaluation of the RAG system. How can we evaluate the responses generated by the RAG system?
Do we always need internet when we use Qdrant? I am developing an ofline chatbot, can we use Qdrant vector db in this case?
great video but why use the model as a RAG? If it is a well trained model it should be able to generate it without retrieval and if not then why not use llama2 or mistral medium that are more powerful?
You can insert patients data through RAG and then the model will undertake the use case for further analysis and diagnosis....
Liked the video but there were a lot of steps I had to complete to get it to work.
so much to learn.thanks if i have 5 client at same time can chat? pdf upload option?
Getting some error in the packages install for the llama_cpp_python
(using python 3.11 version) in windows machine
-------------------------------------------------------------------
ERROR: Failed building wheel for llama_cpp_python
Failed to build llama_cpp_python
ERROR: Could not build wheels for llama_cpp_python, which is required to install pyproject.toml-based projects
getting a similar issue but in onnx.
i had the same error and it was solved when i installed gcc and g++, i suggest you follow a tutorial because the installation is a bit long
localhost not able to connect, can you advise on what is wrong?
How can we evaluate the responses generated by the RAG system?
Bro can you please make a videos on ollama
thank you
Soon 🔜
can we run it on 8GB Ram system ?
Difficult.....
Hey are you indian? because you looks similar
Well, Tutorials are ok, but I lost it after 15 minutes..LOL, need to know many "why's" and "How" before touching this tutorial actually..