- Видео 16
- Просмотров 30 494
Eduardo Vasquez
Доминиканская республика
Добавлен 27 дек 2019
AI & Data Science Explained! I'm Eduardo Vasquez and I make complex topics like Large Language Models, Machine Learning, and Big Data accessible to everyone.
Free Cloud Battle Royale: GCP vs IBM - Which is Best for You?
In this video, I dive deep into the free trial accounts offered by Google Cloud Platform (GCP) and IBM Cloud to determine which one provides more value for your cloud computing needs.
🔍 Key Comparison Points:
- Amount of Credits and Duration of the Trial: Find out which platform gives you more credits and how long you can use them.
- Object Storage (Buckets):
- Storage Free
- Class A and Class B Requests
- Container Registry:
- Storage for Docker Images
- Pull Traffic
- Serverless Application:
- Google Cloud Run vs. IBM Cloud Code Engine
- Example Scenario: Calculating the resources needed for a Machine Learning (ML) serving workflow on both platforms.
Watch till the end to see which cloud provi...
🔍 Key Comparison Points:
- Amount of Credits and Duration of the Trial: Find out which platform gives you more credits and how long you can use them.
- Object Storage (Buckets):
- Storage Free
- Class A and Class B Requests
- Container Registry:
- Storage for Docker Images
- Pull Traffic
- Serverless Application:
- Google Cloud Run vs. IBM Cloud Code Engine
- Example Scenario: Calculating the resources needed for a Machine Learning (ML) serving workflow on both platforms.
Watch till the end to see which cloud provi...
Просмотров: 681
Видео
Soccer Rules PDF Interaction - RAG in LLM Use Case
Просмотров 7385 месяцев назад
Welcome to my video where I dive into the world of RAG technology (tech) stack and its application in creating an interactive PDF with RAG. Learn how this innovative approach transforms how we interact with PDF documents, making them more dynamic and user-friendly. 📚 In this video, we’ll explore: - The fundamentals of the RAG technology (tech) stack. - How to create an interactive PDF with RAG....
This AI Model Outperforms GPT-4o by 7% and is CHEAPER | NO LangChain | Python
Просмотров 7555 месяцев назад
This video shows step-by-step how to build open-source LLMs outperforming GPT4 from Open AI. Discover how a Mixture of Agents can significantly boost the accuracy of LLMs, surpassing even GPT-4o. 🔍 Key Highlights: - Mixture of Agents: Learn how combining different agents can create a powerful system that is 7% more accurate than GPT-4o. - Cost-Efficient Alternatives: Explore an alternative appr...
ChatGPT and ANY LLM in your preferred IDE: VSCode & PyCharm | AI Coding Assistant New Copilot
Просмотров 8755 месяцев назад
In this tutorial, I’ll walk you through integrating various Large Language Models (LLMs) into popular IDEs like VSCode and PyCharm. Whether you’re using open-source models or proprietary ones like GPT-4o, Codestral, or Ollama, I’ve got you covered! STOP paying for Github Copilot, this is a free alternative to an AI Coding Assistant that can be used offline. 🔍 𝗪𝗵𝗮𝘁 𝗬𝗼𝘂’𝗹𝗹 𝗟𝗲𝗮𝗿𝗻: - Setting up pro...
Can Pandas Keep Up? Testing Pandas GPU & Polars for Data Analysis & Processing | cuDF Pandas Python
Просмотров 8095 месяцев назад
In this video, I tested three powerful data processing tools: Pandas, cuDF (Pandas with GPU acceleration), and Polars. I aim to determine which comes out on top regarding speed and memory efficiency. The video covers: 📚 - Reading Data: Which library can load data the fastest? - Filtering Data: How do they compare in filtering operations? - GroupBy Operations: Which one handles Groupby operation...
Building My Own JARVIS! AI Voice Assistant with Whisper from Open AI Groq, gTTS | LangChain Python
Просмотров 3,6 тыс.5 месяцев назад
In this video, I build an AI voice assistant using Python in real-time! This project demonstrates how to integrate AI tools to create a responsive voice interaction system. 📚 What You'll Learn: Conversation Starter: Begin a conversation with the AI assistant. Audio to Text: Transcribe audio to text using an OpenAI model. Fast Inference: Generate responses quickly with Groq implementations. Lang...
Anime Lovers: Building a Content-Based Recommendation System using Python | Embeddings Qdrant AI
Просмотров 1,4 тыс.6 месяцев назад
In this video, I demonstrate how to build a content-based recommender system that provides personalized recommendations based on the user's watch history, specifically what they have liked and disliked. ✨ Follow along as I guide you through: - Generating text embeddings - Inserting collections into Qdrant Cloud - Providing personalized recommendations based on user watch history, focusing on wh...
Advanced RAG with Self-Correction | LangGraph | No Hallucination | Agents | LangChain | GROQ | AI
Просмотров 7 тыс.6 месяцев назад
In this video, I'll guide you through building an advanced Retrieval-Augmented Generation (RAG) application using LangGraph. You'll learn step by step how to create an adaptive and self-reflective RAG system, and how to effectively prevent hallucination in your language models. 🔍 Key Topics Covered: - Adaptive and Self-Reflective RAG: Learn how to design a RAG system that self-corrects to impro...
30x Faster LLM Fine-Tuning with Custom Data: Unsloth, ORPO & Llama3 on Google Colab | LLM | Python
Просмотров 1,7 тыс.6 месяцев назад
In this video, I dive deep into the world of fine-tuning Large Language Models (LLMs) using Odds Ratio Preference Optimization (ORPO) technique for the Llama3 8-billion-parameter model. ORPO takes the best of both worlds, merging the steps of Supervised Fine Tuning (SFT) and Direct Preference Optimization (DPO) into a streamlined, efficient process. 🔍In this video, we cover: 🚀 Unsloth for Faste...
Fast-track RAG: Chat with SQL Databases using Few-Shot Learning and Gemini | Streamlit | LangChain
Просмотров 3,6 тыс.6 месяцев назад
In this video tutorial, I'll guide you through the development of a RAG application designed to chat with SQL databases, eliminating the need for consistently coding complex SQL queries. Employing few-shot learning, I instruct the Large Language Model (LLM) to adapt to specific database schemas through a limited set of examples. Gemini-Pro, a language model from Google, is utilized for this pur...
Deploy Your LLM in Minutes! LangServe, LangSmith and Ollama Tutorial | AI | Python
Просмотров 1,2 тыс.7 месяцев назад
In this tutorial, I'll walk you through the process of utilizing LangServe to transform Large Language Models (LLMs) into an API server. We'll delve into each step, covering how to generate credentials for LangServe, seamlessly integrate LangSmith, utilize Ollama for running models locally without GPU, and explore the LangServe playground. 🔍In this video, we cover: 🔑 LangServe Credential Creati...
Say Goodbye to Manual Entry: Automate Image Data Extraction with Python & Gemini 1.5 | Invoice
Просмотров 9457 месяцев назад
In this tutorial, we dive into the exciting world of image extraction technology powered by Gemini 1.5 and Google's Multimodal LLM. Have you ever wondered how to efficiently extract vital information from images, such as invoices? Look no further! Join me as we walk through the process of building a powerful image extractor application from scratch. 🔍 What You'll Learn: -Setting up the Developm...
Go Beyond Text! Process Images with Gemini 1.5 & Python | Multimodal LLM | Image recognition | API
Просмотров 9797 месяцев назад
In this tutorial, you'll learn how to harness the power of Gemini 1.5, a multimodal language model. Follow along as I guide you through the process of creating credentials, testing results, and exploring the capabilities of Gemini 1.5 for processing text and images. 🔍 What You'll Learn: - Setting Up Credentials: Step-by-step instructions on how to create an API Key to access Gemini 1.5. - Testi...
Integrate Google Search into your LLM | LangChain Agents | Web Search | Python | HuggingFace
Просмотров 2,8 тыс.7 месяцев назад
In this tutorial, I'll guide you through the process of integrating an agent into your Large Language Model (LLM) using LangChain. We'll explore step-by-step how to seamlessly incorporate the power of a Google Search Engine into your LLM, enhancing its capabilities and providing richer responses. 🔍 What You'll Learn: - Setting up Google and Search Engine credentials - Integrating Google Search ...
How to use Ollama in Python | No GPU required for any LLM | LangChain | LLaMa
Просмотров 9638 месяцев назад
Welcome to this comprehensive tutorial on Ollama! In this step-by-step guide, I'll walk you through how to use Ollama and everything you need to know to make the most out of it. From downloading and setting up the platform to exploring available models and seamlessly integrating Ollama with LangChain. Stay tuned as we put Ollama to the test with Llama2, boasting a 7 billion parameters, all acco...
Chat with Websites: LangChain and Gemini to Supercharge Websites Chats | Streamlit | LLM | Python
Просмотров 2,4 тыс.8 месяцев назад
Chat with Websites: LangChain and Gemini to Supercharge Websites Chats | Streamlit | LLM | Python
can we switch groq with a free local ai model from ollama or something?
Excellent! There is Polars GPU now available via Rapids AI in it's beta release, you could test that too and compare, would be great!
THAT IS AMAZING! THX BRO!
Nice tutorial!
Is this also using the Google Cloud. because I got notebook auth error
Hi Eduardo, thanks for your video. What if the the number of tokens of the responses exceeds the context limit? And using many LLMs instead of one will increase the environmental footprint a lot. Is this approach not too energy consuming? And too expensive ? And are the results worth the effort? And will users wait that long for the answers?
great video!! thanks for the explanation
Thank you, glad it was helpful!
Hi Eduardo, this is a really nice video, thank you. Do you think you could add a citation functionality, such that the user get's reaffirmed, where the information was taken from? Thanks
Great video! But how to break a loop after a few trials if the model gets stuck into an infinite loop during Hallucinations grading or answer relevance?
hie bro.I need you to help to set up mine.Please contact me ASAP if possible
Hola Eduardo, me encantó el video. Como sugerencia, podrías hacer añadirle la api de elevenlabs para mejorar la voz del chatbot y hacerla más "humana". Me pregunto si no subiría demasiado la latencia. Bueno un saludo y muy buen video, gracias.
Gracias por la sugerencia, probablemente suba un video usando esa API y comparando la latencia usando otras alternativas.
Great tutorial
Thank you!
Thank you!👏
You're welcome!
When creating all of these agents, creators should include token costs.
Finally an app to send to all of my non-brazillian friends before the World Cup! Amazing video 😄
Hhaha thank you. Now your non-brazilian friends have no excuses to skip the games.
Where can we see the eval loss sir? WaB doesn't show it either. It seems that specifying the evaluation data set is redundant
where is the pycharm extension from
It's from Continue.
This is awesome! Very good alternative for GPT-4o! It’s incredible how easy you make it for us!
It really is. I'm glad you like it!
GREAT!
Thanks!
great, useful 😁
Glad it was helpful!
Me sirve bastante, esta súper
Gracias!!
Amazing! Thanks
Thank you!
Excellent , thank you!
Glad you liked it!
Can you make an example using only Local LLMs and Local Agents, so no API Keys (and no costs) are created? That would be amazing!
Yes, I'll have it in mind for the next video!
@@eduardov01 Amazing!!
Please could you share the link here
Awesome
Thank you!!
Thank you for this video. Very interesting.
Glad you enjoyed it!
Nice one. Question : what if all the docs are marked as irrelevant chunks by the model , do you need to query the vector db again ? I guess an improvement may be to include a Hyde model in between to improve the questions and keep trying to get a different chunks from DB ?
It'll perform a web search to find the relevant information (node that has the Agent). Yes, that could be an option too.
Very nice, the only challenge with this approach is the total cost of answering each query, and it could run forever in some cases till both llms agree or till you get thr eight relevant information from the search. I think of customers want 100% gurantee and are not worried about latency, this will work really well.
Indeed, it'll depend on the usecase that you have because for some cases you wouldn't sacrifice the quality of the responses for the speed.
Surely this approach becomes more and more viable as the cost of newly released models keep on decreasing by 5x, 10x est as we are currently seeing? So the cost of this multi-shot RAG approach with a new model 5x cheaper is still less expensive than a single-shot of its more expensive predecessor?
Exactly!
Does any services already provide "Web Search" as a tool via GUI atm? Because it seems only a matter of time before coding this tools will no longer be needed. Like weather forecast tools or similar.
The advantage of incorporating this agent into your pipeline is that it allows you to retrieve the latest information that LLMs may not have. For instance, Chat-GPT4 uses Bing search to answer questions about recent events, as the LLM wasn't trained with that data.
Woah, that's nice! I don't like Copilot because of the lack of control... This changes everything.
Indeed, with this option you can have any proprietary/open-source model available for you all the time.
Awesome video. Thank you.
Glad you liked it!
Can you come up with a SQL agent chat with Llama3
Yes, that's a valid approach.
Will it work if i have more tables in the database ?
Yes, you can add as many tables as you like. The function that retrieves the schema will provide all the columns and tables as input to the LLM. You only need to add a few example SQL queries (few shots) for those tables so the LLM can understand how to JOIN them if necessary.
Amazing work, keep up the good work!
Thanks, I'm glad you liked it!
Very helpful! With the amount of data being handed, this comparisons help us make better decision on how to structure our solutions! Thank you Eduardo!
Indeed, when we're dealing with large datasets is very important to optimize our code in terms of speed and memory.
Hey, is this using GPT 4.o in the work flow ?
No, it's using Whisper from Open AI.
Great video, very nice
Thank you very much!
langchain.chains LLMChain doesnt work anymore, i get the following error ValidationError: 2 validation errors for LLMChain prompt Can't instantiate abstract class BasePromptTemplate with abstract methods format, format_prompt (type=type_error) llm Can't instantiate abstract class BaseLanguageModel with abstract methods agenerate_prompt, apredict, apredict_messages, generate_prompt, invoke, predict, predict_messages (type=type_error) is there a solution?
Explain very well, what is usloth heard first time.
Thanks. It's a library that optimizes (by manually deriving all compute heavy maths) the fine-tuning and inference process of some LLMs.
thanks, good flow between rag and web search, thanks!!1 :)
Thank you. I'm glad you found it interesting!
very nice!! how i can speak with you? i want speak about Python projects
You can contact me on LinkedIn: www.linkedin.com/in/eduardo-vasquez-n/
Great video! But I have a question I hope you can answer and help me. Why is so slowly answering? that's normal for the architecture or there is other reason, and can we do something to fix that?
The fact that we have 5 LLMs to generate answers + retriever + a websearch is performed when the question is not in the vector store database + we also store the web search results in the database and all these steps can take some time. To make it faster, you can use fewer LLMs and maybe skip the web search, depending on your usecase.
Nice usecase Eduardo, keep it up!
Thank you, much appreciated!
Wonderful project
Thank you!
Thanks a lot ! Really Great!!
Thank you!! I'm glad you liked it!
老師教的真的很好
Thank you so much!
Nice tutorial! Thank you! I will now watch and try your other videos.
I'm glad you liked it. Thank you for the support!
Please upload the next part by adding the few shots in vector DB, would be really helpful :-)
Thank you for the comment! I'll be making this video soon.
How do the second model knows the initial question if only the sql response was provided?
That's a good remark. Currently, the second model makes an assumption about the initial question based solely on the SQL response provided. For a robust approach, the initial question needs to be added to the prompt of the chain_query function. By including both the initial question and the SQL response as input fields, the final answer will be more accurate.
I like the idea of putting the few shot examples into a vector database. That would be a nice video to make.
I'll definitely consider making it. Stay tuned!