- Видео 26
- Просмотров 59 885
Enric Domingo - AI Engineering
Испания
Добавлен 11 июн 2008
AI, MLOps, Data, Bio, and Software Engineering
Code a Vision LLM Agent that plays GeoGuessr using your PC (GPT-4o, Claude 3.5, and Gemini 1.5)
How to code an AI bot that plays autonomously the GeoGuessr game using Multimodal Vision LLMs that take screenshots of the game with Python + LangChain and auto click into the mini-map region taking control of the mouse (no human interaction needed!).
Code: github.com/enricd/geoguessr_ai_bot
Blog: medium.com/@enricdomingo/coding-a-geoguessr-autonomous-ai-bot-with-vision-llms-gpt-4o-claude-3-5-and-gemini-1-5-908faf3bc3c7
GeoGuessr Game: geoguessr.com
Poker AI Bot Video: ruclips.net/video/xb88cPyeNe0/видео.html
In this tutorial, we walk through the process of developing an AI bot for GeoGuessr, leveraging the power of advanced Vision Large Language Models (LLMs) like GPT-4o, Claude 3.5 Sonnet, a...
Code: github.com/enricd/geoguessr_ai_bot
Blog: medium.com/@enricdomingo/coding-a-geoguessr-autonomous-ai-bot-with-vision-llms-gpt-4o-claude-3-5-and-gemini-1-5-908faf3bc3c7
GeoGuessr Game: geoguessr.com
Poker AI Bot Video: ruclips.net/video/xb88cPyeNe0/видео.html
In this tutorial, we walk through the process of developing an AI bot for GeoGuessr, leveraging the power of advanced Vision Large Language Models (LLMs) like GPT-4o, Claude 3.5 Sonnet, a...
Просмотров: 1 420
Видео
Deploy Python LLM Apps on Azure Web App (GPT-4o Azure OpenAI and SSO auth)
Просмотров 1,9 тыс.2 месяца назад
Learn how to build and deploy a Streamlit LangChain application on Azure Web App using App Service Plan, Azure OpenAI, SSO Entra ID Authentication, and GitHub Actions CI/CD. A high performance and secure LLM chatbot, capable of answering questions but also do RAG on custom documents, info and websites in a safe way. Code: github.com/enricd/rag_llm_app Blog: medium.com/@enricdomingo/deploy-a-str...
Program a RAG LLM Chat App with LangChain, Streamlit, OpenAI and Anthropic APIs
Просмотров 4,1 тыс.2 месяца назад
Learn how to build and deploy a RAG web application using Python, Streamlit and LangChain, so multiple users can chat with Documents, Websites and other custom data online. Code: github.com/enricd/rag_llm_app Blog: medium.com/@enricdomingo/program-a-rag-llm-chat-app-with-langchain-streamlit-o1-gtp-4o-and-claude-3-5-529f0f164a5e The RAG LLM Streamlit App: rag-llm-app.streamlit.app/ In this RAG L...
PyTorch Intro Tutorial 🔥 Code and Train a basic Neural Network (with free GPU!)
Просмотров 4583 месяца назад
Learn the PyTorch TOP 10 Basic Concepts and Build a Neural Network from scratch! Kaggle Notebook: www.kaggle.com/code/edomingo/intro-to-pytorch-in-10-code-cells Blog: medium.com/@enricdomingo/intro-to-pytorch-create-and-train-a-basic-neural-network-with-free-gpu-cd9cfc673cbd This programming tutorial video is a quick introduction to PyTorch! A Deep Learning framework, released by Facebook (Meta...
Claude 3.5 Sonnet API: Integrate The Best LLM into your App (Anthropic Python tutorial)
Просмотров 5 тыс.5 месяцев назад
In this video we will see the new model from Anthropic, Claude 3.5 Sonnet, which currently is the best LLM in the benchmarks. We will learn how to use it with code in Python, calling to the Claude API for free, and then we will integrate it to The OmniChat Python Streamlit web app together with the OpenAI and Google models, and how to push and deploy our chatbot app for free. Blog: medium.com/@...
Gemini 1.5 API: Chat with Videos, Images and Audios in your app (Gemini API Code tutorial)
Просмотров 2,6 тыс.5 месяцев назад
Tutorial exploring the basics of the new Google Gemini 1.5 Pro and Flash models, how they compare to the OpenAI GPT-4o and GPT-4 Turbo, how to get the Google API Key, how to send requests to the Gemini API, and chatting with all kinds of files (videos, images, audios, etc.). Finally, we implement the Gemini 1.5 API models to the OmniChat App, a custom Streamlit Python webapp chatbot that we sta...
Code your online Multimodal ChatGPT App in Python (GPT-4o API with Images and Audio)
Просмотров 6 тыс.6 месяцев назад
Video intro to GPT-4o, developing from scratch a chat web app with Streamlit, sending images, text and audio to the OpenAI API. Finally, you will upload the project into your GitHub and deploy this web app for free. Medium blog: medium.com/@enricdomingo/code-the-omnichat-app-integrating-gpt-4o-your-python-chatgpt-d399b90d178e GitHub: github.com/enricd/the_omnichat The OmniChat app: the-omnichat...
GPT-4 Turbo Agent Plays the 1vs1 Snake Game (Python webapp) - Your AI Portfolio #1
Просмотров 73311 месяцев назад
Blog: medium.com/@enricdomingo/code-the-llms-snake-arena-webapp-gpt-4-vs-turbo-your-ai-portfolio-1-3a29de85d983 Data and AI Portfolio Code: github.com/enricd/enricd_streamlit_portfolio The online Data and AI web Portfolio: enricd.streamlit.app LLMs Arena Code: github.com/enricd/st_llms_arena The LLMs Arena webapp: llms-arena.streamlit.app/ Donate: buymeacoffee.com/edomingodot In this video, we ...
Coding a Vision ChatGPT that plays Poker Autonomously! (GPT-4V Python tutorial)
Просмотров 10 тыс.Год назад
Explaining how to make GPT-4V play Poker for me (Autonomous Vision bot in Python) Blog: medium.com/@enricdomingo/making-gpt-4v-to-play-poker-for-me-automatic-vision-bot-in-python-69031c79e733 Donate: buymeacoffee.com/edomingodot NEW VIDEO - Updated video of how to build a similar bot but for GeoGuessr, with the new OpenAI GPT-4o, Anthropic Claude 3.5 and Google Gemini 1.5 Pro: ruclips.net/video...
Local ChatGPT on MacBook Air M2 - Running the best Open 13B LLM with llama.cpp from scratch
Просмотров 5 тыс.Год назад
Local ChatGPT on MacBook Air M2 - Running the best Open 13B LLM with llama.cpp from scratch
🧑💻 Code your web PORTFOLIO for Data and AI in 2024 in Python Streamlit 🐍 - (Deploy FOR FREE!! 🤑)
Просмотров 1,9 тыс.Год назад
🧑💻 Code your web PORTFOLIO for Data and AI in 2024 in Python Streamlit 🐍 - (Deploy FOR FREE!! 🤑)
Intro to Streamlit - Create your first data web-apps (only!) with Python 📊💻🐍
Просмотров 1 тыс.Год назад
Intro to Streamlit - Create your first data web-apps (only!) with Python 📊💻🐍
Python Functions, ASCII and f-strings - Advent of Python 🐍 Course - Day 3
Просмотров 115Год назад
Python Functions, ASCII and f-strings - Advent of Python 🐍 Course - Day 3
Python Dictionaries - Advent of Python 🐍 Course - Day 2
Просмотров 97Год назад
Python Dictionaries - Advent of Python 🐍 Course - Day 2
Python intro, setup and basics - Advent of Python 🐍 Course - Day 1
Просмотров 1992 года назад
Python intro, setup and basics - Advent of Python 🐍 Course - Day 1
Participate in the Advent of Code 2022 - Learning Python
Просмотров 5942 года назад
Participate in the Advent of Code 2022 - Learning Python
Electrónica avión RC (y drone multirotor) - Esquema básico
Просмотров 2,9 тыс.8 лет назад
Electrónica avión RC (y drone multirotor) - Esquema básico
Eachine racer 250 FPV with xiaomi yi - music: video games (HD 1080p)
Просмотров 6 тыс.9 лет назад
Eachine racer 250 FPV with xiaomi yi - music: video games (HD 1080p)
Tijoco Alto Team - Paper Air Challenge Q3
Просмотров 6019 лет назад
Tijoco Alto Team - Paper Air Challenge Q3
Thank you for the detailed explanation. do you have any content on how to perform load testing for the Streamlit chat application?
thanks! not really on load testing yet, it's quite a niche and advanced topic which mostly depends on the machine where the apps is running, even so, in the next video from that one I showed how to deploy that app on Azure and how to chose the size of cpu and ram of the instance, and how to check the current load on cpu and memory, but I didn't perform load testing.
sick
Thank You Very much for explaining very clearly. For me App worked locally, and I am using python 3.13 . To deploy to streamlit cloud i am facing few issues with version. do we need python version lesser than 3.11 to deploy to streamlit.
thanks! 3.10 to 3.12 will work for sure, I'm not sure with 3.13 as it was released 1-2 months ago and some libraries are not yet updated to work with it, I'm still using 3.11 for much of my work and apps.
@@enricd Thanks for the reply. Looking forward for more videos from you.
Argentinian? Just wandering 😅
Barcelona :)
Awsome
Is there any way to do it without giving my credit card to open AI? In the first try I got the "You exceeded your current quota" message
Hi! you can do it with Anthropic otherwise, although you will need to find some other embeddings model. You can also check my older video about building a chat with Google Gemini that also allows to upload docs. I will try to do a video on how to this locally using open source (free) LLMs as well at some point.
And do you know why if we use Ollma, why do we need to use OpenAI? I thought it was enough just with Ollama.
Got the Video Great Haaa ✅
Can you make with Google gemini models or other models this are paid models.
Hey! yes, I will try to do maybe a video doing RAG local with open source models for example whenever I have time for it. Even so, you can check a video in my channel about building a chatbot with Gemini where I already showed how to upload docs into it, which is not exactly the same as the RAG in here, but very similar :)
Really good work. Could you do AWS and GCP versions as well ?
Thanks! I will try to do at least the AWS video at some point
fantastic
getting error creating vector_db says "The onnxruntime python package is not installed." But its already installed. I'm using python3.11
Hi! What Operating System are you using? Try manually downgrading the onnxruntime python library one or a few versions and checking if this fixes it. For example, if you currently have the version 1.19.2 (you can check your current version with "pip show onnxruntime" or "pip freeze"), then try "pip install --upgrade onnxruntime==1.19.1", and if it still don't works, try 1.18.0 for example
👉You can check my latest video on how to create a similar AI Agent for the GeoGuessr game, with the best and latest vision LLMs from OpenAI, Anthropic and Google DeepMind: ruclips.net/video/OyDfr0xIhss/видео.html If you are interested in building a Poker Bot today, I recommend combining the knowledge from both videos for a much better result 💪
ruclips.net/video/9CS7j5I6aOc/видео.html
Superb video, Bro! Could you please create a video tutorial on how to deploy on Google Firebase or Google Cloud Platform?
Thanks! I will add it to the list of ideas :) but not sure if I will have time for it any time soon
Could you please make a video on how to upload a vector store database to the cloud for reuse in the future?
thanks for the idea! There are some different options out there for this, the easier ones are SAAS startups like Pinecone that hosts the cloud vector store for you, then you can also store it in any cloud storage option and also have a microservice or an api serving it, having the docs there. The thing is that there is not a single solution for this, it depends on your use case, how often the vector store data needs to be updated, how many users you will have, how much docs they need to query, what is your budget, etc.
@@enricd so, you should make video for the rest of this world, Sir! :)
Is there a reason why we're not using Google's Gemini?
It's possible to use it but it also has some other custom ways of doing it uploading docs to google cloud and also openai and anthropic/claude are preferred for many people. You can check my Google Gemini video in the channel where I showed how to upload docs into it and chat with them :)
You don't want to upload your stuff to the cloud man...not if you value your privacy.....I'm going to use this video but keep it local.
@@AaronBlox-h2t it depends.. I dont mind uploading some cooking recipes pdf to openAI or some public papers to chat with them, if I want a little bit more of security, then I would do what I did in the next video and build this on Azure, and finally if I really want privacy with my sensitive/private/secret data I would do it locally using open LLMs but the result will be slightly worse
You are amazing! Video is very well explained how to implement in workplace. Something I haven't seen many RUclipsrs mention. I appreciate this 🙏🏾. Does your LLM even RAG bro? 😂😂😅 . Yes it does with limited tokens cause I'm cheap.
😂😂 Thanks!!
Hey enric ! i tried to know about u r omnichat app and i created a google api key i given it to the app when i try to upload my multimedia content i t gives me : google.api_core.exceptions.FailedPrecondition: This app has encountered an error. The original error message is redacted to prevent data leaks. Full error details have been recorded in the logs (if you're on Streamlit Cloud, click on 'Manage app' in the lower right of your app). help me
Thanks you
Looks great. When loading a PDF I got "Error loading document sample.pdf: cryptography>=3.1 is required for AES algorithm".
Thanks! Try to "pip install cryptography" in the terminal with the venv active if you have created it, and also add it to the requirements.txt file if you want to upload the web app on streamlit cloud or somewhere else. Let me know if that fixes it :)
@@enricd It does, thank you! I'm exploring it and I'm really impressed so far. I will git it a try with 250 1.5 mb pdfs file eventually. Is it possible to configure it so if the requested info is not in the PDFs he simply mention it without using info from outside?
Oups, I get "ValueError: Batch size 664 exceeds maximum batch size 166" when trying to upload 8 PDFS of 1.5mb. Do you know what the limitation are?
About the error, it seems to be in the OpenAI Embeddings model, you would need either to upload them in small groups, or to change the loading function so it doesn't try to load them all at once to the vector store, but one by one. Not completely sure about it, but it's my guess.
I'm not really sure if I get the question, do you want it to answer only using info from the RAG PDFs (not it's own learnt knowledge), or do you mean to go to the internet or ask you whenever the info it needs for answering your question is not in the RAG PDFs?
What is your system specifications
do you mean my computer parts? for these demos you don't need much, the LLMs run on the cloud from the OpenAI API and the RAG with few files doesn't take much RAM or CPU, I have a pretty powerful desktop PC with Windows 11, but this can run in almost any device with Python 3.11 or similar. If you are in Mac or Linux, maybe few commands or Python libraries may be slightly different, but 99% of it would be the same
Any chance of an updated version seems like the API has changed a lot.
not sure, in few days I think I will upload a video doing something similar but in another game/app, and maybe in some weeks/months I will check the rules and regulations in the more popular poker online platforms about using bots and assistants, and if I see it clear I will try, but I think they don't like you to do this kind of stuff :/ and I also never play poker :)
Managed to get it working in the end and have even made a basic interface so you can select the table and turn indicator area. Currently experimenting with pre programmed pre-flop decisions to save on API costs. Been really enjoying this project so thanks for the inspiration.
@@superiorchaos awesome! thanks for sharing, there is so much room for improvement and building features on top of it. In addition, I'm pretty sure that LLMs aren't the most optimal solution or algo to play poker, it would be great to mix the more optimal strategies with LLMs somehow to extract the best of both worlds.
We are waiting continues update in videos and blog❤.
thanks @sitheekmohamedarsath ! it's in the oven :P
Great explanation ❤
Amazing Enric as always
Great work
Really helpful, can you please make a video to show how to upload PDFs for this? And to add maybe langchain/llama index for better PDF analysis?
Thanks! This week I plan to upload a video on how to do RAG with LangChain and GPT-4o and Claude 3.5, with docs like PDFs, .docx, .txt, markdown and websites content
@@enricd Sounds amazing, I really prefer your method of talking through and just about the right pace for someone like me ith limited coding knowledge. For this model could you just add PDFs to the gemini LLM or would you need langchain or llma index to incorportae PDF access for this omnichat?
Kindly Use Google Gemini Also if possible it'll be helpful @@enricd
buen video! consulta, haces consultorias de arquitecturas para startups?
muchas gracias Leonidas! escríbeme a contact.enricd@gmail.com y le echamos un ojo :)
Has anyone tried a poker bot in live play and had success. Is the AI better for tournament style play or for cash games?
Hi! I don't know much about poker beyond the basics, but you can probably tune this behavior in the prompt instructions, telling the bot for example to be more conservative, more risky or some more elaborated instructions to perform in a certain style as you desire. At first maybe won't work perfectly, but then you can iterate that prompt to improve it little by little until it works as you want. It would be nice if you try it and comment here how it goes :)
@@enricd it would definitely help out as a tool to give you the best technical analysis of each decision. I don't know if I would ever be comfortable letting it make the choices for you. Mainly in a cash game where everything you bet is your actual money. But there are many tournaments where you just buy-in for the cost of the entry fee which sometimes are for 1 dollar or less. With that style I would be willing to see how it can perform all on its own. Plus this video is 9 months old. The poker skills have likely greatly improved since you made this video. Unfortunately I'm not anywhere close to the skill level needed to program this on my own. Do you have a copy of the code to just use it based on manual inputs and skipping the screen capture parts? When I have asked ChatGPT about NLTH before it didn't seem that skilled in it enough to be able to play autonomously. Is there a certain prompt or version required in order for ChatGPT to provide the best decision making the most up to date model that works?
great !!! thank you for sharing Sir
Wondering, if there is only this way to pass the image in the API? I mean, one with passing the URL from the internet, and second is what you have shown. In the shown method we convert into the byte string and so on. Wondering if there is any other way to send the local image to the API. Please share, thanks.
Hi aayushsmarten, here you can fin the OpenAI API docs: platform.openai.com/docs/guides/vision . As far as I know, there are only these 2 options to send images to the API, the direct URL and the base64 encoding of it. :)
Can this be extended to chat with websites also. Please suggest
post Advanced Claude API course for developers...
Terrific tutorial, love it👍👍
Can you continue this where we can upload several PDFs for example to do RAG on them?
done :) ruclips.net/video/abMwFViFFhI/видео.html
Great video! can you add some tutorial where we can made the model's knowledge based on our own data?
done :) ruclips.net/video/abMwFViFFhI/видео.html
Very detailed, nice! My issue is with max tokens since I wanted to use the API not for chatbot style application but for processing documents and apparently the max is 4096 which isn't nearly enough, and splitting into chunks is screwing up the processed result, compared to what I get from manually doing it through Claude. So sad....
totally.. :/ even so you can try to refine the chain of getting one partial output into the next input with proper instructions and concatenating everything in the end, not easy to do but with enough rails and treaks could works I would say :)
@@enricd hmmm maybe... Would take much API coat just trying to get a good result haha and I think it's not worth it for me at the moment.
it's really good information video 👏♥
Hey man! Could you provide information on approximately how many prompts can be utilized with the $5 free credits? Additionally, is this offer available in all regions, or is it limited to specific areas?
I mean Multimodal queries
As a reference, an image of 1000x1000px would take 1400 tokens, so 1000 image would cost 4 dollars. You can get more info in here: docs.anthropic.com/en/docs/build-with-claude/vision
thank you so much for this video. I can tell claude will be important in the near future. Though I wanna ask, is there a way to make it send files and get back files? I'm currently working on a project using claude's api
the model itself from the api I would say it's not capable of storing files or keeping them, you have to manage such functionalities in a backend logic in any cloud or your own computer. Like doing a vector database for example and the RAG logic.
code ? for download !
no code for download ! :) sorry
Thanks, for your video!
Excellent video👍 if u deploy at streamlit is the url publicly available? How to deploy from you're own server?
Thanks! when you deploy your app into the Streamlit Community Cloud, the <your-sub-domain>.streamlit.app URL is available from anywhere in the internet, I have some other videos in my channel showing to do it. If you want to deploy it somewhere else you can do it either from your own server (which could be risky as you are opening it to the public internet) or you can also deploy it on AWS, Azure, GCP, Heroku and some other public clouds.
@@enricd thanks. for anthropic I get an error about billing "credit balance too low.." . I did create an account and succesfully claimed the $5. Is this not sufficient?
Great Video! Very simple and straight forward. Question: Have you thought about how you can show those responses with images from the model? Also, how about file upload (csv, pdf, etc)? will they all still be in the same format?
Thanks! GPT-4o doesn't produce images, it could generate the prompt for dalle-3 or any txt2img model to generate them. If you want to send it a document, you could use the langchain document loaders for it :)
Thank you. Clear and straight to the point.
Great video! Maybe the next time use langchain for interfacing for better flexibility when using other multi modal LLM. Now you have Gemini flash/pro and Claude 3.5 sonnet multi models beside openai 4o model.
totally! that's something I had in mind, but it would add some more complexity (although the interface and integration would be better) and also when dealing with Gemini 1.5 2 weeks ago, I found that langchain was not yet capable of dealing with upload files like videos, audios, docs and so, so it was still for the Gemini 1.0 capabilities but not for the 1.5. But yes, it would have some advantages and depending on the project, I would use it and also structure better the code with more abstractions and a better architecture. Here I wanted to show a basic and plain integration of it.
Nice Job, but unfortunatly i have an issue when i claim my 5$ ... the code I received didn't work and I can't receive a new one 😢
Hi Vincent, I think whenever you create an account with an email, you already have the $5, but I'm not sure. Try with a new email, and check if this is also available in your country as maybe it's only available in certain regions. I hope you will be able to fix it :)
i have the same problem. i verified my phone number and still couldn't claim it
@@maiseja9987 support.anthropic.com/en/articles/8994925-how-do-i-claim-my-free-credits this is what they say in their FAQs
@@enricdthat didn’t work for me and their support is unresponsive. My phone number is now registered, I cannot re register it, and no credit.
Perfect ,great work always
Good Work!!
Amazing well done !!!