- Видео 49
- Просмотров 211 298
Fast and Simple Development
США
Добавлен 6 янв 2023
Fast and Simple Development - presents software development skills and techniques for super productive development focused on API, Microservices, Spring Boot, Java, MongoDB, Docker, React and AWS.
In this Channel I will shares Pro-Tips and Techniques for Java, Spring Boot, Microservices, Docker, AWS, MongoDB, SES, Twilio, Bit.ly and so much more
If you are want to grow your software development skills super first then you've come to the right place.
Overall I'II guide you to become a better software developer. If you are interest in this channel, make sure to Subscribe and click this notification button, So you never miss one of my videos! For sure, It's FREE!!
Stay with us:
Udemy classes: www.udemy.com/user/tomjay2
Website: www.thomasjayconsulting.com/
Twitter: startupdev
In this Channel I will shares Pro-Tips and Techniques for Java, Spring Boot, Microservices, Docker, AWS, MongoDB, SES, Twilio, Bit.ly and so much more
If you are want to grow your software development skills super first then you've come to the right place.
Overall I'II guide you to become a better software developer. If you are interest in this channel, make sure to Subscribe and click this notification button, So you never miss one of my videos! For sure, It's FREE!!
Stay with us:
Udemy classes: www.udemy.com/user/tomjay2
Website: www.thomasjayconsulting.com/
Twitter: startupdev
Mastering PodMan & Spring Boot 3 Like an Expert
Open Containers are the new solution, no more paying for Docker, PodMan from Red Hat has the solution that most Java developers are moving to.
Creating a Spring Boot 3 Microservice and hosting in a PodMan container takes only a few minutes.
#docker #java #microservices #podman
Creating a Spring Boot 3 Microservice and hosting in a PodMan container takes only a few minutes.
#docker #java #microservices #podman
Просмотров: 478
Видео
ESP32S3: The Coolest Temp Monitoring System
Просмотров 995 месяцев назад
Learn how to create a temperature monitoring system with the XIAO ESP32S3 board for under $30 and post data to a Python Rest API. Perfect for monitoring data centers and more! In only a few minutes, learn how to setup an ESP32S3 XIAO device and collect Temperature and Humidity data that you can send to a server. You will learn how to setup a Server usign Python Flask to receive the information....
Discover the Secrets of Spring AI 1.0, SpringBoot, Java, Ollama/Llama3, API Creation and RAG Basics
Просмотров 2,2 тыс.7 месяцев назад
Discover the secrets of Spring AI 1.0, SpringBoot, Java, Ollama/Llama3, API Creation, and RAG Basics in this informative video. Learn about these key topics to enhance your knowledge in AI the Fast and Simple way to develop Java Enterprise applications using Spring Boot and Spring AI version 1.0 using Ollama LLM running locally using Llama3 model. Learn to create an end to end API and the basic...
Spring Boot: The JSON Log Revolution
Просмотров 5057 месяцев назад
You need to have JSON Formatted Logs to deploy your application in your new Docker Env? I got you covered, generate JSON Logs in 5 minutes following these simple instructions. Spring Boot 3 has some powerful library components, team up with Lombok, Slf4j and Logback and your task is simple. Turn Off Banner spring.main.banner-mode=off Add pom.xml dependency Create logback.xml file in /resources ...
Transform Ollama AI with Llama3 LLM into a Speaking Genius
Просмотров 6 тыс.7 месяцев назад
Transforming a Ollama AI with Llama 3 into a Speaking Genius! Join me on a fascinating journey as we equip a llama with speech software using Python. Witness the llama answering questions with a professional voice and even connect to a remote llama server. Subscribe for more tech adventures! By using ElevenLabs API you can quickly create a Flask API system calling Ollama / Llama3 to process tex...
Unvaeiling RAG: Chat History Follow-Up Ollama + LangChain
Просмотров 2,6 тыс.8 месяцев назад
Learn how to update your RAG application with chat history for follow-up questions on your PDF documents! Follow along as we add chat history to your project step by step. Don't forget to subscribe for more videos like this! Follow Up Questions in your LangChain / Ollama / ChromaDB applications. With only a few lines of code you can now have follow up questions in your Retriever. Source code: g...
Understand Ollama and LangChain Chat History in 10 minutes
Просмотров 5 тыс.8 месяцев назад
Discover the magic of adding chat history to your Ollama application with LangChain. Follow along as we implement Chat History and create a chat prompt template. Subscribe for more AI development tips and tricks! This is my number one question about Ollama with Llama3 and LangChain is how to get Chat History working. It's so simple once you understand the basics. In only a few minutes I will ex...
Epic Google Authenticator hacks for Spring Boot and Java
Просмотров 8658 месяцев назад
Learn how to implement Google Authenticator for your Spring Boot and Java applications with these epic hacks. Enhance your app's security today! Want to learn how to use Google Authenticator app for 2FA with Spring Boot? Here are the secrets exposed. One simple thing unlocks hours and days or research and makes it all work. Here is the Code - github.com/ThomasJay/Fast2FAGA #java #Springboot #se...
Llama3 Full Rag - API with Ollama, LangChain and ChromaDB with Flask API and PDF upload
Просмотров 54 тыс.8 месяцев назад
Get ready to dive into the world of RAG with Llama3! Learn how to set up an API using Ollama, LangChain, and ChromaDB, all while incorporating Flask and PDF uploads. Subscribe now for more exciting tutorials like this one! Complete Llama3 PDF RAG (Retrieval Augemented Generation) API application. Python Flask API, Ollama, Llama3, LangChain and ChromaDB with PDF integration. Complete system from...
Customize your Spring Boot Banner in 5 minutes
Просмотров 1989 месяцев назад
Spring Boot has a banner messages that is printed in the log output when it starts. You can easily customize this information. This is normally the stdout log but normally gets processed in your log collection. Adding little things like this makes you feel like you own the code and as you start understanding all these hidden gems your application start to look more and more professional. #sprin...
Expert Guide: Installing Ollama LLM with GPU on AWS in Just 10 Mins
Просмотров 10 тыс.10 месяцев назад
Learn how to install Ollama LLM with GPU on AWS in just 10 minutes! Follow this expert guide to set up a powerful virtual private LLM server for fast and efficient deep learning. Unlock the full potential of your AI projects with Ollama and AWS. #ai #llm #gpu
Learn how to become an Expert at Travel Planning with Tommy Travel AI
Просмотров 162Год назад
Discover the Ultimate AI Travel Planner and plan your next adventure with the ultimate AI travel planner, Tommy Travel AI! This innovative tool brings together ChatGPT AI integration and real-time information to help you create the perfect itinerary. Whether you're looking for activities, checking the weather, or booking flights and hotels, Tommy Travel AI has got you covered. Get ready for a s...
Supercharge your Spring Boot applications with GZIP encoding for Rest API and Web pages
Просмотров 475Год назад
Spring Boot Rest API's are one of the most popular solutions for Microservices today. Using Java is the standard in so many companies. Even with so many years of experience there are still the little things that make a huge difference in performance, adding GZIP encoding for responses is one of those big wins. #Java #springboot #development
Integrate Bootstrap 5 with Spring Boot & Thymeleaf
Просмотров 6 тыс.Год назад
Embark on a journey to seamlessly integrate Bootstrap 5 with Spring Boot & Thymeleaf in your Java projects! In this comprehensive tutorial, I dive deep into the world of Spring Boot, crafting a practical application with Thymeleaf layouts and the stylish touches of Bootstrap 5. Whether you're just starting out or looking to refine your skills, this guide is tailored to foster your growth in sof...
🚀 Skyrocket Your App Performance with Big Query and Spring Boot!” ✔📈
Просмотров 4 тыс.Год назад
Are you struggling with optimizing your app's performance? Look no further! In this informative RUclips video, we will show you how to utilize Big Query and Spring Boot to skyrocket your app's performance to new heights. Big Query is a powerful tool for processing and analyzing massive datasets, while Spring Boot provides a robust framework for building and deploying high-performance applicatio...
Master Node.js API in 2023: Expert Tips Revealed
Просмотров 1,4 тыс.Год назад
Master Node.js API in 2023: Expert Tips Revealed
Unlock the Hidden Power of Spring Boot Retry
Просмотров 1 тыс.Год назад
Unlock the Hidden Power of Spring Boot Retry
Master Terraform Essentials for Java Developers
Просмотров 5 тыс.Год назад
Master Terraform Essentials for Java Developers
Eliminate CORS Issues: Single Jar Solution for React and Router (6.4) in Spring Boot App
Просмотров 2,3 тыс.Год назад
Eliminate CORS Issues: Single Jar Solution for React and Router (6.4) in Spring Boot App
Revolutionize Your Microservices Architecture: How Spring Boot's Property Secrets Transform Dev
Просмотров 442Год назад
Revolutionize Your Microservices Architecture: How Spring Boot's Property Secrets Transform Dev
Fix CORS issues - React to Spring Boot with the React Proxy
Просмотров 2,8 тыс.Год назад
Fix CORS issues - React to Spring Boot with the React Proxy
Send Awesome HTML Emails - AWS SES and Spring Boot 3 / JDK 20
Просмотров 4,5 тыс.Год назад
Send Awesome HTML Emails - AWS SES and Spring Boot 3 / JDK 20
First Look - Spring Boot 3 and JDK 20 in 2023
Просмотров 759Год назад
First Look - Spring Boot 3 and JDK 20 in 2023
Dockerize any Spring Boot Microservice in minutes
Просмотров 3,1 тыс.Год назад
Dockerize any Spring Boot Microservice in minutes
What you need to know about Spring Boot & Kafka
Просмотров 648Год назад
What you need to know about Spring Boot & Kafka
Boost Your Development Skills: Discover the Art of Using Multiple Databases in Spring Boot
Просмотров 1,6 тыс.Год назад
Boost Your Development Skills: Discover the Art of Using Multiple Databases in Spring Boot
The secret to Java Streams will change your life
Просмотров 1,5 тыс.Год назад
The secret to Java Streams will change your life
Encrypt your Properties with Jasypt and Spring Boot
Просмотров 9 тыс.Год назад
Encrypt your Properties with Jasypt and Spring Boot
How to integrate it in flutter
NOICE! tysm for this :D
Bro ❤ great tutorial. Quick and easy
Heh choice of model makes a huge difference, I'm just getting set up, doing init testing, I asked the model I downloaded 'what day is after monday' and it thought about it for [no joke] 1 minute 45 seconds, then spent the next 3mins spitting out 400 words explaining how weeks work (yeah, running on a cpu)... I think I'll shop around for a more concise model ;)
I use Llama3.2 for most of my work. I run this on a machine with a GPU, answers are normally 1 - 4 seconds. There are hundreds of models and sizes to choose from with Ollama
@@fastandsimpledevelopment Thanks for the heads up, gives me confidence that I might have already found one to stick with. That being llama3.1-python (as I'll using it as a .py assistant); Its quicker and doesnt make me read pages every time :)
Thankyou sir great explanation.🙏
What to do for springboot3 and spring 6 and Java 21 !? Is this supported there ?
I use the same exact code, no difference. I've moved to Terraform vault for my new projects.
Is this solution scalable? With many concurrent users?
No, it is not. That means that one request is processed in one thread of Python, you can scale this if you have multiple instances of the Python Flask app running, this can be done with a load balancer like NGINX but still does not make it fast. The bottleneck is Ollama, it can handle multiple requests but may have to reload the LLM each time if it is not the same LLM, if you were to use Llama3.2 every time then you would get a level of support for multiple requests but in the end it is not scale-able as you would expect, if 1 request takes 10 seconds, 2 requests takes 20 seconds so it does not solve anything. You can always add another GPU, Ollama supports this but then the first request takes 12 seconds, the seconds takes 18 seconds, etc. So again no scale-able solution. But if you have isolated machines that are feed from a load balancer like NGINX and each machine has your Flask API and Ollama running with its own GPU (I use the NVIDIA 4090) then yes, first request takes 10 seconds, 2nd request takes 10 seconds so the same is pretty consistent for multiple requests, you will quickly find that you may need 4 machines to create a production grade system. This is what I have done for a large LLM project and it does work well. I setup the Load Balancer for Round Robin and then process each request as they come in. If I need to support more requests them I will add more servers, I did this on AWS and it cost me about $700 per server per month but it did work. I now have my own servers that cost about $2,500 each to build that is the LLM Engine Cluster. I connect this into the cloud using ngrok and it is very fast. As far as I know there is no way to scale up vertically as far as getting more LLM ram or processor power other than replacing your GPU board. Adding boards in parallel gives more memory but does not effect the processing speed, well it is a bit slower from the overhead but Ollama will put different sections of the LLM into different cards so the memery is scaled but not the processors. Each processor runs the segment of memory based on the LLM loaded in its own instance so there is no performance increase.
@@fastandsimpledevelopment Clear! Thanks for the reply! So from a performance perspective parallel is the way to go, makes sense. Follow up question, how do you keep the source of truth (the RAG and your docs) in sync?
@@ChigosGames I do two things for Rag, initially I used PDF input and stored the Vectors into a ChromaDB, this can be a server so all the instances use the same database but the data is as old as the last PDF upload. I have moved to a better solution where I do not source anything from PDF, I query a database, in my case MongoDB, I then take that content (which should be the current truth) and feed it into the process as if it came from ChromaDB so it is then what Ollama uses to answer a question/prompt, this works very well and a lot of the PDF/VectorDB issue went away. I have a large set of data, think of Airline tickets so I have routes, times, destinations as well as passengers that purchase and I need to answer questions like "What is a cheaper flight" or "If I change a day how much will it cost" so some times it not as simple as a PDF document with content
@@fastandsimpledevelopmentok, I love mongodb, since it is so json friendly (a bit too much sometimes), so with that you already structurize and let it enrich by the LLM. How do you 'steer' the LLM from not to be too creative with amending your (flight) data?
@@ChigosGames I create very specific data for example "SFO 01/10/2024 10:00AM - JFK 01/10/2024 4:45 PM American Flight 1410 $445" This is then used in the LLM and I do have a filter and use JSON format output I then transform this as needed
If you're on Windows remove uvloop from the requirements.txt. It will break your pip.
Thanks for the info!
Thanks! Was looking for this!
Thanks for Jasypt pronunciation - from South Korea -
Finally, guy speaks English.
Thanks! Your video and repository helps a lot!
Excellent video. Helped me immensely, Thank you for sharing.
I'm not sure if you mentioned in the video or not but you need to allow traffic to port 11434 on the AWS security group
Good catch. Thanks
how can we paginate the result? The OFFSET and LIMIT can do that but it still charges for querying entire table. I've not found any good solution yet
Good question, I have not had that experience in charges. I normally query the data and move into MongoDB data in the format I need and then work with that content which removes all the charges after the initial query
why the overhead with a huge java program just to use curl with ollama, when you can use curl directly to ollama.
Java is the most used language in Enterprise software development, integrating these applications with Rag and Ollama is the next step for private LLM processing. A simple curl to Ollama does nothing to integrate a vector Database or integrate into a legacy / microservice system
everyone knows how to do this online... but you need to make it so it can be used OFFLINE as well..
Stream insert is not available in free tier means it is not free at all or I can use it when enable billing info?
I think you need to have a paid account setup
So ollama detects and uses the GPU automatically?
Yes, if the OS has support and you have an AMD or Nvidia GPU installed and the latest version does auto-detect. You can also set it to NOT use the GPU in the Ollama config files but by default it does auto-detect.
@@fastandsimpledevelopment It detects only Nvidia GPU. I tried on AWS g4ad (AMD ) and g4dn.xlarge (Nvidia). Only the latter worked. This is FYI.
@@adityanjsg99 Thanks for your input. I have not tried anything other than NVidia GPU's, I've finally decided to get a few 4090 boards and see how they run. I'm trying to build an on-prem system since there is no affordable cloud solution. I'll externalize the LLM API via ngrok, not what I wanted :(
amazing tutorial thank you so muchhh
Definitely the best tutorial I've found on RUclips. I especially appreciated that you included the problems you found while implementing the code like packages not yet installed, because when someone look at tutorial usually he sees that everything is always working fine but it's not what really happens when doing it for the first time. Great job.
he wasted my 12.44 minute, twilio side starts 12:44 guys.
Thanks for your video. Have a question. If I a very big pdf, will this embedding data take more tokens? And what is the max length of the pdf?
I normally use Ollama for for Rag applications with ChromaDB. Ollama is ran locally or at least on your servers so there is never a cost for tokens. If you were to use Gemini or Open AI then yes you have token costs. Depending on what database your using for your vector store, there may be small costs to store the data. In general when you retrieve data and use it in the LLM processing there are token costs. I have used up to 100Mb PDF files with ChromeDB. The big thing to watch is the Chunk Size, you may find that you will send 3 chunks to the LLM so if each one is 1Mb then 3 chunks are 3Mb, which could be 600,000 tokens (3,000,000 / 5) 5 bytes per token. That would be expensive, again using a local LLM and local vector store resolves these costs real fast ply gives you the security of your data not being outside your company.
@@fastandsimpledevelopment So the size of pdf files only affect the cost of the vector db?
@@candyman3537 Correct, if you have a local vector db then there is no cost. The chunk size would have more effect on tokens, I normally have 3 chunks returned from the similarity results that are then sent to the LLM.
the code is blurred and to small, hard to read
Is it the same on windows ?
I'm not sure, I don't have a windows machine
@@fastandsimpledevelopment its okay thanks for the video I will try to do it on Windows to see if its possible.
@@Nokiathanos did you get this to work in windows?? I'm wanting to get ollama tts working and having trouble.
Where is the calling API?
Does the 'text to speech' require internet connection? or does it work offline?
It does require a connection as well as a paid account for elevenlabs
very helpful
Glad you enjoyed it
Thanks Sir Can we use the Twilio to send to multiple number the sms ?
I'm not sure, I have never done that. Take a look at the Twilio docs. Thanks
Hello Sir, I am getting below exception [404] Not Found - {\"error\":\"model \\\"llama3.1\\\" not found, try pulling it first\"} I have installed llama3.1 and configured in in yml file as well like yours but still geting the exception. I did not find solution for this please can you reproduce this issue and explain. Locally I am able to run the llama and getting the response)Via Command Prompt)
try running on the command line ollama run llama3.1
Great content. I was able to clone the git repo and installed all the requirements. However, when I run app.py file, I'm getting an error. can somebody advise on what to do here? "{ImportError: cannot import name 'EVENT_TYPE_OPENED' from 'watchdog.events' (/opt/anaconda3/lib/python3.11/site-packages/watchdog/events.py)".
OMG!!! I freaking love you, I've been struggling with deployment on AWS with llama and you've made it crystal clear. I'll do anything to support ur channel. UR THE BEST!!!
Thanks for the comments
hi bro i tried this program will show error fastembed will not import but i will already install the package and again and again same error will show
Amazing video! Your explanation is super insightful and well-presented. I'm curious-do you have any thoughts or experience with using Ollama in a production environment? I'm not sure if Ollama can handle multiple requests at scale. If I were to implement something like this in production, would you recommend Ollama, or would alternatives like llama.cpp or vllm be better suited? Would love to hear your perspective on its scalability and performance. Thanks again for sharing such awesome content!
Learn how to use Podman by Redhat as a Free Replacement for Docker
No more Docker payment, love it!
this was such a headache ugh. thank you so much. i was actually using next and not react, but this set me on the path to find a better solution (for next) rather than corsconfig-ing spring boot
Glad it helped
can I ask how?
Thank you. 👍
Great video, please post more (especially for spring).
ps: same for langgraph and autogen? or langgraph maybe same as in langchain?
what do you think of zed instead of vscode? my input: it's very lightweight, but because it's quite new, not as full-fledged as vscode yet, but it does have "ai coding / copilot" implemented and supporting ollama also 🥳
Briliant! Its that simple only because you explained it simply :). Thank you!
Thanks, glad you enjoyed it!
Excelente explicación. Muy agradecido. Saludes amigo
Thanks, hope it helps you.
The best tutorial
Glad you liked it, thanks!
thank you. you made my day.
Glad you enjoyed it
Awesome tutorial!
Thanks, glad you liked it!
Hello i have a problem, the variable context is not declared
Can you give me more information? I do not see a variable "context" in the code, "context" is returned in the lookup of data from the ChromaDB so if there is no context then I suspect there is no matching results from the search, maybe tell me the line number or share the code with me
@@fastandsimpledevelopment line 27 of your github repo. "NameError: name 'context' is not defined"
my mistake. the editor put a f""" automatically
@@federicocalo4776 That is part of the PromptTemplate so not a real variable, it is populated by the Retriever so line #70 should create this value in the retriever for you.
Make sure you have line 27 in 3 double quotes and you are using Python 3.9 or greater
i dont understand how you set up the api key
im new to coding 😅
WINDOWS USERS INSTALL PROBLEM: "uvloop" says it doesn't work on windows remove it from requirements.txt it should install and seems to work anyways (I haven't tested the RAG part, only that the API call works)
Thanks for the input
I tried so hard to get lanchain_community to work but it wouldn't. I installed it and it would show in my env but it wouldn't work. I used 3.8 and 3.10 both and went through multiple doc and even took help of chatgpt for troubleshooting....It wouldn't work though. I am not sure why it wouldn't import langchain_community at all