Ollama's Newest Release and Model Breakdown
HTML-код
- Опубликовано: 21 сен 2024
- Hey everyone, Matt Williams here! Join me as I dive into Ollama's newest release and uncover one incredible feature that have been highly anticipated for ages. In this video, we'll take a closer look at what's new
in version 0.3.11, including updates that make life easier for both developers and non-developers alike.
🔑 *Key Features:*
- *Easy Model Unloading:* Now you can remove models from memory with a simple command! No more complex
scripts or APIs-just use `ollama stop` to free up valuable resources.
- *Faster Performance in Docker:* If you're using Ollama in Docker on Windows or Linux, get ready for
significant speed improvements, shaving off about 5 seconds of start time!
💡 *New Models:*
- *Solar Pro Preview:* A powerful new model slated for a full release in November.
- *Qwen 2.5:* More knowledgeable and versatile than its predecessor, with variants ranging from 0.5B to 72B
parameters.
- *Bespoke Minicheck:* Designed to verify claims with a simple yes or no answer.
- *Mistral Small:* Excellent for translation, summarization, and sentiment analysis despite its smaller
size.
- *Reader-LM:* Converts HTML to Markdown, offering new possibilities but with some caveats.
Here is that command I showed: ollama ps | awk 'NR gt 1 {print $1}' | cut -d':' -f1 | xargs -I {} sh -c 'ollama stop {}'
(replace the gt with the greater than sign...couldn't add that to the command here)
🎬 *Stay Updated:*
Whether you're on Mac, Windows, Linux, or using Docker, I'll show you how to easily update Ollama and keep your installation up-to-date with the latest features.
💬 *Join the Conversation:*
What features are you still hoping to see in future releases? Share your thoughts in the comments below!
🔗 *Resources:*
- [Ollama Homepage](ollama.com)
- [GitHub Releases](github.com/oll...)
*About Me:*
I'm a former member of the original Ollama team and continue to share my insights on local AI solutions. Stay tuned for more updates, tutorials, and deep dives into the world of AI!
📺 *Subscribe for More Content!* 📺
---
*Tags:* #Ollama #AI #MachineLearning #NewRelease #Update #ModelUnloading #FasterPerformance #SolarPro
#Quen2.5 #BespokeMinicheck #MistralSmall #ReaderLM
Be sure to sign up to my monthly newsletter at technovangelis...
You can find the Technovangelist discord at: / discord
The Ollama discord is at / discord
(they have a pretty url because they are paying at least $100 per month for Discord. You help get more viewers to this channel and I can afford that too.)
Join this channel to get access to perks:
/ @technovangelist
Or if you prefer there is also a Patreon: / technovangelist
the long awkward pause at the end always gets me
Pixtral when?
i literally came to ask this 🤣
Probably not anytime soon. Multimodal models are not a priority for llama.cpp (which includes Ollama by extension) at the moment.
No pixtral? Sad day
@@TheDiverJim it's a different architecture so they're probably not supporting it anytime soon.
@@adamholter1884 Nope. Just wait.))
I am training the Pixtral model to further enhance creating interfaces. I downloaded it from a server that usually has these. I can say it is pretty good for OCR and screenshots to correct errors, or read scanned books.. Coupling this with Claude Dev or Engineer, super powers the whole process of building real complex apps... Thanks Mistral!
I'm new here, but I saw this content is really informative and I like that you're very knowledgeable so I subscribed
Thanks, I'm upgrading Ollama now. :)
Very cool t-shirt.
Pixtral needed
Prompt caching and more support for vision models would be great...in any case nice update! Thanks
Prompt caching isn’t something that would make sense for ollama. That would be something implemented by whatever tool you or someone else builds that leverages ollama.
I have begun using Ollama for running an agent setup. I was wondering how to unload models. But I also tried figuring out what models had already been loaded... and I found that whatever model I would try to prompt would reply. I first figured that it was maybe just ignoring the part with the model to use and using whatever model was loaded... but I was pretty sure it should be possible to use more than one model.
I have been testing some more and found that whatever model I prompt just gets loaded... and during that process it unloads the other model if you are loading a new model.
Seems strange, so I am not even sure if its possible to load more than one model. I think maybe that white spaces part was partly the reason, because I could not just copy and paste a model name, it seems all the models had white spaces before or after the name, so when I prompt a model I first get the full list of installed models and then do a "contains" on the name (I am doing my framework in C#) to see if the model name is contained in the model name. Otherwise it would complain about the model not being loaded.
But I also made use of Nvidias SMI Utility which I can use to check available VRAM on the GPU, to then a bit indirectly check how much VRAM the models have taken up. And well, it seems quite certain that it only loads one model at a time. Or at least it did in the previous version of Ollama. Now its time to test whether I could maybe use ollama stop to unload specific models. But since I did not need to run them I am guessing that might not be needed.
I already added features for doing web search using either Google or DuckDuckGo. I am thinking it could be nice using web scraping to look up new Ollama models on the Ollama website. I am hoping I could make my agent framework setup new agents as it goes, trying to create more features for itself, and then maybe checking the Ollama site for relevant models for those new agents. Its... probably very optimistic that this would work.
Mainly my focus now is trying to get vision models to get coordinates for specific things in an image. That would be very useful for an agent setup I think. Making it possible for the agent framework to play games, or use... well, Windows or Linux in general. I have found one vision model that might be promising because it at least gives consistent answers to questions about the same image when asked the same way. But its not pixel coordinates, which I suspect is a problem due to tokens and such, but it gives decimals. So I suspect it might be % x and y coordinates. That is what I am about to test. Otherwise I am hoping Pixtral will be able to do this.
Anyone have any experience with getting Vision models to give somewhat exact positions for UI elements and stuff like that in images?
ollama stop finally!!! 😅 i no longer have to keep restarting ollama 😅😂
I would be interested in using other inference engines with ollama but the documentation I've found doesn't have enough detail
Can you explain what you mean by this?
It is just missing a fine-tuning command with a large custom dataset.
I have a question because you collaborated with the Ollama project.
Because Ollama is not compatible with more low-income models like AMD and models like Rx 580 or Rx 6600, while the LMStudio project is.
It works with AMD just fine. What is the question? older models of amd are not supported because AMD chose to not support them with the newer drivers.
Good video
you can pass to ollama environment variable in the docker compose file OLLAMA_KEEP_ALIVE: 0 it unload automatically the model after generating the response
Thanks
Thanks so much for that
My issue is that I need model loaded always, and I can’t find an obvious way of keeping it in memory for longer then 4 minutes:)
Just set keep alive to -1
what do you think about Ollama or other ways of running LLMs such as 8b models on smartphones?
That’s the main magic feature of Apple intelligence. Finally makes it doable without sacrifice in battery life. At least that’s the promise. Every other approach has been terrible
@@technovangelist the only problem I have with that is that I can't run my "custom" model. And that its apple, so I can't just write software and put in on my phone.
While with android I can easily make an APK or an even in development software and load it on the phone.
Thank you for answering btw :)
You can totally write your own software for iOS. It wouldn’t be the strong platform it is if you couldn’t. You can run custom models today it’s just a terrible experience.
@@technovangelist I was under the impression that it needs to be approved into apple store before I can load it into an iphone, was I mistaken for the past 3 years? ^^'
If you are writing it for yourself then yup you are mistaken
I find traditional OCR methods lacking for many real world scenarios. Running an image through many OCR engines and then trying to determine which ones are appropriate is tiresome. This is something AI should be able to do as good or better than the standard human, if not now then very very soon.
PIXTRAL?
Just barely been released and it’s a new architecture. If it can be supported it might be a while
no more systemctl restart ollama
yea i realizee uninstalling the model was a problem