Это видео недоступно.
Сожалеем об этом.
How to run a private Chroma Vector Database locally in 5 mins!
HTML-код
- Опубликовано: 15 авг 2024
- Hey everyone,
I wanted to take some time to show how simple it is to get Chroma (trychroma.com), an open-source vector database, to run locally on your machine so you can use it with AnythingLLM (github.com/Min...) or other popular services as well as even secure the data between updates and use API key authentication!
Watch the AWS or Render videos for how to take these same ideas and put them on the cloud so you can scale your vector database and bring your AI tools to life with the power of vector databases and RAG.
🎉JOIN THE MINTPLEX LABS DISCORD🎉
Our Discord is the place for AI Developers to talk about new tools and also discuss and get early access & deals to Mintplex Labs tools like AnythingLLM and VectorAdmin.
🔥 / discord 🔥
CHAPTERS:
0:00 Introduction to Chroma localhost
0:30 Current Chroma Version on Github
1:28 Pulling in the latest chroma docker image
2:17 Run ChromaDB on docker - no custom configuration
3:22 Uploading our first embedding
4:20 Problems and caveats to running with no persistence
5:25 Run ChromaDB WITH PERSISTENCE!
6:24 Define where storage should be saved
8:00 What ChromaDB Storage looks like
8:52 Reboot ChromaDB docker but use existing data!
10:12 Run ChromaDB on Docker with persistence AND private API keys
15:21 Conclusion on how to best run docker locally - thanks!
#aws #opensource #rag #retrieval #chromadb #ai #aitools #vectordatabase #vectordata #tutorial #tech #technology #techtips #techtutorial #startups
I was waiting on this before trying out AnythingLM with clients. Thanks Tim!
Another approach to storing data locally:
If you clone the GitHub repo you can cd into chroma and run the command `docker-compose up -d --build` which will spin up an image container and volume which will persist the data when deleting the container and image
This dude is the coolest underrated person imo
This was a fantastic presentation and walkthrough!
Very helpful video. Thanks!
Excellent video. It's very timely as I have a POC that needs vector search and Chroma in Docker will do nicely. Thank you very much.
Thank you, Tim. Nice done!
This is great but for performance I'd take a look at using named volumes over host volumes. There are pros and cons to the different types of volumes used with Docker but allowing it to manage its own volumes makes it harder for you to screw it up.
Thank you, Tim!
Awesome work brother
Awesome tutorial!
Thanks. Nicely explained .
How to mount a persistence in cloud run with docker chromadb
Great tutorial Tim! But can't setup file sharing with a basic Docker subscription. Is there any other alternative (Singularity?/Podman?)
Very nice!
hello, how can i store the embeddings without using Anything LLM?
Thanks for video.
I'm trying to persist data with Azure file shares, but I'm not succeeding. Could you make a video or give me some tips?
Thank you for this wonderful explanation! I am able to successfully able to run locally, How do we deploy this on kubernetes cluster using persistent volumes?
nicely explained!!!
This videos goated
looks like ChromaDB changed how Auth is done just a few week ago. Will this affect how it works with AnythingLLM? cheers
I have a vector db with embeddings and docs with me that i stored using below commands
from langchain.vectorstores import Chroma
db = Chroma.from_documents(docs, embeddings)
And persisted it into a folder in google colab. But in colab, when the run time ends everything is lost. I want to keep my vectorized db forever so that I can retrieve data anytime I want. How to do that?
you can store the created vector store in google drive and just mount that drive in colab runtime
@@aayushchaurasia4727 hi, coul you please help? I am trying to Connect to ChromaDB like this
vectordb = Chroma(persist_directory=persist_directory, embedding_function=embeddings), but still when the run time ends everything is lost. In new session vectordb.get() gives me only {'ids': [],
'embeddings': None,
'metadatas': [],
'documents': [],
'uris': None,
'data': None}
hi, did you manage to solve the problem??
how do i read the data inside a collection? from the ui or something? is there an option?
Chroma does not have a UI. You can read data via the API or a GUI tool github.com/Mintplex-Labs/vector-admin
how can we bundle this with a software for client?
If it can run python code, then I don't see why not. Python is a requirement for chroma
is there another way besides docker
You can run an EC2 instance and use a python script with the chroma library opening a connection that is enabled on boot. Otherwise no
I tried running your command to create the local storage on my Mac and got and error that I needed an argument to run. I changed it to docker run -p 8000:8000 -v /Users/jim/Desktop/chroma/:/chromadb/chroma chromadb/chroma and it worked. Do you have an extra / in your command or did I just type it wrong.
Think it may have been a typo, it is in a CloudFormation template and has been working since i made the video. Its usually that trailing slash before the colon that is dropped. Regardless, glad it worked!
I think I spoke to soon. The error didn't happen but the vector database is not being created in the folder. Still working on it@@TimCarambat
@@jim02377 if you need to manually create it then it's for sure a permission issue! Make sure the docker user has the permission to write to the folder for storage
thanks
I can see the hunter2 password
Yeah, that is for the demo. hunter2 is a joke password from a pre-meme internet. If you played Runescape back in 2000's youd find it funny.
Amazing, thanks