Deploy models with Hugging Face Inference Endpoints
HTML-код
- Опубликовано: 23 сен 2024
- In this video, I show you how to deploy Transformer models straight from the Hugging Face hub to managed infrastructure on AWS, in just a few clicks. Starting from a model that I already trained for image classification, I first deploy an endpoint protected by Hugging Face token authentication. Then, I deploy a second endpoint in a private subnet, and I show you how to access it securely from your AWS account thanks to AWS PrivateLink.
⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos ⭐️⭐️⭐️
⭐️⭐️⭐️ Want to buy me a coffee? I can always use more :) www.buymeacoff... ⭐️⭐️⭐️
- Model: huggingface.co...
- Inference Endpoints: huggingface.co...
- Inference Endpoints documentation: huggingface.co...
- AWS PrivateLink documentation: docs.aws.amazo...
Code:
import requests, json, os
API_URL = ENDPOINT_URL
MY_API_TOKEN = os.getenv("MY_API_TOKEN")
headers = {"Authorization": "Bearer "+MY_API_TOKEN, "Content-Type": "image/jpg"}
def query(filename):
with open(filename, "rb") as f:
data = f.read()
response = requests.request("POST", API_URL, headers=headers, data=data)
return json.loads(response.content.decode("utf-8"))
output = query("food.jpg")
This is the exact content I was looking for yesterday, you posted it today! Fantastic lol
Really hope I can get everything set up to put my idea into production at scale.
Glad it was helpful!
I need to recheck your previous video. There are deployment of training instance. Now it is to deploy inference instance. Always great to revisit to understand different terms for a beginner.
Thanks Julien. Besides ease of using hugging face endpoints, i learned about how VPC endpoints work!
Cool :)
Thank you for hugging face. It makes deployment much easier.
Fantastic , great learning thank you very much. So now I can use these endpoints from Langchain or lllama Index without worrying about the deployment of my model.
Exactly, and you're welcome :)
Ughh, I wish I had found this earlier. I created my own VPS with both front end and back end server to provide access to a transformer model. Thanks, this should help.
Glad I could help!
Thanks Julien, great video!
Glad you liked it!
great lectures.
You're the best!!!
Thanks, I'll tell my wife
This is pure gold, thank you!
Thanks!
Great job sincerely!
Thanks!
That's amazing, Merci pour le partage
Glad you liked it.
I'll appreciate if you share how to deploy models .ckpt or safetansors on a vps that I already own (vultr or digitalocean)
In this we need AWS for model storage , or we can directly use by the inference api endpoints of hugging face , because I want to use jais13b-chat model @Julien Simon
Inference Endpoints lets you deploy any hub model on managed infrastructure running on AWS or Azure. Not sure what you mean by 'model storage' ?
Hey Julien,
Where can we find the training model video for food dataset?
Also, I am trying to use a model and deploy it on Hugging Face Inference, but it errors out saying I need a config.json file. I'm not sure how to create it. Any leads would be really helpful.
Thanks!
Hi, I think this is the right video: ruclips.net/video/uFxtl7QuUvo/видео.html
Yes, your model repository needs to have a config.json file, which is generated automatically when you save your trained model. See the docs at huggingface.co/docs/inference-endpoints/index
Need to know how to communicate with chat models that are running using python code. I’m struggling to find this information.
Check out the Inference Endpoints documentation. The format is simple JSON.
paying for it but Its reaaaaaallly hard to change the tokens for models.
Where do I get my api token?
Create an account on the Hugging Face hub and go to settings.