Great video, its original content, not seen in youtube. Congratulations Its possible to ollama in multi nodes and multi gpu in a kubenetes cluster ? Its better performance if run ollama 2 gpus in 1 node or if run ollama in 2 nodes and 1 gpu per node ?
Hi, this video and series in general is great. as someone interested in using AWS and other cloud providers, I would like to know how much this exercise as well as the others actually costed. could you please indicate the total costs of the activity so I can try to replicate it without and unexpected hidden cost? thx in advance
Thanks! The instance I used (g4dn.2xlarge) costs about $0.75 an hour. Meaning I probably spent around $2-3 making this video. For reference, running this instance for an entire month would cost about $550.
If you are running a single instance of ollama, there aren't many benefits. If you need lots of ollama instances (for a public api, or text generation, etc) using Kubernetes will help you simplify the operations.
@ we are about 1000 people in the company…. How would you approach this for using ollama i want to have some way from tracking which user is using it how much tokens. Not the content but to know how much which department is using it
Hey another question i had was that i found a video where it shows that ollama on its simolest installation is already able to be opened up in several terminals and responding. Concurrency was until a few months ago not possible but now it is. So i wonder why do all this hussle in making virtual gpus to make better availability. Im sure you know why. Its just what i am struggling to understand right now. Hope you find time to answer again. Thanks very much much 😊
Nice. Time Slicing piece was the cherry to make the Sunday.
Great video, thank you!
Thank you for the feedback! Any topics you'd like me to cover?
@@mathisve Nothing comes to mind; just enjoying your videos 🙂
very helpful thank you very much
Thanks for the feedback! Are there any other topics you would like to see me cover?
@@mathisve Well i'm very new to the whole k8s thing and got this video recommended :) maybe more DevOps stuff perhaps?
Great video, its original content, not seen in youtube. Congratulations
Its possible to ollama in multi nodes and multi gpu in a kubenetes cluster ? Its better performance if run ollama 2 gpus in 1 node or if run ollama in 2 nodes and 1 gpu per node ?
Hi, this video and series in general is great. as someone interested in using AWS and other cloud providers, I would like to know how much this exercise as well as the others actually costed. could you please indicate the total costs of the activity so I can try to replicate it without and unexpected hidden cost? thx in advance
Thanks! The instance I used (g4dn.2xlarge) costs about $0.75 an hour. Meaning I probably spent around $2-3 making this video. For reference, running this instance for an entire month would cost about $550.
Which are the benefits of running ollama with k8s in the cloud instead of ollama container in the cloud without k8s? Thanks very much for this video
If you are running a single instance of ollama, there aren't many benefits. If you need lots of ollama instances (for a public api, or text generation, etc) using Kubernetes will help you simplify the operations.
@ we are about 1000 people in the company…. How would you approach this for using ollama i want to have some way from tracking which user is using it how much tokens. Not the content but to know how much which department is using it
Hey another question i had was that i found a video where it shows that ollama on its simolest installation is already able to be opened up in several terminals and responding. Concurrency was until a few months ago not possible but now it is. So i wonder why do all this hussle in making virtual gpus to make better availability. Im sure you know why. Its just what i am struggling to understand right now. Hope you find time to answer again. Thanks very much much 😊
ruclips.net/video/8r_8CZqt5yk/видео.htmlsi=PzyzG4KSBiM371e8