Kubernetes Tutorial: Why Do You Need StatefulSets in Kubernetes?
HTML-код
- Опубликовано: 9 фев 2025
- 🚀Grab Your Black Friday Offers Now: kode.wiki/3CzuOnc
Before we talk about stateful sets, we must first understand why we need it. Why can’t we just live with Deployments?
Let’s start from the very basics. So for a minute let’s keep aside everything that we learned so far, such as Deployments, or Kubernetes, or Docker or containers or virtual machines.
Let’s just start with a simple server. Our good old physical server. And we are tasked to deploy a database server. So we install and setup MySQL on the server and create a database. Our database is now operational. Other applications can now write data to our database.
To withstand failures we are tasked to deploy a High Availability solution. So we deploy additional servers and install MySQL on those as well. We have a blank database on the new servers.
So how do we replicate the data from the original database to the new databases on the new servers. Before we get into that,
So back to our question on how do we replicate the database to the databases in the new server.
There are different topologies available.
The most straight forward one being a single master multi slave topology, where all writes come in to the master server and reads can be served by either the master or any of the slaves servers.
So the master server should be setup first, before deploying the slaves.
Once the slaves are deployed, perform an initial clone of the database from the master server to the first slave. After the initial copy enable continuous replication from the master to that slave so that the database on the slave node is always in sync with the database on the master.
Note that both these slaves are configured with the address of the master host. When replication is initialized you point the slave to the master with the master’s hostname or address. That way the slaves know where the master is.
Let us now go back to the world of Kubernetes and containers and try to deploy this setup.
In the Kubernetes world, each of these instances, including the master and slaves are a POD part of a deployment.
In step 1, we want the master to come up first and then the slaves. And in case of the slaves we want slave 1 to come up first, perform initial clone of data from the master, and in step 4, we want slave 2 to come up next and clone data from slave 1.
With deployments you can’t guarantee that order. With deployments all pods part of the deployment come up at the same time.
So the first step can’t be implemented with a Deployment.
As we have seen while working with deployments the pods come up with random names. So that won’t help us here. Even if we decide to designate one of these pods as the master, and use it’s name to configure the slaves, if that POD crashes and the deployment creates a new pod in it’s place, it’s going to come up with a completely new pod name. And now the slaves are pointing to an address that does not exist. And because of all of these, the remaining steps can’t be executed.
And that’s where stateful sets come into play. Stateful sets are similar to deployment sets, as in they create PODs based on a template. But with some differences. With stateful sets, pods are created in a sequential order. After the first pod is deployed, it must be in a running and ready state before the next pod is deployed.
So that helps us ensure master is deployed first and then slave 1 and then slave 2. And that helps us with steps 1 and 4.
Stateful sets assign a unique ordinal index to each POD - a number starting from 0 for the first pod and increments by 1.
Each Pod gets a name derived from this index, combined with the stateful set name. So the first pod gets mysql-0, the second pod mysql-1 and third mysql-2. SO no more random names. You can rely on these names going forward. We can designate the pod with the name mysql-0 as the master, and any other pods as slaves. Pod mysql-2 knows that it has to perform an initial clone of data from the pod mysql-1. If you scale up by deploying another pod, mysql-3, then it would know that it can perform a clone from mysql-2.
To enable continuous replication, you can now point the slaves to the master at mysql-0.
Even if the master fails, and the pod is recreated is created, it would still come up with the same name. Stateful sets maintain a sticky identity for each of their pods. And these help with the remaining steps. The master is now always the master and available at the address mysql-0.
And that is why you need stateful sets. In the upcoming lectures, we will talk more about creating stateful sets, headless services, persistent volumes, and more.
Access the full course here: kodekloud.com/...
#KubernetesTutorial #Kubernetes
🚀Explore Our Top Courses & Special Offers: kode.wiki/40SkWyU
Unfortunately I have only one like button. Thanks for the crystal clear explanation.
Hi, we appreciate the kind comment! enjoy!
Excellent explained Mumshad . You're a gem 💎
Thanks a ton!
nice explanation - but small request, please use night-mode for the video as most developers prefer night mode :)
Thanks for the tip
Beautifully explained
Thanks a lot!
Mumshad you are cool. Nice set of videos. Thanks for making it so simple and easy to understand.
Very good explaination...Thanks mumshaad...
Glad you enjoyed the video and good to know that it cleared your doubts. Thanks😊
Please subscribe to the channel and support us 😊
good😀 explanation !
Glad you liked it!
wonderful explanation. thank you
You are welcome!
Simply explained sir thank you.....
Greetings! Thank you for your kind words. Spread the word by liking, sharing and subscribing to our channel! Cheers :).
Thank you Mumshad !!
Welcome!
That's really cool l and simple
Hi Mumshad,
Thanks for the great session. As you said in Statefulset if any POD goes down, it comes up with same name I.e. MySQL-0 in your example, how does this newly constructed pod sync the data? It might have lost all the data when it went down, even if it uses persistent storage, it might have missed some new data during the period when it was down??
When master goes down,there is no write.So there wont be any need to sync data for master node..
Pod will get the same data again because we will use Persistent Volume
When the new pod is created the initContainers will ensure that the data is synced over to the new Pod. Data can be written on on mysql-0.
where are the queries stored at that point of time@@haressedMom
verry good tnx
You are welcome!
sir its understanding but with practical will give me the much understanding
What do you mean by frontier models
good explanation. Question - how does K8 preserve statefulness if a stateful pod crashes and is replaced by another pod?
This is part of which course cka ? Or which coruse in kk?
StatefulSets are indeed included in both the Certified Kubernetes Administrator (CKA) and Certified Kubernetes Application Developer (CKAD) courses.
Good explaination
Hi Mumshad.
Your explanation is awesome..
I have one doubt here on how pods get dynamically assigned to IP address.
which component in K8S allocates the IPs to PODs.
Kindly answer.
i think kubeproxy
@@aliakbarhemmati31 ok thanks for the answer , im also finding the answer for how pods get dynamically assigned to IPs when ever they died .
Network plugin sir. Calico is an example
this was very helpful! Thank you.
Great video!
Great explanation, thank you mumshad
great video!
Hi Sir - Appreciate if you can provide with another simple example if possible. I think this is quite complicate one .
Your stateful set has different capabilities. The master (mysql-0) in your example is writable where the others aren't.
As you have differences why not 2 charts. mysql-master with a replica count of 1 and mysql-replica with a replica count that can scale? For the replica's connecting to the master use the service name instead of the hostname?
This is what I thought. We anyway use service to communicate between two pods, why would be need podname?
Hello Sir, Nice video. Please share any video's on Helm charts getting started with Kubernetes clusters, , Thanks...
Thanks master!
thanksssss
Need your help. I have created "nfs server pod" instead of local nfs installation then created the PV. In PersistentVolume, I have mentioned the NFS server's service name
like:
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteMany
nfs:
server: nfs-server.default.svc.cluster.local
path: "/
NOW I want to bind this PersistentVolume in Statefulset VolumeClaimTemplate in GCP.. How do we mention it?
I gave like this in statefulset
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
the problem is if I gave the above example it will be created a new one, not bound with my PV... I hope you understand what I am trying to say.. else let me know. Thanks in advance
you should specify storageclassname in pv definition and refer to it using storageclassname in pvc
thank you, i love you. May Allah bless India.
Why Master sand Slaves ? and not ( "Primary pods" and "secondary pods" ? Is this a cultural BIAS ???
Is it ethic to do so ?
Play at 1.25x speed.