Tuesday, 8 March 2022
How to track state in your Kubernetes Operator
How to track state in your Kubernetes Operator
State can get very messy in a distributed environment like Kubernetes.
In simple environments, state machines are very adequate to define and track state. You quickly get to understand if your computation is misbehaving, and report progress to users or other programs.
But, In the context of an Operator, you can’t just pay attention to your own workload. You also need to be aware of external factors monitored by Kubernetes itself.
Here’s how to think about state when building your own Operator (based on lessons learnt from building our own).
Let’s get into it!
Spec is desired, status is actual
Let’s start with the basics. All Kubernetes objects define
status sub-objects to manage State. This includes custom resources monitored by your own Operator’s custom controller.
spec is the specification of the object where you declare the desired state of the resource. On the other hand, the
status is where Kubernetes provides visibility into the actual state of the resource.
For example, our Artillery Operator monitors a custom resource named LoadTest. As above, our LoadTest has a
spec field specifying the number of load tests to run in parallel. Our custom controller processes our load test. It creates the required parallel workers (Pods) to match the spec’s desired state therefore updating the actual state.
The key takeaway from this diagram is that observing state is a constant happenstance to ensure desired becomes actual.
But we need to understand how our custom resource is progressing. So, what exactly should we track in the
State machines are rigid
The answer to the previous question is: do not use single fields with state machine like values to track state.
Kubernetes advocates for managing state using
Conditions in our
status sub-object. This gives us a less rigid and more ‘open-world’ perspective.
This makes sense. Processes in a distributed system interact in unforeseen ways creating unforeseen states. It’s hard to preemptively determine what these states will be.
Monitoring aggregate state
Let’s look at the example of a typical custom controller.
This controller creates and manages a few child objects. It is aware of state across all of these objects. And, it constantly examines its ‘aggregate’ state to inform its actual state.
We can define these
aggregate states using a state machine. As an example, let's say we have the following states:
initializing, running, completed, stoppedFailed
We’ll create a field called
CurrentState in our
status object, and update it with the current state. Our
CurrentState is now
... Out of the blue a Kubernetes node running our workload fails!!
This drops a child object into an unforeseen (to us) state. Our custom controller’s ‘aggregate' state has now shifted. But, we continue to report it as
running to our users and downstream clients.
We need a fix! We update our custom controller with logic to better examine node failure for a child object. We introduce a new
CurrentState for this scenario:
Some downstream clients will find this new controller state useful. We now have to inform them to monitor for this new state. All is well in the world, for now .
... What!?? out of the blue another an accounted for state happens!
CurrentState can’t explain the issue. So, we update controller logic, add an adequate state for ... etc.. etc..
You can see how this state machine based approach quickly becomes rigid.
Conditions 👀, provide a richer set of information for a controller’s actual state over time. You can think of it as a controller’s personal changelog.
Conditions are present in all Kubernetes objects. Here’s an example of a Deployment. Have a read I’ll wait... 👀
If our typical custom controller, from the last section, had a Deployment child object that exhibited the following conditions.
status: availableReplicas: 2 conditions: - lastTransitionTime: 2016-10-04T12:25:39Z lastUpdateTime: 2016-10-04T12:25:39Z message: Replica set "nginx-deployment-4262182780" is progressing. reason: ReplicaSetUpdated status: "True" type: Progressing - lastTransitionTime: 2016-10-04T12:25:42Z lastUpdateTime: 2016-10-04T12:25:42Z message: Deployment has minimum availability. reason: MinimumReplicasAvailable status: "True" type: Available
It can use the following logic to figure out its own actual state,
- Go through the Deployment’s
- Find the latest
- Weigh if the other Deployment
.status.conditionshad any alarming statuses
Here, it determines the Deployment is still progressing and not yet completed. Then it can add a condition with a ‘Progress’ like type to its Conditions list.
Creating your own Conditions
So, how do we create conditions for our own custom controller?
We’ve just seen an example of a Deployment’s Conditions, specifically how open ended they are. E.g
type: Progressing may mean either of progressing and complete.
As a general rule of thumb,
- We have a smaller set of types
.status.conditions[..].typethat explain general behaviour.
- Unforeseen states, can be easily signalled using the
- Failure for a condition type
.status.conditions[..].typecan be explained using
.status.conditions[..].statusvalue with an informative
Let’s checkout an example and see how these tips play out.
Conditions by example
If you recall, earlier, we used the Artillery Operator to explore
status sub-objects. Let’s get into the details of how it makes use of Conditions to manage state.
The Artillery Operator uses a LoadTest custom resource to spec out a load test run. The spec includes the number of workers to run in parallel. This information is used to generate a Kubernetes Job and Pods to run the actual load test workloads.
The Job is a key component here. Tracking our aggregate state relies on the progress of the Job.
We also observed that,
- There are three statuses we care about based on a Job’s
- We should add extra fields to
These fields will give us a fine grained view of how many workers are active, succeeded or failed.
// The number of actively runningLoadTest worker pods. Active int32 `json:"active,omitempty"` // The number ofLoadTest worker pods which reached phaseSucceeded. Succeeded int32 `json:"succeeded,omitempty"` // The number ofLoadTest worker pods which reached phaseFailed. Failed int32 `json:"failed,omitempty"`
Using our Conditions rule of thumb, two Condition types are required:
Unknownas a value.
Unknownas a value.
These easily explained the progress and completed states. But, how do explain a Load Test has failed?
In our distributed load testing domain, a load test always completes. And, will only flag failed workers. Restarting a failed worker/Pod messes with the load test metrics, so we avoid it at all costs.
So, for the controller, the
Completed condition set to
true was more than enough. Failed workers were flagged in the
status field (using a
failed field to track failed count).
Other implementations may treat the failure condition differently (e.g. Deployment uses the Progressing Condition type with value false).
Our Conditions ensure users can track a Load Test’s observed status to infer progress. While helping clients monitor the conditions that matter to them.
Feel free to check the full Conditions implementation. 👀
Conditions all the way down
The more you deal with Kubernetes object to track any form of state, the more you deal with Conditions. They’re everywhere.
They help create a simple reactive system that is open to change. Hopefully, this articles gives a good understanding of how to get started.