Optimizing cost of a Kubernetes microservice while preserving SLOs with performance tests
Optimizing cost of a Java microservice on Kubernetes while preserving SLOs with performance tests
Optimizing cost of a Kubernetes microservice while preserving SLOs in production
Optimizing cost of a Java microservice on Kubernetes while preserving SLOs in production
Optimizing cost of a Kubernetes microservice with HPA in production
In this guide, you optimize the cost (or resource footprint) of a Java microservice running on Kubernetes. The study tunes both pod resource settings (CPU and memory requests and limits) and JVM options (max heap size, garbage collection algorithm, etc.) at the same time, while also taking into account your application performance and reliability requirements (SLOs). This optimization happens in production, leveraging Akamas live optimization capabilities.
an Akamas instance
a Kubernetes cluster, with a Java-based deployment to be optimized
a supported telemetry data source configured to collect metrics from the target Kubernetes cluster (see here for the full list)
a way to apply configuration changes recommended by Akamas to the target deployment. In this guide, Akamas interacts directly with the Kubernetes APIs via kubectl.
You need a service account with permission to update your deployment (see below for other integration options)
In this guide, we assume the following setup:
the Kubernetes deployment to be optimized is called adservice (in the boutique namespace)
in the deployment, there is a container named server, where the application JVM runs
Dynatrace is used as an observability tool
Let's set up the Akamas optimization for this use case.
For this optimization, you need the following components to model the adservice tech stack:
A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits (from the Kubernetes optimization pack)
A Java OpenJDK component, which contains JVM-level metrics like heap memory usage and parameters to be tuned like the garbage collector algorithm (from the Java OpenJDK optimization pack)
A Web Application component, which contains service-level metrics like throughput and response time of the microservice (from the Web application optimization pack)
Let's start by creating the system, that represents the Kubernetes deployment to be optimized. To create it, write a system.yaml
manifest like this:
Then run:
Now create a component-container.yaml
manifest like the following:
Notice the component includes properties that specify how Dynatrace telemetry will look up this container in the Kubernetes cluster (the same will happen for the following components).
These properties are dependent upon the telemetry provider you are using.
Then run:
Next, create a component-jvm.yaml
manifest like the following:
Then run:
Now create a component-webapp.yaml
manifest like the following:
Then run:
To optimize a Kubernetes microservice in production, you need to create a workflow that defines how to deploy in production the new configuration recommended by Akamas.
Let's explore the high-level tasks required in this scenario and the options you have to adapt it to your environment:
Let's now create a workflow.yaml
manifest like the following:
In the configure
task, Akamas will apply the container CPU/memory limits and JVM options recommended by Akamas AI to the deployment file. To do that, copy your deployment manifest to a template file (here called adservice.yaml.templ
), and substitute the current values with Akamas parameter placeholders as follows:
Whenever Akamas recommended configuration is applied, the configure task will create the actual adservice.yaml
deployment file with the parameter placeholders substituted with values recommended by Akamas AI, and then the new deployment will be applied via kubectl apply
.
To create the workflow, run:
Create a telemetry instance based on your observability setup to collect your target Kubernetes deployment metrics.
Create a telemetry.yaml
manifest like the following:
Then run:
It's time to create the Akamas study to achieve your optimization objectives.
Let's explore how the study is designed by going through the main concepts. The complete study manifest is available at the bottom.
You can now create a study.yaml
manifest like the following:
Then run:
You can now follow the live optimization progress and explore the results using the Akamas UI.
To quickly set up this optimization, download the Akamas template manifests and update the values file to match your needs. Then, create your optimization using the Akamas scaffolding.
In this guide, you optimize the cost (or resource footprint) of a Kubernetes deployment where the number of replicas is controlled by the HPA. The study tunes both pod resource settings (CPU and memory requests and limits) and HPA options (target CPU utilization) at the same time, while also taking into account your application performance and reliability requirements (SLOs). This optimization happens in production, leveraging Akamas live optimization capabilities.
an Akamas instance
a Kubernetes cluster, with a deployment to be optimized
a Horizontal Pod Autoscaler working on the desired deployment
a supported telemetry data source configured to collect metrics from the target Kubernetes cluster (see for the full list)
a way to apply configuration changes recommended by Akamas to the target deployment and HPA. In this guide, Akamas interacts directly with the Kubernetes APIs via kubectl.
You need a service account with permissions to update your deployment (see below for other integration options).
In this guide, we assume the following setup:
the Kubernetes deployment to be optimized is called frontend (in the hipster-shop namespace)
in the deployment, there is a container named server, where the app runs
the HPA is called frontend-hpa
both Dynatrace and Prometheus are used as observability tools
Let's set up the Akamas optimization for this use case.
For this optimization, you need the following components to model the frontend tech stack:
An HPA component, which contains HPA parameters like the target CPU utilization
Let's start by creating the system, which represents the Kubernetes deployment to be optimized. To create it, write a system.yaml
manifest like this:
Then run:
Now create the three Kubernetes components. Create a workload.yaml
manifest like the following:
Then create a container.yaml
manifest like the following:
And a pod.yaml
manifest like the following:
Now create the entities by running:
Now create an application.yaml
manifest like the following:
Notice the component includes properties that specify how Dynatrace telemetry will look up this container in the Kubernetes cluster.
These properties are dependent upon the telemetry provider you are using. See the reference for the full list of supported providers and relative configurations.
The run:
Finally, create anhpa.yaml
manifest like the following:
The HPA component does not provide any metric, so we do not need to specify anything about the workload.
Then run:
To optimize a Kubernetes microservice in production, you need to create a workflow that defines how the new configuration recommended by Akamas will be deployed in production.
Let's explore the high-level tasks required in this scenario and the options you have to adapt it to your environment:
Let's now create a workflow.yaml
manifest like the following:
Then run:
To collect metrics of your target Kubernetes deployment, you create a telemetry instance based on your observability setup.
Create a dynatrace.yaml
manifest like the following:
Then run:
Create a prometheus.yaml
manifest like the following:
Then run:
It's now time to create the Akamas study to achieve your optimization objectives.
Let's explore how the study is designed by going through the main concepts. The complete study manifest is available at the bottom.
You can now create a study.yaml
manifest like the following:
Then run:
You can now follow the live optimization progress and explore the results using the Akamas UI.
In this example, you will use Akamas live optimization to minimize the cost of a Kubernetes deployment, while preserving application performance and reliability requirements.
In this example, you need:
an Akamas instance
a Kubernetes cluster, with a deployment to be optimized
the kubectl
command installed in the Akamas instance, configured to access the target Kubernetes and with privileges to get and update the deployment configurations
a supported telemetry data source (e.g. Prometheus or Dynatrace) configured to collect metrics from the target Kubernetes cluster
This example leverages the following optimization packs:
The system represents the Kubernetes deployment to be optimized (let's call it "frontend"). You can create a system.yaml
manifest like this:
Create the new system resource:
The system will then have two components:
A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits
A Web Application component, which contains service-level metrics like throughput and response time
In this example, we assume the deployment to be optimized is called frontend, with a container named server, and is located within the boutique namespace. We also assume that Dynatrace is used as a telemetry provider.
Create a component-container.yaml
manifest like the following:
Then run:
Now create a component-webapp.yaml
manifest like the following:
Then run:
The workflow in this example is composed of three main steps:
Update the Kubernetes deployment manifest with the parameters (CPU and memory limits) recommended by Akamas
Apply the new parameters (kubectl apply)
Wait for the rollout to complete
Sleep for 30 minutes (observation interval)
Create a workflow.yaml
manifest like the following:
Then run:
Create the telemetry.yaml
manifest like the following:
Then run:
In this live optimization:
the goal is to reduce the cost of the Kubernetes deployment. In this example, the cost is based on the amount of CPU and memory limits (assuming requests = limits).
the approval mode is set to manual, a new recommendation is generated daily
to avoid impacting application performance, constraints are specified on desired response times and error rates
to avoid impacting application reliability, constraints are specified on peak resource usage and out-of-memory kills
the parameters to be tuned are the container CPU and memory limits (we assume requests=limits in the deployment file)
Create a study.yaml
manifest like the following:
Then run:
You can now follow the live optimization progress and explore the results using the Akamas UI for Live optimizations.
The Kubernetes Workload, Container and Pod components, containing metrics like CPU used for the different objects and parameters to be tuned like CPU limits at the container levels (from the )
A Web Application component, which contains service-level metrics like throughput and response time of the microservice (from the optimization pack)
These commands are executed from the toolbox, an Akamas utility that can be enabled in an Akamas installation on Kubernetes. Make sure that kubectl
is configured correctly to connect to your Kubernetes cluster and can update your target deployment. See for more details.