Optimize cost of a Java microservice on Kubernetes while preserving SLOs in production
In this guide, you optimize the cost (or resource footprint) of a Java microservice running on Kubernetes. The study tunes both pod resource settings (CPU and memory requests and limits) and JVM options (max heap size, garbage collection algorithm, etc.) at the same time, while also taking into account your application performance and reliability requirements (SLOs). This optimization happens in production, leveraging Akamas live optimization capabilities.
Prerequisites
an Akamas instance
a Kubernetes cluster, with a Java-based deployment to be optimized
a supported telemetry data source configured to collect metrics from the target Kubernetes cluster (see here for the full list)
a way to apply configuration changes recommended by Akamas to the target deployment. In this guide, Akamas interacts directly with the Kubernetes APIs via
kubectl.
You need a service account with permission to update your deployment (see below for other integration options)
Optimization setup
In this guide, we assume the following setup:
the Kubernetes deployment to be optimized is called adservice (in the boutique namespace)
in the deployment, there is a container named server, where the application JVM runs
Dynatrace is used as an observability tool
Let's set up the Akamas optimization for this use case.
System
For this optimization, you need the following components to model the adservice tech stack:
A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits (from the Kubernetes optimization pack)
A Java OpenJDK component, which contains JVM-level metrics like heap memory usage and parameters to be tuned like the garbage collector algorithm (from the Java OpenJDK optimization pack)
A Web Application component, which contains service-level metrics like throughput and response time of the microservice (from the Web application optimization pack)
Let's start by creating the system, that represents the Kubernetes deployment to be optimized. To create it, write a system.yaml
manifest like this:
name: adservice
description: The Adservice deployment
Then run:
akamas create system system.yaml
Now create a component-container.yaml
manifest like the following:
name: server
description: Kubernetes container in the frontend deployment
componentType: Kubernetes Container
properties:
dynatrace:
type: CONTAINER_GROUP_INSTANCE
kubernetes:
namespace: boutique
containerName: server
basePodName: frontend-*
Then run:
akamas create component component-container.yaml frontend
Next, create a component-jvm.yaml
manifest like the following:
name: jvm
description: JVM of the frontend deployment
componentType: java-openjdk-17
properties:
dynatrace:
type: PROCESS
tags:
akamas: adservice-jvm
Then run:
akamas create component component-jvm.yaml adservice
Now create a component-webapp.yaml
manifest like the following:
name: webapp
description: The HTTP service of the adservice deployment
componentType: Web Application
properties:
dynatrace:
type: SERVICE
name: adservice
Then run:
akamas create component component-webapp.yaml frontend
Workflow
To optimize a Kubernetes microservice in production, you need to create a workflow that defines how to deploy in production the new configuration recommended by Akamas.
Let's explore the high-level tasks required in this scenario and the options you have to adapt it to your environment:
Let's now create a workflow.yaml
manifest like the following:
name: adservice
tasks:
- name: configure
operator: FileConfigurator
arguments:
source:
hostname: toolbox
username: akamas
password: <your-toolbox-password>
path: adservice.yaml.templ
target:
hostname: toolbox
username: akamas
password: <your-toolbox-password>
path: adservice.yaml
- name: apply
operator: Executor
arguments:
timeout: 5m
host:
hostname: toolbox
username: akamas
password: <your-toolbox-password>
command: kubectl apply -f adservice.yaml
- name: verify
operator: Executor
arguments:
timeout: 5m
host:
hostname: toolbox
username: akamas
password: <your-toolbox-password>
command: kubectl rollout status --timeout=5m deployment/adservice -n boutique;
- name: observe
operator: Sleep
arguments:
seconds: 1800
In the configure
task, Akamas will apply the container CPU/memory limits and JVM options recommended by Akamas AI to the deployment file. To do that, copy your deployment manifest to a template file (here called adservice.yaml.templ
), and substitute the current values with Akamas parameter placeholders as follows:
apiVersion: apps/v1
kind: Deployment
metadata:
name: adservice
spec:
selector:
matchLabels:
app: adservice
replicas: 1
template:
metadata:
labels:
app: adservice
spec:
containers:
- name: server
image: gcr.io/google-samples/microservices-demo/adservice:v0.3.8
ports:
- containerPort: 9555
env:
- name: PORT
value: "9555"
- name: JAVA_OPTS
value: "${jvm.*}"
resources:
limits:
cpu: ${server.cpu_limit}
memory: ${server.memory_limit}
Whenever Akamas recommended configuration is applied, the configure task will create the actual adservice.yaml
deployment file with the parameter placeholders substituted with values recommended by Akamas AI, and then the new deployment will be applied via kubectl apply
.
To create the workflow, run:
akamas create workflow workflow.yaml
Telemetry
Create a telemetry instance based on your observability setup to collect your target Kubernetes deployment metrics.
Create a telemetry.yaml
manifest like the following:
provider: Dynatrace
config:
url: <YOUR_DYNATRACE_URL>
token: <YOUR_DYNATRACE_TOKEN>
Then run:
akamas create telemetry-instance telemetry.yaml adservice
Study
It's time to create the Akamas study to achieve your optimization objectives.
Let's explore how the study is designed by going through the main concepts. The complete study manifest is available at the bottom.
You can now create a study.yaml
manifest like the following:
name: adservice - optimize costs tuning K8s and JVM
system: adservice
workflow: adservice
goal:
name: Cost
objective: minimize
function:
formula: ((server.container_cpu_limit)/1000)*29 + ((((server.container_memory_limit)/1024)/1024)/1024)*3
constraints:
absolute:
- name: Application response time degradation
formula: web_application.requests_response_time:max <= 5
- name: Application error rate degradation
formula: web_application.requests_error_rate:max <= 0.02
- name: Container CPU saturation
formula: server.container_cpu_util_max:p95 < 1
- name: Container memory saturation
formula: server.container_memory_util_max:max < 1
- name: Container out-of-memory
formula: server.container_restarts == 0
- name: JVM heap saturation
formula: jvm.jvm_gc_time:max < 0.05
windowing:
type: trim
trim: [2m, 0s]
task: observe
parametersSelection:
- name: server.cpu_request
domain: [10, 181]
- name: server.cpu_limit
domain: [10, 181]
- name: server.memory_request
domain: [16, 2048]
- name: jvm.jvm_maxHeapSize
domain: [16, 1024]
- name: jvm.jvm_gcType
parameterConstraints:
- name: JVM off-heap safety buffer
formula: jvm.jvm_maxHeapSize + 1000 < server.memory_limit
- name: CPU limit at most 2x of requests
formula: server.cpu_limit <= server.cpu_request * 2
workloadsSelection:
- name: web_application.requests_throughput
numberOfTrials: 48
steps:
- name: baseline
type: baseline
values:
server.cpu_limit: 1000
server.memory_limit: 2048
jvm.jvm_maxHeapSize: 1024
jvm.jvm_gcType: Serial
- name: optimize
type: optimize
numberOfExperiments: 21
Then run:
akamas create study study.yaml
You can now follow the live optimization progress and explore the results using the Akamas UI.
Artifact templates
To quickly set up this optimization, download the Akamas template manifests and update the values file to match your needs. Then, create your optimization using the Akamas scaffolding.
Last updated
Was this helpful?