Akamas Docs
3.2.0
3.2.0
  • How to use this documentation
  • Getting started with Akamas
    • Introduction to Akamas
    • Licensing
    • Deployment
      • Cloud Hosting
    • Security
    • Maintenance & Support (M&S) Services
      • Customer Support Services
      • Support levels for Customer Support Services
      • Support levels for software versions
      • Support levels with Akamas
  • Installing Akamas
    • Architecture
    • Docker compose installation
      • Prerequisites
        • Hardware Requirements
        • Software Requirements
        • Network requirements
      • Install Akamas dependencies
      • Install the Akamas Server
        • Online installation mode
          • Online installation behind a Proxy server
        • Offline installation mode
        • Changing UI Ports
        • Setup HTTPS configuration
      • Troubleshoot Docker installation issues
    • Kubernetes installation
      • Prerequisites
        • Cluster Requirements
        • Software Requirements
      • Install Akamas
        • Online Installation
        • Offline Installation
      • Accessing Akamas
      • HTTPS configuration
    • Install the CLI
      • Setup the CLI
      • Initialize the CLI
      • Change CLI configuration
      • Use a proxy server
    • Verify the installation
    • Install the license
    • Manage anonymous data collection
    • Manage Akamas
      • Akamas logs
      • Audit logs
      • Install upgrades and patches
      • Monitor the Akamas Server
      • Backup & Recover of the Akamas Server
  • Using Akamas
    • General optimization process and methodology
    • Preparing optimization studies
      • Modeling systems
      • Modeling components
        • Creating custom optimization packs
        • Managing optimization packs
      • Creating telemetry instances
      • Creating automation workflows
        • Creating workflows for offline studies
        • Performing load testing to support optimization activities
        • Creating workflows for live optimizations
      • Creating optimization studies
        • Defining optimization goal & constraints
        • Defining windowing policies
        • Defining KPIs
        • Defining parameters & metrics
        • Defining workloads
        • Defining optimization steps
        • Setting safety policies
    • Running optimization studies
      • Before running optimization studies
      • Analyzing results of offline optimization studies
        • Optimization Insights
      • Analyzing results of live optimization studies
      • Before applying optimization results
    • Guidelines for choosing optimization parameters
      • Guidelines for Kubernetes
      • Guidelines for JVM layer (OpenJDK)
      • Guidelines for JVM (OpenJ9)
      • Guidelines for Oracle Database
      • Guidelines for PostgreSQL
    • Guidelines for defining optimization studies
      • Optimizing Linux
      • Optimizing Java OpenJDK
      • Optimizing OpenJ9
      • Optimizing Web Applications
      • Optimizing Kubernetes
      • Optimizing Spark
      • Optimizing Oracle Database
      • Optimizing MongoDB
      • Optimizing MySQL Database
      • Optimizing PostgreSQL
  • Integrating Akamas
    • Integrating Telemetry Providers
      • CSV provider
        • Install CSV provider
        • Create CSV telemetry instances
      • Dynatrace provider
        • Install Dynatrace provider
        • Create Dynatrace telemetry instances
      • Prometheus provider
        • Install Prometheus provider
        • Create Prometheus telemetry instances
        • CloudWatch Exporter
        • OracleDB Exporter
      • Spark History Server provider
        • Install Spark History Server provider
        • Create Spark History Server telemetry instances
      • NeoLoadWeb provider
        • Install NeoLoadWeb telemetry provider
        • Create NeoLoadWeb telemetry instances
      • LoadRunner Professional provider
        • Install LoadRunner Professional provider
        • Create LoadRunner Professional telemetry instances
      • LoadRunner Enterprise provider
        • Install LoadRunner Enterprise provider
        • Create LoadRunner Enterprise telemetry instances
      • AWS provider
        • Install AWS provider
        • Create AWS telemetry instances
    • Integrating Configuration Management
    • Integrating Value Stream Delivery
    • Integrating Load Testing
      • Integrating NeoLoad
      • Integrating Load Runner Professional
      • Integrating LoadRunner Enterprise
  • Akamas Reference
    • Glossary
      • System
      • Component
      • Metric
      • Parameter
      • Component Type
      • Workflow
      • Telemetry Provider
      • Telemetry Instance
      • Optimization Pack
      • Goals & Constraints
      • KPI
      • Optimization Study
      • Offline Optimization Study
      • Live Optimization Study
      • Workspace
    • Construct templates
      • System template
      • Component template
      • Parameter template
      • Metric template
      • Component Types template
      • Telemetry Provider template
      • Telemetry Instance template
      • Workflows template
      • Study template
        • Goal & Constraints
        • Windowing policy
          • Trim windowing
          • Stability windowing
        • Parameter selection
        • Metric selection
        • Workload selection
        • KPIs
        • Steps
          • Baseline step
          • Bootstrap step
          • Preset step
          • Optimize step
        • Parameter rendering
        • Optimizer Options
    • Workflow Operators
      • General operator arguments
      • Executor Operator
      • FileConfigurator Operator
      • LinuxConfigurator Operator
      • WindowsExecutor Operator
      • WindowsFileConfigurator Operator
      • Sleep Operator
      • OracleExecutor Operator
      • OracleConfigurator Operator
      • SparkSSHSubmit Operator
      • SparkSubmit Operator
      • SparkLivy Operator
      • NeoLoadWeb Operator
      • LoadRunner Operator
      • LoadRunnerEnteprise Operator
    • Telemetry metric mapping
      • Dynatrace metrics mapping
      • Prometheus metrics mapping
      • NeoLoadWeb metrics mapping
      • Spark History Server metrics mapping
      • LoadRunner metrics mapping
    • Optimization Packs
      • Linux optimization pack
        • Amazon Linux
        • Amazon Linux 2
        • Amazon Linux 2022
        • CentOS 7
        • CentOS 8
        • RHEL 7
        • RHEL 8
        • Ubuntu 16.04
        • Ubuntu 18.04
        • Ubuntu 20.04
      • DotNet optimization pack
        • DotNet Core 3.1
      • Java OpenJDK optimization pack
        • Java OpenJDK 8
        • Java OpenJDK 11
      • OpenJ9 optimization pack
        • IBM J9 VM 6
        • IBM J9 VM 8
        • Eclipse Open J9 11
      • NodeJS optimization pack
        • NodeJS
      • GO optimization pack
        • GO 1
      • Web Application optimization pack
        • Web Application
      • Docker optimization pack
        • Container
      • Kubernetes optimization pack
        • Kubernetes Pod
        • Kubernetes Container
        • Kubernetes Workload
        • Kubernetes Namespace
        • Kubernetes Cluster
      • WebSphere optimization pack
        • WebSphere 8.5
        • WebSphere Liberty ND
      • AWS optimization pack
        • EC2
        • Lambda
      • PostgreSQL optimization pack
        • PostgreSQL 11
        • PostgreSQL 12
      • Cassandra optimization pack
        • Cassandra
      • MySQL Database optimization pack
        • MySQL 8.0
      • Oracle Database optimization pack
        • Oracle Database 12c
        • Oracle Database 18c
        • Oracle Database 19c
        • RDS Oracle Database 11g
        • RDS Oracle Database 12c
      • MongoDB optimization pack
        • MongoDB 4
        • MongoDB 5
      • Elasticsearch optimization pack
        • Elasticsearch 6
      • Spark optimization pack
        • Spark Application 2.2.0
        • Spark Application 2.3.0
        • Spark Application 2.4.0
    • Command Line commands
      • Administration commands
      • User and Workspace management commands
      • Authentication commands
      • Resource management commands
      • Optimizer options commands
    • Release Notes
  • Knowledge Base
    • Setting up a Konakart environment for testing Akamas
    • Modeling a sample Java-based e-commerce application (Konakart)
    • Optimizing a web application
    • Optimizing a sample Java OpenJ9 application
    • Optimizing a sample Java OpenJDK application
    • Optimizing a sample Linux system
    • Optimizing a MongoDB server instance
    • Optimizing a Kubernetes application
    • Leveraging Ansible to automate AWS instance management
    • Guidelines for optimizing AWS EC2 instances
    • Optimizing a sample application running on AWS
    • Optimizing a Spark application
    • Optimizing an Oracle Database server instance
    • Optimizing an Oracle Database for an e-commerce service
    • Guidelines for optimizing Oracle RDS
    • Optimizing a MySQL server database running Sysbench
    • Optimizing a MySQL server database running OLTPBench
    • Optimizing cost of a Kubernetes application while preserving SLOs in production
    • Optimizing a live full-stack deployment (K8s + JVM)
  • Akamas Free Trial
Powered by GitBook
On this page
  • Environment setup
  • Optimization setup
  • System
  • Workflow
  • Telemetry

Was this helpful?

Export as PDF
  1. Knowledge Base

Optimizing a Kubernetes application

Last updated 1 year ago

Was this helpful?

In this example, we’ll optimize , a demo e-commerce application running on microservices, by tuning the resources allocated to a selection of pods. This is a common use case where we want to minimize the cost associated with running an application without impacting the SLO.

Notice: all the required artifacts are published in this .

Environment setup

The test environment includes the following instances:

  • Akamas: the instance running Akamas.

  • Cluster: an instance hosting a Minikube cluster.

You can configure the Minikube cluster using the scripts provided in the public repository by running the command

kubectl apply -f kubernetes-online-boutique/kube/prometheus.yaml

Telemetry Infrastructure setup

To gather metrics about the application we will use Prometheus. It will be automatically configured by applying the artifacts in the repository with the following command:

kubectl apply -f kubernetes-online-boutique/kube/

Application and Test tool

The targeted system is Online Boutique, a microservice-based demo application. In the same namespace, a deployment running the load generator will stress the boutique and forward the performance metrics to Prometheus.

To configure the application and the load generator on your (Minikube) cluster, apply the definitions provided in the public repository by running the following command:

kubectl apply -f kubernetes-online-boutique/kube/

Optimization setup

In this section, we will guide you through the steps required to set up the optimization on Akamas.

Notice: the artifacts to create the Akamas entities can be found in the public repository, under the akamas directory.

System

System Online Boutique

Here’s the definition of the system containing our components and telemetry-instances for this example:

name: Online Boutique
description: The Online Boutique by Google

To create the system run the following command:

akamas create component application.yaml 'Online Boutique'

Component online_boutique

Here’s the definition of the component:

name: online_boutique
description: The Online Boutique application
componentType: Web Application
properties:
  prometheus:
    instance: .*
    job: .*
    namespace: akamas-demo
    container: server|redis

To create the component in the system run the following command:

akamas create component application.yaml 'Online Boutique'

Component frontend and productcatalogservice

Here’s their definition:

name: frontend
description: The frontend of the online boutique by Google
componentType: Kubernetes Container
properties:
  prometheus:
    job: .*
    instance: .*
    name: .*
    pod: ak-frontend.*
    container: server
name: productcatalogservice
description: The productcatalogservice of the online boutique by Google
componentType: Kubernetes Container
properties:
  prometheus:
    job: .*
    instance: .*
    name: .*
    pod: ak-productcatalogservice.*
    container: server

To create the component in the system run the following command:

akamas create component frontend.yaml 'Online Boutique'
akamas create component productcatalogservice.yaml 'Online Boutique'

Workflow

The workflow is divided into the following steps:

  • Create the YAML artifacts with the updated resource limits for the tuned containers.

  • Apply the updated definitions to the cluster.

  • Wait for the rollout to complete.

  • Start the load generator

  • Let the test run for a fixed amount of time

  • Stop the test and reset the load generator

The following is the definition of the workflow:

name: boutique
tasks:
  - name: Configure Online Boutique
    operator: FileConfigurator
    arguments:
      source:
        hostname: CLUSTER_INSTANCE_IP
        username: akamas
        password: akamas
        path: boutique.yaml.templ
      target:
        hostname: cluster
        username: akamas
        password: akamas
        path: boutique.yaml

  - name: Apply new configuration to the Online Boutique
    operator: Executor
    arguments:
      host:
        hostname: CLUSTER_INSTANCE_IP
        username: akamas
        password: akamas
      command: kubectl apply -f boutique.yaml

  - name: Check Online Boutique is up
    operator: Executor
    arguments:
      retries: 0
      host:
        hostname: CLUSTER_INSTANCE_IP
        username: akamas
        password: akamas
      command: kubectl wait --for=condition=available deploy/ak-frontend deploy/ak-productcatalogservice --timeout=30s

  - name: Start Locust Test
    operator: Executor
    arguments:
      host:
        hostname: CLUSTER_INSTANCE_IP
        username: akamas
        password: akamas
      command: bash load-test.sh start

  - name: Test
    operator: Sleep
    arguments:
      seconds: 150

  - name: Stop Locust test
    operator: Executor
    arguments:
      host:
        hostname: CLUSTER_INSTANCE_IP
        username: akamas
        password: akamas
      command: bash load-test.sh stop

To better illustrate the process, here is a snippet of the template file used to update the resource limits for the frontend deployment.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ak-frontend
  namespace: akamas-demo
spec:
  selector:
    matchLabels:
      app: ak-frontend
  template:
    metadata:
      labels:
        app: ak-frontend
    # other definitions...
    spec:
      containers:
        - name: server
          image: gcr.io/google-samples/microservices-demo/frontend:v0.2.2
          # other definitions...
          resources:
            requests:
              cpu: ${frontend.cpu_limit}
              memory: ${frontend.memory_limit}
            limits:
              cpu: ${frontend.cpu_limit}
              memory: ${frontend.memory_limit}
# other definitions...

The following are respectively the script to start and stop the load generator:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ak-frontend
  namespace: akamas-demo
spec:
  selector:
    matchLabels:
      app: ak-frontend
  replicas: 1
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  template:
    metadata:
      labels:
        app: ak-frontend
    spec:
      serviceAccountName: default
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 1
              preference:
                matchExpressions:
                  - key: akamas/node
                    operator: In
                    values:
                      - akamas
      containers:
        - name: server
          image: gcr.io/google-samples/microservices-demo/frontend:v0.2.2
          ports:
            - containerPort: 8080
          readinessProbe:
            initialDelaySeconds: 10
            httpGet:
              path: "/_healthz"
              port: 8080
              httpHeaders:
                - name: "Cookie"
                  value: "shop_session-id=x-readiness-probe"
          livenessProbe:
            initialDelaySeconds: 10
            httpGet:
              path: "/_healthz"
              port: 8080
              httpHeaders:
                - name: "Cookie"
                  value: "shop_session-id=x-liveness-probe"
          env:
            - name: PORT
              value: "8080"
            - name: PRODUCT_CATALOG_SERVICE_ADDR
              value: "ak-productcatalogservice:3550"
            - name: CURRENCY_SERVICE_ADDR
              value: "ak-currencyservice:7000"
            - name: CART_SERVICE_ADDR
              value: "ak-cartservice:7070"
            - name: RECOMMENDATION_SERVICE_ADDR
              value: "ak-recommendationservice:8080"
            - name: SHIPPING_SERVICE_ADDR
              value: "ak-shippingservice:50051"
            - name: CHECKOUT_SERVICE_ADDR
              value: "ak-checkoutservice:5050"
            - name: AD_SERVICE_ADDR
              value: "ak-adservice:9555"
            - name: ENV_PLATFORM
              value: "aws"
            - name: DISABLE_TRACING
              value: "1"
            - name: DISABLE_PROFILER
              value: "1"
          resources:
            requests:
              cpu: ${frontend.cpu_limit}
              memory: ${frontend.memory_limit}
            limits:
              cpu: ${frontend.cpu_limit}
              memory: ${frontend.memory_limit}

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ak-productcatalogservice
  namespace: akamas-demo
spec:
  selector:
    matchLabels:
      app: ak-productcatalogservice
  replicas: 1
  template:
    metadata:
      labels:
        app: ak-productcatalogservice
    spec:
      serviceAccountName: default
      terminationGracePeriodSeconds: 5
      affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 1
              preference:
                matchExpressions:
                  - key: akamas/node
                    operator: In
                    values:
                      - akamas
      containers:
        - name: server
          image: gcr.io/google-samples/microservices-demo/productcatalogservice:v0.2.2
          ports:
            - containerPort: 3550
          env:
            - name: PORT
              value: "3550"
            - name: DISABLE_STATS
              value: "1"
            - name: DISABLE_TRACING
              value: "1"
            - name: DISABLE_PROFILER
              value: "1"
          readinessProbe:
            exec:
              command: ["/bin/grpc_health_probe", "-addr=:3550"]
          livenessProbe:
            exec:
              command: ["/bin/grpc_health_probe", "-addr=:3550"]
          resources:
            requests:
              cpu: ${productcatalogservice.cpu_limit}
              memory: ${productcatalogservice.memory_limit}
            limits:
              cpu: ${productcatalogservice.cpu_limit}
              memory: ${productcatalogservice.memory_limit}
#/bin/bash

ACTION=$1

LOCUST_ENDPOINT="$(minikube service -n akamas-demo ak-loadgenerator | awk '/web-ui.*http/ {print $8}')"

case $ACTION in
  start)
    echo curl -X POST -d 'user_count=100' -d 'spawn_rate=3' -d 'host=http://ak-frontend:80'  "${LOCUST_ENDPOINT}/swarm"
    ;;
  stop)
    echo curl "${LOCUST_ENDPOINT}/stop"
    echo curl "${LOCUST_ENDPOINT}/stats/reset"
    ;;
  *)
    echo "Unrecognized option '${ACTION}'"
    exit 1
    ;;
esac

Telemetry

With the definition of the telemetry instance shown below, we import the end-user performance metrics provided by the load-generator, along with a custom definition of "cost" given by a weighted sum of the CPU and memory allocated for the pods in the cluster:

provider: Prometheus
config:
  address: CLUSTER_IP
  port: PROM_PORT
metrics:
  - metric: users
    datasourceMetric: "locust_users"
  - metric: transactions_throughput
    datasourceMetric: 'rate(locust_requests_num_requests{name="Aggregated"}[30s]) - rate(locust_requests_num_failures{name="Aggregated"}[30s])'
  - metric: transactions_error_throughput
    datasourceMetric: 'rate(locust_requests_num_failures{name="Aggregated"}[30s])'
  - metric: transactions_error_rate
    datasourceMetric: "locust_requests_fail_ratio"
  - metric: transactions_response_time
    datasourceMetric: 'locust_requests_avg_response_time{name="Aggregated"}'
  - metric: transactions_response_time_p50
    datasourceMetric: 'locust_requests_current_response_time_percentile_50'
  - metric: transactions_response_time_p95
    datasourceMetric: 'locust_requests_current_response_time_percentile_95'

  - metric: cost
    datasourceMetric: 'sum(kube_pod_container_resource_requests{resource="cpu" %FILTERS%})*29 + sum(kube_pod_container_resource_requests{resource="memory" %FILTERS%})/1024/1024/1024*3.2'

To create the telemetry instance execute the following command:

akamas create telemetry-instance prometheus.yml 'Online Boutique'

Study

With this study, we want to minimize the "cost" of running the application, which, according to the definition described in the previous section, it means reducing the resources allocated to the tuned pods in the cluster. At the same time, we want the application to stay within the expected SLO, and that is obtained by defining a constraint on the response time and error rate recorded by the load generator.

name: Minimize Kubernetes Online Boutique cost while matching SLOs
system: Online Boutique
workflow: boutique

goal:
  objective: minimize
  constraints:
    absolute:
      - name: response_time
        formula: online_boutique.transactions_response_time <= 500
      - name: error_rate
        formula: online_boutique.transactions_error_rate <= 0.02
  function:
    formula: online_boutique.cost

windowing:
  type: trim
  trim: [1m, 30s]
  task: Test

metricsSelection:
  - online_boutique.cost
  - online_boutique.transactions_throughput
  - online_boutique.transactions_error_rate
  - online_boutique.transactions_response_time
  - online_boutique.transactions_response_time_p95
  - online_boutique.users
  - frontend.container_cpu_used
  - frontend.container_cpu_util
  - frontend.container_cpu_limit
  - frontend.container_cpu_throttle_time
  - frontend.container_memory_used
  - frontend.container_memory_util
  - frontend.container_memory_limit
  - productcatalogservice.container_cpu_used
  - productcatalogservice.container_cpu_util
  - productcatalogservice.container_cpu_limit
  - productcatalogservice.container_cpu_throttle_time
  - productcatalogservice.container_memory_used
  - productcatalogservice.container_memory_util
  - productcatalogservice.container_memory_limit

parametersSelection:
  - name: frontend.cpu_limit
    domain: [100, 300]
  - name: frontend.memory_limit
    domain: [64, 512]
  - name: productcatalogservice.cpu_limit
    domain: [100, 500]
  - name: productcatalogservice.memory_limit
    domain: [64, 512]

steps:
  - name: baseline
    type: baseline
    values:
      frontend.cpu_limit: 300
      frontend.memory_limit: 256
      productcatalogservice.cpu_limit: 300
      productcatalogservice.memory_limit: 256

  - name: optimize
    type: optimize
    numberOfExperiments: 50

To create and run the study execute the following commands:

akamas create study study.yaml
akamas start study 'Minimize Kubernetes Online Boutique cost while matching SLOs'

If you have not installed the Kubernetes optimization pack yet, take a look at the page to proceed with the installation.

We’ll use a component of type to represent at a high level the Online Boutique application. To identify the related Prometheus metrics the configuration requires the prometheus property for the telemetry service, detailed later in this guide.

The public repository contains the definition of all the services that compose Online Boutique. In this guide, for the sake of simplicity, we’ll only tune the resources of the containers in the frontend and the product-catalog pods, defined as components of type .

If you have not installed the Prometheus telemetry provider yet, take a look at the telemetry provider page to proceed with the installation.

Online Boutique
public repository
Kubernetes optimization pack
WebApplication
Kubernetes Container
Prometheus provider