Only this pageAll pages
Powered by GitBook
Couldn't generate the PDF for 287 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

3.6

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Installing

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Users management

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Applications running on cloud instances

Loading...

Spark applications

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Introduction

A quick introduction to Akamas

Akamas is the AI-powered optimization platform designed to maximize service quality and cost efficiency without compromising on application performance. Akamas supports both production environments under live, dynamic workloads, and in test/pre-production environments against any what-if scenario and workload.

Thanks to Akamas, performance engineers, DevOps, CloudOps, FinOps and SRE teams can keep complex applications, such as Kubernetes microservices applications, optimized to avoid any unnecessary cost and any performance risks.

Akamas Optimization platform

The Akamas optimization platform leverages patented AI techniques that can autonomously identify optimal full-stack configurations driven by any custom-defined goals and constraints (SLOs), without any human intervention, any agents, and any code or byte-code changes.

Akamas optimal configurations can be applied either i) under human approval (human-in-the-loop mode) or ii) automatically, as a continuous optimization step in a CI/CD pipeline (in-the-pipe) or iii) autonomously by Akamas (autopilot).

Akamas coverage

Akamas can optimize any system with respect to any set of parameters chosen from the application, middleware, database, cloud, and any other underlying layers.

Akamas provides dozens of out-of-the-box Optimization Packs available for key technologies such as JVM, Go, Kubernetes, Docker, Oracle, MongoDB, ElasticSearch, PostgreSQL, Spark, AWS EC2 and Lambda, and more. Optimization Pack provides parameters, relationships, and metrics to accelerate the optimization process setup and support company-wide best practices. Custom Optimization Packs can be easily created without any coding.

The following figure is illustrative of Akamas coverage for both managed technologies and integrated components of the ecosystem.

Akamas integrations

Akamas can integrate with any ecosystem thanks to out-of-the-box and custom integrations with the following components:

  • telemetry & monitoring tools and other sources of KPIs and cost data, such as Dynatrace, Prometheus, CloudWatch, and CSV files

  • configuration management tools, repositories and interfaces to apply configurations, such as Ansible, Openshift, and Git

  • value stream delivery tools to support a continuous optimization process, such as Jenkins, Dynatrace Cloud Automation, and GitLab

  • load testing tools to generate simulated workloads in test/pre-production, such as LoadRunner, NeoLoad, and JMeter

Akamas has been designed around Infrastructure-as-Code (IaC) and DevOps principles. Thanks to a comprehensive set of APIs and integration mechanisms, it is possible to extend the Akamas optimization platform to manage any system and integrate with any ecosystem.

Use Cases

Akamas optimization platform supports a variety of use cases, including:

  • Improve Service Quality: optimize application performance (e.g. maximize throughput, minimize response time and job execution time) and stability (lower fluctuations and peaks);

  • Increase Business Agility: identify resource bottlenecks in early stages of the delivery cycle, avoid delays due to manual remediations - release higher quality services and reduce production incidents;

  • Increase Service Resilience: improve service resilience under higher workloads (e.g. expected business growth) or failure scenarios identified by chaos engineering practices - improve SRE practice;

  • Reduce IT Cost / Cloud Bill: reduce on-premise infrastructure cost and cloud bills due to resource over-provisioning - improve cost efficiency of Kubernetes microservices applications;

  • Optimize Cloud Migration: safely migrate on-premise applications to cloud environments for optimal cost efficiency evaluate options to migrate to managed services (e.g. AWS Fargate);

  • Improve Operational Efficiency: save engineering time spent on manual tuning tasks and enable Performance Engineering teams to do more in less time (and with less external consulting).

Home

  • provides a very first introduction to AI-powered optimization

  • covers Akamas licensing, deployment, security topics

  • describes Akamas maintenance and support services.

This guide provides some preliminary knowledge required to puchaise, implement and use Akamas.

User personas: All roles

  • describes the Akamas architecture

  • provides the hardware, software and network prerequisites

  • describes the steps to install an Akamas Server and CLI

This guide provides the knowledge required to install and manage an Akamas installation.

User personas: Akamas Admin

  • describes the Akamas optimization process and methodology

  • provides guidelines for optimizing some specific technologies

  • provides examples of optimization studies

This guide provides the methodology to define an optimization process and knowledge to leverage Akamas

User personas: Analyst / Practicioner teams

  • describes how to integrate Akamas with the telemetry providers and configuration management tools

  • describes how to integrate Akamas with load testing tools

  • describes how to integrate Akamas with CI/CD tools

This guide provides the knowledge required to integrate Akamas with the ecosystem

User personas: Akamas Admin, DevOps team

  • provides a glossary of Akamas key concepts with references to construct templates and commands

  • provides a reference to Akamas construct templates

  • provides a reference to Akamas command-line commands

  • describes Akamas optimization packs and telemetry providers

User personas: Akamas Admin, DevOps team, Analyst / Practicioner teams

  • describes how to setup a test environment for experimenting with Akamas

  • describes how to apply the Akamas approach to the optimization of some real-world cases

  • provides examples of Akamas templates and commands for the real-world cases

User personas: Analyst / Practicioner teams

Getting started with Akamas
Installing Akamas
Using Akamas
Integrating Akamas
Akamas Reference
Knowledge Base

Insights for Kubernetes

What is Insights

Insights is a new Akamas capability that helps SREs, platform engineers, developers and FinOps teams uncover hidden cost inefficiencies and reliability risks in your Kubernetes clusters and applications.

Insights provides actionable recommendations to optimize your Kubernetes environment quickly and easily, without requiring setup effort and skills.

Why Insights

Achieving reliable and cost-efficient Kubernetes clusters and applications is easier said than done. The untold reality is that most Kubernetes clusters are massively over-provisioned, and at the same time, applications suffer reliability issues.

Insights analyzes your entire Kubernetes environment and provides:

  • Clear visibility into optimization opportunities across all clusters.

  • Estimated impact of the optimization, e.g. achievable savings.

  • Prioritized, safe recommendations for both infrastructure and application configurations.

Why Insights is different

  • No agents required: no setup time, no security checks are required.

  • Full-stack optimization approach: while current Kubernetes optimization tools just consider pod CPU and memory resources, Insights goes deeper inside the pod and optimizes the application runtime, such as the JVM for Java applications or V8 for Node.js applications. This is unique in the industry.

  • No effort required: identifies optimization opportunities and provide recommendations with no effort and deep Kubernetes and application runtime skills required.

  • Designed with safety in mind: recommendations are full-stack and consider the application running within the pod. This avoids reliability risks such as out-of-memory errors or CPU throttling, hence are trusted by development teams.

  • Best practices built-in: provides not only recommendations but also best-practices your teams can use to avoid reliability incidents and run highly efficient Kubernetes environments.

How Insights works

  1. Connect Insights with your Kubernetes observability solution Insights collect metrics from your existing observability tools. See the FAQ for the list of supported tools.

  2. Insights gathers metrics history of your Kubernetes clusters See below for more details about which data is collected.

  3. Insights analyzes collected data using its full-stack, application-aware recommendation engines and knowledge base Insights analyzes data and identifies opportunities to optimize efficiency and reliability using its full-stack, technology-specific recommendation engines. Recommendations are generated considering clusters, workload and application runtimes like the JVM.

  4. Insights shows the identified cost savings opportunities and reliability issues, plus recommendations to improve Kubernetes efficiency and reliability

Example screenshot

Insights summary dashboard showing optimization opportunities across all clusters, and a recommendation to optimize the pod resources and JVM memory for a Java application.

Integration requirements

Insights collects data leveraging the observability tool you are already using to monitor your Kubernetes environment. No agent needs to be installed on your clusters.

Account credentials

Insights simply needs a read-only account to connect and extract data from your observability tool.

Type of collected data

Insights collect technical metrics and configuration information only (see below for details). No PII information is collected.

Metrics collected

Insights analyzes and provide recommendations to optimize the full Kubernetes stack.

To do so, it requires access to the following metrics:

Level
Description
Examples

Kubernetes cluster

Metrics and configuration information related to

  • cluster

  • nodes

  • cluster autoscalers

  • Cluster CPU/memory requests, limits, and used

  • Node CPU/memory requests, limits, and used

Kubernetes workloads

Metrics and configuration information related to

  • workloads

  • pods & containers

  • HPA

  • namespaces

  • resource quotas

  • Pods CPU/memory requests, limits, and used

  • HPA replica count

  • Namespaces CPU/memory requests, limits, and used

Application runtime

Metrics and configuration information related to the runtime powering the application

  • Java virtual machine (JVM)

  • Node.js V8 (planned)

  • JVM heap size, usage

  • JVM garbage collection

  • JVM configuration

Not all metrics are mandatory!

We recommend to feed Insights with all the mentioned layers for best results. However, not all the layers are mandatory. In particular, application runtime metrics are used by Insights to optimize your applications for max reliability and efficiency. However, if application runtime metrics are not available in your observability tool, Insights will still provide technology-agnostic recommendations.

Getting started

Insights is in beta status and will be in GA soon. Try it out and give us your feedback!

Frequently Asked Questions

Do I need to install anything in my cluster? No — Akamas Insights is agentless. It leverages metrics already collected by your Kubernetes observability tool. Which observability tools are supported? Observability tools currently supported are:

  • Dynatrace SaaS

  • Datadog

  • Grafana Cloud (planned)

We're adding support for more solutions, please reach out to us if your solution is not listed here. What is the deployment model? Insights is a SaaS-based solution. Will this modify workloads? No — Insights is read-only and does not modify your workloads. You can inspect the recommendations and apply them manually. Support for automation is planned.

Can I use Insights with multiple clusters? Yes — Insights supports multi-cluster views and analysis.

Getting started

This guide introduces Akamas and covers various fundamental topics such as licensing and deployment models, security topics, and maintenance & support services.

Deployment

Akamas is an on-premise product running on a dedicated machine within the customer environment:

  • on a virtual or physical machine in your data center

  • on a virtual machine managed running on a cloud, by any cloud provider (e.g. AWS EC2)

  • on your own laptop

All of this comes easy, with no skills and effort required to set up, as there are no agents to be installed. For more information, read our launch .

Request your access .

It is recommended to read this guide before moving to other guides on how to install, integrate, and use Akamas. The section of the Reference guide can help in reviewing Akamas key concepts.

Akamas also provides a Free Trial option which can be requested .

blog
here

Customer Support Services

Akamas Customer Support Services are delivered by Akamas support engineers, also called Support Agents, who will work remotely with Customer to provide a temporary remedy for the incident and, ultimately, a permanent resolution. Akamas Support Agents automatically escalate issues to the appropriate technical group within Akamas and notify Customers of any relevant progress. Akamas provides Customers with the ability to escalate issues when appropriate.

Please notice that Customer Support services are not to be considered as alternatives to product documentation and training, or to professional and consulting services, so adequate knowledge of Akamas products is assumed when interacting with Akamas Customer Support. Thus, during the resolution of a reported issue Support Agents may redirect Customer to training or professional services (that are not part of the scope of this service).

Support levels for Customer Support Services

Akamas Customer Support Services provides different standard levels of support. Please verify the level of support specified in the contract in place with your Company.

Severity levels

The following table describes the different severity levels for Customer Support.

Severity level
Description
Impact

S1

Blocking: production Customer system is severely impacted.

Notice: this severity level only applies to production environments

Catastrophic business impacts (e.g. complete loss of a core business process and work cannot reasonably continue (e.g. all final users are unable to access the Customer application)

S2

Critical: one major Akamas functionality is unavailable

Significant loss or degradation of the Akamas services (e.g. Akamas is down or Akamas is not generating recommendations)

S3

Severe: limitation in accessing one major Akamas functionality

Moderate business impact and moderate loss or degradation of services, but work can reasonably continue in an impaired manner (e.g. only some specific functions are not working properly)

S4

Informational: Any other request

Minimum business impact.

Substantially functioning with minor or no impediments of services.

Support conditions

The contract in place with the Customer specifies the level of support provided by Akamas Agents, according at least to the following items:

  • Maximum number of support seats: this is the maximum number of named users within the Customer organization who can request Akamas Customer Support.

  • Language(s): these are the languages that can be used for interacting with Akamas Support Agents - the default is English.

  • Channel(s): these are the different communication channels that can be used to interact with Akamas Agents - these may include one or more options among web ticketing, email, phone, and Slack channel.

  • Max Initial Response Time: this refers to the time interval occurring from the time a request is opened by Customer to Customer Support and the time a Support Agent responds with a first notification (acknowledgment).

  • Severity: this is the level of severity associated with a reported issue, which initially corresponds to the severity level originally indicated by the Customer. Notice that the severity level may change, for example as new information becomes available or if Support Agents and Customer agree to re-evaluate it. Please notice that the severity level may be downgraded by Support Agents if Customer is not able to provide adequate resources or responses to enable Akamas to continue with its resolution efforts.

  • Initial Remedy: this refers to any operation aimed at addressing a reported issue by restoring a minimal level of operations, even if it may cause some performance degradation of the Customer service or operations. A workaround is to be considered a valid Initial Remedy.

Please notice that Support Agents may refuse to serve a service request to Customer Support either in case Customer does not have a valid Maintenance & Support subscription or in case the above-mentioned conditions or other conditions stated in the contract in place are not met. In any case, the Customer is expected to provide all the information required by Support Agent in order to serve service requests Customer Support.

Support levels for software versions

Different levels of support are provided for software versions of Akamas products, starting from its general availability (GA) date, and depending on the release of following software versions.

Version Numbering

Akamas adopts a three-place numbering scheme MA.MI.SP to designate released versions of its Software, where:

  • MA is the Major Version

  • MI is the Minor Version

  • SP is the Service Pack or Patch number

Support levels

The following table describes the three levels of support for a software version.

Support level
Description

Full Support

Akamas provides full support for one previous (either major or minor) version in addition to the latest available GA version.

For Software version in Full Support level: Akamas Support Agents provide service packs, patches, hotfixes, or workarounds to make the Software operate in substantial conformity with its then-current operating documentation.

Limited Support

Following the Full Support period, Akamas provides Limited Support for additional 12 months.

For Software versions in Limited Support level:

  • No new enhancements will be made to a version in "Limited Support" Akamas Support Agents will direct Customers to existing fixes, patches, or workarounds applicable to the reported case, if any;

  • Akamas Support Agents will provide hot fixes for problems of high technical impact or business exposure for customers;

  • Based on Customer input, Akamas Support Agents will determine the degree of impact and exposure and the consequent activities;

  • Akamas Support Agents will direct Customers to upgrade to a more current version of the Software.

No Support

Following the Limited Support period, Akamas provides no support for any Software version.

For Software versions in No Support level: No new maintenance releases, enhancements, patches, or hot fixes will be made available. Akamas Support Agents will direct Customers to upgrade to a more current version of the Software.

End-of-Life (EOL)

At any time, Akamas reserves the right to "end of life" (EOL) a software product and to terminate any Maintenance & Support Services for such product, provided that Licensor has notified the Licensee at least 12 months prior to the above-mentioned termination.

The period of time occurring between the "end of life" notification and the actual termination of Maintenance & Support Services is provided as follows:

  • No new enhancements will be introduced.

  • No enhancements will be made to support new or updated versions of the platform on which the product runs or which it integrates.

  • New hotfixes for problems of high technical impact or business exposure for customers may still be developed. Based on customer input, Akamas Support Agents will determine the degree of impact and exposure and the consequent activities.

  • Reasonable efforts will be done to inform the Customer of any fixes, service packs, patches, or workarounds applicable to the reported case if any.

Glossary
here

Cloud Hosting

Refer to your Cloud Provider website for information about cloud hosting options and related cost information.

AWS EC2

Free Trial

Akamas offers a Free Trial option to quickly understand Akamas concepts and capabilities and experience the power of its AI-based optimization platform.

You can join Akamas Free Trial quickly:

  1. Receive credentials to access your dedicated Akamas server (a cloud instance on AWS EC2) - optimally you can also download & install the Akamas CLI and learn how to fully automate the optimization process;

What you will get:

  • Understand the Akamas methodology

  • See Akamas AI-powered optimization in action

  • Learn to use Akamas by following the how-to guides

  • Familiarize yourself with Akamas UI and CLI

  • Touch the benefits Akamas can deliver to your organization

Enjoy!

Maintenance & Support (M&S) Services

This page is intended as a first introduction to Akamas Maintenance & Support (M&S) Services.

Please refer to the specific contract in place with your Company.

Akamas M&S Services include:

Akamas M&S Services do not include any installation and upgrade services, creation of any custom optimization packs, telemetry providers, or workflow operators, or implementation of any custom features and integrations that are not provided out-of-the-box by the Akamas products.

Support levels with Akamas

For AWS EC2 costs visit the and use the to estimate the cost for your architecture.

Fill out this on the Akamas website;

Explore already executed optimization studies or create & run new studies to optimize a microservice app at both the JVM runtime and Kubernetes level - here you can take advantage of .

access to Software versions released as major and minor versions, service packs, patches, and hotfixes according to .

assistance from for inquiries about the Akamas product and issues encountered while using Akamas products where there is a reasonable expectation that issues are caused by Akamas products, according to

Based on the , the following table describes the level of support of the Akamas versions after the version 3.2 GA date (2023 May, 1st).

Version
Support Level
EC2 Pricing page
AWS Pricing Calculator
form
Akamas Quick Guides
Support levels for software versions
Akamas Customer Support
Support levels for Customer Support Services
Support levels for software versions

Licensing

Software Licenses

Maintenance & Support Services

Other billable services

3.2

Full Support

Notice: this will change once the following major version is released

3.1

Full Support

Notice: this will change once the following major version is released

3.0

Full Support

Notice: this will change once the following major version is released

2.x

Limited Support until 12 months after 3.0 GA date, that is 2023 September, 13th (see )

1.x

No Support

Akamas software licensing model is subscription-based (typically on a yearly basis). For more information on Akamas' cost model and software licensing costs, please contact .

Akamas software licenses include which also include access to .

Akamas also provides optional professional services for deployment, training, and integration activities. For more information about Akamas professional services, please contact .

Support Levels with Akamas 3.0
info@akamas.io
Maintenance & Support Services
Customer Support Services
info@akamas.io

Prerequisites

Before installing the Akamas Server please make sure to review all the following requirements:

Hardware requirements
Software requirements
Network requirements

Install Akamas dependencies

While some links to official documentation and installation resources are provided here, please make sure to refer to your internal system engineering department to ensure that your company deployment processes and best practices are correctly matched.

Dependencies Setup

As a preliminary step before installing any dependency, it is strongly suggested to create a user named akamas on your machine hosting Akamas Server.

Docker

Follow the reference documentation to install docker on your system.

Verify dependencies

As a quick check to verify that all dependencies have been correctly installed, you can run the following commands

  • Docker:

    docker run hello-world

For offline installations, you can check docker with docker ps command

  • Docker compose :

    docker compose --version

Docker versions older than 23 must usedocker-compose command instead of docker compose

  • AWS CLI:

    aws --version

This page will guide you through the installation of software components that are required to get the Akamas Server installed on a machine. Please read the for a detailed list of these software components for each specific OS.

Docker installation guide:

Docker compose is already installed since Docker 23+. To install it on previous versions of Docker follow this installation guide:

AWS CLI v2:

To run docker with a non-root user, such as the akamas user, you should add it to the docker group. You can follow the guide at:

Akamas dependencies
https://docs.docker.com/engine/install
https://docs.docker.com/compose/install/
https://docs.aws.amazon.com/cli/latest/userguide
https://docs.docker.com/engine/install/linux-postinstall/
Akamas high-level architecture
Akamas optimizes any system and integrates with any ecosystem
Akamas key use cases

Software Requirements

Operating System

The following table provides a list of the supported operating systems and their versions.

Operating System

Version

Ubuntu Linux

20.04+

CentOS

8.6+

RedHat Enterprise Linux

8.6+

On RHEL systems Akamas containers might need to be run in privileged mode depending on how Docker was installed on the system.

Software packages

The following table provides a list of the required Software Packages (also referred to as Akamas dependencies) together with their versions.

Software Package

Notes

Docker

Akamas is deployed as a set of containerized services running on Docker. During its operation, Akamas launches different containers so access to the docker socket with enough permissions to run the container is required.

Docker Compose

Akamas containerized services are managed via Docker Compose. Docker compose is usually already shipped with Docker starting from version 23.

AWS CLI

Akamas container images are published in a private Amazon Elastic Container Registry (ECR) and are automatically downloaded during the online installation procedure.

AWS CLI is required only during the installation phase if the server has internet access and can be skipped during an offline installation.

The exact version of these prerequisites is listed in the following table:

Software Package

Version

Docker

24+

Docker Compose

2.7.0+

AWS CLI

2.0.0+

Akamas user

To install and run Akamas it is recommended to create a dedicated user (usually "akamas"). The Akamas user is not required to be in the sudoers list but can be added to the docker (dockeroot) group so it can run docker and docker-compose commands.

Make sure that the Akamas user has the read, write, and execute permissions on /tmp. If your environment does not allow writing to the whole /tmp folder, please create a folder /tmp/build and assign read and write permission to the Akamas user on that folder.

Read more about how to set up .

Akamas dependencies

Architecture

Akamas is based on a microservices architecture where each service is deployed as a container and communicates with other services via REST APIs. Akamas can be deployed on a dedicated machine (Akamas Server) or on a Kubernetes cluster.

The following figure represents the high-level Akamas architecture.

Interact with Akamas

Users can interact with Akamas via either the Graphical User Interface (GUI), Command-Line Interface (CLI), or via Application Programmatic Interface (API).

Both the GUI and CLI leverage HTTP/S APIs which pass through an API gateway (based on Kong), which also takes care of authenticating users by interacting with Akamas access management and routing requests to the different services.

The Akamas CLI can be invoked on either the Akamas Server itself or on a different machine (e.g. a laptop or another server) where the Akamas CLI has been installed.

Repositories

Akamas data is securely stored in different databases:

  • time series data gathered from telemetry providers are stored in Elasticsearch;

  • application logs are also stored in Elasticsearch;

  • data related to systems, studies, workflows, and other user-provided data are stored in a Postgres database.

Notice: both Postgres and Elasticsearch and any other service included within Akamas are provided by Akamas as part of the Akamas installation package.

Services

Core Services

The following Spring-based microservices represent Akamas core services:

  • System Service: holds information about metrics, parameters, and systems that are being optimized

  • Campaign Service: holds information about optimization studies, including configurations and experiments

  • Metrics Service: stores raw performance metrics (in Elasticsearch)

  • Analyzer Service: automates the analysis of load tests and provides related functionalities such as smart windowing

  • Telemetry Service: takes care of integrating different data sources by supporting multiple Telemetry Providers

  • Optimizer Service: combines different optimization engines to generate optimized configurations using ML techniques

  • Orchestrator Service: manages the execution of user-defined workflows to drive load tests

  • User Service: takes care of user management activities such as user creation or password changes

  • License Service: takes care of license management activities, optimization pack, and study export.

Ancillary Services

Akamas also provides advanced management features like logging, self-monitoring, licensing, user management, and more.

Offline installation mode

Akamas is deployed as a set of containerized services running on Docker and managed via Docker Compose. In the offline installation mode, the latest version of the Akamas Docker Compose file and all the images required by Docker cannot be downloaded from the AWS ECR repository.

Get Akamas Docker artifacts

Get in contact with Akamas Customer Services to get the latest versions of the Akamas artifacts uploaded to a location of your choice on the dedicated Akamas Server.

Akamas installation artifacts will include:

  • images.tar.gz: a tarball containing Akamas main images.

  • docker-compose.yml: docker-compose file for Akamas.

  • akamas: the binary file of the Akamas CLI that will be used to verify the installation.

Import Docker images

A preliminary step in the offline installation mode is to import the shipped Docker images by running the following commands in the same directory where the tar files have been stored:

cd <your bundle files location>
docker image load -i images.tar.gz

Mind that this import procedure could take some time!

Configure Akamas environment variables

To configure Akamas, you should set the following environment variables:

  • AKAMAS_CUSTOMER: the customer name matching the one referenced in the Akamas license.

  • AKAMAS_BASE_URL: the endpoint in the Akamas APIs that will be used to interact with the CLI, typically https://<akamas server DNS address>

To avoid losing your environment variables for future upgrades, it is suggested to keep them in the .env file, stored in the same directory as the docker-compose.yml:

.env
# Required variables
AKAMAS_CUSTOMER=<your name or your organization name>
AKAMAS_BASE_URL=https://<akamas server DNS address>

# Optional variables
## Database password. Use DEFAULT_DATABASE_PASSWORD to set a custom password for all databases
DEFAULT_DATABASE_PASSWORD=
## A custom password per each service can be set using the variables below, otherwise, the default is used. For example, for Kong's database, the password is `akamas_kong`.
KONG_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_kong}
AIRFLOW_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_airflow}
KEYCLOAK_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_keycloak}
ANALYZER_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_analyzer}
CAMPAIGN_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_campaign}
LICENSE_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_license}
OPTIMIZER_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_optimizer}
ORCHESTRATOR_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_orchestrator}
SYSTEM_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_system}
TELEMETRY_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_telemetry}
# Docker volumes prefix
COMPOSE_PROJECT_NAME=${COMPOSE_PROJECT_NAME:-akamas}

Run installation

To start Akamas you can now simply navigate into the akamas folder and run a docker-compose command:

cd <your docker-compose file location>
docker compose up -d

You may get the following error:

Error saving credentials: error storing credentials - err: exit status 1, out: Cannot autolaunch D-Bus without X11 $DISPLAY
  • Ubuntu

sudo apt-get install -y pass
  • RHEL

yum install pass

Docker compose installation

This section describes how to install Akamas on Docker.

Preliminary steps

Before installing Akamas, please follow these steps:

Installation steps

Please follow these steps to install the Akamas Server:

Changing UI Ports

By default, Akamas uses the following ports for its UI:

  • 80 (HTTP)

  • 443 (HTTPS)

Depending on the configuration of your environment, you may want to change the default settings: to do so, you’ll have to update the Akamas docker-compose file.

Inside the docker-compose.yml file, scroll down until you come across the akamas-ui service. There you will find a specification as follows:

Update the YAML file by remapping the UI ports to the desired ports of the host.

In case you were running Akamas with host networking, you are allowed to bind different ports in the container itself. To do so you can expand the docker-compose service by adding a couple of environment variables like this:

Finally, apply the new configuration after updating the AKAMAS_BASE_URL environment variable to match the new protocol or port.

Setup HTTPS configuration

Akamas APIs and UI use plain HTTP when they are first installed. To enable the use of HTTPS you will need to:

  1. Ask your security team to provide you with a valid certificate for your server. The certificate usually consists of two files with ".key" and ".pem" extensions. You will need to provide the Akamas server DNS name.

  2. Create a folder named "certs" in the same directory as Akamas' docker-compose file;

  3. Copy the ".key" and ".pem" files in the created "certs" folder and rename them to "akamas.key" and "akamas.pem" respectively. Ensure the files belong to the same user and group you use to run Akamas.

  4. Restart two Akamas services by running the following commands:

After the containers' reboot is complete you will be able to access the UI over HTTPS from your browser:

If you previously set up AKAMAS_BASE_URL variable with http (e.g. http://my.domain) you should update it to use https (e.g. http://my.domain) then issue

Setup CLI to use HTTPS

Now that your Akamas server is configured to use HTTPS you can update the Akamas CLI configuration to use the secure protocol.

You will be prompted to enter some input, please value it as follows:

You can test the connection by running:

It should return 'OK', meaning Akamas has been properly configured to work over HTTPS.

This is a documented docker bug (see ) that can be solved by installing the "pass" package:

Please make sure to read the section before installing Akamas.

Please also read the section on how to and how to . Finally, read the relevant sections of to integrate Akamas into your specific ecosystem.

If you have not installed the Akamas CLI, follow the . If you already have the CLI available, you can run the following command:

this link
Getting Started
Review hardware, software, and network prerequisites
Install all Akamas dependencies
Install the Akamas Server
Install the Akamas CLI
Verify the Akamas Server
Install an Akamas license
troubleshoot the installation
manage the Akamas Server
Integrating Akamas
  akamas-ui:
    ports:
      - "443:443"
      - "80:80"
  akamas-ui:
    ports:
      - "<YOUR_HTTPS_PORT_OF_CHOICE>:443"
      - "<YOUR_HTTP_PORT_OF_CHOICE>:80"
  akamas-ui:
    environment:
      - HTTP_PORT=<HTTP_CONTAINER_PORT>
      - HTTPS_PORT=<HTTPS_CONTAINER_PORT>
    ports:
      - "<YOUR_HTTPS_PORT_OF_CHOICE>:<HTTP_CONTAINER_PORT>"
      - "<YOUR_HTTP_PORT_OF_CHOICE>:<HTTPS_CONTAINER_PORT>"
cd <Akamas docker-compose file folder>
docker compose restart akamas-ui kong
https://<akamas server name here>
docker compose up -d
akamas init config
Api address [http://localhost:8000]: https://<akamas server dns address>:443/akapi
Workspace [default]: default
Verify SSL: [True]: True
akamas status
CLI installation guide

Hardware Requirements

Running in your data center

The following table provides the minimal hardware requirements for the virtual or physical machine used to install the Akamas server in your data center.

Resource

Requirement

CPU

4 cores @ 2 GHz

Memory

16 GB

Disk Space

70 GB

Running on AWS EC2

As shown in the following diagram, you can create the Akamas instance in the same AWS region, Virtual Private Cloud (VPC), and private subnet as your own already existing EC2 machines and by creating/configuring a new security group that allows communication between your application instances and Akamas instance. The inbound/outbound rules of this security group must be configured as explained in the Networking Requirements section of this page.

It is recommended to use an m6a.xlarge instance with at least 70GB of disks of type GP2 or GP3 and select the latest LTS version of Ubuntu.

Supported AWS Regions

Akamas can be run in any EC2 region.

AWS Service Limits

To run Akamas on an AWS Instance you need to create a new virtual machine based on one of the supported operating systems. You can refer to for step-by-step instructions on creating the instance.

You can find the latest version supported for your preferred region .

Before installing Akamas on an AWS Instance please make sure to meet your AWS service limits (please refer to the official AWS documentation ).

AWS documentation
here
here

Security

Akamas takes security seriously and provides enterprise-grade software where customer data is kept safe at all times. This page describes some of the most important security aspects of Akamas software and information related to processes and tools used by the Akamas company (Akamas S.p.A) to develop its software products.

Information managed by Akamas

Akamas manages the following types of information:

  • System configuration and performance metrics: technical data related to optimized systems. Examples of such data include the number of CPUs available in a virtual machine or the memory usage of a Java application server;

  • User accounts: accounts assigned to users to securely access the Akamas platform. For each user account, Akamas currently requires an account name and a password. Akamas does not collect any other personal identifying information;

  • Service Credentials: credentials used by Akamas to automate manual tasks and to integrate with external tools. In particular, Akamas leverages the following types of interaction:

    • Integration with monitoring and orchestration tools, e.g., collecting IT performance metrics and system configuration. As a best practice, Akamas recommends using dedicated service accounts with minimal read-only privileges.

    • Integration with the target systems to apply changes to configuration parameters. As a best practice, Akamas recommends using dedicated service accounts with minimal privileges to read/write identified parameters.

GDPR Compliance

Akamas is a fully GDPR-compliant product.

Akamas is a company owned by the Moviri Group. The Moviri Group and all its companies are fully compliant with GDPR. Moviri Group Data Privacy Policy and Data Breach Incident Response Plan which apply to all the owned companies can be requested from Akamas Customer Support.

Security certifications

Akamas is an on-premises product and does not transmit any data outside the customer network. Considering the kind of data that is managed within Akamas (see section "Which information is managed by Akamas"), specific security certifications like PCI or HIPAA are not required as the platform does not manage payment or health-related information.

Data encryption

Akamas takes the need for security seriously and understands the importance of encrypting data to keep it safe at rest and in-flight.

In-Flight encryption

All the communications between Akamas UI and CLI and the back-end services are encrypted via HTTPS. The customer can configure Akamas to use customer-provided SSL certificates in all communications.

Communications between Akamas services and other integrated tools within the customer network rely on the security configuration requirements of the integrated tool (e.g.: HTTPS calls to interact with REST services).

At-Rest encryption

Akamas is an on-premises product and runs on dedicated virtual machines within the customer environment. At-rest encryption can be achieved following customer policies and best practices, for example, leveraging operating system-level techniques.

Akamas also provides an application-level encryption layer aimed at extending the scope of at-rest encryption. With this increased level of security, sensitive data managed by Akamas (e.g. passwords, tokens, or keys required to interact with external systems) are safely stored in Akamas databases using industry-standard AES 256-bit encryption.

Encryption option for Akamas on EC2

In the case of Akamas hosted on an AWS machine you may optionally create an EC2 instance with an encrypted EBS volume before installing OS and Akamas, to achieve a higher level of security.

Password management

Password Security

Passwords are securely stored using a one-way hash algorithm.

Password complexity

Akamas comes with a default password policy with the following requirements:

  • has a minimum length of 8 characters.

  • contains at least 1 uppercase and 1 lowercase character.

  • contains at least 1 special character.

  • is different from the username.

  • must be different from the last password set.

Customers can modify this policy by providing a custom one that matches their internal security policies.

Password rotation

Akamas enforces no out-of-the-box password rotation mechanism. Customers can specify custom password expiration policies.

Credential storage

  • When running on a Linux installation with KDE's KWallet enabled or GNOME's Keyring enabled, the credentials will be stored in the default wallet/keyring.

  • When running on Windows, the credentials will be stored in Windows Credential Locker.

  • When running on a macOS, the credential will be stored in Keychain.

  • When running on a Linux headless installation, the credentials will be stored in CLEAR TEXT in a file in the current Akamas configuration folder.

Resources visibility model

Akamas provides fine granularity control over resources managed within the platform. In particular, Akamas features two kinds of resources:

  • Workspace resources: entities bound to one of the isolated virtual environments (named workspaces) that can only be accessed in reading or writing mode by users to whom the administrators explicitly granted the required privileges. Such resources typically include sensitive data (e.g.: passwords, API tokens). Examples of such resources include the system to be optimized, the set of configurations, optimization studies, etc.

  • Shared resources: entities that can be installed and updated by administrators and are available to all Akamas users. Such resources only contain technology-related information (e.g.: the set of performance metrics for a Java application server). Examples of such resources include Optimization Packs, which are libraries of technology components that Akamas can optimize, such as a Java application server.

Akamas Logs

Akamas logs traffic from UI and APIs. Application level logs include user access via APIs and UI and any action taken by Akamas on integrated systems.

Akamas' logs are retained on the dedicated virtual machine within the customer environment, by default, for 7 days. The retention period can be configured according to customer policies. Logs can be accessed either via UI or via log dump within the retention period. Additionally, logs have a format that can be easily integrated with external systems like log engines and SIEM to support forensic analysis.

Code scanning policy

Akamas is developed according to security best practices and the code is scanned regularly (at least daily).

The Akamas development process leverages modern continuous integration approaches and the development pipeline includes SonarQube, a leading security scanning product that includes comprehensive support for established security standards including CWE, SANS, and OWASP. Code scanning is automatically triggered in case of a new build, a release, and every night.

Vulnerability scanning and patch management policy

Akamas features modern micro-service architecture and is delivered as a set of docker containers whose images are hosted on a private Elastic Container Registry (ECR) repository on the AWS cloud. Akamas leverages the vulnerability scanning capabilities of AWS ECR to identify vulnerabilities within the product container images. AWS ECR uses the Common Vulnerabilities and Exposures (CVEs) database from the open-source Clair project.

If a vulnerability is detected, Akamas will perform a security assessment of the security risk in terms of the impact of the vulnerability, and evaluate the necessary steps (e.g.: dependency updates) required to fix the vulnerability within a timeline related to the outcome of the security assessment.

After the assessment, the vulnerability can be fixed by either recommending the upgrade to a new product version or delivering a patch or a hotfix for the current version.

Troubleshoot Docker installation issues

This section describes some of the most common issues found during the Akamas installation.

Issues when installing Docker

Centos 7 and RHEL 7

Notice: this distro features a known issue since Docker default execution group is named dockerroot instead of docker . To make docker work edit (or create) /etc/docker/daemon.json to include the following fragment:

{
  "group": "dockerroot"
}

After editing or creating the file, please restart Docker and then check the group permission of the Docker socket (/var/run/docker.sock), which should show dockerroot as a group:

srw-rw----. 1 root dockerroot 0 Jul  4 09:57 /var/run/docker.sock

Then, add the newly created akamas user to the dockerroot group so that it can run docker containers:

sudo usermod -aG dockerroot <user_name>

and check the akamas user has been correctly added to dockerroot group by running:

lid -g dockerroot

Issues when running AWS CLI

In case of issues in logging in through AWS CLI, when executing the following command:

aws ecr get-login-password --region us-east-2

Please check that:

  • Environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION are correctly set

  • AWS CLI version is 2.0+

Issue when starting Akamas services

Akamas failed to start some services

Please notice that the very first time Akamas is started, up to 30 minutes might be required to initialize the environment.

In case the issue persists you can run the following command to identify which service is not able to start up correctly

akamas status -d

License service unable to access docker socket

In some systems, the Docker socket, usually located in /var/run/docker.sock can not be accessed within a container. This causes Akamas to signal this behavior by reporting the Access Denied error in the license service logs.

To overcome this limitation edit the docker-compose.yaml file adding the line privileged: true to the following services:

  • License

  • Optimizer

  • Telemetry

  • Airflow

The following is a sample configuration where this change is applied to the license service:

license:
  image: 485790562880.dkr.ecr.us-east-2.amazonaws.com/akamas/license_service:2.3.0
  container_name: license
  privileged: true

Finally, you can issue the following command to apply these changes

docker compose up -d

Missing Akamas Customer variable

You can easily inspect which value of this variable has been used when starting Akamas by running the following command on the Akamas server:

docker inspect license | grep AKAMAS_CUSTOMER

If you find out that the value is not the one you expect, you can update the .env file and then start again the license service by running:

docker compose up -d license

Once Akamas is up and running you can re-install your license.

Other issues

We recommend using the for a smoother experience.

When installing Akamas it’s mandatory to provide the AKAMAS_CUSTOMER variable as illustrated in the . This variable must match the one provided by Akamas representatives when issuing a license. If the variable is not properly exported license installation will fail with an error message indicating that the name of the customer installation does not match the one provided in the license.

For any other issues please contact Akamas .

official AWS CLI installation guide
installation guide
Customer Support Services

Online installation mode

Akamas is deployed as a set of containerized services running on Docker and managed via Docker Compose. In the online installation mode, the latest version of the Akamas Docker Compose file and all the images required by Docker can be downloaded from the AWS ECR repository.

Get Akamas Docker artifacts

It is suggested first to create a directory akamas in the home directory of your user, and then run the following command to get the latest compose file:

cd ~
mkdir akamas
cd akamas
curl -O https://s3.us-east-2.amazonaws.com/akamas/compose/3.6.2/docker-compose.yml

Configure Akamas environment variables

To log into AWS ECR and pull the most recent Akamas container images, you need to set the AWS authentication variables to the appropriate values provided by Akamas Customer Support Services by running the following command. To configure Akamas, you should set the following environment variables:

To configure Akamas, you should set the following environment variables:

  • AKAMAS_CUSTOMER: the customer name matching the one referenced in the Akamas license.

  • AWS_ACCESS_KEY_ID: the access key for pulling the Akamas images

  • AWS_SECRET_ACCESS_KEY: the secret access key for pulling the Akamas images

  • AWS_DEFAULT_REGION: Unless specified by the support team keep the value to us-east-2

  • AKAMAS_BASE_URL: the endpoint in the Akamas APIs that will be used to interact with the CLI, typically https://<akamas server DNS address>

To avoid losing your environment variables for future upgrades, it is suggested to keep them in the .env file. Launch the following command from the same folder where the docker-compose.yml is stored, replacing the parameters in the brackets <>:

# Required variables
AKAMAS_CUSTOMER=<your name or your organization name>
AWS_ACCESS_KEY_ID=<your access key id>
AWS_SECRET_ACCESS_KEY=<your secret access key>
AKAMAS_BASE_URL=https://<akamas server DNS address>
AWS_DEFAULT_REGION=us-east-2

# Optional variables
# Database passwords
DEFAULT_DATABASE_PASSWORD=
KEYCLOAK_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_keycloak}
ANALYZER_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_analyzer}
CAMPAIGN_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_campaign}
LICENSE_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_license}
OPTIMIZER_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_optimizer}
ORCHESTRATOR_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_orchestrator}
SYSTEM_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_system}
TELEMETRY_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_telemetry}
# Docker volumes prefix
COMPOSE_PROJECT_NAME=${COMPOSE_PROJECT_NAME:-akamas}

Start Akamas

To log into AWS ECR and pull the most recent Akamas container images you also need to set the AWS authentication variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION) with the values provided by Akamas Customer Support Services. You can leverage the .env file previously created with the following command:

source ./.env
aws ecr get-login-password --region us-east-2 | docker login -u AWS --password-stdin https://485790562880.dkr.ecr.us-east-2.amazonaws.com

You can start installing the Akamas server by running the following AWS CLI commands:

docker compose up -d

Software Requirements

This page describes the requirements that should be fulfilled by the user when installing or managing an Akamas installation on Kubernetes. The software below is usually installed on the user's workstation or laptop.

Kubectl

Helm

Privileged access

Akamas uses Elasticsearch to store logs and time series. When running Akamas on Kubernetes, Elasticsearch is installed automatically using the official Elasticsearch helm chart. This chart required running an init container with privileged access to set up a configuration on the Elasticsearch pod host. If running such a container is not permitted in your environment, you can add the following snippet to the akamas.yaml file when installing Akamas to disable this feature.

In case the Akamas Server is behind a proxy server please also read how to .

Kubectl must be installed and configured to interact with the desired cluster. Refer to the to set up the client.

To interact with the Kubernetes APIs, you will need , preferably with a version matching the cluster. To check both the client and cluster versions, run the following:

Installing Akamas requires or higher. To check the version, run the following:

setup Akamas behind a Proxy
kubectl version --short
helm version --short
# Disable ES privileged initialization container. 
elasticsearch:
  sysctlInitContainer:
    enabled: false
official kubectl documentation
kubectl
Helm 3.0

Install the Akamas Server

Akamas is deployed as a set of containerized services running on Docker and managed via Docker Compose. The latest version of the Akamas Docker Compose file and all the images required by Docker can be downloaded from the AWS ECR repository.

Two installation modes are available:

, in case the Akamas Server has access to the Internet - is also supported.

, in case the Akamas Server does not have access to the Internet.

online installation mode
installation behind a proxy server
offline installation mode

Install Akamas

Two installation modes are available:

Akamas is deployed on your Kubernetes cluster through a , and all the required images can be downloaded from the AWS ECR repository.

, in case the Kubernetes cluster can access the Internet.

, in case the Kubernetes cluster does not have access to the Internet or you need to use a private image registry.

Helm chart
online installation
offline installation

Kubernetes installation

This section describes how to install Akamas on a Kubernetes cluster.

Preliminary steps

Before installing Akamas, please follow these steps:

Installation steps

Please follow these steps to install the Akamas application:

Please also read the section on how to . Finally, read the relevant sections of to integrate Akamas into your specific ecosystem.

Review the cluster requirements
Install the software requirements
Install the application
Install the CLI
Verify the installation
Install the license
manage Akamas
Integrating Akamas
Akamas high-level architecture

Online installation behind a Proxy server

This section describes how to setup an Akamas Server behind a proxy server and to allow Docker to connect to the Akamas repository on AWS ECR.

Configure Docker daemon

First, create the /etc/systemd/system/docker.service.d directory if it does not already exists. Then create or update the /etc/systemd/system/docker.service.d/http-proxy.conf file with the variables listed below, taking care of replacing <PROXY> with the address and port (and credentials if needed) of your target proxy server:

[Service]
Environment="HTTP_PROXY=<PROXY>"
Environment="HTTPS_PROXY=<PROXY>"

Once configured, flush the changes and restart Docker with the following commands:

sudo systemctl daemon-reload
sudo systemctl restart docker

Configure the Akamas containers

To allow the Akamas services to connect to addresses outside your intranet, the Docker instance needs to be configured to forward the proxy configuration to the Akamas containers.

Update the ~/.docker/config.json file adding the following field to the JSON, taking care to replace <PROXY> with the address (and credentials if needed) of your target proxy server:

{
  # ...
  "proxies": {
    "default": {
      "httpProxy": "<PROXY>",
      "httpsProxy": "<PROXY>",
      "ftpProxy": "<PROXY>",
      "noProxy": "localhost,127.0.0.1,/var/run/docker.sock,database,optimizer,campaign,analyzer,telemetry,log,elasticsearch,metrics,system,license,store,orchestrator,airflow-db,airflow-webserver,kong-database,kong,user-service,keycloak,logstash,kibana,akamas-ui,grafana,prometheus,node-exporter,cadvisor,konga,benchmark"
    }
  }
}

Run Akamas

Set the following variables to configure your working environment, taking care to replace <PROXY> with the address (and credentials if needed) of your target proxy server:

export HTTP_PROXY='<PROXY>'
export HTTPS_PROXY='<PROXY>'

Once configured, you can log into the ECR repository through the AWS CLI and start the Akamas services manually.

Offline Installation - Private registry

Configure the registry

If your cluster is in an air-gapped network or cannot reach the Akamas image repository, you need to copy the required images to your private registry.

The procedure described here leverages your local environment to upload the images. Thus, to interact between the Akamas and private registry, it requires Docker to be installed and configured.

Transfer the Docker images

The offline installation requires you to pull the images and migrate them to your private registry. In the following command replace the chart version to download the related list of images:

Once the import is complete, you must re-tag and upload the images. Run the following snippet, replacing <REGISTRY_URL> with the actual URL of the private registry:

This process could last several minutes, once the upload is complete, you can proceed with the next steps.

Create the configuration file

To proceed with the installation, you must create a Helm Values file, called akamas.yaml in this guide, containing the mandatory configuration values required to customize your application. The following template contains the minimal set required to install Akamas:

Replace in the file the following placeholders:

  • CUSTOMER_NAME: customer name provided with the Akamas license

  • ADMIN_PASSWORD: initial administrator password

  • REGISTRY_URL: the URL for the private registry used in the transfer process above

Configure the authentication

To authenticate to your private registry, you must manually create the Secret required to pull the images. If the registry uses basic authentication, you can create the credentials in the namespace by running the following command:

Otherwise, you can leverage any credential already configured on your machine by running the following command:

Define Size

Medium

Large

Start the installation

From a machine that can reach the endpoint, run the following command to download the chart:

The command downloads the latest version chart version as an archive named akamas-<version>.tgz. The file can be transferred to the machine where the installation will be run. Replace akamas/akamas with the download package in the following commands.

If you wish to see and override the values that Helm will use to install Akamas, you may execute the following command.

Now, with the configuration file you just created (and the new variables you added to override the defaults), you can start the installation with the following command:

This command will create the Akamas resources within the specified namespace. You can define a different namespace by changing the argument --namespace <your-namespace>

An example output of a successful installation is the following:

Check the installation

To monitor the application startup, run the command kubectl get pods. After a few minutes, the expected output should be similar to the following:

At this point, you should be able to access the Akamas UI using the endpoint specified in the akamasBaseUrl, and interact through the Akamas CLI with the path /api.

Installing telemetry providers

For more details, refer to the official documentation page: .

For more details, refer to the official documentation page: .

Before starting the installation, make sure the are met.

Akamas on Kubernetes is provided as a set of templates packaged in a chart archive managed by .

INSTANCE_HOSTNAME: the URL that will be used to expose the Akamas installation, for example https://akamas.k8s.example.com when using an Ingress, or http//:localhost:9000 when using port-forwarding. Refer to for the list of the supported access methods and a reference for any additional configuration required.

This section describes how to configure the authentication to your private registry. If your registry does not require any authentication, skip directly to the .

Akamas can be installed in three sizes Small, Medium, and Large as explained in the section. By default, the chart installs the Small size. If you want to install a specific size add the following snippet to your values.yaml file.

If the host you are using to install akamas can reach helm.akamas.io you can follow the instructions in the . Otherwise, follow the instructions below to download the chart content locally.

If you haven't already, you can update your configuration file to use a different type of service to expose Akamas' endpoints. To do so, pick from the the configuration snippet for the service type of your choice, add it to the akamas.yaml file, update the akamasBaseUrl value, and re-run the installation command to update your Helm release.

During online installation, a set of out-of-the-box telemetry providers are automatically installed. For offline installation, this step has to be executed manually. To install the telemetry providers required for your environment proceed to section.

Control Docker with systemd
Configure Docker to use a proxy server
curl -sO  http://helm.akamas.io/images/1.6.3/image-list
NEW_REGISTRY="<REGISTRY_URL>"

while read IMAGE; do
    REGISTRY=$(echo "$IMAGE" | cut -d '/' -f 1)
    REPOSITORY=$(echo "$IMAGE" | cut -d ':' -f 1 | cut -d "/" -f2-)
    TAG=$(echo "$IMAGE" | cut -d ':' -f 2)

    NEW_IMAGE="$NEW_REGISTRY/$REPOSITORY:$TAG"
    echo "Migrating $IMAGE to $NEW_IMAGE"

    docker pull "$IMAGE"
    docker tag "$IMAGE" "$NEW_IMAGE"
    docker push "$NEW_IMAGE"
done <image-list
akamas.yaml
# Akamas customer name. Must match the value in the license (required)
akamasCustomer: <CUSTOMER_NAME>

# Akamas administrator password. If not set a random password will be generated
akamasAdminPassword: <ADMIN_PASSWORD>

# The URL that will be used to access Akamas, for example 'http://akamas.kube.example.com' (required)
akamasBaseUrl: <INSTANCE_HOSTNAME>

global:
  imageRegistry: <REGISTRY_URL>

elasticsearch:
  image: <REGISTRY_URL>/akamas/elastic/elasticsearch
  
kibana:
  image: <REGISTRY_URL>/akamas/elastic/kibana
  
airflow:
  images:
    airflow:
      repository: <REGISTRY_URL>/akamas/airflow_service
      tag: 2.8.0
    pgbouncer:
      repository: <REGISTRY_URL>/akamas/airflow_service
      tag: ~
    pgbouncerExporter:
      repository: <REGISTRY_URL>/akamas/airflow_service
      tag: ~
  webserver:   
    extraInitContainers:
      - name: wait-logstash
        image: <REGISTRY_URL>/akamas/utils:0.1.7
        command:
          - "sh"
          - "-c"
          - "until ./wait-for-it.sh -h logstash -p 9600 -t 120 -e _node/pipelines -j '.pipelines|length' -r 10 ; do echo Waiting for Logstash; sleep 10; done; echo Connected"
        resources:
          limits:
            cpu: 100m
            memory: 50Mi
          requests:
            cpu: 10m
            memory: 50Mi
  scheduler:
    podAnnotations:
      k8s.akamas.com/imageName: <REGISTRY_URL>/akamas/airflow_service
    env:
      - name: CONTAINER_NAME
        value: airflow
      - name: SERVICE
        value: airflow
      - name: LOGTYPE
        value: airflow
      - name: IMAGE_NAME
        value: <REGISTRY_URL>/akamas/airflow_service
      - name: AIRFLOW_CONN_HTTP_SYSTEM
        value: "http://:@system:8080"
      - name: AIRFLOW_CONN_HTTP_CAMPAIGN
        value: "http://:@campaign:8080"
      - name: AIRFLOW_CONN_HTTP_ORCHESTRATOR
        value: "http://:@orchestrator:8080"
      - name: KEYCLOAK_ENDPOINT
        value: "http://keycloak:8080"

    extraInitContainers:
      - name: wait-logstash
        image: <REGISTRY_URL>/akamas/utils:0.1.7
        command:
          - "sh"
          - "-c"
          - "until ./wait-for-it.sh -h logstash -p 9600 -t 120 -e _node/pipelines -j '.pipelines|length' -r 10 ; do echo Waiting for Logstash; sleep 10; done; echo Connected"
        resources:
          limits:
            cpu: 100m
            memory: 50Mi
          requests:
            cpu: 10m
            memory: 50Mi
kubectl create secret docker-registry registry-token \
  --namespace akamas \
  --docker-server=<REGISTRY_URL> \
  --docker-username=<USER> \
  --docker-password=<PASSWORD>
kubectl create secret docker-registry registry-token \
  --namespace akamas \
  --from-file=.dockerconfigjson=<PATH/TO/.docker/config.json>
# Medium
airflow:
  config:
    core:
      parallelism: 102
  scheduler:
    resources:
      limits:
        cpu: 2500m         
        memory: 21000Mi    
      requests:
        cpu: 1000m         
        memory: 21000Mi   
# Large
airflow:
  config:
    core:
      parallelism: 202
  scheduler:
    resources:
      limits:
        cpu: 2500m         
        memory: 28000Mi    
      requests:
        cpu: 1000m         
        memory: 28000Mi    
telemetry:
  parallelism: 50
helm pull --repo http://helm.akamas.io/charts --version '1.6.3' akamas
helm show values akamas-<version>.tgz
helm upgrade --install \
  --create-namespace --namespace akamas \
  -f akamas.yaml \
  akamas akamas-<version>.tgz
Release "akamas" does not exist. Installing it now.
NAME: akamas
LAST DEPLOYED: Thu Sep 21 10:39:01 2023
NAMESPACE: akamas
STATUS: deployed
REVISION: 1
NOTES:
Akamas has been installed

NOTES:
Akamas has been installed

To get the initial password use the following command:

kubectl get secret akamas-admin-credentials -o go-template='{{ .data.password | base64decode }}'
NAME                           READY   STATUS    RESTARTS   AGE
airflow-6ffbbf46d8-dqf8m       3/3     Running   0          5m
analyzer-67cf968b48-jhxvd      1/1     Running   0          5m
campaign-666c5db96-xvl2z       1/1     Running   0          5m
database-0                     1/1     Running   0          5m
elasticsearch-master-0         1/1     Running   0          5m
keycloak-66f748d54-7l6wb       1/1     Running   0          5m
kibana-6d86b8cbf5-6nz9v        1/1     Running   0          5m
kong-7d6fdd97cf-c2xc9          1/1     Running   0          5m
license-54ff5cc5d8-tr64l       1/1     Running   0          5m
log-5974b5c86b-4q7lj           1/1     Running   0          5m
logstash-8697dd69f8-9bkts      1/1     Running   0          5m
metrics-577fb6bf8d-j7cl2       1/1     Running   0          5m
optimizer-5b7576c6bb-96w8n     1/1     Running   0          5m
orchestrator-95c57fd45-lh4m6   1/1     Running   0          5m
store-5489dd65f4-lsk62         1/1     Running   0          5m
system-5877d4c89b-h8s6v        1/1     Running   0          5m
telemetry-8cf448bf4-x68tr      1/1     Running   0          5m
ui-7f7f4c4f44-55lv5            1/1     Running   0          5m
users-966f8f78-wv4zj           1/1     Running   0          5m
requirements
Helm
Accessing Akamas
cluster prerequisite
Accessing Akamas
Integrating Telemetry Providers
installation section

Prerequisites

Before installing the Akamas please make sure to review all the following requirements:

Online Installation

Create the configuration file

To proceed with the installation, you need to create a Helm Values file, called akamas.yaml in this guide, containing the mandatory configuration values required to customize your application. The following template contains the minimal set required to install Akamas:

You can also download the template file running the following snippet:

Replace in the file the following placeholders:

  • AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY: the AWS credentials for pulling the Akamas images

  • CUSTOMER_NAME: customer name provided with the Akamas license

  • ADMIN_PASSWORD: initial administrator password

Define Size

Medium

Large

Start the installation

With the configuration file you just created (and the new variables you added to override the defaults), you can start the installation with the following command:

This command will create the Akamas resources within the specified namespace. You can define a different namespace by changing the argument --namespace <your-namespace>

An example output of a successful installation is the following:

Check the installation

To monitor the application startup, run the command kubectl get pods. After a few minutes, the expected output should be similar to the following:

At this point, you should be able to access the Akamas UI using the endpoint specified in the akamasBaseUrl, and interact through the Akamas CLI with the path /api.

Before starting the installation, make sure the are met.

Akamas on Kubernetes is provided as a set of templates packaged in a chart archive managed by .

INSTANCE_HOSTNAME: the URL that will be used to expose the Akamas installation, for example https://akamas.k8s.example.com when using an Ingress, or http://localhost:9000 when using port-forwarding. Refer to for the list of the supported access methods and a reference for any additional configuration required.

Akamas can be installed in three sizes Small, Medium, and Large as explained in the section. By default, the chart installs the Small size. If you want to install a specific size add the following snippet to your values.yaml file.

If you haven't already, you can update your configuration file to use a different type of service to expose Akamas' endpoints. To do so, pick from the the configuration snippet for the service type of your choice, add it to the akamas.yaml file, update the akamasBaseUrl value, and re-run the installation command to update your Helm release.

akamas.yaml
# AWS credentials to fetch ECR images (required)
awsAccessKeyId: <AWS_ACCESS_KEY_ID>
awsSecretAccessKey: <AWS_SECRET_ACCESS_KEY>

# Akamas customer name. Must match the value in the license (required)
akamasCustomer: <CUSTOMER_NAME>

# Akamas administrator password. If not set a random password will be generated
akamasAdminPassword: <ADMIN_PASSWORD>

# The URL that will be used to access Akamas, for example 'http://akamas.kube.example.com' (required)
akamasBaseUrl: <INSTANCE_HOSTNAME>
curl -so akamas.yaml  http://helm.akamas.io/templates/1.6.3/akamas.yaml.template
# Medium
airflow:
  config:
    core:
      parallelism: 102
  scheduler:
    resources:
      limits:
        cpu: 2500m         
        memory: 21000Mi    
      requests:
        cpu: 1000m         
        memory: 21000Mi   
# Large
airflow:
  config:
    core:
      parallelism: 202
  scheduler:
    resources:
      limits:
        cpu: 2500m         
        memory: 28000Mi    
      requests:
        cpu: 1000m         
        memory: 28000Mi    
telemetry:
  parallelism: 50
helm upgrade --install \
  --create-namespace --namespace akamas \
  --repo http://helm.akamas.io/charts \
  --version '1.6.3' \
  -f akamas.yaml \
  akamas akamas
Release "akamas" does not exist. Installing it now.
NAME: akamas
LAST DEPLOYED: Thu Sep 21 10:39:01 2023
NAMESPACE: akamas
STATUS: deployed
REVISION: 1
NOTES:
Akamas has been installed

NOTES:
Akamas has been installed

To get the initial password use the following command:

kubectl get secret akamas-admin-credentials -o go-template='{{ .data.password | base64decode }}'
NAME                           READY   STATUS    RESTARTS   AGE
airflow-6ffbbf46d8-dqf8m       3/3     Running   0          5m
analyzer-67cf968b48-jhxvd      1/1     Running   0          5m
campaign-666c5db96-xvl2z       1/1     Running   0          5m
database-0                     1/1     Running   0          5m
elasticsearch-master-0         1/1     Running   0          5m
keycloak-66f748d54-7l6wb       1/1     Running   0          5m
kibana-6d86b8cbf5-6nz9v        1/1     Running   0          5m
kong-7d6fdd97cf-c2xc9          1/1     Running   0          5m
license-54ff5cc5d8-tr64l       1/1     Running   0          5m
log-5974b5c86b-4q7lj           1/1     Running   0          5m
logstash-8697dd69f8-9bkts      1/1     Running   0          5m
metrics-577fb6bf8d-j7cl2       1/1     Running   0          5m
optimizer-5b7576c6bb-96w8n     1/1     Running   0          5m
orchestrator-95c57fd45-lh4m6   1/1     Running   0          5m
store-5489dd65f4-lsk62         1/1     Running   0          5m
system-5877d4c89b-h8s6v        1/1     Running   0          5m
telemetry-8cf448bf4-x68tr      1/1     Running   0          5m
ui-7f7f4c4f44-55lv5            1/1     Running   0          5m
users-966f8f78-wv4zj           1/1     Running   0          5m
online installation guide
Cluster requirements
Software requirements
requirements
Helm
Accessing Akamas
cluster prerequisite
Accessing Akamas

Cluster Requirements

Kubernetes version

Running Akamas requires a cluster running Kubernetes version 1.24 or higher.

Resources requirements

Akamas can be deployed in three different sizes depending on the number of concurrent optimization studies that will be executed. If you are unsure about which size is appropriate for your environment we suggest you start with the small one and upgrade to bigger ones as you expand the optimization activity to more applications.

The tables below report the required resources both for requests and limits that should be available in the cluster to use Akamas.

The resources specified on this page have been defined by considering using a dedicated namespace to run only Akamas components. If your cluster has additional tools (E.g. a service mesh or a monitoring agent) that inject containers in the Akamas pods we suggest either disabling them or increasing the sizing considering their overhead. Also if you plan to deploy other software inside the Akamas namespace and resource quotas are enabled you should increase the size considering the resources required by the specific software.

Small

The small tier is suited for environments that need to support up to 3 concurrent optimization studies

Resource
Requests
Limits

CPU

4 Cores

15 Cores

Memory

28 GB

28 GB

Disk Space

70 GB

70 GB

Medium

The medium tier is suited for environments that need to support up to 50 concurrent optimization studies

Resource
Requests
Limits

CPU

8 Cores

20 Cores

Memory

50 GB

50 GB

Disk Space

100 GB

100 GB

Large

The large tier is suited for environments that need to support up to 100 concurrent optimization studies. If you plan to run more concurrent studies, please contact Akamas support to plan your installation.

Resource
Requests
Limits

CPU

10 Cores

25 Cores

Memory

60 GB

60 GB

Disk Space

150 GB

150 GB

Storage requirements

The cluster must define a Storage Class so that the application installation can leverage Persistent Volume Claims to dynamically provision the volumes required to persist data.

Permissions

Cluster-level permissions are not required to install and run Akamas. This is the minimal set of namespaced rules.

- apiGroups: ["", "apps", "policy", "batch", "networking.k8s.io", "events.k8s.io/v1", "rbac.authorization.k8s.io"]
  resources:
    - configmaps
    - cronjobs
    - deployments
    - events
    - ingresses
    - jobs
    - persistentvolumeclaims
    - poddisruptionbudgets
    - pods
    - pods/log
    - rolebindings
    - roles
    - secrets
    - serviceaccounts
    - services
    - statefulsets
  verbs: ["get", "list", "create", "delete", "patch", "update", "watch"]

Networking

For more information on this topic, refer to .

Networking requirements depend on how users interact with Akamas. Services can be exposed via Ingress or . Refer to for a more detailed description of the available options.

Kubernetes' official documentation
using kubectl as a proxy
Accessing Akamas

Accessing Akamas

To interact with your Akamas instance, you need the UI and API Gateway to be accessible from outside the cluster.

Kubernetes offers different options to expose a service outside of the cluster. The following is a list of the supported ones, with examples of how to configure them to work in your chart release:

While changing the access mode of your Akamas installation, you must also update the value of the akamasBaseUrl option of the Helm Values file to match the new endpoint used.

Port Forwarding

By default, Akams uses Cluster IPs for its services, allowing communication only inside the cluster. Still, you can leverage Kubectl's port-forward to create a private connection and expose any internal service on your local machine.

This solution is suggested to perform quick tests without exposing the application or in scenarios where cluster access to the public is not allowed.

Set akamasBaseUrl to http://localhost:9000 in your Helm Values file, and install or update your Akamas deployment using the Helm command. Once the rollout is complete, open a tunnel to the UI with the following command:

kubectl port-forward service/ui 9000:http

As long as the port-forwarding is running, you will be able to interact with the UI through the tunnel; you can also interact through the Akamas CLI by configuring the URL http://localhost:9000/akapi.

Ingress

An Ingress is a Kubernetes object that provides service access, load balancing, and SSL termination to Kubernetes services.

To expose the Akamas UI through an Ingress, configure the Helm Values file by configuring akamasBaseUrl with the host of the Ingress (e.g.: https://akamas.kube.example.com), and by adding the snippet below:

ingress:
  enabled: true
  tls:
    - secretName: "<SECRET_NAME>"  # Secret containing the certificate and key data
  annotations: {}  # Optional

Here is a description of the fields:

  • enabled: set to true to enable the Ingress

Selecting Cluster Nodes

You can then re-apply the chart using the helm upgrade command.

Refer to the official for more details about port-forwarding.

tls: configure secretName with the name of the Secret containing the TLS certificate for the hostname configured in akamasBaseUrl. This secret must be created manually before applying the configuration (see on the Kubernetes documentation) or managed by a certificate issuer configured in the namespace.

annotations: optional, provide any additional annotation required in your deployment. If your cluster leverages any certificate issuer (such as ), you can add here the annotations required to interact with the issuer.

Re-run to update the configuration. Once the rollout is complete, you will be able to access the UI using the URL specified in akamasBaseUrl and interact with the CLI using ${akamasBaseUrl}/api.

Refer to the for more details on Ingresses.

You can use Kubernetes to specify a set of nodes of the cluster on which Akamas containers will be scheduled.

To do so you should first look for a label common to all those nodes or create a new one. You can read more about labels in Kubernetes in the .

Once you have defined a label (say nodeRole: akamas ), you can edit the values.yaml file defined in the section adding the following properties.

kubernetes documentation
TLS Secrets
cert-manager
official kubernetes documentation
Port Forwarding
Ingress
the install command
# Node selector for core akamas services
nodeSelector:
  nodeRole: akamas

# Node selector for elasticsearch database
elasticsearch:
  nodeSelector:
    nodeRole: akamas

# Node selector for postgresql database
postgresql:
  primary:
    nodeSelector:
      nodeRole: akamas

# Node selector for airflow
airflow:
  nodeSelector:
    nodeRole: akamas
Node Selector
official documentation
Installing Akamas

Network requirements

This section lists all the connectivity settings required to operate and manage Akamas

Internet access

Internet access is required for Akamas online installation and updated procedures and allows retrieving the most updated Akamas container images from the Akamas private Amazon Elastic Container Registry (ECR).

If internet access is not available for policies or security reasons, Akamas installation and updates can be executed offline.

Internet access from the Akamas server is not mandatory but it’s strongly recommended.

Ports

The following table provides a list of the ports on the Akamas server that have to be reachable by Akamas administrators and users to properly operate the system.

In the specific case of AWS instance and customer instances sharing the same VPC/Subnet inside AWS, you should:

  • open all of the ports listed in the table above for all inbound URLs (0.0.0.0/32) on your AWS security group

  • open outbound rules to all traffic and then attach this AWS security group (which must reside inside a private subnet) to the Akamas machine and all customer application AWS machines

Source

Destination

Port

Reason

Akamas admin

Akamas server

22

ssh

Akamas admin/user

Akamas server

80, 443

Akamas web UI access

Akamas admin/user

Akamas server

8000, 8443

Akamas API access

Setup the CLI

Linux

To get Akamas CLI installed on Linux, run the following commands:

curl -o akamas_cli https://s3.us-east-2.amazonaws.com/akamas/cli/$(curl -s https://s3.us-east-2.amazonaws.com/akamas/cli/stable.txt)/linux_64/akamas
sudo mv akamas_cli /usr/local/bin/akamas
chmod 755 /usr/local/bin/akamas

You can now run the Akamas CLI following by running the akamas command.

In some installations, the /usr/local/bin folder is not present in the PATH environment variable. This prevents you from using akamas without specifying the complete file location. To fix this issue you can add an entry to the PATH system environment variable or move the executable to another folder in your PATH.

Auto-completion

To enable auto-completion on Linux systems with a bash shell (requires bash 4.4+), run the following commands:

curl -O https://s3.us-east-2.amazonaws.com/akamas/cli/$(curl -s https://s3.us-east-2.amazonaws.com/akamas/cli/stable.txt)/linux_64/akamas_autocomplete.sh
mkdir -p ~/.akamas
mv akamas_autocomplete.sh ~/.akamas
echo '. ~/.akamas/akamas_autocomplete.sh' >> ~/.bashrc
source ~/.bashrc

Windows

To install the Akamas CLI on Windows run the following command from Powershell:

Invoke-WebRequest "https://s3.us-east-2.amazonaws.com/akamas/cli/$($(Invoke-WebRequest https://s3.us-east-2.amazonaws.com/akamas/cli/stable.txt | Select-Object -Expand Content) -replace '\n', '')/win_64/akamas.exe" -OutFile akamas.exe

You can now run the Akamas CLI by running .\akamas in the same folder.

To invoke the akamas CLI from any folder, create a akamas folder (such as C:\Program Files\akamas), and move there the akamas.exe file. Then, add an entry to the PATH system environment variable with the value C:\Program Files\akamas. Now, you can invoke the CLI from any folder, by simply running the akamas command.

The Akamas CLI can be accessed by simply running the akamascommand.

Verify the CLI

You can verify that the CLI was installed correctly by running this command:

akamas version

which should show an output similar to this one

Akamas CLI: 2.9.10
Akamas platform: 3.6.2

At any time, you can see available commands and options with:

akamas --help

For the full list of Akamas commands please refer to the section .

CLI reference

Installing on OpenShift

Running Akamas on OpenShift requires some Helm configurations to be applied.

OpenShift requirements

OpenShift version 4.x.

Installation

The following snippet must be added to the akamas.yaml to install Akamas on OpenShift.

akamas.yaml
airflow:
  uid: null
  gid: null

postgresql:
  primary:
    containerSecurityContext:
      enabled: false

    podSecurityContext:
      enabled: false

  shmVolume:
    enabled: false

kibana:
  podSecurityContext:
    fsGroup: null

  securityContext:
    runAsUser: null

elasticsearch:
  sysctlInitContainer:
    enabled: false

  securityContext:
    runAsUser: null

  podSecurityContext:
    fsGroup: null
    runAsUser: null

Access Akamas - Ingress to route

akamas.yaml
ingress:
  enabled: true
  
  annotations:
    route.openshift.io/termination: edge
    haproxy.router.openshift.io/timeout: 1200s

  className: ""

  tls:
    - {}

Once the Helm command is invoked, ensure the routes have been created by running:

oc get routes

The output must list the Akamas routes with different paths.

Toolbox

The toolbox optional component requires privileged access to run on OpenShift; the toolbox uses a dedicated service account, named toolbox by default. You can grant privileged access by issuing the following command.

#This command assumes the akamas namespace is named "akamas" 
# and the service account default name "toolbox" is used
oc adm policy add-scc-to-user privileged system:serviceaccount:akamas:toolbox

The installation is provided as a set of templates packaged in a chart archive managed by . Custom values are applied to ensure Akamas complies with the default restricted-v2 security context constraints.

Before proceeding with the installation make sure you meet the

The installation can be done offline and online as described in the section . Choose the one that better suits your cluster access policies.

Besides the methods described in , you can use the OpenShift default ingress controller to create the required routes. Add the following snippet to the akamas.yaml file.

Helm
Kubernetes requirements
Install Akamas
Accessing Akamas

Install the CLI

This section describes how to install an Akamas workstation

The Akamas CLI allows users to invoke commands against the Akamas dedicated machine (Akamas Server). The Akamas CLI can also be installed on a different system than the Akamas Server.

Prerequisites

Linux and Windows operating systems are supported for installing Akamas CLI.

Installation steps

The Akamas CLI can be installed and configured in two simple steps:

Useful commands

You may find helpful some of the commands listed in the sections below.

Read database passwords

By default, access to each service database is assigned to a user with randomly generated passwords. For example, to read the campaign service database password, execute the following command:

The username for the campaign service can be found in the configuration file under each service section. To read the username for the campaign service set during the installation, launch the following command:

You can connect to the campaign_service database with the user and password above.

If you want to show all the passwords, execute this command:

Refer to the section to modify the CLI ports the Akamas Server is listening to. Section provides instructions on how to interact with Akamas via a proxy server.

Setup the CLI
Initialize the CLI
Change CLI config
Use a proxy server
kubectl get secret database-user-credentials -o go-template='{{ .data.campaign | base64decode }}'
helm get values akamas --all --output json | jq '.campaign.database.user'
kubectl get secret database-user-credentials -o go-template='{{range $k,$v := .data}} {{printf "%s: %s\n" $k ( $v |base64decode ) }}{{end}}'

Change CLI configuration

API Address

The CLI, as well as the UI, interacts with the akamas server via APIs. The apiAddress configuration contains the information required to communicate with the server.

Docker

The Akamas Server provides different listeners to interact with APIs:

  • an HTTP listener on port 80 under the path /akapi

  • an HTTP listener on port 8000

  • an HTTPS listener on port 443 under the path /akapi

  • an HTTPS listener on port 8443

Depending on your networking setup you can either use the listeners on ports 80 and 443 which are also used for the UI or directly interact with the API gateway on ports 8000 and 8443. If you are unsure about your network setup we suggest you start with the HTTPS listener on port 443.

For improved security, it is recommended to configure CLI communications with the Akamas Server over HTTPS. Notice that you need to have a valid certificate installed on your Akamas server (at least a self-signed one) to enable HTTPS communication between CLI and the Akamas Server.

Changing CLI protocol

The CLI can be configured either directly via the CLI itself or via the YAML configuration file akamasconf.

Using the CLI

Issue the following command to change the configuration of the Akamas CLI:

akamas init config

and then follow the wizard to provide the required CLI configuration:

  • enable HTTPS communications:

Api address [http://localhost:8000]: https://<akamas server dns name>:443/akapi
Workspace [default]: Workspace1
Login method (local, oauth2) [local]: local
Verify SSL: [True]: True
Is external certificate CA required? [y/N]: N
  • enable HTTP communications:

Api address [http://localhost:8000]: http://<akamas server DNS name>:80
Workspace [default]: Workspace1
Login method (local, oauth2) [local]: local

Please notice that by default Akamas CLI expects a valid SSL certificate. If you are using a self-signed certificate or a not valid one you can set the Verify SSL variable to false. This will mimic the behavior of accepting an invalid HTTPS certificate on your favorite browser.

Using the akamasconf file

Create a file and name it akamasconf to be located at the following locations:

  • Linux: ~/.akamas/akamasconf

  • Windows: C:\Users\<username>\.akamas (where C: is the drive where the OS is installed)

The file location can be customized by setting an $AKAMASCONF environment variable.

Here is an example akamasconf file provided as a sample:

apiAddress: http[s]://<akamas server dns name>:80[443]/akapi
verifySsl: [true|false]
workspace: default

The CLI configuration contains the information required to communicate with the akamas server. It can be easily created and updated with a configuration wizard. This page describes the main options of the Akamas CLI and how to modify them. If your Akamas instance is installed with Kubernetes, ensure the UI service is .

configured correctly

Use a proxy server

The Akamas CLI supports interacting with the API server through an HTTP/HTTPS proxy server.

To enable access via an HTTP proxy, set the environment variable HTTP_PROXY. From the following snippet, replace proxy_ip and proxy_port with the desired values.

export HTTP_PROXY="http://<proxy_ip>:<proxy_port>"

Then, run the akamas command to verify access.

akamas status debug

Access through an HTTPS proxy can be set by using the environment variable HTTPS_PROXY instead of HTTP_PROXY.

Initialize the CLI

The CLI is used to interact with an akamas server. To initialize the configuration of the Akamas CLI you can run the command:

akamas init config

and follow the wizard to provide the required information such as the server IP.

Here is a summary of the configuration wizard options.

Api address [http://localhost:8000]: https://<akamas-hostname>:<ui-port>/akapi
Workpace [default]: default
Verify SSL: [True]: True
Is external certificate CA required? [y/N]: N

After this step, the Akamas CLI can be used to login to the Akamas server, by issuing the following command:

akamas login

and providing the credentials as requested.

Managing Akamas

This section is a collection of different topics related to how to manage the Akamas Server.

This section covers some topics on how to manage the Akamas Server:

Verify the installation

Run the following command to verify the correct startup and initialization of Akamas:

When all services have been started this command will return an "OK" message. Please notice that it might take a few minutes for Akamas to start all services.

To check the UI is also properly working please access the following URL:

You will see the Akamas login form:

This configuration can be changed at any time (see how to ).

Please notice that it is impossible to log into Akamas before a license has been installed. Read here .

change the CLI config
Akamas logs
Audit logs
Install upgrades and patches
Monitor the Akamas Server
Backup & Recovery of the Akamas Server
akamas status
http://<akamas server name here>
how to Install an Akamas license

Upgrade Akamas

The following sections describe the procedure to upgrade your Akamas instance.

If you plan to upgrade your Akamas instance, please verify the upgrade path with the Akamas support team. To ensure rollback in case of upgrade failure, it is suggested to backup your studies (see section ).

Install the license

Running Akamas' studies requires a valid license.

To install a license get in touch with Akamas Customer Service to receive:

  • the Akamas license file

  • your "customer name" to configure in the variable AKAMAS_CUSTOMER for Docker installations or akamasCustomer for Kubernetes installations

  • the URL to configure in the AKAMAS_BASE_URL variable for Docker installations

  • login credentials

Once you have this information, you can issue the following commands:

cd <your bundle files location>
akamas install license <license file you have been provided>

To get the administrator's initial password for Kubernetes installations, run the following command:

kubectl get secret -n <NAMESPACE> akamas-admin-credentials -o go-template='{{.data.password | base64decode}}'

Installing the toolbox

Akamas offers, as an additional container, a toolbox that contains the Akamas CLI executable, along with some other useful command-line tools such as kubectl, Helm, vim, docker cli, jq, yq, git, gzip, zip, OpenSSH, ping, cURL, and wget. It can be executed along akamas services, in the same network, for docker-compose installation, or in the akamas namespace for Kubernetes installations.

This toolbox aims to:

  • allow users to interact with Akamas without the need to install Akamas CLI on their systems

Docker compose installation

By setting the following options in the .env file, you can configure your toolbox by enabling SSH password authentication (only key-based authentication will be available otherwise) and by setting a login password:

.env
ALLOW_PASSWORD=true
CUSTOM_PASSWORD=yourPassword

To start the toolbox container just issue the following command:

docker compose --profile toolbox up -d

If you want to keep the toolbox running also after a complete restart you can also add the following line to your .env file: COMPOSE_PROFILES=toolbox

Accessing the toolbox on Docker

To access the toolbox on docker you can issue the following command:

docker exec -it toolbox bash

You will be provided with a shell inside the toolbox where you can interact with Akamas. Please read the work folder section below for more information on how to persist scripts and data on the toolbox upon restart and upgrades.

Kubernetes installation

Follow the usual guide for installing Akamas on Kubernetes but make sure to override the following variable (its default value is false) in your akamas.yaml file or in the file values-files/my-values.yaml (can be created if missing):

Follow the usual guide for installing Akamas on Kubernetes, adding the following variables to the values file:

toolbox:
  enabled: true
  sshPassword:
    # enable SSH password authentication. If 'false', only key-based access
    # will be allowed
    enabled: false
    # configure the password for the toolbox user. If not provided, an
    # autogenerated password will be used
    value: <my-custom-password>
  # Optionally you can also specify custom resource limits
  resources:
    limits:
      cpu: 300m
      memory: 300Mi
    requests:
      cpu: 100m
      memory: 300Mi

Then, you can launch the usual helm upgrade --install ... command to run the pod, as described in theStart the installation (online) or Start the installation (offline) sections.

Service Account

By default, the toolbox uses a dedicated service account to allow for more granularity and control over permissions.

The service account will be created automatically upon first installation. If you need to use an existing service account you can specify its configuration in the values file using the following snippet.

toolboox:
 # extra lines in between #
 serviceAccount:
    create: true      # Automatically create the SA if it does not already exist
    name: toolbox     # Name of the SA to use for the toolbox   

You can verify which credentials the kubectl cli is using by running kubectl auth whoami command from within the toolbox.

Accessing the toolbox on Kubernetes

When it's deployed to Kubernetes, you may access this toolbox in two ways:

  • via kubectl

  • via SSH command

Kubectl access

Accessing is as simple as:

kubectl exec -it deployment/toolbox -- bash

SSH access

For this type of access, you need to retrieve the SSH login password (if enabled) or key. To fetch them, run the following commands:

# Get the password
kubectl exec deployment/toolbox -- cat /home/akamas/password
# Get the key
kubectl exec deployment/toolbox -- cat /home/akamas/.ssh/id_rsa

With this info, you can leverage the toolbox to run commands in your workflows, like in the following example:

name: hello-workflow
tasks:
  - name: Say Hello
    operator: Executor
    arguments:
      command: echo 'Hello Akamas'
      host:
        hostname: toolbox
        username: akamas
        password: d48020ab71be6a07

You can also access the toolbox by port-forwarding from your local machine (on port 2222 in our example). Run the following kubectl command:

kubectl port-forward service/toolbox 2222:22

On another terminal, run:

ssh akamas@localhost -p 2222

and answer yes to the question, then insert the akamas password to successfully SSH access the toolbox (see example below):

$ ssh akamas@localhost -p 2222
The authenticity of host '[localhost]:2222 ([127.0.0.1]:2222)' can't be established.
ED25519 key fingerprint is SHA256:34GXnmRz1YjWr2TTpUpJmRoHYck0NzeAxni2L857Exs.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[localhost]:2222' (ED25519) to the list of known hosts.
akamas@localhost's password:
Welcome to Ubuntu 20.04.6 LTS (GNU/Linux 5.10.178-162.673.amzn2.x86_64 x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

This system has been minimized by removing packages and content that are
not required on a system that users do not log into.

To restore this content, you can run the 'unminimize' command.

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

akamas@toolbox-6dd8b7f898-8xwzf:~$

Work directory

A typical kubernetes scenario is Akamas running from inside a namespace and a customer application running from inside another namespace. In such a scenario you will probably need to create an Akamas workflow (running from the akamas namespace) that applies a new configuration on the customer application (running in the customer namespace) then Akamas collects new metrics for a period of time and then calculates a new configuration based on the score of the previous configuration.

What follows is a typical workflow example that:

  • uses a FileConfigurator to create a new helm file that applies the new configuration computed by Akamas on a single service named adservice.FileConfigurator recreates a new adservice.yaml file by using the template adservice.yaml.templ. Just make sure that adservice.yaml.templ contains namespace: boutique (the customer namespace, in our example)

  • uses an Executor that launches kubectl apply with the new helm file adservice.yaml you just saved to apply the new configuration

  • uses another Executor to wait for the new configuration to be rolled out by launching kubectl rollout status

  • waits for half an hour to observe the changes in metrics

If you need to store Akamas artifacts, scripts, or any other file that requires persistence, you can use the /work directory, which persists across restarts. This is the default working directory at login time.

provide the with an environment where to run scripts and persist artifacts when no other options (e.g. a dedicated host) are available

By default, SSH access to the toolbox is limited to a subset of internal services. In the Helm values file, you can configure toolbox.ingress with additional .

Akamas' workflows
workflow-related
ingress rules

Manage anonymous data collection

Akamas might collect anonymized usage information on running optimizations. Collection and tracking are disabled by default and can be manually enabled.

Docker installation

External tracking is managed through the following environment variables:

  • AKAMAS_TRACKER_URL: the target URL for all tracking info.

  • AKAMAS_TRACKING_OPT_OUT: when set to 1, disables anonymous data collection.

Tracking for a running instance can be enabled by editing the AKAMAS_TRACKING_OPT_OUT variable in the docker-compose.yaml file.

To enable tracking set the variable to the following value:

AKAMAS_TRACKING_OPT_OUT=0

Then issue the command:

docker compose up -d

Kubernetes installation

External tracking is managed through the field trackingOptOut in the Values file. To enable tracking set trackingOptOut to 0 as in the following example and upgrade the installation:

awsAccessKeyId: "YOUR_ACCESSKEY_ID"
awsSecretAccessKey: "YOUR_SECRET_ACCESS_KEY"

trackingOptOut: 0

Audit logs

Akamas audit logs

Akamas stores all its logs into an internal Elasticsearch instance: some of these logs are reported to the user in the GUI in order to ease the monitoring of workflow executions, while other logs are only accessible via CLI and are mostly used to provide more context and information to support requests.

Audit access can be performed by using the CLI in order to extract logs related to UI or API access. For instance, to extract audit logs from the last hour use the following commands:

  • UI Logs

akamas logs --no-pagination -S kong -f -1h
  • API Logs

akamas logs --no-pagination -S kong -f -1h

Notice: to visualize the system logs unrelated to the execution of workflows bound to workspaces, you need an account with administrative privileges.

Storing audit logs into files

Akamas can be configured to store access logs into files to ease the integration with external logging systems. Enabled this feature ensures that, when the user interacts with the UI or the API, Akamas will report detailed access logs on the internal database and in a file in a dedicated log folder. To ease log rolling and management every day, Akamas will create a new file named according to the pattern access-%{+YYYY-MM-dd}.log.

Docker version

To enable this feature you should:

  1. Create a logs folder next to the Akamas docker-compose.yml file

  2. Edit the docker-compose.yml file by modifying the line FILE_LOG: "false" to FILE_LOG: "true"

  3. If Akamas is already running issue the following command

docker compose up -d logstash

otherwise, start Akamas first.

Kubernetes version

To enable this feature you should go to your Akamas chart folder, edit your values file (typically values-flies/my-values.yaml), and add the following section (if a logstash: section is already present, add the new values to it):

logstash:
  enabled: true
  fileLogging:
    enabled: true

then perform installation or update as usual with:

make install

in this specific case, the logs will be stored in a dedicated volume attached to the logstash pod, under the folder /akamas/logs/.

To list them you can use the command:

kubectl exec deploy/logstash -- ls /akamas/logs/

To read a logfile you can use the command (replace LOGFILENAME.log with the actual name):

kubectl exec deploy/logstash -- cat /akamas/logs/LOGFILENAME.log

To copy them to your local machine you can use:

# for this specific command, you cannot use the deployment name 
# but you need the actual pod name
kubectl cp logstash-NNNNNNN-NNNN:/akamas/logs/ .

Akamas logs

Akamas allows dumping log entries from a specific service, workspace, workflow, study, trial, and experiment, for a specific timeframe and at different log levels.

Akamas CLI for logs

Akamas logs can be dumped via the following CLI command:

akamas log

This command provides many filters which can be retrieved with the following command:

akamas log --help

which should return

Usage: akamas log [OPTIONS] [MESSAGE]

  Show Akamas logs

Options:
  -d, --debug                     Show extended error messages if present.
  --page-size INTEGER             Number of log lines to be retrieved NOTE:
                                  This argument is mutually exclusive with
                                  arguments: [dump, no_pagination].
  --no-pagination                 Disable pagination and print all logs NOTE:
                                  This argument is mutually exclusive with
                                  arguments: [dump, page_size].
  --dump                          Print the logs without pagination and
                                  formatting NOTE: This argument is mutually
                                  exclusive with arguments: [page_size,
                                  no_pagination].
  -f, --from [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S|%Y-%m-%dT%H:%M:%S.%f|%Y-%m-%d %H:%M:%S.%f|[-]nw|[-]nd|[-]nh|[-]nm|[-]ns]
                                  The start timestamp of the logs
  -t, --to [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S|%Y-%m-%dT%H:%M:%S.%f|%Y-%m-%d %H:%M:%S.%f|[-]nw|[-]nd|[-]nh|[-]nm|[-]ns]
                                  The end timestamp of the logs
  -s, --study TEXT                UUID or name of the Study
  -e, --exp INTEGER               Number of the experiment
  --trial INTEGER                 Number of the trial
  -y, --system TEXT               UUID or name of the System
  -W, --workflow TEXT             UUID or name of the Workflow
  -l, --log-level TEXT            Log level
  -S, --service TEXT              Akamas service
  --without-metadata              Hide metadata
  --sorting [ASC|DESC]            Sorting order of the timestamps
  -w, --workspace TEXT           UUID or name of the Workspace to visualize.
                                  When empty, system logs will be returned
                                  instead
  --help                          Show this message and exit.

For example, to get the list of the most recent Akamas errors:

akamas log -l ERROR

which should return something similar to:

       timestamp                         system                  provider    service                                                                                   message
==============================================================================================================================================================================================================================================================
2022-05-02T15:51:26.88    -                                      -          airflow     Task failed with exception
2022-05-02T15:51:26.899   -                                      -          airflow     Failed to execute job 2 for task Akamas_LogCurator_Task
2022-05-02T15:56:29.195   -                                      -          airflow     Task failed with exception
2022-05-02T15:56:29.215   -                                      -          airflow     Failed to execute job 3 for task Akamas_LogCurator_Task
2022-05-02T16:01:55.587   -                                      -          license     2022-05-02 16:01:47.426 ERROR 1 --- [           main] c.a.m.utils.rest.RestHandlers            :  has failed with returning a response:
                                                                                        {"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following metrics: 'spark.spark_application_duration' were not found
                                                                                        in any of the components of the system 'analytics_cluster'","path":null}
2022-05-02T16:01:55.587   -                                      -          license     2022-05-02 16:01:47.434 ERROR 1 --- [           main] c.a.m.MigrationApplication               : Unable to complete operation. Mode: RESTORE. Cause: A request to a
                                                                                        downstream service CampaignService has failed: 400 : [{"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following
                                                                                        metrics: 'spark.spark_application_duration' were not found in any of the components of the system 'analytics_cluster'","path":null}]
2022-05-02T16:01:55.678   -                                      -          license     2022-05-02 16:01:47.434 ERROR 1 --- [           main] c.a.m.MigrationApplication               : Unable to complete operation. Mode: RESTORE. Cause: A request to a
                                                                                        downstream service CampaignService has failed: 400 : [{"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following
                                                                                        metrics: 'spark.spark_application_duration' were not found in any of the components of the system 'analytics_cluster'","path":null}]
2022-05-02T16:01:55.678   -                                      -          license     2022-05-02 16:01:47.426 ERROR 1 --- [           main] c.a.m.utils.rest.RestHandlers            :  has failed with returning a response:
                                                                                        {"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following metrics: 'spark.spark_application_duration' were not found
                                                                                        in any of the components of the system 'analytics_cluster'","path":null}
2022-05-02T16:12:10.261   -                                      -          license     2022-05-02 16:05:53.209 ERROR 1 --- [           main] c.a.m.services.CampaignService           : de9f5ff9-418e-4e25-ae2c-12fc8e72cafc
2022-05-02T16:32:07.216   -                                      -          license     2022-05-02 16:31:37.330 ERROR 1 --- [           main] c.a.m.services.CampaignService           : 06c4b858-8353-429c-bacd-0cc56cc44634
2022-05-02T16:38:18.522   -                                      -          campaign    Internal Server Error: Object of class [com.akamas.campaign_service.entities.campaign.experiment.Experiment] with identifier
                                                                                        [ExperimentIdentifier(workspace=ac8481d3-d031-4b6a-8ae9-c7b366f027e8, study=de9f5ff9-418e-4e25-ae2c-12fc8e72cafc, id=2)]: optimistic locking failed; nested exception
                                                                                        is org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) :
                                                                                        [com.akamas.campaign_service.entities.campaign.experiment.Experiment#ExperimentIdentifier(workspace=ac8481d3-d031-4b6a-8ae9-c7b366f027e8,
                                                                                        study=de9f5ff9-418e-4e25-ae2c-12fc8e72cafc, id=2)]

Viewing platform logs

By default the akamas CLI only shows logs of the current workspace. In order to see platform logs for events such as installation or optimization packs or telemetry providers you can specify the -ws option with an empty workspace name such as:

akamas logs -ws ' -S license

Kubernetes

Online

Start by updating the local chart repository:

helm repo update akamas

Start online upgrade

Ensure your kubectl configuration points to the namespace where Akamas is installed or specify it with the --namespace parameter. To start the upgrade to the latest version:

helm upgrade akamas akamas/akamas

Offline

helm repo update akamas
helm pull akamas/akamas

Useful commands

Listing Akamas chart versions

Akamas' versions can be listed by running the following command:

helm search repo akamas/akamas --versions

It is always suggested to install and upgrade to the latest chart version. The App Version field refers to the Akamas version. To ease the release process multiple chart versions may refer to the same App Version.

Retrieving the Values file

In case you do not have access to the Values file used during the last installation/upgrade, you can still get it by running:

helm get values akamas -o yaml > akamas-values.yaml

Such a command is useful only if you need to change some of the parameters during the upgrade, otherwise the old Values file is kept by Helm.

If you plan to upgrade your Akamas instance, please verify the upgrade path with the Akamas support team. To ensure rollback in case of upgrade failure, it is suggested to back up your studies (see section ).

The following guide uses the same chart repository and helm release names. Before starting the upgrade, you may find it helpful to look at the section .

You can specify an older chart version using the --version parameter. Refer to for discovering the published chart versions.

If you need to specify a different Values file from the latest installation, start from the last one used. If you do not have it stored, it can be retrieved as specified in .

Before starting the upgrade, check to add new docker images.

If you can not reach helm.akamas.io from the machine where the installation will be run, run the following commands from another client (see the for a full explanation).

Then, you can start the upgrade in the same way as for the . If you are using the downloaded chart package, transfer the package and replace akamas/akamas with the downloaded tgz archive.

Useful Commands
Listing Akamas chart versions
Retrieving the Values file
Online version
Configure the registry
installation guide

Docker compose

Docker compose Configuration

mv docker-compose.yml docker-compose.yml.bak
curl -O https://s3.us-east-2.amazonaws.com/akamas/compose/3.6.2/docker-compose.yml

You can point to a specific version. As an example to download the artifact for version 3.5.0:

curl -O https://s3.us-east-2.amazonaws.com/akamas/compose/3.5.0/docker-compose.yml

If the old docker-compose has been changed and it is still needed in the newer Akamas version, make sure to migrate such changes from docker-compose.yml.bak to the docker-compose.yml .

Then log in to AWS with the following command:

aws ecr get-login-password --region us-east-2 | docker login --username AWS --password-stdin 485790562880.dkr.ecr.us-east-2.amazonaws.com

If the login succeeds, then you can start the upgrade by running:

docker compose up -d

Wait for a few minutes and check the Akamas services are running the command:

akamas status -d

The expected output should be like the following (repeat the command after a minute or two if the last line is not "OK" as expected):

Checking Akamas services on http://localhost:8000
service       status
=========================
analyzer      UP
campaign      UP
metrics       UP
optimizer     UP
orchestrator  UP
system        UP
telemetry     UP
license       UP
log           UP
users         UP
OK

Monitor Akamas status

Checking Akamas services

To check the status of the Akams services please run akamas status -d to identify which service is not able to start up correctly

Here is an example of output:

If you plan to upgrade your Akamas instance, please verify the upgrade path with the Akamas support team. To ensure rollback in case of upgrade failure, it is suggested to backup your studies (see section ).

To start with the upgrade, on the Akamas server navigate to the same folder where the docker-compose.yml and .env file are stored (see section ). Now you can download the latest version compose file:

Ensure your .env file is up to date with the required variables, by comparing your version with the one at .

Get Akamas Docker artifacts
Configure Akamas environment variables
Checking Akamas services on http://localhost:8000
 service	 status
=========================
analyzer       	UP
campaign       	UP
metrics        	UP
optimizer      	UP
orchestrator   	UP
system         	UP
telemetry      	UP
license        	UP
log            	UP
users          	UP
OK

Backup & Recover of the Akamas Server

Akamas server backup

The process of backing up an Akamas server can be divided in two parts, that is system backup and otherwise start Akamas. Backup can be performed in any way you see fit: they’re just regular files so you can use any backup tool.

System backup

System services are hosted on AWS ECR repo so the only thing that fully defines a working Akamas application is the docker-compose.yml file. Performing a backup of the Akamas application is as simple as copying this single file to your backup location. you may schedule any script that performs this weekly or at any frequency you see fit

User data backup

You may list all existing Akamas studies via the Akamas CLI command:

akamas list study

Then you can export all existing studies one by one via the CLI command

akamas export study <UUID>

where UUID is the UUID of a single study. This command exports into a single archive file (tar.gz). These archive files can be backed up to your favorite backup folder.

Akamas server recovery

Akamas server recovery involves recovering the system backup, restarting the Akamas service then re-importing the studies.

System Restore

To restore the system you must recover the original docker-compose.yml then launch the command

docker compose up &

from the folder where you placed this YAML file and then wait for the system to come up, by checking it with the command

akamas status -d

User data restore

All studies can be re-imported singularly with the CLI command (referring to the correct pathname of the archive):

akamas import study archive.tgz
User data backup
User data backup
User data backup

Configure an external identity provider

To configure an external identity provider, start by accessing the Keycloak administration console. Refer to Accessing Keycloak admin console for detailed instructions.

Within the Akamas realm, navigate to the Identity Providers section.

The configuration steps will vary based on the provider you are integrating with. Select the appropriate guide below:

If you need to limit the number of user session logins for this provider, refer to Limit users sessions.

Accessing Keycloak admin console

The Keycloak administration console is exposed on the /auth page of your installation; for example, https://app.akamas.io/auth.

Now log into the Administration Console using the admin user. The password for such a user can be retrieved in different ways, depending on the installation method:

  • Kubernetes. A custom password can be specified during the installation by providing a value keycloak.adminPassword in the helm chart. If this value was left unspecified, you can retrieve the auto-generated password with the following command:

kubectl get secret keycloak-admin-credentials \
  -o go-template='{{ .data.KEYCLOAK_ADMIN_PASSWORD | base64decode }}'

Note that you might need to provide the namespace in which Akamas has been installed using the flag -n namespace

  • Docker.

    A custom password can be specified during the installation by providing a value for the variable KEYCLOAK_ADMIN_PASSWORD in the environment or the docker-compose file. if during the installation you didn't specify the value, you can retrieve the auto-generated password with the following command:

docker exec -it keycloak cat /config/keycloak_admin | cut -d '|' -f2

Akamas realm

Once logged in, select the akamas realm from the dropdown menu and navigate to the Identity providers section.

Limit users sessions

As a security measure, Akamas lets you enforce a limit on the number of concurrent sessions per user, by default, this is set to terminate the oldest sessions and keep only a restricted number alive. If you wish to change the behavior limit, you can do so by configuring the Akamas realm in Keycloak.

Azure Active Directory

This guide provides a step-by-step walkthrough to configure Azure Active Directory (AD) as an external identity provider for Akamas users.

Ensure you have an Azure account with the Application.ReadWrite.All permission to create app registrations in your Azure AD tenant.

Configure the App registration

To integrate Akamas with your Azure AD, you’ll need a dedicated App registration in your Azure organization. You can either use an existing registration or create a new one.

  • Creating a New Registration: Follow the instructions below.

​​Multiple Akamas instances can share a single app registration, meaning any AD user added to the registration can access all associated Akamas instances. To manage access with finer granularity, create a unique app registration for each Akamas instance.

Creating a new App registration

  • Provide:

    • A name for the application.

    • The account type that best suits your use case.

  • Complete the process by clicking Register.

Get the client configuration

On the Overview page of your app registration, make note of the following values:

  • Application (client) ID

  • OpenID Connect metadata document (found in the "Endpoints" side panel)

Then, in the Certificates & secrets section, create a new Client secret and note its value. With these values ready, proceed to configure the provider in the Keycloak console.

Create the Identity provider in Keycloak

  • Select OpenID Connect v1.0 to start creating the new provider.

  • Provide:

    • Alias (e.g., "microsoft") and optional Display name (e.g., "Microsoft") for the login page.

  • In the OpenID Connect settings section, populate the following fields:

    • Discovery endpoint: Enter the URL of the OpenID Connect metadata document. A green box indicates successful validation.

    • Client ID: Enter the Application (client) ID.

    • Client Secret: Enter the generated client secret.

Click Add to complete the configuration. Copy the Redirect URI from the details page of the new provider.

Complete the app registration in Azure

Return to the Azure portal and open the app registration. In the Authentication section, add the Web platform (if not already present).

Add the Redirect URI from the Keycloak console to the list of redirect URIs.

Akamas is now configured to delegate user login to Azure AD.

If the hostname of the Akamas installation changes, update the Redirect URI in the app registration to avoid login errors such as:

The redirect URI 'https://...' specified in the request does not match the redirect URIS configured for the application '...'.

Configure the default Akamas roles

To automatically assign default roles to users, set up mappers in Keycloak so users can access the default workspace with read and write permissions upon first login.

In Keycloak, go to the provider's details page and navigate to Mappers:

Add the following configurations:

User role

  • Name: User role

  • Mapper type: Hardcoded role

  • Role: USER

Default Workspace Read

  • Name: Default Workspace Read

  • Mapper type: Hardcoded role

  • Role: WS_ac8481d3-d031-4b6a-8ae9-c7b366f027e8_R

Default Workspace Write

  • Name: Default Workspace Write

  • Mapper type: Hardcoded role

  • Role: WS_ac8481d3-d031-4b6a-8ae9-c7b366f027e8_W

Test the integration

Visit the Akamas installation's login page to verify that the new authentication method is displayed and working as expected.

The section explains how to properly configure users stored in Keycloak. The page explains how to apply the same limit for users managed by an Identity Provider.

Using an Existing Registration: Skip to .

In your Azure portal, navigate to and select New registration.

In the Keycloak admin console, access the Identity Providers section within the Akamas realm (see the page for more details).

Azure Active Directory
Google
Configure an external identity provider
Get the client configuration
Local Users
Identity Provider users
App registrations

Google

This guide provides a step-by-step walkthrough to configure Google as an external identity provider for Akamas users.

You will need a Google account with the privileges required to create app registrations.

Configure the App registration

To integrate Akamas with your Google Workspace, create a project with a dedicated OAuth client in the Google Developer Console.

  • Log in to your Google Developer Console.

Configure the Consent Screen

If a warning prompts you to configure the consent screen, you’ll need to create an app for user consent.

  • Click on the provided button to launch the Consent Screen Wizard.

Create the OAuth client

  • On the Credentials page, select Create Credentials and choose OAuth Client ID.

  • Configure the client as follow:

    • Application Type: Choose "Web application."

    • Name: Enter a name for the new client.

    • Authorized redirect URIs: Leave this blank for now; you’ll configure it in a later step.

After clicking Create, a confirmation popup will display the Client ID and Client Secret. Make note of these values.

Create the Identity provider

In the Keycloak admin console, go to the Identity Providers section within the Akamas realm (see Configure an external identity provider for more details).

  • Select Google as the provider type.

  • Fill in the following fields using the values from the OAuth client:

    • Client ID: Enter the Client ID from the Google Developer Console.

    • Client Secret: Enter the Client Secret.

Copy the Redirect URI generated by Keycloak and click Add to save the configuration.

Complete the app registration

Return to the Credentials page in the Google Developer Console. Open the newly created OAuth client, and in the Authorized Redirect URIs section, add the Redirect URI copied from Keycloak.

If the hostname of the Akamas installation changes, update the Redirect URI in the app registration to avoid login errors such as:

The redirect URI 'https://...' specified in the request does not match the redirect URIS configured for the application '...'.

Configure the default Akamas roles

To automatically assign default roles to users, set up mappers in Keycloak so users can access the default workspace with read and write permissions upon first login.

In Keycloak, go to the provider's details page and navigate to Mappers:

Add the following configurations:

User role

  • Name: User role

  • Mapper type: Hardcoded role

  • Role: USER

Default Workspace Read

  • Name: Default Workspace Read

  • Mapper type: Hardcoded role

  • Role: WS_ac8481d3-d031-4b6a-8ae9-c7b366f027e8_R

Default Workspace Write

  • Name: Default Workspace Write

  • Mapper type: Hardcoded role

  • Role: WS_ac8481d3-d031-4b6a-8ae9-c7b366f027e8_W

Test the integration

Visit the Akamas installation's login page to verify that the new authentication method is displayed and working as expected.

Go to the API & Services section and navigate to .

Follow the wizard to configure the consent screen according to your company's policies. For more details, refer to on the official documentation.

Once the consent screen configuration is complete, return to the page.

Credentials
Configure the OAuth consent screen
Credentials

Identity provider users

If you have configured ore or more Identity Providers, you can also limit the number of concurrent user sessions. First, access the Keycloak admin console with the instructions provided on the page Accessing Keycloak admin console.

Click on the "create flow" button, provide a name, and then select the flow type "Basic Flow" and click on create.

Now click on "add execution"

A dialog pops up with a list of possible actions, filter the results with the limit keyword.

Select "User session count limiter" and click on "Add".

Set this new step as "Required" from the drop-down then click on the cog icon to edit its properties

Give it a meaningful alias and type in the maximum concurrent session value you desire. Select the behavior "Deny new session" from the drop-down list. Type in a valid message in the textbox "Optional custom error message" and click on "Save".

Now go to the identity provider page and click on the Identity provider you want to limit.

Scroll down to the bottom, click on the "Post login flow" dropdown, and select the new step you just created then click on the "Save" button.

Local users

First, access the Keycloak admin console with the instructions provided on the page Accessing Keycloak admin console.

On the Authentication page, select the "browser" flow and scroll the "User session count limiter" entry.

On the row "User session count limiter", click on the cog icon. From here you can choose the maximum concurrent sessions for each user, and the behavior when the maximum number is reached. Select "Deny new session" to deny new accesses. if previous sessions are not properly terminated, you may need to delete them from the Keycloak console under the Users section.

System

Creating a system is the first step in optimizing your application.

A system is a representation of your application. It might be a complete representation of different layers, a single microservice, a batch job, or any IT system that you want to optimize.

A system can be used to fully model an application and then run multiple optimization initiatives or contain just the elements that are used for a specific optimization study.

The system is identified by a name, which in our example is "Online Boutique", and can be extended with a description to make it easily recognizable.

Components

The core elements of a system are the components. A component represents the fundamental element of an IT system, often composed of various layers or entities. It serves as a black-box definition of an entity involved in optimization, eliminating the need for intricate details in modeling.

A component comprises the following properties:

  • Name: A distinct identifier within the context of the system.

  • Description: A clarification of the component's purpose or function.

  • Component type: An identification of the underlying technology or technology stack of the component.

  • Properties: A set of additional properties that hold information about the component's configuration or telemetry (e.g. the IP used to reach an API or the username to connect to a server via SSH).

Akamas allows users to model their IT systems without the need to focus on technological aspects by providing several out-of-the-box component types to support system and component modeling.

Component types are platform entities (i.e.: shared among all the users) that contain key information about specific technologies such as parameters that can be tuned and key metrics.

Akamas includes off-the-shelf component types for the most popular technologies such as Containers, Linux Hosts, AWS EC2 instances, Web Applications, Spark, and runtimes such as JVM, Node, and Go.

Creating the Online Boutique system

Recalling our example of the Online Boutique application, we decided, for the moment, to model just the elements that are included in the optimization initiative. We have also decided not to model the entire Kubernetes cluster as we are not interested in optimizing and monitoring it at this stage.

We have mapped the JVM and the Pod to the respective component types and mapped the Kubernetes service to the Web Application component type. You can read more about these component types in their documentation reference.

To model our system we used the component types coming from these optimization packs:

The following picture shows our choice of components starting from the architectural diagram.

Creating the system with the CLI

To create this system in Akamas you can use the following YAML file.

name: Online Boutique
description: The Online Boutique e-commerce application

Create the file system.yaml and run the following command.

akamas create system system.yaml

Now you can start adding components. The following three YAML files represent the three components of our Online Boutique system.

APIs component specification
name: Apis
description: The APIs exposed to users
componentType: Web Application
properties:
  dynatrace:
    tags:
      Application: Ad-Service
Ad Service component specification
name: Adservice
description: The adservice of the online boutique by Google
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: akamas-demo
      containerName: server
      basePodName: ak-adservice-*
JVM component specification
name: AdserviceJVM
description: The JVM of the adservice 
componentType: java-openjdk-11
properties:
  dynatrace:
    tags:
      JVM: Ad-Service

Create the files and run the following command for each file.

akamas create component <file-name> "Online Boutique"

Note that, since components are bound to a specific system, we need to provide as an argument to the creation command also the name of the system Online Boutique that we created a few moments ago.

Component types are shipped within and can be easily installed and updated as support for new technologies is released.

Optimization Packs
Open JDK
Web Application
Kubernetes

Using

This section describe the main steps to optimize an application

To optimize a new application on Akamas you have to follow four steps shown in the following picture and described in the next sections by means of a simple example.

As depicted in the picture above, to optimize a new application you should:

  • Create a system that models the key parts of your application (e.g. containers, runtimes, APIs) that will be interested in the optimization initiative.

  • Set up the integration with a monitoring tool via telemetry providers so that Akamas can gather metrics about the performance of your application.

  • Create a workflow that allows Akamas to configure your application (e.g. write a configuration file, relaunch a process).

  • Define the optimization study according to your goal and SLOs so that Akamas knows what you want to achieve.

These steps relate to how Akamas integrates with your environment and apply to both offline and live optimization studies.

Example Application

In the following sections, we will use a simple yet representative web application to illustrate how to onboard a new application on Akamas. The application is called Online Boutique. It is a microservices application composed of 11 microservices that allow users to browse items, add them to the cart, and purchase them in an online store.

Suppose that we are about to deploy a major upgrade to one of the microservices, the Ad Service, that handles the advertisement logic, and we want to reduce the costs of running this service while meeting our SLO on the response time given an increasing number of users.

As shown in the diagram below, our service is built in Java, deployed as a pod in a Kubernetes cluster, and exposes an API using a service. The whole platform is monitored with Dynatrace.

You can now proceed to the first step, creating the system to model this application.

If your technology stack or optimization need does not fit this example, take a look at the section where you can find many optimization scenarios for different use cases.

Optimization Guide

Collecting support information

This documentation aims to guide users through common troubleshooting steps and how to retrieve essential support information to diagnose and resolve issues effectively.

When encountering issues with Akamas, gathering detailed support information is crucial for diagnosing and solving problems. This information includes platform logs data from the Java Flight Recorder (JFR), which provide insights into the system's operations and the nature of any encountered issues.

Retrieving Platform Logs

Platform logs in Akamas offer a comprehensive view of all system activities, errors, and operational messages. These logs are essential for a deep dive into the specifics of any encountered issues. To retrieve platform logs you can issue the following command from the akamas cli.

Note that the --from argument allows you to specify a timeframe for the log extraction. If you know the issue have been occurred in a specific time frame you can limit the extraction to that period.

akamas logs --dump --from -3d > log.out

The logs will be written to a file named log.out which can be shared with Akamas support agents for further investigations.

Accessing Flight Recorder

Akamas natively integrates Java Flight Recorder, a powerful tool for monitoring and recording the behavior of the Java runtime used to execute core Akamas services. Depending on the installation method (Docker or Kubernetes) accessing the JFR data requires different steps.

Docker

When running Akamas on Docker, JFR data is stored in a dedicated volume on the host. The volume is named perf. Each service writes its performance data in a dedicated subfolder of that volume.

Use the following command on the Akamas host to extract the data of a specific service:

docker cp license:/perf/<service> ./perf

The command will move all required files to a local folder named perf which can be shared with the support team.

To extract the data for all services issue the following command

docker cp license:/perf ./perf

Kubernetes

When running in a Kubernetes cluster, each service writes its performance data in a dedicated volume backed by a persistent volume claim to make it resilient to pod restarts.

To extract the data of a specific service follow these steps:

  1. Identify the name of the pod running the service with the command kubectl get pods | grep <service>

  2. Copy the content of the /perf folder inside the main container of the pod to a local directory with the following command

kubectl cp <pod-name>:/perf ./perf

Here is an example of a complete extraction for the service named campaign

$: kubectl get pods  | grep campaign
campaign-867674f9b5-5sppf            1/1     Running   2 (6h48m ago)   6h55m
$: kubectl cp campaign-867674f9b5-5sppf:/perf ./perf

This data can help Akamas support teams or your internal IT department to pinpoint the root cause of problems and identify appropriate solutions.

Telemetry

After modeling the system and its components, the following step is to set up the telemetry. Telemetry is essential to provide Akamas with enough data to evaluate a configuration both in terms of goal (e.g. reducing the cost) and constraints (e.g. meeting SLOs).

To instruct Akamas about the location of the data sources and how to access them, you can create a telemetry instance for your system. A telemetry instance comprises the following properties:

  • Name: An optional unique name within the system to quickly identify it.

  • Provider: The name of the telemetry provider that will be used to gather metrics.

  • Config: Additional configuration options that depend on the provider (e.g. a URL to reach the observability tool or the location of a CSV file to import) refer to each provider reference for more information.

A system can include multiple telemetry instances from different providers (e.g. in case you need to extract some information from Dynatrace and others from a CSV file).

Components and Telemetry

Each telemetry provider supports a unique set of properties that depends on the specific data source which allows Akamas to map each component to one or more entities in the observability tool and extract the right metrics for that particular technology.

Creating a telemetry instance for the Online Boutique

In this file, we specified the URL and the token required to authenticate to our Dynatrace instance.

Save it to a file named, as an example, instance.yaml and then issue the command.

As described in the section above, telemetry instances are coupled to a specific system. For this reason, we had to provide the name of the system Online Boutique as an argument to the create command.

Here is how the telemetry instance looks in the UI.

Mapping Components

Akamas needs to be informed that the component named Adservice used in the system maps to a specific entity in Dynatrace that represents the container running in the Kubernetes cluster.

Recalling the definition of the Adservice component in the system we see that it contains a set of properties starting with the dynatrace keyword. These properties are used by the Dynatrace telemetry provider to map the component to the correct entity and import metrics such as CPU usage and throttling that can be used to gather information about the performance of such components.

From and architectural diagram to the Akamas system
Steps to setup a new optimization
Online Boutique Ad microservice deployment
User Role map
User Role map

Akamas can gather metrics from many data sources, from industry-standard observability platforms (e.g. Prometheus or Dynatrace) to simple CSV files. This is done via telemetry providers that contain all the logic and information required to correctly extract the metrics and map them to the components of your system. You can take a look at available telemetry providers in the .

Telemetry instances alone do not provide information on which metrics should be extracted from the data source and to which component they map. As briefly introduced in the this is the job of the component properties.

As we introduced at the beginning of this section, we choose to use to monitor our application. To instruct Akamas to gather metrics from this data source you just need to create the following file.

For a complete definition of the properties available for the Dynatrace provider, as well as other providers, you can take a look at the documentation section.

If Dynatrace is not your observability platform of choice, take a look at the section where you can find many other telemetry providers for different observability tools and common integration strategies like CSV files.

provider: Dynatrace
name: Staging Environment
config:
  url: https://mydyn87510.live.dynatrace.com/
  token: dt0c01.JQG73....  
akamas create telemetry-instance instance.yaml "Online Boutique"
name: Adservice
description: The adservice of the online boutique by Google
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: akamas-demo
      containerName: server
      basePodName: ak-adservice-*
documentation reference
system section
Dynatrace
telemetry reference
telemetry provider

Study

Now that Akamas knows about your application, how to configure it, and how to monitor it, the final step is to define your optimization study.

The study defines the objective of the optimization activity. It contains information about what we want to achieve (e.g. reduce costs, improve latency..), the parameters that can be optimized, and any SLO that should not be breached by the optimized configuration.

Studies are divided into two main categories:

The setup of both studies is similar as both are constituted by the following core elements:

  • Name: A unique identifier that can be used to identify different studies.

  • System: The name of the system that we want to optimize.

  • Workflow: The name of the workflow that will be used to configure the application.

  • Goal: The objective of the optimization (e.g. minimize cost, maximize throughput, reduce latency).

  • Parameter Selection: A list of parameters that will be tuned in the optimization (e.g. container memory and CPU limits, EC2 instance family..).

  • Steps: The flow of the optimization study (e.g. assessing the baseline performance, optimizing the system, restoring the configuration).

Goal

The goal defines the objective of our optimization. Specifying a goal is as simple as defining the metric we want to optimize and the direction of the optimization such as maximizing throughput or minimizing cost. If you want to optimize more complex scenarios or lack a single metric that represents your objective you can also specify a formula and define a goal such as minimizing memory and CPU utilization.

Metrics are identified within a study with the following notation component.metric_name where component is the name of a component of the system linked to the study and metric name is the name of a metric. As an example, the CPU utilization of a container might be identified by MyContainer.cpu_util.

Parameter Selection

The parameter selection contains the list of parameters that are subject to the optimization process. These might include several components and layers, as in the following example.

Similarly to metrics, components are defined with the notation component.parameter_name.

Optionally, you can also specify a range of values that can be assigned to the parameter. This is very useful when you want to evaluate a specific optimization area or want to add some context to the optimization (e.g. avoid setting a memory greater than 8GB because it's not available on the system).

The parameter selection can include any component and parameter of the system. During the optimization process, Akamas will provide values for those parameters and apply them to the system using the workflow provided in the study definition.

Steps

If the goal describes where we are heading, steps describe the road to get there. Usually, when optimizing an application we want to assess its performance before the tuning activity to evaluate the benefits; this initial assessment is called the Baseline. Then, we want to run the optimization process for a definite number of iterations, this is called an Optimization step. Many other use cases can be achieved by providing additional steps to the study. Some of these include:

  • Re-using knowledge gathered by other optimization studies

  • Applying the baseline configuration to the test environment after the optimization has ended

  • Evaluating a specific configuration suggested by the user

Optimizing the Online Boutique

As shown in the image below, you can use the study creation wizard in the UI to specify all the required information.

If you prefer to define it via YAML you can use the following file.

name: Reduce Costs
system: Online Boutique
workflow: Configure and Test Boutique

goal:
  objective: minimize
  function:
    formula: Adservice.cost
  constraints:
    absolute:
      - name: response_time
        formula: Apis.requests_response_time <= 20

parametersSelection:
  - name: Adservice.cpu_limit
    domain: [150, 1000]
  - name: Adservice.memory_limit
    domain: [64, 2048]
  - name: AdserviceJVM.jvm_maxRAMPercentage
  - name: AdserviceJVM.jvm_gcType

steps:
  - name: baseline
    type: baseline
    values:
      Adservice.cpu_limit: 500
      Adservice.memory_limit: 1024
      AdserviceJVM.jvm_maxRAMPercentage: 25

  - name: optimize
    type: optimize
    numberOfExperiments: 30

Save it to a file named, as an example, study.yaml and then issue the command

akamas create study study.yaml

This study's definition contains three main parts.

The goal

The parameters selection

The steps

This final section instructs Akamas to first assess the performance and costs of the current configuration, which we will refer to as the baseline, then run 30 experiments by changing the parameters to optimize the goal.

You can now start your optimization study and wait for Akamas to find the best configuration!

Offline Studies are, generally, executed in test environments where the workload of the application is generated using a load-testing tool. You can read more .

Live Studies are, usually, executed in production environments. You can read more .

The and the , already introduced in the previous sections, are referenced in the study definition to provide Akamas with information on how to apply the parameters (through the workflow) and retrieve the metrics (through the telemetry instances in the system) that are used to calculate the goal.

Another important, although optional, element of the goal is the definition of constraints on other metrics of the system: in many cases optimizing a system involves finding a tradeoff between multiple aspects, and goal constraints can be used to map SLO and inform Akamas about other aspects of our system that we want to safeguard during the optimization (e.g. reducing the amount of CPU assigned to a container might reduce the cost of running the system but increase its response time). Constraints can be used to specify, as an example, an upper limit to the response time or the memory utilization of the system. You can find more information on how to specify .

You can find more information on the .

Besides the goal, parameter selection, and steps, the study can be enriched with other, optional, elements that can be used to better tailor it to your specific needs. These include, as an example automated windowing and parameter constraints. You can find more information on these optional elements in the specific subsections or read the entire study definition in the .

Recalling our application example introduced in , our optimization objective is to reduce the costs of running the Ad service while reaching our SLO on the response time.

In this section, we instruct akamas that we want to minimize the cost of the Adservice and we have added a constraint to the optimization. In particular, we added a constraint on the value of the metric requests_response_time of the Api component to be lower than 20ms. This is an absolute constraint as it's defined on the actual value of the metric and can easily map an SLO. You can also express constraints like "do not make the response time increase more than 10%" by using relative constraints. You can find more info on the supported constraint types in the .

In this section, we defined which parameters Akamas can change to achieve its goal. We decided to include parameters both from the JVM and the container layers to let Akamas tune all of them accordingly. We also specified a custom domain, for a couple of parameters, to allow Akamas to explore only values within those ranges. Note that this is an optional step as Akamas already knows about the range of possible values of many parameters. You can find more info on available parameters and guidelines to choose them in different use cases in the section.

here
here
system
workflow
constraints in the reference documentation section
steps in the reference documentation section
reference documentation section
this section
reference documentation section
optimization guides

Workflow

The third step in optimizing a new application is to create a workflow to instruct Akamas on the actions required to apply a configuration to the target application.

A workflow defines the actions that must be executed to evaluate the performance of a given configuration. These actions usually depend on the application architecture, technology stack, and deployment practices which might vary between environments and organizations (e.g. Deploying a microservice application in a staging environment on Kubernetes and performing a load test might be very different than applying an update to a monolith running in production).

If you are using GitOps practices and deployment pipeline you are probably already familiar with most of the elements used in Akamas workflows. Workflows can also trigger existing pipelines and re-use all the automation already in place.

Workflows are not tightly coupled to a study and can be re-used across studies and systems so you can change the optimization scope and target without the need to re-create a specific workflow.

Creating the workflow for Online Boutique

The workflow that we will create to allow Akamas to evaluate the configurations comprises the following actions:

  1. Create a deployment file from a template

  2. Apply the file via kubectl command

  3. Wait for the deployment to be ready

  4. Start the load test via locust APIs

Even if the integrations of this workflow are specific to the technology used by our test application (e.g. using kubectl CLI to deploy the application), the general structure of the workflow could fit most of the applications subject to offline optimization in a test environment.

Here is the YAML definition of the workflow described above.

name: Configure and Test Online Boutique
tasks:
  # 1 - Create a deployment file from a template
  - name: Configure Online Boutique
    operator: FileConfigurator
    arguments:
      source:
        hostname: mgmserver
        username: akamas
        password: ******
        path: /work/boutique/boutique.yaml.templ
      target:
        hostname: mgmserver
        username: akamas
        password: *******
        path: /work/boutique/boutique.yaml
 
  # 2 - Apply the file via the kubectl command
  - name: Apply new configuration to the Online Boutique
    operator: Executor
    arguments:
      host:
        hostname: mgmserver
        username: akamas
        password: *******
      command: kubectl apply -f /work/boutique/boutique.yaml
  
  # 3 - Wait for the deployment to be ready
  - name: Check Online Boutique is up
    operator: Executor
    arguments:
      retries: 0
      host:
        hostname: mgmserver
        username: akamas
        password: *******
      command: kubectl rollout status --timeout=3m deployment ak-adservice 
  
  # 4 - Start the load test via locust APIs
  - name: Start Locust Test
    operator: Executor
    arguments:
      host:
        hostname: mgmserver
        username: akamas
        password: *******
      command: bash /work/boutique/run-test.sh

Save it to a file named, as an example, workflow.yaml and then issue the creation command:

akamas create workflow workflow.yaml

Here is what the workflow looks like in the UI:

Offline Study

Offline optimization studies are typically used to optimize systems in pre-production environments, with respect to planned and what-if scenarios that cannot be directly run in production. Scenarios include new application releases, planned technology changes (e.g. new JVM or DB), cloud migration or new provider, expected workload growth, and resilience under failure scenarios (from chaos engineering).

The following figure represents the iterative process associated with offline optimizations:

The following 5 phases can be identified for each iteration (also known as experiment):

  1. Recommend Conf: Akamas AI engine identifies the configuration for the next iteration until a termination condition for the study is met (e.g. number of experiments).

Thanks to its patented AI (reinforcement learning) algorithms, Akamas can find the optimal configuration without having to explore all the possible configurations.

Trials

For each experiment, Akamas allows multiple trials to be executed. A trial is a repetition of the same experiment to reduce the impact of noise on the result of an experiment.

Environments can be noisy for several reasons such as:

  • External conditions (e.g. background jobs, "noisy neighbors" in the cloud)

  • Measurement errors (e.g. monitoring tools not always 100% accurate)

This approach is consistent with scientific and engineering practices, where the strategy to minimize the impact of noise is to repeat the same experiment multiple times.

Steps

An offline optimization study can include multiple steps.

Typically there are at least two steps:

  • Baseline step: a single experiment that is run by applying the already deployed configuration before the Akamas optimization is applied - the results of this experiment are used as a reference (baseline) for assessing the optimization and as such is a mandatory step for each study

  • Optimize step: a defined number of experiments used to identify the optimal configuration by leveraging Akamas AI.

Other steps are:

  • Bootstrap step: imported experiments from other optimization studies

  • Preset step: a single experiment with a defined configuration

The steps to be executed can be specified when defining an offline optimization study.

Commands

User Interface

The Akamas UI shows offline optimization studies in a specific top-level menu.

The details and results of an offline optimization study are displayed when drilling down (there are multiple tabs and sections).

Akamas provide several general-purpose and specialized workflow operators that allow users to perform common actions such as running a command on a Linux instance via SSH as well as integrate enterprise tools such as LoadRunner to run performance tests or Spark to launch Big Data analysis. More information and usage examples are on the .

The structure of the workflow heavily depends on deployment practices and the kind of optimization. In our example, we are dealing with a microservice application deployed in a test environment which is tested by injecting some load using , a popular open-source performance testing tool.

You can find more workflow examples for different use cases on the and references to technology-specific operators (e.g. Loadrunner, Spark) on the .

In this workflow, we used two operators: the which creates a configuration file starting from a template by inserting the configuration values decided by Akamas, and the which runs a command on a remote instance (named mgmserver in this case, via ssh).

Offline optimization studies are where the workload is simulated by leveraging a load-testing tool.

Apply configuration: Akamas applies the parameter configuration (one or more ) to the target system by leveraging a set of

Apply workload: Akamas triggers a workload on the target system by also leveraging a set of

Collect KPIs: Akamas collects the related to the target system - only those metrics that are specified by each defined in the system

Score vs goal: Akamas scores the applied parameter configuration against the defined - the score is the value of the goal function

An offline optimization study is an that can be managed via CLI using the

Workflow Operators reference page
Locust
Optimization Guides section
Workflow Operators reference page
FileConfigurator operator
Executor operator
optimization studies
parameters
workflow operators
workflow operators
metrics
telemetry instance
goal and constraints

Analyzing results of live optimization studies

Even for live optimization studies, it is a good practice to analyze how the optimization is being executed with respect to the defined goal & constraints, and workloads.

This analysis may provide useful insights about the system being optimized (e.g. understanding of the system dynamics) and about the optimization study itself (e.g. how to adjust optimizer options or change constraints). Since this is more challenging for an environment that is being optimized live, a common practice to adopt a recommendation mode before possibly switching to a fully autonomous mode.

The Akamas UI displays the results of an offline optimization study in the following areas:

  • the Metrics section (see the following figures) displays the behavior of the metrics as configurations are recommended and applied (possibly after being reviewed and approved by users); this area supports the analysis of how the optimizer is driven by the configured safety and exploration factors.

  • The All Configurations section provides the list of all the recommended configurations, possibly as modified by the user, as well as the details of each applied configuration (see the following figures).

  • in the case of a recommendation mode, the Pending configuration section (see the following figure) shows the configuration that is being recommended to allow users to review it (see the EDIT toggle) and approve it:

Windowing

A critical aspect, when evaluating the performance of an application, is to make sure that the data we use is accurate. It's quite common for IT systems to experience some transient periods of instabilities; these might occur in many situations such as filling up caches, runtime compilation activities, horizontal scaling, and much more.

A common practice, in performance engineering, is to exclude from the analysis the initial and final part of a performance test to consider only the time when the system is in full operation. Akamas can automatically identify a subset of the whole data to evaluate scores and constraints.

Looking at the example below, from the Online Boutique application, we see that the response time has an initial spike to about 7ms and then stabilizes below 1ms; also the CPU utilization shows a similar pattern.

This is quite common, as an example, for Java-based systems as, in the first minutes of operations activities like heap resizing and just-in-time compilation take place. In this case, Akamas considered in the evaluation of the experiment only the gray area effectively avoiding the impact of the initial spike.

This behavior can be configured in the study by specifying a section called windowing. Two windowing policies allow you to properly configure Akamas in different scenarios.

The windowing section in the study definition is optional and the default policy considers all the available data to evaluate the performance of the experiment.

Live Study

In cases where a testing environment is not available or it is hard to build representative load tests Akamas can directly optimize production environments by running a Live Optimization study. Production environments differ from test environments in many ways, here are the main aspects that affect how Akamas can optimize the system in such a scenario and that define live optimization studies:

  • Safety, in terms of application stability and performance, is critical in production environments where SLO might be in place.

  • The approval process is usually different between production and lower-level environments. In many cases, a configuration change in a production environment must be manually approved by the SRE or Application team and follow a custom deployment scenario.

  • The workload on the application in a production environment is usually not controlled, it might change with the time of the day, due to special events or external factors

These are the main factors that make live optimization studies differ from offline optimizations.

The following figure represents the iterative process associated with live optimizations:

The following 5 phases can be identified for each iteration:

  1. Recommend Conf: Akamas provides a recommendation for parameter configuration based on the observed behavior and leveraging the Akamas AI

  2. Human Approval: the recommendation is inspected, possibly revisited, and approved by users before being applied to the system. This step is optional and can be automated.

Overall the core process is very similar to the one of offline optimization studies. The main difference is the (optional) presence of a manual configuration review and approval step.

Safety

Even if the process is similar, the way recommended configurations are generated is quite different as it's subject to some safety policies such as:

  • The exploration factor defines the maximum magnitude of the change of a parameter from one configuration to the next (e.g. reducing the CPU limit of a container by at most 10%). As changes are smaller in magnitude their effect on the system is also smaller, this leads to safer optimizations as the optimization can better track changes in the core metrics. As a side effect, it might take more time for a live optimization to fully optimize a configuration when compared to an offline study.

  • The safety factor defines how tight the constraints defined in the study are. As the configuration changes some metrics might approach a limit imposed by constraints. As an example, if we set a response time threshold of 300ms akamas will keep track of how the response time changes due to the configuration changes and react to keep the constraint fulfilled. The safety factor influences how quickly Akamas reacts to approaching constraints.

Workload

A key aspect of live optimization studies is the fact that the incoming workload of the application is not generated by a test script but by real users. This means that, after deploying a new configuration the incoming might be different with respect to the use used to evaluate the previous one. Nevertheless, the Akamas AI algorithm is capable of taking into account the differences in the incoming workload and fairly evaluating different configurations even if applied in different scenarios. As an example, the traffic of web applications exposed to the general public is usually different between workdays and weekends or working hours and nights.

To instruct Akamas to take into account changes that are not controlled by the deployment process you just need to specify the workloadsSelection parameter in the optimization study.

The workload selection should contain a list of metrics that are independent of the configuration and represent external factors that affect the performance of the configuration in terms of goals or constraints. Most of the time the application throughput is a good metric to use as a workload metric.

When one or more workload metrics are specified Akamas will take into account the differences in the workload and build clusters of similar workloads to identify repetitive working conditions for the application. It will then use this information to contextualize the evaluation of each configuration and provide a recommended configuration that fulfills the defined constraints on all the workload conditions seen by the optimization process.

User Interface

Live optimizations are separated from offline optimization studies and are available in the second entry on the left menu.

Live optimizations are run usually for a longer period compared to offline optimizations and their effect on the goal and the constraints is more gradual. For this reason, Akamas offers a specific UI that allows users to evaluate the progress of live optimizations and compare many different configurations applied by looking at the evolution of core metrics.

resource management commands.
Metrics section of a live optimization study
From the metrics cahrt displaying configurations (toggle on) to a specific configuration
The list of configurations applied ovet time in the All Configuration section
A specific configuration from the All Configuration section
Pending configutation

The simplest policy is called trim and allows users to specify how much time should be excluded from the evaluation from the start and the end of the experiment. It is also possible to apply the trim policy to a specific task of the workflow. This policy can be easily used when, for example, the time required to deploy the application might change. You can read more on this policy in the .

In other contexts, discarding the initial warmup period is not enough. For these scenarios, Akamas supports a more advanced policy, called stability. This policy is also particularly useful for stress tests where our objective is to make the system sustain as much load as possible before becoming unstable as it allows users to express constraints on the stability of the system. You can read more on this policy in the

Collect KPIs: Akamas collects the of the system required to observe its behavior under the current parameter configuration by leveraging the associated - here Akamas is also observing and categorizing the different workload contexts that are used to recommend configurations that are appropriate for each specific workload context

Score vs Goal: Akamas scores the applied parameter configuration under the specific workload context against the defined

Apply Conf: Akamas applies the recommended configuration by leveraging the defined .

You can read more on safety policies in the related .

You can read more on this parameter on the reference .

reference documentation section
reference documentation section.
metrics
telemetry provider
goal and constraints
workflow
documentation section
workload selection page

Optimization Guides

What do you want to do with Akamas?

Optimize resources and costs, while preserving application performance and reliability

Optimize application performance and reliability, while avoiding resource and cost wastage

Optimize application costs and resource efficiency

Kubernetes microservices

Cloud instances

Spark applications

Application Runtimes

Kubernetes microservices

Offline optimizations

Live optimizations

Optimizing cost of a Kubernetes microservice while preserving SLOs with performance tests

Optimizing cost of a Java microservice on Kubernetes while preserving SLOs with performance tests

Optimizing cost of a Kubernetes microservice while preserving SLOs in production

Optimizing cost of a Java microservice on Kubernetes while preserving SLOs in production

Optimizing cost of a Kubernetes microservice with HPA in production

Optimize cost of a Kubernetes microservice while preserving SLOs in production

In this example, you will use Akamas live optimization to minimize the cost of a Kubernetes deployment, while preserving application performance and reliability requirements.

Prerequisites

In this example, you need:

  • an Akamas instance

  • a Kubernetes cluster, with a deployment to be optimized

  • the kubectl command installed in the Akamas instance, configured to access the target Kubernetes and with privileges to get and update the deployment configurations

  • a supported telemetry data source (e.g. Prometheus or Dynatrace) configured to collect metrics from the target Kubernetes cluster

Optimization setup

Optimization packs

This example leverages the following optimization packs:

System

The system represents the Kubernetes deployment to be optimized (let's call it "frontend"). You can create a system.yaml manifest like this:

name: frontend
description: Kubernetes frontend deployment

Create the new system resource:

akamas create system system.yaml

The system will then have two components:

  • A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits

  • A Web Application component, which contains service-level metrics like throughput and response time

In this example, we assume the deployment to be optimized is called frontend, with a container named server, and is located within the boutique namespace. We also assume that Dynatrace is used as a telemetry provider.

Kubernetes component

Create a component-container.yaml manifest like the following:

name: container
description: Kubernetes container, part of the frontend deployment
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: boutique
      containerName: server
      basePodName: frontend-*

Then run:

akamas create component component-container.yaml frontend

Now create a component-webapp.yaml manifest like the following:

name: webapp
description: The service related to the frontend deployment
componentType: Web Application
properties:
  dynatrace:
    id: <TELEMETRY_DYNATRACE_WEBAPP_ID>

Then run:

akamas create component component-webapp.yaml frontend

Workflow

The workflow in this example is composed of three main steps:

  1. Update the Kubernetes deployment manifest with the parameters (CPU and memory limits) recommended by Akamas

  2. Apply the new parameters (kubectl apply)

  3. Wait for the rollout to complete

  4. Sleep for 30 minutes (observation interval)

Create a workflow.yaml manifest like the following:

name: frontend
tasks:
  - name: configure
    operator: FileConfigurator
    arguments:
      source:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml.templ
      target:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml

  - name: apply
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl apply -f frontend.yaml

  - name: verify
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl rollout status --timeout=5m deployment/frontend -n boutique;

  - name: observe
    operator: Sleep
    arguments:
      seconds: 1800

Then run:

akamas create workflow workflow.yaml

Telemetry

Create the telemetry.yamlmanifest like the following:

provider: Dynatrace
config:
  url: <YOUR_DYNATRACE_URL>
  token: <YOUR_DYNATRACE_TOKEN>
  pushEvents: false

Then run:

akamas create telemetry-instance telemetry.yaml frontend

Study

In this live optimization:

  • the goal is to reduce the cost of the Kubernetes deployment. In this example, the cost is based on the amount of CPU and memory limits (assuming requests = limits).

  • the approval mode is set to manual, a new recommendation is generated daily

  • to avoid impacting application performance, constraints are specified on desired response times and error rates

  • to avoid impacting application reliability, constraints are specified on peak resource usage and out-of-memory kills

  • the parameters to be tuned are the container CPU and memory limits (we assume requests=limits in the deployment file)

Create a study.yaml manifest like the following:

name: frontend
system: frontend
workflow: frontend
requireApproval: true

goal:
  objective: minimize
  function:
    formula: (((container.container_cpu_limit/1000) * 3) + (container.container_memory_limit/(1024*1024*1024)))
  constraints:
    absolute:
      - name: Response Time
        formula: webapp.requests_response_time <= 300
      - name: Error Rate
        formula: webapp.service_error_rate:max <= 0.05
      - name: Container CPU saturation
        formula: container.container_cpu_util:p95 < 0.8
      - name: Container memory saturation
        formula: container.container_memory_util:max < 0.7
      - name: Container out-of-memory kills
        formula: container.container_oom_kills_count == 0

parametersSelection:
  - name: container.cpu_limit
    domain: [300, 1000]
  - name: container.memory_limit
    domain: [800, 1536]

windowing:
  type: trim
  trim: [5m, 0m]
  task: observe

workloadsSelection:
  - name: webapp.requests_throughput

steps:
  - name: baseline
    type: baseline
    numberOfTrials: 48
    values:
      container.cpu_limit: 1000
      container.memory_limit: 1536

  - name: optimize
    type: optimize
    numberOfTrials: 48
    numberOfExperiments: 100
    numberOfInitExperiments: 0
    maxFailedExperiments: 50

Then run:

akamas create study study.yaml

You can now follow the live optimization progress and explore the results using the Akamas UI for Live optimizations.

Kubernetes
Web application

Optimize cost of a Kubernetes deployment subject to Horizontal Pod Autoscaler

In this guide, you optimize the cost (or resource footprint) of a Kubernetes deployment where the number of replicas is controlled by the HPA. The study tunes both pod resource settings (CPU and memory requests and limits) and HPA options (target CPU utilization) at the same time, while also taking into account your application performance and reliability requirements (SLOs). This optimization happens in production, leveraging Akamas live optimization capabilities.

Prerequisites

  • an Akamas instance

  • a Kubernetes cluster, with a deployment to be optimized

  • a Horizontal Pod Autoscaler working on the desired deployment

  • a way to apply configuration changes recommended by Akamas to the target deployment and HPA. In this guide, Akamas interacts directly with the Kubernetes APIs via kubectl.You need a service account with permissions to update your deployment (see below for other integration options).

Optimization setup

In this guide, we assume the following setup:

  • the Kubernetes deployment to be optimized is called frontend (in the hipster-shop namespace)

  • in the deployment, there is a container named server, where the app runs

  • the HPA is called frontend-hpa

  • both Dynatrace and Prometheus are used as observability tools

Let's set up the Akamas optimization for this use case.

System

For this optimization, you need the following components to model the frontend tech stack:

  • The Kubernetes Workload, Container and Pod components, containing metrics like CPU used for the different objects and parameters to be tuned like CPU limits at the container levels (from the Kubernetes optimization pack)

  • An HPA component, which contains HPA parameters like the target CPU utilization

  • A Web Application component, which contains service-level metrics like throughput and response time of the microservice (from the Web Applicationoptimization pack)

Let's start by creating the system, which represents the Kubernetes deployment to be optimized. To create it, write a system.yaml manifest like this:

name: frontend
description: The frontend Kubernetes deployment

Then run:

akamas create system system.yaml

Now create the three Kubernetes components. Create a workload.yaml manifest like the following:

name: workload_frontend
description: The frontend Kubernetes workload
componentType: Kubernetes Workload
properties:
  prometheus:
    namespace: hipster-shop
    deployment: frontend

Then create a container.yaml manifest like the following:

name: server
description: The server Kubernetes container
componentType: Kubernetes Container
properties:
  prometheus:
    namespace: hipster-shop
    pod: frontend.*
    container: server

And a pod.yaml manifest like the following:

name: pod_frontend
description: The frontend Kubernetes pod
componentType: Kubernetes Pod
properties:
  prometheus:
    namespace: hipster-shop
    pod: frontend.*

Now create the entities by running:

akamas create component workload.yaml frontend-2
akamas create component container.yaml frontend-2
akamas create component pod.yaml frontend-2

Now create an application.yaml manifest like the following:

name: webapp
description: The web application of frontend deployment
componentType: Web Application
properties:
  dynatrace:
    id: SERVICE-80258F7AA97F2E4D
  prometheus:
    namespace: hipster-shop-2
    pod: frontend.*
    container: server

Notice the component includes properties that specify how Dynatrace telemetry will look up this container in the Kubernetes cluster.

These properties are dependent upon the telemetry provider you are using. See the reference for the full list of supported providers and relative configurations.

The run:

akamas create component application.yaml frontend-2

Finally, create anhpa.yaml manifest like the following:

name: frontend_hpa
description: The HPA for the frontend
componentType: HPA

The HPA component does not provide any metric, so we do not need to specify anything about the workload.

Then run:

akamas create component hpa.yaml frontend-2

Workflow

To optimize a Kubernetes microservice in production, you need to create a workflow that defines how the new configuration recommended by Akamas will be deployed in production.

Let's explore the high-level tasks required in this scenario and the options you have to adapt it to your environment:

1) Update the Kubernetes deployment and HPA configurations

The first step is to update the Kubernetes deployment and HPA with the new configuration. This can be done in several ways depending on your environment and processes:

  • A simple option is to let Akamas directly update the Kubernetes entities leveraging the Kubernetes APIs via kubectl commands.

  • Another option is to follow an Infrastructure-as-code approach, where the configuration change is managed via pull requests to a Git repository, leveraging your pipelines to deploy the change in production.

In this guide, we take the first option and use the kubectl patch and kubectl apply commands to configure the new deployment and the HPA.

2) Wait for the new deployment to be rolled out in production

In a live optimization, Akamas needs to understand when the new deployment rollout is complete and whether it was completed successfully or not. This is key information for Akamas AI to observe and optimize your applications safely.

This task can be done in several ways depending on how you manage changes, as discussed in the previous task:

  • A simple option is to use thekubectl rollout command to wait for the deployment rollout completion. This is the approach used in this guide.

  • Another option is to follow an Infrastructure-as-code approach, where a change is managed via pull requests to a Git repository, leveraging your pipelines to deploy in production. In this situation, the deployment process is executed externally and is not controlled by Akamas. Hence, the workflow task will periodically poll the Kubernetes deployment to recognize when the new deployment has landed in production.

3) Wait for the appropriate time to start the experiment

When dealing with the HPA, it is important that Akamas always observes the same timeframe.

If the configuration change requires too much time (e.g., because it requires a manual step), the akamas experiments will see a different workload pattern (e.g., we could observe the night instead of the day). This would make the analysis quite complex, especially for humans.

Albeit Akamas handles different workload patterns, it's always better to run each experiment on the same time slot, so that each configuration is evaluated against a similar workload pattern.

In this example we assume that we want to evaluate a new configuration every hour, hence we will insert a workload step that waits for the end of the current hour.

Typically, this depends on the configuration process of your application.

4) Observe how the application behaves with the new configuration

In a live optimization, Akamas simply needs to wait for a given observation interval, while the application works in production with the new configuration. Telemetry metrics will be collected during this observation period and will be analyzed by Akamas AI to recommend the next configuration.

Since we decided to evaluate a configuration every hour, we use a 55 minute observation interval, leaving 5 minutes for the configuration process.

Let's now create a workflow.yaml manifest like the following:

name: frontend-11-delayedApproval-hpa-1hour-system2
tasks:
  - name: configure frontend
    operator: FileConfigurator
    arguments:
      source:
        hostname: toolbox
        username: akamas
        key: /home/stefano/tmp_ak_key
        path: /work/examples/hipstershop-hpa/hipstershop-2/ak-frontend.sh.templ
      target:
        hostname: toolbox
        username: akamas
        key: /home/stefano/tmp_ak_key
        path: /work/ak-frontend-2.sh

  - name: apply frontend
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: toolbox
        username: akamas
        key: /home/stefano/tmp_ak_key
      command: sh /work/ak-frontend-2.sh hipster-shop-2 frontend

  - name: verify frontend
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: toolbox
        username: akamas
        key: /home/stefano/tmp_ak_key
      command: kubectl rollout status --timeout=5m deployment/frontend -n hipster-shop-2;

  - name: configure hpa
    operator: FileConfigurator
    arguments:
      source:
        hostname: toolbox
        username: akamas
        key: /home/stefano/tmp_ak_key
        path: /work/examples/hipstershop-hpa/hipstershop-2/frontend-hpa-v2.yaml.templ
      target:
        hostname: toolbox
        username: akamas
        key: /home/stefano/tmp_ak_key
        path: /work/frontend-hpa-v2-2.yaml

  - name: apply hpa
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: toolbox
        username: akamas
        key: /home/stefano/tmp_ak_key
      command: kubectl apply -f /work/frontend-hpa-v2-2.yaml -n hipster-shop-2

  - name: check if we are in time or wait for start of next hour
    operator: Executor
    arguments:
      host:
        hostname: toolbox
        username: akamas
        key: /home/stefano/tmp_ak_key
      command: if [ $(date +%M) -lt 55 ]; then sleep $((60*(60 - $(date +%M)))); else sleep 0; fi

  - name: observe 55 minutes
    operator: Sleep
    arguments:
      seconds: 3300

Then run:

akamas create workflow workflow.yaml

Telemetry

To collect metrics of your target Kubernetes deployment, you create a telemetry instance based on your observability setup.

Create a dynatrace.yamlmanifest like the following:

provider: Dynatrace
config:
  url: <YOUR_DYNATRACE_URL>
  token: <YOUR_DYNATRACE_TOKEN>
  pushEvents: false

Then run:

akamas create telemetry-instance dynatrace.yaml frontend-2

Create a prometheus.yamlmanifest like the following:

provider: Prometheus
config:
  address: prom-kube-prometheus-stack-prometheus.monitoring
  port: 9090
  duration: 60
  logLevel: DETAILED
metrics:
  - metric: cost
    datasourceMetric: 'sum(kube_pod_container_resource_requests{resource="cpu" %FILTERS%})*29 + sum(kube_pod_container_resource_requests{resource="memory" %FILTERS%})/1024/1024/1024*3.2'

Then run:

akamas create telemetry-instance prometheus.yaml frontend-2

Study

It's now time to create the Akamas study to achieve your optimization objectives.

Let's explore how the study is designed by going through the main concepts. The complete study manifest is available at the bottom.

Goal

Your overall objective is to reduce the cost (or resource footprint) of a Kubernetes deployment. To do that, you need to define the goal, which is a metric (or combination of metrics) representing the deployment cost to be minimized.

There are different approaches to measuring the cost of Kubernetes deployments:

  • A simple approach is to consider that Kubernetes allocates infrastructure resources based on pod resource requests (CPU and memory). Hence, the cost of a deployment can be derived from the deployment aggregate CPU and memory requests. In this guide, we use this approach and define the study goal as the sum of CPU and memory requests of the container to be optimized.

  • Alternatively, the cost of a Kubernetes deployment can also be collected from external data sources that provide actual cost metrics like OpenCost. In this case, the study goal can be defined by leveraging the cost metric. See here for more information on how to integrate cost metrics.

Notice that weighting factors can be used in the goal formula to specify the importance of CPU vs memory resources. For example, the cloud price of 1 CPU is about 9 times that of 1 GB of RAM. You can customize those weights based on your requirements so that Akamas knows how to truly reach the most cost-efficient configuration in your specific context.

Constraints

When optimizing for cost reduction (or resource footprint), it's key not to impact application response time or introduce risks of availability and reliability issues. To ensure this, you can define your performance and reliability requirements (SLOs) as metric constraints.

In this study:

  • to ensure application performance, constraints are specified on application response times and error rate

  • to ensure application reliability, constraints are specified on container peak CPU and memory utilization, and container out-of-memory kills

Parameters

To achieve cost-efficient and reliable microservices, Kubernetes container resources and HPA scaling options must be configured optimally and tuned jointly, as they are heavily interconnected.

To do that, the study includes the following parameters:

  • Kubernetes container: CPU and memory requests and limits

  • HPA target CPU utilization

The study also includes parameter constraints to ensure that recommended configurations are safe and comply with best practices. In particular:

  • CPU limits must be at most 2x CPU requests, to avoid excessive over-commitment of CPU limits in the cluster.

Notice that the parameters and constraints can change depending on your policies. For example, it is a best practice to set memory requests == limits to avoid pod eviction, hence we are only tuning the memory limit in the study and set the request to the same value in the deployment file.

Workload

Akamas live optimization considers the application's workload to recommend new configurations that are optimal for the goal (e.g. reduce cost) while meeting all metric constraints (e.g., latency and error rates).

For Kubernetes microservices, the workload is typically the throughput (requests/sec) of the microservice API endpoints. This is the approach used in this guide.

Approval mode

In this live optimization, the manual approval is set to false, meaning that as soon as a new configuration gets generated, the workflow will be executed without any human involvement.

You can set it to true so that Akamas will ask for user approval when a new configuration gets generated. Once you approve it, the workflow will be executed, and the new configuration will be deployed to production according to the integration strategy you have defined above.

You can now create a study.yaml manifest like the following:

name: ak-frontend - live - system 2
system: frontend-2
workflow: frontend-11-delayedApproval-hpa-1hour-system2

goal:
  name: Cost
  objective: minimize
  function:
    formula: web_application.cost
  constraints:
    absolute:
      - name: Application response time degradation
        formula: web_application.requests_response_time_p50:p90 <= 60
      - name: Application error rate degradation
        formula: web_application.requests_error_rate:p90 <= 0.02
      - name: Container CPU saturation
        formula: server.container_cpu_util_max:p90 < 0.8
      - name: Container memory saturation
        formula: server.container_memory_used:max / server.container_memory_limit < 0.7

windowing:
  type: trim
  trim: [1m,  1m]
  task: observe 55 minutes

parametersSelection:
  - name: server.cpu_request
    domain: [10, 500]
  - name: server.cpu_limit
    domain: [10, 500]
  - name: server.memory_limit
    domain: [16, 640]
  - name: frontend_hpa.metrics_resource_target_averageUtilization
    domain: [10, 90]

parameterConstraints:
  - name: CPU request less or equal to limits
    formula: server.cpu_request <= server.cpu_limit
  - name: CPU limit within a given factor of request
    formula: server.cpu_limit <= server.cpu_request * 2

workloadsSelection:
  - name: web_application.requests_throughput:max
  - name: web_application.requests_throughput

numberOfTrials: 1
steps:
  - name: baseline
    type: baseline
    numberOfTrials: 3
    values:
      server.cpu_request: 200
      server.cpu_limit: 400
      server.memory_limit: 128
      frontend_hpa.metrics_resource_target_averageUtilization: 60
    renderParameters: [frontend_hpa.metrics_resource_target_averageUtilization]

  - name: optimize
    type: optimize
    numberOfExperiments: 300

Then run:

akamas create study study.yaml

You can now follow the live optimization progress and explore the results using the Akamas UI.

Application runtime

Offline optimizations

Optimize cost of a Java microservice on Kubernetes while preserving SLOs in production

In this guide, you optimize the cost (or resource footprint) of a Java microservice running on Kubernetes. The study tunes both pod resource settings (CPU and memory requests and limits) and JVM options (max heap size, garbage collection algorithm, etc.) at the same time, while also taking into account your application performance and reliability requirements (SLOs). This optimization happens in production, leveraging Akamas live optimization capabilities.

Prerequisites

  • an Akamas instance

  • a Kubernetes cluster, with a Java-based deployment to be optimized

  • a way to apply configuration changes recommended by Akamas to the target deployment. In this guide, Akamas interacts directly with the Kubernetes APIs via kubectl.You need a service account with permission to update your deployment (see below for other integration options)

Optimization setup

In this guide, we assume the following setup:

  • the Kubernetes deployment to be optimized is called adservice (in the boutique namespace)

  • in the deployment, there is a container named server, where the application JVM runs

  • Dynatrace is used as an observability tool

Let's set up the Akamas optimization for this use case.

System

For this optimization, you need the following components to model the adservice tech stack:

Let's start by creating the system, that represents the Kubernetes deployment to be optimized. To create it, write a system.yaml manifest like this:

Then run:

Now create a component-container.yaml manifest like the following:

Notice the component includes properties that specify how Dynatrace telemetry will look up this container in the Kubernetes cluster (the same will happen for the following components).

These properties are dependent upon the telemetry provider you are using.

Then run:

Next, create a component-jvm.yaml manifest like the following:

Then run:

Now create a component-webapp.yaml manifest like the following:

Then run:

Workflow

To optimize a Kubernetes microservice in production, you need to create a workflow that defines how to deploy in production the new configuration recommended by Akamas.

Let's explore the high-level tasks required in this scenario and the options you have to adapt it to your environment:

1) Update the Kubernetes deployment configuration

The first step is to update the Kubernetes deployment with the new configuration. This can be done in several ways depending on your environment and processes:

  • A simple option is to let Akamas directly update the deployment leveraging the Kubernetes APIs via kubectl commands

  • Another option is to follow an Infrastructure-as-code approach, where the configuration change is managed via pull requests to a Git repository, leveraging your pipelines to deploy the change in production

2) Wait for the new deployment to be rolled out in production

In a live optimization, Akamas needs to understand when the new deployment rollout is complete and whether it was completed successfully or not. This is key information for Akamas AI to observe and optimize your applications safely.

This task can be done in several ways depending on how you manage changes, as discussed in the previous task:

  • A simple option is to use thekubectl rollout command to wait for the deployment rollout completion. This is the approach used in this guide

  • Another option is to follow an Infrastructure-as-code approach, where a change is managed via pull requests to a Git repository, leveraging your pipelines to deploy in production. In this situation, the deployment process is executed externally and is not controlled by Akamas. Hence, the workflow task will periodically poll the Kubernetes deployment to recognize when the new deployment has landed in production

3) Observe how the application behaves with the new configuration

In a live optimization, Akamas simply needs to wait for a given observation interval, while the application works in production with the new configuration. Telemetry metrics will be collected during this observation period and will be analyzed by Akamas AI to recommend the next configuration.

A 30-minute observation interval is recommended for most situations.

Let's now create a workflow.yaml manifest like the following:

In the configure task, Akamas will apply the container CPU/memory limits and JVM options recommended by Akamas AI to the deployment file. To do that, copy your deployment manifest to a template file (here called adservice.yaml.templ), and substitute the current values with Akamas parameter placeholders as follows:

Whenever Akamas recommended configuration is applied, the configure task will create the actual adservice.yaml deployment file with the parameter placeholders substituted with values recommended by Akamas AI, and then the new deployment will be applied via kubectl apply.

To create the workflow, run:

Telemetry

Create a telemetry instance based on your observability setup to collect your target Kubernetes deployment metrics.

Create a telemetry.yamlmanifest like the following:

Then run:

Study

It's time to create the Akamas study to achieve your optimization objectives.

Let's explore how the study is designed by going through the main concepts. The complete study manifest is available at the bottom.

Goal

Your overall objective is to reduce the cost (or resource footprint) of a Kubernetes deployment. To do that, you need to define the goal, which is a metric (or combination of metrics) representing the deployment cost to be minimized.

There are different approaches to measuring the cost of Kubernetes deployments:

  • A simple approach is to consider that Kubernetes allocates infrastructure resources based on pod resource requests (CPU and memory). Hence, the cost of a deployment can be derived from the deployment aggregate CPU and memory requests. In this guide, we use this approach and define the study goal as the sum of CPU and memory requests of the container to be optimized

  • Alternatively, the cost of a Kubernetes deployment can also be collected from external data sources that provide actual cost metrics like OpenCost. In this case, the study goal can be defined by leveraging the cost metric

Notice that weighting factors can be used in the goal formula to specify the importance of CPU vs memory resources. For example, the cloud price of 1 CPU is about 9 times that of 1 GB of RAM. You can customize those weights based on your requirements so that Akamas knows how to truly reach the most cost-efficient configuration in your specific context.

Constraints

When optimizing for cost reduction (or resource footprint), it's key not to impact application response time or introduce risks of availability and reliability issues. To ensure this, you can define your performance and reliability requirements (SLOs) as metric constraints.

In this study:

  • to ensure application performance, constraints are specified on application response times and error rate

  • to ensure application reliability, constraints are specified on:

    • container peak CPU and memory utilization, and container out-of-memory kills

    • JVM garbage collection time %, to prevent out-of-memory in the JVM heap memory

Parameters

To achieve cost-efficient and reliable Java-based microservices, Kubernetes container resources and JVM runtime options must be configured optimally and tuned jointly, as they are heavily interconnected.

To do that, the study includes the following parameters:

  • Kubernetes container: CPU and memory requests and limits

  • JVM: heap size and garbage collection (GC) algorithms

The study also includes parameter constraints to ensure that recommended configurations are safe and comply with best practices. In particular:

  • Kubernetes container memory limit must be higher than JVM heap size, plus a buffer to account for JVM off-heap memory usage

  • CPU limits must be at most 2x CPU requests, to avoid excessive over-commitment of CPU limits in the cluster

Notice that the parameters and constraints can change depending on your policies. For example, it is a best practice to set memory requests == limits to avoid pod eviction. In this case, you only include memory requests in the study and set limits to the same value in the deployment file.

Workload

Akamas live optimization considers the application's workload to recommend new configurations that are optimal for the goal (e.g. reduce cost) while meeting all metric constraints (e.g., latency and error rates).

For Kubernetes microservices, the workload is typically the throughput (requests/sec) of the microservice API endpoints. This is the approach used in this guide.

Approval mode and recommendation frequency

In this live optimization, the manual approval is set to required, meaning that Akamas will ask for user approval when a new configuration gets generated. Once you approve it, the workflow will be executed, and the new configuration will be deployed to production according to the integration strategy you have defined above.

You can set it to false to enable fully autonomous optimization: in this case, as soon as a new configuration gets generated, the workflow will be executed without any human involvement.

The recommendation frequency can be chosen by leveraging the numberOfTrials parameter. As the workflow duration is set to 30 minutes, in order to have a new configuration generated daily, set the number of trials to 48.

You can now create a study.yaml manifest like the following:

Then run:

You can now follow the live optimization progress and explore the results using the Akamas UI.

Artifact templates

To quickly set up this optimization, download the Akamas template manifests and update the values file to match your needs. Then, create your optimization using the Akamas scaffolding.

a supported telemetry data source configured to collect metrics from the target Kubernetes cluster (see for the full list)

These commands are executed from the toolbox, an Akamas utility that can be enabled in an Akamas installation on Kubernetes. Make sure that kubectl is configured correctly to connect to your Kubernetes cluster and can update your target deployment. See for more details.

a supported telemetry data source configured to collect metrics from the target Kubernetes cluster (see for the full list)

A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits (from the optimization pack)

A Java OpenJDK component, which contains JVM-level metrics like heap memory usage and parameters to be tuned like the garbage collector algorithm (from the optimization pack)

A Web Application component, which contains service-level metrics like throughput and response time of the microservice (from the optimization pack)

In this guide, we take the first option and use the kubectl apply command to configure the new deployment. These commands are executed from the toolbox, an Akamas utility that can be enabled in an Akamas installation on Kubernetes. Make sure that kubectl is configured correctly to connect to your Kubernetes cluster and can update your target deployment. See for more details.

here
here
name: adservice
description: The Adservice deployment
akamas create system system.yaml
name: server
description: Kubernetes container in the frontend deployment
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: boutique
      containerName: server
      basePodName: frontend-*
akamas create component component-container.yaml frontend
name: jvm
description: JVM of the frontend deployment
componentType: java-openjdk-17
properties:
  dynatrace:
    type: PROCESS
    tags:
     akamas: adservice-jvm
akamas create component component-jvm.yaml adservice
name: webapp
description: The HTTP service of the adservice deployment
componentType: Web Application
properties:
  dynatrace:
    type: SERVICE
    name: adservice
akamas create component component-webapp.yaml frontend
name: adservice
tasks:
  - name: configure
    operator: FileConfigurator
    arguments:
      source:
        hostname: toolbox
        username: akamas
        password: <your-toolbox-password>
        path: adservice.yaml.templ
      target:
        hostname: toolbox
        username: akamas
        password: <your-toolbox-password>
        path: adservice.yaml

  - name: apply
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: toolbox
        username: akamas
        password: <your-toolbox-password>
      command: kubectl apply -f adservice.yaml

  - name: verify
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: toolbox
        username: akamas
        password: <your-toolbox-password>
      command: kubectl rollout status --timeout=5m deployment/adservice -n boutique;

  - name: observe
    operator: Sleep
    arguments:
      seconds: 1800
apiVersion: apps/v1
kind: Deployment
metadata:
  name: adservice
spec:
  selector:
    matchLabels:
      app: adservice
  replicas: 1
  template:
    metadata:
      labels:
        app: adservice
    spec:
      containers:
        - name: server
          image: gcr.io/google-samples/microservices-demo/adservice:v0.3.8
          ports:
            - containerPort: 9555
          env:
            - name: PORT
              value: "9555"
            - name: JAVA_OPTS
              value: "${jvm.*}"
          resources:
            limits:
              cpu: ${server.cpu_limit}
              memory: ${server.memory_limit}
akamas create workflow workflow.yaml
provider: Dynatrace
config:
  url: <YOUR_DYNATRACE_URL>
  token: <YOUR_DYNATRACE_TOKEN>
akamas create telemetry-instance telemetry.yaml adservice
name: adservice - optimize costs tuning K8s and JVM
system: adservice
workflow: adservice

goal:
  name: Cost
  objective: minimize
  function:
    formula: ((server.container_cpu_limit)/1000)*29 + ((((server.container_memory_limit)/1024)/1024)/1024)*3
  constraints:
    absolute: 
      - name: Application response time degradation
        formula: web_application.requests_response_time:max <= 5
      - name: Application error rate degradation
        formula: web_application.requests_error_rate:max <= 0.02
      - name: Container CPU saturation
        formula: server.container_cpu_util_max:p95 < 1
      - name: Container memory saturation
        formula: server.container_memory_util_max:max < 1
      - name: Container out-of-memory
        formula: server.container_restarts == 0
      - name: JVM heap saturation
        formula: jvm.jvm_gc_time:max < 0.05

windowing:
  type: trim
  trim: [2m, 0s]
  task: observe

parametersSelection:
  - name: server.cpu_request
    domain: [10, 181]
  - name: server.cpu_limit
    domain: [10, 181]
  - name: server.memory_request
    domain: [16, 2048]
  - name: jvm.jvm_maxHeapSize
    domain: [16, 1024]
  - name: jvm.jvm_gcType

parameterConstraints:
  - name: JVM off-heap safety buffer
    formula: jvm.jvm_maxHeapSize + 1000 < server.memory_limit
  - name: CPU limit at most 2x of requests
    formula: server.cpu_limit <= server.cpu_request * 2

workloadsSelection:
  - name: web_application.requests_throughput

numberOfTrials: 48
steps:
  - name: baseline
    type: baseline
    values:
      server.cpu_limit: 1000
      server.memory_limit: 2048
      jvm.jvm_maxHeapSize: 1024
      jvm.jvm_gcType: Serial

  - name: optimize
    type: optimize
    numberOfExperiments: 21
akamas create study study.yaml

Optimizing performance of a Node.js application with V8 runtime tuning leveraging performance tests

Optimizing performance of a Java application with JVM tuning leveraging performance tests

2KB
akamas-templates-optimize-costs-k8s-jvm-live.tgz

Optimizing cost of a Node.js application with performance tests

COMING SOON! Please reach out to us at support@akamas.io if interested.

Optimizing cost of a Golang application with performance tests

COMING SOON! Please reach out to us at support@akamas.io if interested.

here
Kubernetes
Java OpenJDK
Web application
here

Parameters and constraints

Akamas supports four types of parameters:

  • Integer parameters are those that can only assume an integer value (e.g. the number of cores on a VM instance).

  • Real parameters can assume real values (e.g. 0.2) and are mostly used when dealing with percentages.

  • Categorical parameters map those elements that do not have a strict ordering such as GC types (e.g. Parallel, G1, Serial) or booleans.

  • Ordinal parameters are similar to categorical ones as they also support a set of literal values but they are also ordered. An example is VM instance size (e.g. small, medium, large, xlarge..).

Most of the time you should not bother with defining parameters, as this information is already defined in the Optimization Packs.

When creating new optimization studies you should first select a set of parameters to include in the optimization process. The set might depend on many factors such as:

  • The potential impact of a parameter on the defined goal (e.g. if my goal is to reduce the cost of running an application it might be a good idea to include parameters related to resource usage).

  • The layers selected for the optimization. Optimizing multiple layers at the same time might bring more benefits as the configurations of both layers are aligned.

  • The Akamas' ability to change those parameters (e.g. if my deployment process does not support the definition of some parameters because, as an example, are managed by an external group, I should avoid adding them).

Domains

Besides defining the set of parameters users can also select the domain for the optimization and add a set of constraints.

Optimization packs already include information on the possible values for a parameter but in some situations, it is necessary to shrink it. As an example, the parameter that defines the amount of CPU that a container can use (the cpu_limit ) might vary a lot depending on the underlying cluster and the application. If the cluster that hosts the application only contains nodes with up to 10 CPUs it might be worth limiting the domain of this parameter for the optimization study to that value to avoid failures when deploying the application and speed up the optimization process. If you forget to set this domain restriction Akamas will learn it by itself but it will need to try to deploy a container with a higher CPU limit to find out that that's not possible.

Constraints

In many situations, parameters have dependencies between each other. As an example, suppose you want to optimize at the same time the size of a container and the Java runtime that executes the application inside of it. Both layers have some parameters that affect how much memory can be used, for the container layer this parameter is called memory_limit and for the JVM is called jvm_heap_size. Configurations that have a jvm_heap_size value higher than the memory_limit might lead to out-of-memory errors.

You can define this relationship by specifying a constraint as in the example below:

parameterConstraints:
  - name: Heap should be lower than the container memory limit
    formula: container.memory_limit > jvm.jvm_heap_size + 50

These constraints instruct Akamas to avoid generating configurations that bring the jvm_heap_size parameter close to the memory_limit leaving a gap of 50Mb.

Optimizing a sample Java OpenJDK application

Environment setup

The test environment includes the following instances:

  • Akamas: instance running Akamas

  • PageRank: instance running the PageRank benchmark and the Prometheus monitoring service

Telemetry Infrastructure setup

To gather metrics about PageRank we will use a Prometheus and a JMX exporter. Here’s the scraper to add to the Prometheus configuration to extract the metrics from the exporter:

- job_name: jmx-exporter
  static_configs:
    - targets: ['pagerank.akamas.io:5556']
      labels:
      instance: jvm

Application and Test tool

To run and monitor the benchmark we’ll require on the PageRank instance:

Here’s the snippet of code to configure the instance as required for this guide:

mkdir renaissance; cd renaissance
wget -O renaissance.jar https://github.com/renaissance-benchmarks/renaissance/releases/download/v0.10.0/renaissance-gpl-0.10.0.jar
wget -O jmx_exporter.jar https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.14.0/jmx_prometheus_javaagent-0.14.0.jar
echo -e '--\nwhitelistObjectNames: ["java.lang:*"]' > conf.yaml

Optimization setup

In this section, we will guide you through the steps required to set up the optimization on Akamas.

System

System pagerank

Here’s the definition of the system we will use to group our components and telemetry instances for this example:

name: pagerank
description: A system to tune the pagerank benchmark

To create the system run the following command:

akamas create system pagerank.yaml

Component jvm

Here’s the definition of the component:

name: jvm
componentType: openjdk-11
properties:
  prometheus:
    instance: jvm
    job: jmx-exporter

To create the component in the system run the following command:

akamas create component jvm.yaml pagerank

Workflow

The workflow used for this study consists of two main stages:

  • generate the configuration file containing the tested Java parameters

  • run the execution using previously written parameters

Here’s the definition of the workflow:

name: run-pagerank
tasks:
  - name: Configure parameters
    operator: FileConfigurator
    arguments:
      source:
        hostname: pagerank.akamas.io
        username: ubuntu
        path: /home/ubuntu/renaissance/java_opts.template
        key: key
      target:
        hostname: pagerank.akamas.io
        username: ubuntu
        path: /home/ubuntu/renaissance/java_opts
        key: key

  - name: Run benchmark
    operator: Executor
    arguments:
      command: "cd renaissance; java -javaagent:./jmx_exporter.jar=5556:conf.yaml $(cat java_opts) -jar renaissance.jar -r 2 page-rank"
      host:
        hostname: pagerank.akamas.io
        username: ubuntu
        key: key

Where the configuration template is java_opts.template is defined as follows:

 ${jvm.jvm_gcType} ${jvm.jvm_maxHeapSize} ${jvm.jvm_newSize} ${jvm.jvm_survivorRatio} ${jvm.jvm_maxTenuringThreshold}

To create the workflow run the following command:

akamas create workflow workflow.yaml

Telemetry

The following is the definition of the telemetry instance that fetches metrics from the Prometheus service:

provider: Prometheus
config:
  address: pagerank.akamas.io
  port: 9090

To create the telemetry instance in the system run the following command:

akamas create telemetry-instance prometheus.yaml pagerank

This telemetry instance will be able to bind the fetched metrics to the related jvm component thanks to the prometheus attribute we previously added in its definition.

Study

The goal of this study is to find a JVM configuration that minimizes the peak memory used by the benchmark.

The optimized parameters are the maximum heap size, the garbage collector used and several other parameters managing the new and old heap areas. We also specify a constraint stating that the GC regions can’t exceed the total heap available, to avoid experimenting with parameter configurations that can’t start in the first place.

Here’s the definition of the study:

name: Optimize PageRank
description: Tweaking the JVM parameters to optimize the page-rank benchmark.
system: pagerank
workflow: run-pagerank

goal:
  objective: minimize
  function:
    formula: memory_used
    variables:
      memory_used:
        metric: jvm.jvm_memory_used

parametersSelection:
  - name: jvm.jvm_gcType
  - name: jvm.jvm_maxHeapSize
    domain: [1250, 2000]
  - name: jvm.jvm_newSize
    domain: [350, 2000]
  - name: jvm.jvm_survivorRatio
  - name: jvm.jvm_maxTenuringThreshold

parameterConstraints:
  - name: Max heap must always be greater than new size
    formula: jvm.jvm_maxHeapSize > jvm.jvm_newSize

steps:
  - name: baseline
    type: baseline
    values:
      jvm.jvm_gcType: G1
      jvm.jvm_maxHeapSize: 2000

  - name: optimize
    type: optimize
    numberOfExperiments: 30

To create and run the study execute the following commands:

akamas create study study.yaml
akamas start study 'Optimize PageRank'

One of the key elements that define an optimization study is the parameters set. We have already seen in the how to define the set of optimized parameters here we dig deeper on this topic.

You can read more on parameters and how they are managed in the .

Constraints usually depend on the set of parameters chosen for the optimization. You can find more information about common constraints for the supported technologies in the documentation of the or the .

In this example study we’ll tune the parameters of PageRank, one of the benchmarks available in the , with the goal of minimizing its memory usage. Application monitoring is provided by Prometheus, leveraging a JMX exporter.

The

The , plus a configuration file to expose the required classes

If you have not installed the optimization pack yet, take a look at the optimization pack page Java OpenJDK to proceed with the installation.

We’ll use a component of type to represent the JVM underlying the PageRank benchmark. To identify the JMX-related metrics in Prometheus the configuration requires the prometheus property for the telemetry service, detailed later in this guide.

study section
reference documentation section
related optimization pack
optimization guides
Renaissance suite
Renaissance jar
JMX exporter agent
Java OpenJDK
Java OpenJDK 11

Optimizing cost of a .NET application with performance tests

COMING SOON! Please reach out to us at support@akamas.io if interested.

Optimizing a sample application running on AWS

In this example, you will go through the optimization of a Spark application running on AWS instances. We’ll be using a PageRank implementation included in Renaissance, an industry-standard Java benchmarking suite, tuning both Java and AWS parameters to improve the performance of our application.

Environment setup

For this example, you’re expected to use two dedicated machines:

  • an Akamas instance

  • a Linux-based AWS EC2 instance

The Linux-based instance will run the application benchmark, so it requires the latest open-jdk11 release

sudo apt install openjdk-11-jre

Telemetry Infrastructure setup

For this study you’re going to require the following telemetry providers:

Application and Test tool

Since the application consists of a jar file only, the setup is rather straightforward; just download the binary in the ~/renaissance/ folder:

mkdir ~/renaissance
cd ~/renaissance
wget -O renaissance.jar https://github.com/renaissance-benchmarks/renaissance/releases/download/v0.10.0/renaissance-gpl-0.10.0.jar

In the same folder upload the template file launch.benchmark.sh.temp, containing the script that executes the benchmark using the provided parameters and parses the results:

#!/bin/bash
java -XX:MaxRAMPercentage=60 ${jvm.*} -jar renaissance.jar -r 50 --csv renaissance.csv page-rank

total_time=$(awk -F"," '{total_time+=$2}END{print total_time}' ./renaissance.csv)
first_line=$(head -n 1 renaissance.csv)
end_time=$(tail -n 1 renaissance.csv | cut -d',' -f3)
start_time=$(sed '2q;d' renaissance.csv | cut -d',' -f4)
echo $first_line,"TS,COMPONENT" > renaissance-parsed.csv
ts=$(date -d @$(($start_time/1000)) "+%Y-%m-%d %H:%M:%S")

echo -e "page-rank,$total_time,$end_time,$start_time,$ts,pagerank" >> renaissance-parsed.csv

Optimization setup

In this section, we will guide you through the steps required to set up the optimization on Akamas.

Optimization packs

This example requires the installation of the following optimization packs:

System

Our system could be named renaissance after its application, so you’ll have a system.yaml file like this:

name: jvm
description: The JVM running the benchmark
componentType: java-openjdk-11
properties:
    prometheus:
      job: jmx
      instance: jmx_instance

Then create the new system resource:

akamas create component component-jvm.yaml renaissance

The renaissance system will then have three components:

  • A benchmark component

  • A Java component

  • An EC2 component, i.e. the underlying instance

Java component

Create a component-jvm.yaml file like the following:

name: jvm
description: The JVM running the benchmark
componentType: java-openjdk-11
properties:
    prometheus:
      job: jmx
      instance: jmx_instance

Then type:

akamas create component component-jvm.yaml renaissance

Benchmark component

Since there is no optimization pack associated with this component, you have to create some extra resources.

  • A metrics.yaml file for a new metric tracking execution time:

metrics:
  - name: elapsed
    unit: nanoseconds
    description: The duration of the benchmark execution
  • A component-type benchmark.yaml:

name: benchmark
description: A component type for the Renaissance Java benchmarking suite
metrics:
  - name: elapsed
parameters: []
  • The component pagerank.yaml:

name: pagerank
description: The pagerank application included in Renaissance benchmarks
componentType: benchmark

Create your new resources, by typing in your terminal the following commands:

akamas create metrics metrics.yaml
akamas create component-type benchmark.yaml
akamas create component pagerank.yaml renaissance

EC2 component

Create a component-ec2.yaml file like the following:

name: instance
description: The ec2 instance the benchmark runs on
componentType: ec2
properties:
  hostname: renaissance.akamas.io
  sshPort: 22
  instance: ec2_instance
  username:  ubuntu
  key: # SSH KEY
  ec2:
    region: us-east-2 # This is just a reference

Then create its resource by typing in your terminal:

akamas create component component-ec2.yaml renaissance

Workflow

The workflow in this example is composed of three main steps:

  1. Update the instance type

  2. Run the application benchmark

  3. Stop the instance

In detail:

  1. Update the instance size

    1. Generate the playbook file from the template

    2. Update the instance using the playbook

    3. Wait for the instance to be available

  2. Run the application benchmark

    1. Configure the benchmark Java launch script

    2. Execute the launch script

    3. Parse PageRank output to make it consumable by the CSV telemetry instance

  3. Stop the instance

    1. Configure the playbook to stop an instance with a specific instance id

    2. Run the playbook to stop the instance

The following is the template of the Ansible playbook:

# Change instance type, requires AWS CLI

- name: Resize the instance
  hosts: localhost
  gather_facts: no
  connection: local
  tasks:
  - name: save instance info
    ec2_instance_info:
      filters:
        "tag:Name": <your-instance-name>
    register: ec2
  - name: Stop the instance
    ec2:
      region: <your-aws-region>
      state: stopped
      instance_ids:
        - "{{ ec2.instances[0].instance_id }}"
      instance_type: "{{ ec2.instances[0].instance_type }}"
      wait: True
  - name: Change the instances ec2 type
    shell: >
       aws ec2 modify-instance-attribute --instance-id "{{ ec2.instances[0].instance_id }}"
       --instance-type "${ec2.aws_ec2_instance_type}.${ec2.aws_ec2_instance_size}"
    delegate_to: localhost
  - name: restart the instance
    ec2:
      region: <your-aws-region>
      state: running
      instance_ids:
        - "{{ ec2.instances[0].instance_id }}"
      wait: True
    register: ec2
  - name: wait for SSH to come up
    wait_for:
      host: "{{ item.public_dns_name }}"
      port: 22
      delay: 60
      timeout: 320
      state: started
    with_items: "{{ ec2.instances }}"

The following is the workflow configuration file:

name: Pagerank AWS optimization
tasks:

  # Creating the EC2 instance
  - name: Configure provisioning
    operator: FileConfigurator
    arguments:
      sourcePath: /home/ubuntu/ansible/resize.yaml.templ
      targetPath: /home/ubuntu/ansible/resize.yaml
      host:
        hostname: bastion.akamas.io
        username: ubuntu
        key: # SSH KEY

  - name: Execute Provisioning
    operator: Executor
    arguments:
      command: ansible-playbook /home/akamas/ansible/resize.yaml
      host:
        hostname: bastion.akamas.io
        username: akamas
        key: # SSH KEY

  # Waiting for the instance to come up and set up its DNS
  - name: Pause
    operator: Sleep
    arguments:
      seconds: 120

  # Running the benchmark
  - name: Configure Benchmark
    operator: FileConfigurator
    arguments:
        source:
            hostname: renaissance.akamas.io
            username: ubuntu
            path: /home/ubuntu/renaissance/launch_benchmark.sh.templ
            key: # SSH KEY
        target:
            hostname: renaissance.akamas.io
            username: ubuntu
            path: /home/ubuntu/renaissance/launch_benchmark.sh
            key: # SSH KEY

  - name: Launch Benchmark
    operator: Executor
    arguments:
      command: bash /home/ubuntu/renaissance/launch_benchmark.sh
      host:
        hostname: renaissance.akamas.io
        username: ubuntu
        key: # SSH KEYCreate the workflow resource by typing in your terminal:

Telemetry

Prometheus

  • The prometheus.yml file, located in your Prometheus folder:

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: prometheus
    static_configs:
    - targets: ['localhost:9090']

  - job_name: jmx
    static_configs:
    - targets: ["localhost:9110"]
    relabel_configs:
    - source_labels: ["__address__"]
      regex: "(.*):.*"
      target_label: instance
      replacement: jmx_instanc

The config.yml file you have to create in the ~/renaissance folder:

startDelaySeconds: 0
username:
password:
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
# using the property above we are telling the export to export only relevant java metrics
whitelistObjectNames:
  - "java.lang:*"
  - "jvm:*"

Now you can create a prometheus-instance.yaml file:

provider: Prometheus
config:
  address: renaissance.akamas.io
  port: 9090

Then you can install the telemetry instance:

akamas create telemetry-instance prometheus-instance.yaml renaissance

CSV - Telemetry instance

Create a telemetry-csv.yaml file to read the benchmark output:

provider: CSV
config:
  protocol: scp
  address: renaissance.akamas.io
  username: ubuntu
  authType: key
  auth: # SSH KEY
  remoteFilePattern: /home/ubuntu/renaissance/renaissance-parsed.csv
  csvFormat: horizontal
  componentColumn: COMPONENT
  timestampColumn: TS
  timestampFormat: yyyy-MM-dd HH:mm:ss

metrics:
  - metric: elapsed
    datasourceMetric: nanos

Then create the resource by typing in your terminal:

akamas create telemetry-instance renaissance

Study

Here we provide a reference study for AWS. As we’ve anticipated, the goal of this study is to optimize a sample Java application, the PageRank benchmark you may find in the renaissance benchmark suite by Oracle.

Our goal is rather simple: minimizing the product between the benchmark execution time and the instance price, that is, finding the most cost-effective instance for our application.

Create a study.yaml file with the following content:

name: aws
description: Tweaking aws and the JVM to optimize the page-rank application.
system: renaissance

goal:
  objective: minimize
  function:
    formula: benchmark.elapsed * aws.aws_ec2_price

workflow: workflow-aws

parametersSelection:
  - name: aws.aws_ec2_instance_type
    categories: [c5,c5d,c5a,m5,m5d,m5a,r5,r5d,r5a]
  - name: aws.aws_ec2_instance_size
    categories: [large,xlarge,2xlarge,4xlarge]
  - name: jvm.jvm_gcType
  - name: jvm.jvm_newSize
  - name: jvm.jvm_maxHeapSize
  - name: jvm.jvm_minHeapSize
  - name: jvm.jvm_survivorRatio
  - name: jvm.jvm_maxTenuringThreshold

steps:
  - name: baseline
    type: baseline
    numberOfTrials: 2
    values:
     aws.aws_ec2_instance_type: c5
     aws.aws_ec2_instance_size: 2xlarge
     jvm.jvm_gcType: G1
  - name: optimize
    type: optimize
    numberOfExperiments: 60

Then create the corresponding Akamas resource and start the study:

akamas create study study.yaml
akamas start study aws

Optimize application performance and reliability

Kubernetes microservices

Live optimizations

Optimizing a Spark application

In this example study we’ll tune the parameters of SparkPi, one of the example applications provided by most of the Apache Spark distributions, to minimize its execution time. Application monitoring is provided by the Spark History Server APIs.

Environment setup

The test environment includes the following instances:

  • Akamas: instance running Akamas

  • Spark cluster: composed of instances with 16 vCPUs and 64 GB of memory, where the Spark binaries are installed under /usr/lib/spark. In particular, the roles are:

    • 1x master instance: the Spark node running the resource manager and Spark History Server (host: sparkmaster.akamas.io)

    • 2x worker instances: the other instances in the cluster

Telemetry Infrastructure setup

To gather metrics about the application we will leverage the Spark History Server. If it is not already running, start it on the master instance with the following command:

Application and Test tools

To make sure the tested application is available on your cluster and runs correctly, execute the following commands:

Optimization setup

In this section, we will guide you through the steps required to set up on Akamas the optimization of the Spark application execution.

System

System spark

Here’s the definition of the system we will use to group our components and telemetry instances for this example:

To create the system run the following command:

Component sparkPi

In the snippet shown below, we specify:

  • the field properties required by Akamas to connect via SSH to the cluster master instance

  • the parameters required by spark-submit to execute the application

  • the sparkApplication flag required by the telemetry instance to associate the metrics from the History Server to this component

To create the component in the system run the following command:

Workflow

The workflow used for this study contains only a single stage, where the operator submits the application along with the Spark parameters under test.

Here’s the definition of the workflow:

To create the workflow run the following command:

Telemetry

Here’s the definition of the component, specifying the History Server endpoint:

To create the telemetry instance in the system run the following command:

This telemetry instance will be able to bind the fetched metrics to the related sparkPi component thanks to the sparkApplication attribute we previously added in its definition.

Study

The goal of this study is to find a Spark configuration that minimizes the execution time for the example application.

To achieve this goal we’ll operate on the number of executor processes available to run the application job, and the memory and CPUs allocated for both driver and executors. The domains are configured so that the single driver/executor process does not exceed the size of the underlying instance, and the constraints make it so that the application overall does not require more resources than the ones available in the cluster, also taking into account that some resources must be reserved for other services such as the cluster manager.

Note that this study uses two constraints on the total number of resources to be used by the spark application. This example refers to a cluster of three nodes with 16 cores and 64 GB of memory each, and at least one core per instance should be reserved for the system.

Here’s the definition of the study:

To create and run the study execute the following commands:

Optimizing cost of a Kubernetes microservice while preserving SLOs in production

In this example, you will use Akamas live optimization to minimize the cost of a Kubernetes deployment, while preserving application performance and reliability requirements.

Prerequisites

In this example, you need:

  • an Akamas instance

  • a Kubernetes cluster, with a deployment to be optimized

  • the kubectl command installed in the Akamas instance, configured to access the target Kubernetes and with privileges to get and update the deployment configurations

  • a supported telemetry data source (e.g. Prometheus or Dynatrace) configured to collect metrics from the target Kubernetes cluster

Optimization setup

Optimization packs

This example leverages the following optimization packs:

System

The system represents the Kubernetes deployment to be optimized (let's call it "frontend"). You can create a system.yaml manifest like this:

Create the new system resource:

The system will then have two components:

  • A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits

  • A Web Application component, which contains service-level metrics like throughput and response time

In this example, we assume the deployment to be optimized is called frontend, with a container named server, and is located within the boutique namespace. We also assume that Dynatrace is used as a telemetry provider.

Kubernetes component

Create a component-container.yaml manifest like the following:

Then run:

Now create a component-webapp.yaml manifest like the following:

Then run:

Workflow

The workflow in this example is composed of three main steps:

  1. Update the Kubernetes deployment manifest with the parameters (CPU and memory limits) recommended by Akamas

  2. Apply the new parameters (kubectl apply)

  3. Wait for the rollout to complete

  4. Sleep for 30 minutes (observation interval)

Create a workflow.yaml manifest like the following:

Then run:

Telemetry

Create the telemetry.yamlmanifest like the following:

Then run:

Study

In this live optimization:

  • the goal is to reduce the cost of the Kubernetes deployment. In this example, the cost is based on the amount of CPU and memory limits (assuming requests = limits).

  • the approval mode is set to manual, a new recommendation is generated daily

  • to avoid impacting application performance, constraints are specified on desired response times and error rates

  • to avoid impacting application reliability, constraints are specified on peak resource usage and out-of-memory kills

  • the parameters to be tuned are the container CPU and memory limits (we assume requests=limits in the deployment file)

Create a study.yaml manifest like the following:

Then run:

You can now follow the live optimization progress and explore the results using the Akamas UI for Live optimizations.

Optimizing cost of a Java microservice on Kubernetes while preserving SLOs in production

In this example, you will use Akamas live optimization to minimize the cost of a Kubernetes deployment, while preserving application performance and reliability requirements.

Prerequisites

In this example, you need:

  • an Akamas instance

  • a Kubernetes cluster, with a deployment to be optimized

  • the kubectl command installed in the Akamas instance, configured to access the target Kubernetes and with privileges to get and update the deployment configurations

  • a supported telemetry data source (e.g. Prometheus or Dynatrace) configured to collect metrics from the target Kubernetes cluster

Optimization setup

Optimization packs

This example leverages the following optimization packs:

System

The system represents the Kubernetes deployment to be optimized (let's call it "frontend"). You can create a system.yaml manifest like this:

Create the new system resource:

The system will then have two components:

  • A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits

  • A Web Application component, which contains service-level metrics like throughput and response time

In this example, we assume the deployment to be optimized is called frontend, with a container named server, and is located within the boutique namespace. We also assume that Dynatrace is used as a telemetry provider.

Kubernetes component

Create a component-container.yaml manifest like the following:

Then run:

Now create a component-webapp.yaml manifest like the following:

Then run:

Workflow

The workflow in this example is composed of three main steps:

  1. Update the Kubernetes deployment manifest with the parameters (CPU and memory limits) recommended by Akamas

  2. Apply the new parameters (kubectl apply)

  3. Wait for the rollout to complete

  4. Sleep for 30 minutes (observation interval)

Create a workflow.yaml manifest like the following:

Then run:

Telemetry

Create the telemetry.yamlmanifest like the following:

Then run:

Study

In this live optimization:

  • the goal is to reduce the cost of the Kubernetes deployment. In this example, the cost is based on the amount of CPU and memory limits (assuming requests = limits).

  • the approval mode is set to manual, a new recommendation is generated daily

  • to avoid impacting application performance, constraints are specified on desired response times and error rates

  • to avoid impacting application reliability, constraints are specified on peak resource usage and out-of-memory kills

  • the parameters to be tuned are the container CPU and memory limits (we assume requests=limits in the deployment file)

Create a study.yaml manifest like the following:

Then run:

You can now follow the live optimization progress and explore the results using the Akamas UI for Live optimizations.

The Akamas instance requires provisioning and manipulating instances, therefore it requires to be enabled to do so by setting , integrating with orchestration tools (such as ), and an inventory linked to your AWS EC2 environment.

to parse the results of the benchmark

to monitor the instance

to extract instance price

The suite provides the benchmark we’re going to optimize.

You may find further info about the suite and its benchmarks in the .

To manage the instance we are going to integrate a very simple in our workflow: the will replace the parameters in the template file in order to generate the code run by the as explained in the page.

If you have not installed the Prometheus telemetry provider or the CSV telemetry provider yet, take a look at the telemetry provider pages and to proceed with the installation.

Prometheus allows us to gather jvm execution metrics through the jmx exporter: download the java agent required to gather metrics from , then update the two following files:

You may find further info on exporting Java metrics to Prometheus .

We’ll use a component of type to represent the application running on the Apache Spark framework 2.3.

If you have not installed the Spark History Server telemetry provider yet, take a look at the telemetry provider page to proceed with the installation.

CSV Provider
Prometheus provider
AWS Telemetry provider
renaissance
official doc
AWS
Java OpenJDK
Ansible
FileConfigurator operator
Executor operator,
Ansible
Prometheus provider
CSV Provider
here
here

Optimizing cost of a Kubernetes microservice while preserving SLOs in production

Optimizing cost of a Java microservice on Kubernetes while preserving SLOs in production

/usr/lib/spark/sbin/start-history-server.sh
file /usr/lib/spark/examples/jars/spark-examples.jar
spark-submit \
  --master yarn --deploy-mode client \
  --class 'org.apache.spark.examples.SparkPi' \
  /usr/lib/spark/examples/jars/spark-examples.jar 100
name: spark
description: A system to tune the Spark Pi example application
akamas create system system.yaml
name: sparkPi
description: The Spark Application used to calculate KPIs for ContentWise Analytics
componentType: Spark Application 2.3.0

properties:
  hostname: sparkmaster.akamas.io
  username: hadoop
  key: ssh_key

  master: yarn
  deployMode: client
  className: org.apache.spark.examples.SparkPi
  file: /usr/lib/spark/examples/jars/spark-examples.jar
  args: [ 1000 ]

  sparkApplication: 'true'
akamas create component sparkPi.yaml spark
name: Run SparkPi
tasks:
- name: run application
  operator: SSHSparkSubmit
  arguments:
    component: sparkPi
    retries: 0
akamas create workflow workflow.yaml
provider: SparkHistoryServer
config:
  address: sparkmaster.akamas.io
  port: 18080

  importLevel: job
akamas create telemetry-instance telemetry.yaml spark
name: Speedup SparkPi execution
system: spark
workflow: Run SparkPi

goal:
  objective: minimize
  function:
    formula: sparkPi.spark_application_duration

parametersSelection:
- name: sparkPi.driverCores
  domain: [1, 10]
- name: sparkPi.driverMemory
  domain: [32, 2048]
- name: sparkPi.executorCores
  domain: [1, 15]
- name: sparkPi.executorMemory
  domain: [32, 2048]
- name: sparkPi.numExecutors
  domain: [1, 45]

parameterConstraints:
- name: cap_total_allocated_cpus
  formula: (spark.driverCores + spark.executorCores*spark.numExecutors) <= 15*3

- name: cap_total_allocated_memory
  formula: (spark.driverMemory + spark.executorMemory*spark.numExecutors) <= 60*3

steps:
- name: baseline
  type: baseline

- name: tune
  type: optimize
  numberOfExperiments: 200
  maxFailedExperiments: 200
akamas create study study.yaml
akamas start study 'Speedup SparkPi execution'
name: frontend
description: Kubernetes frontend deployment
akamas create system system.yaml
name: container
description: Kubernetes container, part of the frontend deployment
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: boutique
      containerName: server
      basePodName: frontend-*
akamas create component component-container.yaml frontend
name: webapp
description: The service related to the frontend deployment
componentType: Web Application
properties:
  dynatrace:
    id: <TELEMETRY_DYNATRACE_WEBAPP_ID>
akamas create component component-webapp.yaml frontend
name: frontend
tasks:
  - name: configure
    operator: FileConfigurator
    arguments:
      source:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml.templ
      target:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml

  - name: apply
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl apply -f frontend.yaml

  - name: verify
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl rollout status --timeout=5m deployment/frontend -n boutique;

  - name: observe
    operator: Sleep
    arguments:
      seconds: 1800
akamas create workflow workflow.yaml
provider: Dynatrace
config:
  url: <YOUR_DYNATRACE_URL>
  token: <YOUR_DYNATRACE_TOKEN>
  pushEvents: false
akamas create telemetry-instance telemetry.yaml frontend
name: frontend
system: frontend
workflow: frontend
requireApproval: true

goal:
  objective: minimize
  function:
    formula: (((container.container_cpu_limit/1000) * 3) + (container.container_memory_limit/(1024*1024*1024)))
  constraints:
    absolute:
      - name: Response Time
        formula: webapp.requests_response_time <= 300
      - name: Error Rate
        formula: webapp.service_error_rate:max <= 0.05
      - name: Container CPU saturation
        formula: container.container_cpu_util:p95 < 0.8
      - name: Container memory saturation
        formula: container.container_memory_util:max < 0.7
      - name: Container out-of-memory kills
        formula: container.container_oom_kills_count == 0

parametersSelection:
  - name: container.cpu_limit
    domain: [300, 1000]
  - name: container.memory_limit
    domain: [800, 1536]

windowing:
  type: trim
  trim: [5m, 0m]
  task: observe

workloadsSelection:
  - name: webapp.requests_throughput

steps:
  - name: baseline
    type: baseline
    numberOfTrials: 48
    values:
      container.cpu_limit: 1000
      container.memory_limit: 1536

  - name: optimize
    type: optimize
    numberOfTrials: 48
    numberOfExperiments: 100
    numberOfInitExperiments: 0
    maxFailedExperiments: 50
akamas create study study.yaml
name: frontend
description: Kubernetes frontend deployment
akamas create system system.yaml
name: container
description: Kubernetes container, part of the frontend deployment
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: boutique
      containerName: server
      basePodName: frontend-*
akamas create component component-container.yaml frontend
name: webapp
description: The service related to the frontend deployment
componentType: Web Application
properties:
  dynatrace:
    id: <TELEMETRY_DYNATRACE_WEBAPP_ID>
akamas create component component-webapp.yaml frontend
name: frontend
tasks:
  - name: configure
    operator: FileConfigurator
    arguments:
      source:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml.templ
      target:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml

  - name: apply
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl apply -f frontend.yaml

  - name: verify
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl rollout status --timeout=5m deployment/frontend -n boutique;

  - name: observe
    operator: Sleep
    arguments:
      seconds: 1800
akamas create workflow workflow.yaml
provider: Dynatrace
config:
  url: <YOUR_DYNATRACE_URL>
  token: <YOUR_DYNATRACE_TOKEN>
  pushEvents: false
akamas create telemetry-instance telemetry.yaml frontend
name: frontend
system: frontend
workflow: frontend
requireApproval: true

goal:
  objective: minimize
  function:
    formula: (((container.container_cpu_limit/1000) * 3) + (container.container_memory_limit/(1024*1024*1024)))
  constraints:
    absolute:
      - name: Response Time
        formula: webapp.requests_response_time <= 300
      - name: Error Rate
        formula: webapp.service_error_rate:max <= 0.05
      - name: Container CPU saturation
        formula: container.container_cpu_util:p95 < 0.8
      - name: Container memory saturation
        formula: container.container_memory_util:max < 0.7
      - name: Container out-of-memory kills
        formula: container.container_oom_kills_count == 0

parametersSelection:
  - name: container.cpu_limit
    domain: [300, 1000]
  - name: container.memory_limit
    domain: [800, 1536]

windowing:
  type: trim
  trim: [5m, 0m]
  task: observe

workloadsSelection:
  - name: webapp.requests_throughput

steps:
  - name: baseline
    type: baseline
    numberOfTrials: 48
    values:
      container.cpu_limit: 1000
      container.memory_limit: 1536

  - name: optimize
    type: optimize
    numberOfTrials: 48
    numberOfExperiments: 100
    numberOfInitExperiments: 0
    maxFailedExperiments: 50
akamas create study study.yaml

Kubernetes microservices

Spark Application 2.3.0
Spark History Server Provider
Kubernetes
Web application
Kubernetes
Web application

CSV provider

The CSV provider collects metrics from CSV files and makes them available to Akamas. It offers a very versatile way to integrate custom data sources.

Prerequisites

This section provides the minimum requirements that you should match before using the CSV File telemetry provider.

Network requirements

The following requirements should be met to enable the provider to gather CSV files from remote hosts:

  • Port 22 (or a custom one) should be open from Akamas installation to the host where the files reside.

  • The host where the files reside should support SCP or SFTP protocols.

Permissions

  • Read access to the CSV files target of the integration

Akamas supported version

  • Versions < 2.0.0 are compatibile with Akamas until version 1.8.0

  • Versions >= 2.0.0 are compatible with Akamas from version 1.9.0

Supported component types

The CSV File provider is generic and allows integration with any data source, therefore it does not come with support for a specific component type.

Setup the data source

To operate properly, the CSV file provider expects the presence of four fields in each processed CSV file:

  • A timestamp field used to identify the point in time a certain sample refers to.

  • A component field used to identify the Akamas entity.

  • A metric field used to identify the name of the metric.

  • A value field used to store the actual value of the metric.

These fields can have custom names in the CSV file, you can specify them in the provider configuration.

Integrating

  • Configuration Management tools providing the ability to set tunable parameters for the system to be optimized - this integration applies to both offline and live optimization studies;

  • Value Stream Delivery tools to implement a continuous optimization process as part of a CI/CD pipeline - this integration applies to both offline and live optimization studies;

  • Load Testing tools used to reproduce a synthetic workload on the system to be optimized; notice that these tools may also act as Telemetry Providers (e.g. for end-user metrics) - this integration only applies to offline optimization studies.

These integrations may require some setup on both the tool and the Akamas side and may also involve defining workflows and making use of workflow operators.

Ansible

The page describes how to get this Telemetry Provider installed. Once installed, this provider is shared with all users of your Akamas installation and can be used to monitor many different systems, by configuring appropriate telemetry provider instances as described in the page.

Akamas provides the following areas of integration with your ecosystem, which may apply or not depending on whether you are running or :

Telemetry Providers tools providing time series for metrics of interest for the system to be optimized (see also ) - this integration applies to both offline and live optimization studies;

Install CSV provider

To install the CSV File provider, create a YAML file (called provider.yml in this example) with the specification of the provider:

# CSV File Telemetry Provider
name: CSV File
description: Telemetry Provider that enables to import of metrics from a remote CSV file
dockerImage: 485790562880.dkr.ecr.us-east-2.amazonaws.com/akamas/telemetry-providers/csv-file-provider:3.2.0

Then, you can then install the provider with the Akamas CLI:

akamas install telemetry-provider provider.yml
Install CSV provider
Create a CSV provider instance
live optimization studies
offline optimization studies
Telemetry Providers

Integrating Telemetry Providers

Akamas supports the integration with virtually any telemetry and observability tool.

Supported Telemetry Providers

The following table describes the supported Telemetry Providers, which are created automatically at installation time.

Notice that Telemetry Providers are shared across all the workspaces within the same Akamas installation, and only users with administrative privileges can manage them.

Create CSV telemetry instances

To create an instance of the CSV provider, build a YAML file (instance.yml in this example) with the definition of the instance:

Then you can create the instance for the system using the Akamas CLI:

timestampFormat format

Notice that the week-year format YYYY is compliant with the ISO-8601 specification, but you should replace it with the year-of-era format yyyy if you are specifying a timestampFormat different from the ISO one. For example:

  • Correct: yyyy-MM-dd HH:mm:ss

  • Wrong: YYYY-MM-dd HH:mm:ss

Configuration options

When you create an instance of the CSV provider, you should specify some configuration information to allow the provider to correctly extract and process metrics from your CSV files.

You can specify configuration information within the config part of the YAML of the instance definition.

Required properties

  • address - a URL or IP identifying the address of the host where CSV files reside

  • username - the username used when connecting to the host

  • authType - the type of authentication to use when connecting to the file host; either password or key

  • auth - the authentication credential; either a password or a key according to authType. When using keys, the value can either be the value of the key or the path of the file to import from

  • remoteFilePattern - a list of remote files to be imported

Optional properties

  • protocol - the protocol to use to retrieve files; either scp or sftp. Default is scp

  • fieldSeparator - the character used as a field separator in the CSV files. Default is ,

  • componentColumn - the header of the column containing the name of the component. Default is COMPONENT

  • timestampColumn - the header of the column containing the timestamp. Default is TS

  • timestampFormat - the format of the timestamp (e.g. yyyy-MM-dd HH:mm:ss zzz). Default is YYYY-MM-ddTHH:mm:ss

You should also specify the mapping between the metrics available in your CSV files and those provided by Akamas. This can be done in the metrics section of the telemetry instance configuration. To map a custom metric you should specify at least the following properties:

  • metric - the name of a metric in Akamas

  • datasourceMetric - the header of a column that contains the metric in the CSV file

The provider ignores any column not present as datasourceMetric in this section.

The sample configuration reported in this section would import the metric cpu_util from CSV files formatted as in the example below:

Telemetry instance reference

The following represents the complete configuration reference for the telemetry provider instance.

The following table reports the configuration reference for the config section

The following table reports the configuration reference for the metrics section

Use cases

Here you can find common use cases addressed by this provider.

Linux SAR

Note that the metrics are percentages (between 1 and 100), while Akamas accepts percentages as values between 0 and 1, therefore each metric in this configuration has a scale factor of 0.001.

You can import the two CPU metrics and the memory metric from a SAR log using the following telemetry instance configuration.

Using the configured instance, the CSV File provider will perform the following operations to import the metrics:

  1. Retrieve the file "/csv/sar.csv" from the server "127.0.0.1" using the SCP protocol authenticating with the provided password.

  2. Use the column hostname to lookup components by name.

  3. Use the column timestamp to find the timestamps of the samples (that are expected to be in the format specified by timestampFormat).

  4. Collect the metrics (two with the same name, but different labels, and one with a different name):

    • cpu_util: in the CSV file is in the column %user and attach to its samples the label "mode" with value "user".

    • cpu_util: in the CSV file is in the column %system and attach to its samples the label "mode" with value "system".

    • mem_util: in the CSV file is in the column %memory.

Telemetry Provider
Description

collects metrics from CSV files

collects metrics from Dynatrace

collects metrics from Prometheus

collects metrics from Spark History Server

collects metrics from Tricentis Neoload Web

collects metrics from MicroFocus Load Runner Professional

collects metrics from MicroFocus Load Runner Enterprise

collects price metrics for Amazon Elastic Compute Cloud (ec2) from Amazon’s own APIs

You can find detailed information on timestamp patterns in the Patterns for Formatting and Parsing section on the page.

Field
Type
Description
Default Value
Restrictions
Required
Field
Type
Description
Restrictions
Required

In this use case, you are going to import some metrics coming from , a popular UNIX tool to monitor system resources. SAR can export CSV files in the following format.

# CSV Telemetry Provider Instance
provider: CSV File
config:
  address: host1.example.com
  authType: password
  username: akamas
  auth: akamas
  remoteFilePattern: /monitoring/result-*.csv
  componentColumn: COMPONENT
  timestampColumn: TS
  timestampFormat: YYYY-MM-dd'T'HH:mm:ss
metrics:
  - metric: cpu_util
    datasourceMetric: user%
akamas create telemetry-instance instance.yml system
TS,                   COMPONENT,  user%
2020-04-17T09:46:30,  host,       20
2020-04-17T09:46:35,  host,       23
2020-04-17T09:46:40,  host,       32
2020-04-17T09:46:45,  host,       21
provider: CSV File             # this is an instance of the CSV provider
config:
  address: host1.example.com   # the address of the host with the CSV files
  port: 22                     # the port used to connect
  authType: password           # the authentication method
  username: akamas             # the username used to connect
  auth: akamas                 # the authentication credential
  protocol: scp                # the protocol used to retrieve the file
  fieldSeparator: ","          # the character used as field separator in the CSV files
  remoteFilePattern: /monitoring/result-*.csv    # the path of the CSV files to import
  componentColumn: COMPONENT                     # the header of the column with component names
  timestampColumn: TS                            # the header of the column with the time stamp
  timestampFormat: YYYY-mm-ddTHH:MM:ss           # the format of the timestamp
metrics:
  - metric: cpu_util                             # the name of the Akamas metric
    datasourceMetric: user%                      # the header of the column with the original metric
    staticLabels:
      mode: user                                 # (optional) additional labels to add to the metric

metric

String

The name of the metric in Akamas

An existing Akamas metric

Yes

datasourceMetric

String

The name (header) of the column that contains the specific metric

An existing column in the CSV file

Yes

scale

Decimal number

The scale factor to apply when importing the metric

staticLabels

List of key-value pairs

A list of key-value pairs that will be attached to the specific metric sample

No

hostname, interval,     timestamp, 		        %user,	%system,      %memory
machine1, 600,		2018-08-07 06:45:01 UTC,	30.01,	20.77,		96.21
machine1, 600,		2018-08-07 06:55:01 UTC,	40.07,	13.00,		84.55
machine1, 600,		2018-08-07 07:05:01 UTC,	5.00,	90.55,		89.23
provider: CSV File
config:
  remoteFilePattern: /csv/sar.csv
  address: 127.0.0.1
  port: 22
  username: user123
  auth: password123
  authType: password
  protocol: scp
  componentColumn: hostname
  timestampColumn: timestamp
  timestampFormat: yyyy-MM-dd HH:mm:ss zzz
metrics:
  - metric: cpu_util
    datasourceMetric: %user
    scale: 0.001
    staticLabels:
      mode: user
  - metric: cpu_util
    datasourceMetric: %system
    scale: 0.001
    staticLabels:
      mode: system
  - metric: mem_util
    scale: 0.001
    datasourceMetric: %memory
CSV provider
Dynatrace
Prometheus
Spark History Server
NeoloadWeb
Load Runner Professional
Load Runner Enterprise
AWS
DateTimeFormatter (Java Platform SE 8)
SAR

address

String

The address of the machine where the CSV file resides

A valid URL or IP

Yes

port

Number (integer)

The port to connect to, in order to retrieve the file

22

1≤port≤65536

No

username

String

The username to use in order to connect to the remote machine

Yes

protocol

String

scp

scp sftp

No

authType

String

Specify which method is used to authenticate against the remote machine:

  • password: use the value of the parameter auth as a password

  • key: use the value of the parameter auth as a private key. Supported formats are RSA and DSA

password key

Yes

auth

String

A password or an RSA/DSA key (as YAML multi-line string, keeping new lines)

Yes

remoteFilePattern

String

A list of valid path for linux

Yes

componentColumn

String

The CSV column containing the name of the component.

The column's values must match (case sensitive) the name of a component specified in the System

COMPONENT

The column must exists in the CSV file

Yes

timestampColumn

String

The CSV column containing the timestamps of the samples

TS

The column must exists in the CSV file

No

timestampFormat

String

Timestamps' format

YYYY-mm-ddTHH:MM:ss

No

fieldSeparator

String

Specify the field separator of the CSV

,

, ;

No

The protocol used to connect to the remote machine: or

The path of the remote file(s) to be analyzed. The path can contains expressio

Must be specified using .

SCP
SFTP
GLOB
Java syntax
Akamas resource
AWS Policies