Only this pageAll pages
Powered by GitBook
Couldn't generate the PDF for 274 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

3.4.0

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Installing

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Applications running on cloud instances

Loading...

Spark applications

Loading...

Loading...

Loading...

Loading...

Loading...

Applications running on cloud instances

Spark applications

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Introduction

A quick introduction to Akamas

Akamas is the AI-powered optimization platform designed to maximize service quality and cost efficiency without compromising on application performance. Akamas supports both production environments under live, dynamic workloads, and in test/pre-production environments against any what-if scenario and workload.

Thanks to Akamas, performance engineers, DevOps, CloudOps, FinOps and SRE teams can keep complex applications, such as Kubernetes microservices applications, optimized to avoid any unnecessary cost and any performance risks.

Akamas Optimization platform

The Akamas optimization platform leverages patented AI techniques that can autonomously identify optimal full-stack configurations driven by any custom-defined goals and constraints (SLOs), without any human intervention, any agents, and any code or byte-code changes.

Akamas optimal configurations can be applied either i) under human approval (human-in-the-loop mode) or ii) automatically, as a continuous optimization step in a CI/CD pipeline (in-the-pipe) or iii) autonomously by Akamas (autopilot).

Akamas coverage

Akamas can optimize any system with respect to any set of parameters chosen from the application, middleware, database, cloud, and any other underlying layers.

Akamas provides dozens of out-of-the-box Optimization Packs available for key technologies such as JVM, Go, Kubernetes, Docker, Oracle, MongoDB, ElasticSearch, PostgreSQL, Spark, AWS EC2 and Lambda, and more. Optimization Pack provides parameters, relationships, and metrics to accelerate the optimization process setup and support company-wide best practices. Custom Optimization Packs can be easily created without any coding.

The following figure is illustrative of Akamas coverage for both managed technologies and integrated components of the ecosystem.

Akamas integrations

Akamas can integrate with any ecosystem thanks to out-of-the-box and custom integrations with the following components:

  • telemetry & monitoring tools and other sources of KPIs and cost data, such as Dynatrace, Prometheus, CloudWatch, and CSV files

  • configuration management tools, repositories and interfaces to apply configurations, such as Ansible, Openshift, and Git

  • value stream delivery tools to support a continuous optimization process, such as Jenkins, Dynatrace Cloud Automation, and GitLab

  • load testing tools to generate simulated workloads in test/pre-production, such as LoadRunner, NeoLoad, and JMeter

Akamas has been designed around Infrastructure-as-Code (IaC) and DevOps principles. Thanks to a comprehensive set of APIs and integration mechanisms, it is possible to extend the Akamas optimization platform to manage any system and integrate with any ecosystem.

Use Cases

Akamas optimization platform supports a variety of use cases, including:

  • Improve Service Quality: optimize application performance (e.g. maximize throughput, minimize response time and job execution time) and stability (lower fluctuations and peaks);

  • Increase Business Agility: identify resource bottlenecks in early stages of the delivery cycle, avoid delays due to manual remediations - release higher quality services and reduce production incidents;

  • Increase Service Resilience: improve service resilience under higher workloads (e.g. expected business growth) or failure scenarios identified by chaos engineering practices - improve SRE practice;

  • Reduce IT Cost / Cloud Bill: reduce on-premise infrastructure cost and cloud bills due to resource over-provisioning - improve cost efficiency of Kubernetes microservices applications;

  • Optimize Cloud Migration: safely migrate on-premise applications to cloud environments for optimal cost efficiency evaluate options to migrate to managed services (e.g. AWS Fargate);

  • Improve Operational Efficiency: save engineering time spent on manual tuning tasks and enable Performance Engineering teams to do more in less time (and with less external consulting).

Getting started

This guide introduces Akamas and covers various fundamental topics such as licensing and deployment models, security topics, and maintenance & support services.

Licensing

Software Licenses

Maintenance & Support Services

Other billable services

It is recommended to read this guide before moving to other guides on how to install, integrate, and use Akamas. The section of the Reference guide can help in reviewing Akamas key concepts.

Akamas software licensing model is subscription-based (typically on a yearly basis). For more information on Akamas' cost model and software licensing costs, please contact .

Akamas software licenses include which also include access to .

Akamas also provides optional professional services for deployment, training, and integration activities. For more information about Akamas professional services, please contact .

Glossary
info@akamas.io
Maintenance & Support Services
Customer Support Services
info@akamas.io

Security

Akamas takes security seriously and provides enterprise-grade software where customer data is kept safe at all times. This page describes some of the most important security aspects of Akamas software and information related to processes and tools used by the Akamas company (Akamas S.p.A) to develop its software products.

Information managed by Akamas

Akamas manages the following types of information:

  • System configuration and performance metrics: technical data related to optimized systems. Examples of such data include the number of CPUs available in a virtual machine or the memory usage of a Java application server;

  • User accounts: accounts assigned to users to securely access the Akamas platform. For each user account, Akamas currently requires an account name and a password. Akamas does not collect any other personal identifying information;

  • Service Credentials: credentials used by Akamas to automate manual tasks and to integrate with external tools. In particular, Akamas leverages the following types of interaction:

    • Integration with monitoring and orchestration tools, e.g., collecting IT performance metrics and system configuration. As a best practice, Akamas recommends using dedicated service accounts with minimal read-only privileges.

    • Integration with the target systems to apply changes to configuration parameters. As a best practice, Akamas recommends using dedicated service accounts with minimal privileges to read/write identified parameters.

GDPR Compliance

Akamas is a fully GDPR-compliant product.

Akamas is a company owned by the Moviri Group. The Moviri Group and all its companies are fully compliant with GDPR. Moviri Group Data Privacy Policy and Data Breach Incident Response Plan which apply to all the owned companies can be requested from Akamas Customer Support.

Security certifications

Akamas is an on-premises product and does not transmit any data outside the customer network. Considering the kind of data that is managed within Akamas (see section "Which information is managed by Akamas"), specific security certifications like PCI or HIPAA are not required as the platform does not manage payment or health-related information.

Data encryption

Akamas takes the need for security seriously and understands the importance of encrypting data to keep it safe at rest and in-flight.

In-Flight encryption

All the communications between Akamas UI and CLI and the back-end services are encrypted via HTTPS. The customer can configure Akamas to use customer-provided SSL certificates in all communications.

Communications between Akamas services and other integrated tools within the customer network rely on the security configuration requirements of the integrated tool (e.g.: HTTPS calls to interact with REST services).

At-Rest encryption

Akamas is an on-premises product and runs on dedicated virtual machines within the customer environment. At-rest encryption can be achieved following customer policies and best practices, for example, leveraging operating system-level techniques.

Akamas also provides an application-level encryption layer aimed at extending the scope of at-rest encryption. With this increased level of security, sensitive data managed by Akamas (e.g. passwords, tokens, or keys required to interact with external systems) are safely stored in Akamas databases using industry-standard AES 256-bit encryption.

Encryption option for Akamas on EC2

In the case of Akamas hosted on an AWS machine you may optionally create an EC2 instance with an encrypted EBS volume before installing OS and Akamas, to achieve a higher level of security.

Password management

Password Security

Passwords are securely stored using a one-way hash algorithm.

Password complexity

Akamas comes with a default password policy with the following requirements:

  • has a minimum length of 8 characters.

  • contains at least 1 uppercase and 1 lowercase character.

  • contains at least 1 special character.

  • is different from the username.

  • must be different from the last password set.

Customers can modify this policy by providing a custom one that matches their internal security policies.

Password rotation

Akamas enforces no password rotation mechanism.

Credential storage

  • When running on a Linux installation with KDE's KWallet enabled or GNOME's Keyring enabled, the credentials will be stored in the default wallet/keyring.

  • When running on Windows, the credentials will be stored in Windows Credential Locker.

  • When running on a macOS, the credential will be stored in Keychain.

  • When running on a Linux headless installation, the credentials will be stored in CLEAR TEXT in a file in the current Akamas configuration folder.

Resources visibility model

Akamas provides fine granularity control over resources managed within the platform. In particular, Akamas features two kinds of resources:

  • Workspace resources: entities bound to one of the isolated virtual environments (named workspaces) that can only be accessed in reading or writing mode by users to whom the administrators explicitly granted the required privileges. Such resources typically include sensitive data (e.g.: passwords, API tokens). Examples of such resources include the system to be optimized, the set of configurations, optimization studies, etc.

  • Shared resources: entities that can be installed and updated by administrators and are available to all Akamas users. Such resources only contain technology-related information (e.g.: the set of performance metrics for a Java application server). Examples of such resources include Optimization Packs, which are libraries of technology components that Akamas can optimize, such as a Java application server.

Akamas Logs

Akamas logs traffic from UI and APIs. Application level logs include user access via APIs and UI and any action taken by Akamas on integrated systems.

Akamas' logs are retained on the dedicated virtual machine within the customer environment, by default, for 7 days. The retention period can be configured according to customer policies. Logs can be accessed either via UI or via log dump within the retention period. Additionally, logs have a format that can be easily integrated with external systems like log engines and SIEM to support forensic analysis.

Code scanning policy

Akamas is developed according to security best practices and the code is scanned regularly (at least daily).

The Akamas development process leverages modern continuous integration approaches and the development pipeline includes SonarQube, a leading security scanning product that includes comprehensive support for established security standards including CWE, SANS, and OWASP. Code scanning is automatically triggered in case of a new build, a release, and every night.

Vulnerability scanning and patch management policy

Akamas features modern micro-service architecture and is delivered as a set of docker containers whose images are hosted on a private Elastic Container Registry (ECR) repository on the AWS cloud. Akamas leverages the vulnerability scanning capabilities of AWS ECR to identify vulnerabilities within the product container images. AWS ECR uses the Common Vulnerabilities and Exposures (CVEs) database from the open-source Clair project.

If a vulnerability is detected, Akamas will perform a security assessment of the security risk in terms of the impact of the vulnerability, and evaluate the necessary steps (e.g.: dependency updates) required to fix the vulnerability within a timeline related to the outcome of the security assessment.

After the assessment, the vulnerability can be fixed by either recommending the upgrade to a new product version or delivering a patch or a hotfix for the current version.

Deployment

Akamas is an on-premise product running on a dedicated machine within the customer environment:

  • on a virtual or physical machine in your data center

  • on a virtual machine managed running on a cloud, by any cloud provider (e.g. AWS EC2)

  • on your own laptop

Akamas also provides a Free Trial option which can be requested .

here
Akamas high-level architecture
Akamas optimizes any system and integrates with any ecosystem
Akamas key use cases

Customer Support Services

Akamas Customer Support Services are delivered by Akamas support engineers, also called Support Agents, who will work remotely with Customer to provide a temporary remedy for the incident and, ultimately, a permanent resolution. Akamas Support Agents automatically escalate issues to the appropriate technical group within Akamas and notify Customers of any relevant progress. Akamas provides Customers with the ability to escalate issues when appropriate.

Please notice that Customer Support services are not to be considered as alternatives to product documentation and training, or to professional and consulting services, so adequate knowledge of Akamas products is assumed when interacting with Akamas Customer Support. Thus, during the resolution of a reported issue Support Agents may redirect Customer to training or professional services (that are not part of the scope of this service).

Free Trial

Akamas offers a Free Trial option to quickly understand Akamas concepts and capabilities and experience the power of its AI-based optimization platform.

You can join Akamas Free Trial quickly:

  1. Receive credentials to access your dedicated Akamas server (a cloud instance on AWS EC2) - optimally you can also download & install the Akamas CLI and learn how to fully automate the optimization process;

What you will get:

  • Understand the Akamas methodology

  • See Akamas AI-powered optimization in action

  • Learn to use Akamas by following the how-to guides

  • Familiarize yourself with Akamas UI and CLI

  • Touch the benefits Akamas can deliver to your organization

Enjoy!

Fill out this on the Akamas website;

Explore already executed optimization studies or create & run new studies to optimize a microservice app at both the JVM runtime and Kubernetes level - here you can take advantage of .

form
Akamas Quick Guides

Home

Maintenance & Support (M&S) Services

This page is intended as a first introduction to Akamas Maintenance & Support (M&S) Services.

Please refer to the specific contract in place with your Company.

Akamas M&S Services include:

Akamas M&S Services do not include any installation and upgrade services, creation of any custom optimization packs, telemetry providers, or workflow operators, or implementation of any custom features and integrations that are not provided out-of-the-box by the Akamas products.

Cloud Hosting

Refer to your Cloud Provider website for information about cloud hosting options and related cost information.

AWS EC2

  • provides a very first introduction to AI-powered optimization

  • covers Akamas licensing, deployment, security topics

  • describes Akamas maintenance and support services.

This guide provides some preliminary knowledge required to puchaise, implement and use Akamas.

User personas: All roles

  • describes the Akamas architecture

  • provides the hardware, software and network prerequisites

  • describes the steps to install an Akamas Server and CLI

This guide provides the knowledge required to install and manage an Akamas installation.

User personas: Akamas Admin

  • describes the Akamas optimization process and methodology

  • provides guidelines for optimizing some specific technologies

  • provides examples of optimization studies

This guide provides the methodology to define an optimization process and knowledge to leverage Akamas

User personas: Analyst / Practicioner teams

  • describes how to integrate Akamas with the telemetry providers and configuration management tools

  • describes how to integrate Akamas with load testing tools

  • describes how to integrate Akamas with CI/CD tools

This guide provides the knowledge required to integrate Akamas with the ecosystem

User personas: Akamas Admin, DevOps team

  • provides a glossary of Akamas key concepts with references to construct templates and commands

  • provides a reference to Akamas construct templates

  • provides a reference to Akamas command-line commands

  • describes Akamas optimization packs and telemetry providers

User personas: Akamas Admin, DevOps team, Analyst / Practicioner teams

  • describes how to setup a test environment for experimenting with Akamas

  • describes how to apply the Akamas approach to the optimization of some real-world cases

  • provides examples of Akamas templates and commands for the real-world cases

User personas: Analyst / Practicioner teams

access to Software versions released as major and minor versions, service packs, patches, and hotfixes according to .

assistance from for inquiries about the Akamas product and issues encountered while using Akamas products where there is a reasonable expectation that issues are caused by Akamas products, according to

For AWS EC2 costs visit the and use the to estimate the cost for your architecture.

Getting started with Akamas
Installing Akamas
Using Akamas
Integrating Akamas
Akamas Reference
Knowledge Base
Support levels for software versions
Akamas Customer Support
Support levels for Customer Support Services
EC2 Pricing page
AWS Pricing Calculator

Architecture

Akamas is based on a microservices architecture where each service is deployed as a container and communicates with other services via REST APIs. Akamas can be deployed on a dedicated machine (Akamas Server) or on a Kubernetes cluster.

The following figure represents the high-level Akamas architecture.

Interact with Akamas

Users can interact with Akamas via either the Graphical User Interface (GUI), Command-Line Interface (CLI), or via Application Programmatic Interface (API).

Both the GUI and CLI leverage HTTP/S APIs which pass through an API gateway (based on Kong), which also takes care of authenticating users by interacting with Akamas access management and routing requests to the different services.

The Akamas CLI can be invoked on either the Akamas Server itself or on a different machine (e.g. a laptop or another server) where the Akamas CLI has been installed.

Repositories

Akamas data is securely stored in different databases:

  • time series data gathered from telemetry providers are stored in Elasticsearch;

  • application logs are also stored in Elasticsearch;

  • data related to systems, studies, workflows, and other user-provided data are stored in a Postgres database.

Notice: both Postgres and Elasticsearch and any other service included within Akamas are provided by Akamas as part of the Akamas installation package.

Services

Core Services

The following Spring-based microservices represent Akamas core services:

  • System Service: holds information about metrics, parameters, and systems that are being optimized

  • Campaign Service: holds information about optimization studies, including configurations and experiments

  • Metrics Service: stores raw performance metrics (in Elasticsearch)

  • Analyzer Service: automates the analysis of load tests and provides related functionalities such as smart windowing

  • Telemetry Service: takes care of integrating different data sources by supporting multiple Telemetry Providers

  • Optimizer Service: combines different optimization engines to generate optimized configurations using ML techniques

  • Orchestrator Service: manages the execution of user-defined workflows to drive load tests

  • User Service: takes care of user management activities such as user creation or password changes

  • License Service: takes care of license management activities, optimization pack, and study export.

Ancillary Services

Akamas also provides advanced management features like logging, self-monitoring, licensing, user management, and more.

Support levels for software versions

Different levels of support are provided for software versions of Akamas products, starting from its general availability (GA) date, and depending on the release of following software versions.

Version Numbering

Akamas adopts a three-place numbering scheme MA.MI.SP to designate released versions of its Software, where:

  • MA is the Major Version

  • MI is the Minor Version

  • SP is the Service Pack or Patch number

Support levels

The following table describes the three levels of support for a software version.

Support level
Description

Full Support

Akamas provides full support for one previous (either major or minor) version in addition to the latest available GA version.

For Software version in Full Support level: Akamas Support Agents provide service packs, patches, hotfixes, or workarounds to make the Software operate in substantial conformity with its then-current operating documentation.

Limited Support

Following the Full Support period, Akamas provides Limited Support for additional 12 months.

For Software versions in Limited Support level:

  • No new enhancements will be made to a version in "Limited Support" Akamas Support Agents will direct Customers to existing fixes, patches, or workarounds applicable to the reported case, if any;

  • Akamas Support Agents will provide hot fixes for problems of high technical impact or business exposure for customers;

  • Based on Customer input, Akamas Support Agents will determine the degree of impact and exposure and the consequent activities;

  • Akamas Support Agents will direct Customers to upgrade to a more current version of the Software.

No Support

Following the Limited Support period, Akamas provides no support for any Software version.

For Software versions in No Support level: No new maintenance releases, enhancements, patches, or hot fixes will be made available. Akamas Support Agents will direct Customers to upgrade to a more current version of the Software.

End-of-Life (EOL)

At any time, Akamas reserves the right to "end of life" (EOL) a software product and to terminate any Maintenance & Support Services for such product, provided that Licensor has notified the Licensee at least 12 months prior to the above-mentioned termination.

The period of time occurring between the "end of life" notification and the actual termination of Maintenance & Support Services is provided as follows:

  • No new enhancements will be introduced.

  • No enhancements will be made to support new or updated versions of the platform on which the product runs or which it integrates.

  • New hotfixes for problems of high technical impact or business exposure for customers may still be developed. Based on customer input, Akamas Support Agents will determine the degree of impact and exposure and the consequent activities.

  • Reasonable efforts will be done to inform the Customer of any fixes, service packs, patches, or workarounds applicable to the reported case if any.

Support levels for Customer Support Services

Akamas Customer Support Services provides different standard levels of support. Please verify the level of support specified in the contract in place with your Company.

Severity levels

The following table describes the different severity levels for Customer Support.

Severity level
Description
Impact

S1

Blocking: production Customer system is severely impacted.

Notice: this severity level only applies to production environments

Catastrophic business impacts (e.g. complete loss of a core business process and work cannot reasonably continue (e.g. all final users are unable to access the Customer application)

S2

Critical: one major Akamas functionality is unavailable

Significant loss or degradation of the Akamas services (e.g. Akamas is down or Akamas is not generating recommendations)

S3

Severe: limitation in accessing one major Akamas functionality

Moderate business impact and moderate loss or degradation of services, but work can reasonably continue in an impaired manner (e.g. only some specific functions are not working properly)

S4

Informational: Any other request

Minimum business impact.

Substantially functioning with minor or no impediments of services.

Support conditions

The contract in place with the Customer specifies the level of support provided by Akamas Agents, according at least to the following items:

  • Maximum number of support seats: this is the maximum number of named users within the Customer organization who can request Akamas Customer Support.

  • Language(s): these are the languages that can be used for interacting with Akamas Support Agents - the default is English.

  • Channel(s): these are the different communication channels that can be used to interact with Akamas Agents - these may include one or more options among web ticketing, email, phone, and Slack channel.

  • Max Initial Response Time: this refers to the time interval occurring from the time a request is opened by Customer to Customer Support and the time a Support Agent responds with a first notification (acknowledgment).

  • Severity: this is the level of severity associated with a reported issue, which initially corresponds to the severity level originally indicated by the Customer. Notice that the severity level may change, for example as new information becomes available or if Support Agents and Customer agree to re-evaluate it. Please notice that the severity level may be downgraded by Support Agents if Customer is not able to provide adequate resources or responses to enable Akamas to continue with its resolution efforts.

  • Initial Remedy: this refers to any operation aimed at addressing a reported issue by restoring a minimal level of operations, even if it may cause some performance degradation of the Customer service or operations. A workaround is to be considered a valid Initial Remedy.

Please notice that Support Agents may refuse to serve a service request to Customer Support either in case Customer does not have a valid Maintenance & Support subscription or in case the above-mentioned conditions or other conditions stated in the contract in place are not met. In any case, the Customer is expected to provide all the information required by Support Agent in order to serve service requests Customer Support.

Prerequisites

Before installing the Akamas Server please make sure to review all the following requirements:

Hardware Requirements

Running in your data center

The following table provides the minimal hardware requirements for the virtual or physical machine used to install the Akamas server in your data center.

Resource

Requirement

CPU

4 cores @ 2 GHz

Memory

16 GB

Disk Space

70 GB

Running on AWS EC2

As shown in the following diagram, you can create the Akamas instance in the same AWS region, Virtual Private Cloud (VPC), and private subnet as your own already existing EC2 machines and by creating/configuring a new security group that allows communication between your application instances and Akamas instance. The inbound/outbound rules of this security group must be configured as explained in the Networking Requirements section of this page.

It is recommended to use an m6a.xlarge instance with at least 70GB of disks of type GP2 or GP3 and select the latest LTS version of Ubuntu.

Supported AWS Regions

Akamas can be run in any EC2 region.

AWS Service Limits

Software Requirements

Operating System

The following table provides a list of the supported operating systems and their versions.

Operating System

Version

Ubuntu Linux

18.04+

CentOS

7.6+

RedHat Enterprise Linux

7.6+

On RHEL systems Akamas containers might need to be run in privileged mode depending on how Docker was installed on the system.

Software packages

The following table provides a list of the required Software Packages (also referred to as Akamas dependencies) together with their versions.

Software Package

Notes

Docker

Akamas is deployed as a set of containerized services running on Docker. During its operation, Akamas launches different containers so access to the docker socket with enough permissions to run the container is required.

Docker Compose

Akamas containerized services are managed via Docker Compose. Docker compose is usually already shipped with Docker starting from version 23.

AWS CLI

Akamas container images are published in a private Amazon Elastic Container Registry (ECR) and are automatically downloaded during the online installation procedure.

AWS CLI is required only during the installation phase if the server has internet access and can be skipped during an offline installation.

The exact version of these prerequisites is listed in the following table:

Software Package

Ubuntu

CentOS

RHEL

Docker

20.10.10+

20.10.10+

20.10.10+

Docker Compose

2.7.0+

2.7.0+

2.7.0+

AWS CLI

2.0.0+

2.0.0+

2.0.0+

Akamas user

To install and run Akamas it is recommended to create a dedicated user (usually "akamas"). The Akamas user is not required to be in the sudoers list but can be added to the docker (dockeroot) group so it can run docker and docker-compose commands.

Make sure that the Akamas user has the read, write, and execute permissions on /tmp. If your environment does not allow writing to the whole /tmp folder, please create a folder /tmp/build and assign read and write permission to the Akamas user on that folder.

Support levels with Akamas

Docker compose installation

This section describes how to install Akamas on Docker.

Preliminary steps

Before installing Akamas, please follow these steps:

Installation steps

Please follow these steps to install the Akamas Server:

To run Akamas on an AWS Instance you need to create a new virtual machine based on one of the supported operating systems. You can refer to for step-by-step instructions on creating the instance.

You can find the latest version supported for your preferred region .

Before installing Akamas on an AWS Instance please make sure to meet your AWS service limits (please refer to the official AWS documentation ).

Read more about how to set up .

Based on the , the following table describes the level of support of the Akamas versions after the version 3.2 GA date (2023 May, 1st).

Version
Support Level

3.2

Full Support

Notice: this will change once the following major version is released

3.1

Full Support

Notice: this will change once the following major version is released

3.0

Full Support

Notice: this will change once the following major version is released

2.x

Limited Support until 12 months after 3.0 GA date, that is 2023 September, 13th (see )

1.x

No Support

Please make sure to read the section before installing Akamas.

Please also read the section on how to and how to . Finally, read the relevant sections of to integrate Akamas into your specific ecosystem.

Hardware requirements
Software requirements
Network requirements
AWS documentation
here
here
Akamas dependencies
Support levels for software versions
Getting Started
Review hardware, software, and network prerequisites
Install all Akamas dependencies
Install the Akamas Server
Install the Akamas CLI
Verify the Akamas Server
Install an Akamas license
troubleshoot the installation
manage the Akamas Server
Integrating Akamas
Support Levels with Akamas 3.0

Install Akamas dependencies

While some links to official documentation and installation resources are provided here, please make sure to refer to your internal system engineering department to ensure that your company deployment processes and best practices are correctly matched.

Dependencies Setup

As a preliminary step before installing any dependency, it is strongly suggested to create a user named akamas on your machine hosting Akamas Server.

Docker

Follow the reference documentation to install docker on your system.

Verify dependencies

As a quick check to verify that all dependencies have been correctly installed, you can run the following commands

  • Docker:

    docker run hello-world

For offline installations, you can check docker with docker ps command

  • Docker compose :

    docker compose --version

Docker versions older than 23 must usedocker-compose command instead of docker compose

  • AWS CLI:

    aws --version

This page will guide you through the installation of software components that are required to get the Akamas Server installed on a machine. Please read the for a detailed list of these software components for each specific OS.

Docker installation guide:

Docker compose is already installed since Docker 23+. To install it on previous versions of Docker follow this installation guide:

AWS CLI v2:

To run docker with a non-root user, such as the akamas user, you should add it to the docker group. You can follow the guide at:

Akamas dependencies
https://docs.docker.com/engine/install
https://docs.docker.com/compose/install/
https://docs.aws.amazon.com/cli/latest/userguide
https://docs.docker.com/engine/install/linux-postinstall/

Online installation mode

Akamas is deployed as a set of containerized services running on Docker and managed via Docker Compose. In the online installation mode, the latest version of the Akamas Docker Compose file and all the images required by Docker can be downloaded from the AWS ECR repository.

Get Akamas Docker artifacts

It is suggested first to create a directory akamas in the home directory of your user, and then run the following command to get the latest compose file:

cd ~
mkdir akamas
cd akamas
curl -O https://s3.us-east-2.amazonaws.com/akamas/compose/3.4.0/docker-compose.yml

Configure Akamas environment variables

To login into AWS ECR and pull the most recent Akamas container images you also need to set the AWS authentication variables to the appropriate values provided by Akamas Customer Support Services by running the following command. To configure Akamas, you should set the following environment variables:

To configure Akamas, you should set the following environment variables:

  • AKAMAS_CUSTOMER: the customer name matching the one referenced in the Akamas license.

  • AWS_ACCESS_KEY_ID: the access key for pulling the Akamas images

  • AWS_SECRET_ACCESS_KEY: the secret access key for pulling the Akamas images

  • AWS_DEFAULT_REGION: Unless specified by the support team keep the value to us-east-2

  • AKAMAS_BASE_URL: the endpoint in the Akamas APIs that will be used to interact with the CLI, typically https://<akamas server DNS address>

To avoid losing your environment variables for future upgrades, it is suggested to keep them in the .env file. Launch the following command from the same folder where the docker-compose.yml is stored, replacing the parameters in the brackets <>:

# Required variables
AKAMAS_CUSTOMER=<your name or your organization name>
AWS_ACCESS_KEY_ID=<your access key id>
AWS_SECRET_ACCESS_KEY=<your secret access key>
AKAMAS_BASE_URL=https://<akamas server DNS address>
AWS_DEFAULT_REGION=us-east-2

# Optional variables
# Database passwords
DEFAULT_DATABASE_PASSWORD=
KEYCLOAK_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_keycloak}
ANALYZER_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_analyzer}
CAMPAIGN_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_campaign}
LICENSE_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_license}
OPTIMIZER_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_optimizer}
ORCHESTRATOR_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_orchestrator}
SYSTEM_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_system}
TELEMETRY_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_telemetry}

Start Akamas

To log into AWS ECR and pull the most recent Akamas container images you also need to set the AWS authentication variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION) with the values provided by Akamas Customer Support Services. You can leverage the .env file previously created with the following command:

source ./.env
aws ecr get-login-password --region us-east-2 | docker login -u AWS --password-stdin https://485790562880.dkr.ecr.us-east-2.amazonaws.com

You can start installing Akamas server by running the following AWS CLI commands:

docker compose up -d

Install the Akamas Server

Akamas is deployed as a set of containerized services running on Docker and managed via Docker Compose. The latest version of the Akamas Docker Compose file and all the images required by Docker can be downloaded from the AWS ECR repository.

Two installation modes are available:

In case the Akamas Server is behind a proxy server please also read how to .

, in case the Akamas Server has access to the Internet - is also supported.

, in case the Akamas Server does not have access to the Internet.

setup Akamas behind a Proxy
online installation mode
installation behind a proxy server
offline installation mode

Network requirements

This section lists all the connectivity settings required to operate and manage Akamas

Internet access

Internet access is required for Akamas online installation and updated procedures and allows retrieving the most updated Akamas container images from the Akamas private Amazon Elastic Container Registry (ECR).

If internet access is not available for policies or security reasons, Akamas installation and updates can be executed offline.

Internet access from the Akamas server is not mandatory but it’s strongly recommended.

Ports

The following table provides a list of the ports on the Akamas server that have to be reachable by Akamas administrators and users to properly operate the system.

Source

Destination

Port

Reason

Akamas admin

Akamas server

22

ssh

Akamas admin/user

Akamas server

80, 443

Akamas web UI access

Akamas admin/user

Akamas server

8000, 8443

Akamas API access

In the specific case of AWS instance and customer instances sharing the same VPC/Subnet inside AWS, you should:

  • open all of the ports listed in the table above for all inbound URLs (0.0.0.0/32) on your AWS security group

  • open outbound rules to all traffic and then attach this AWS security group (which must reside inside a private subnet) to the Akamas machine and all customer application AWS machines

Offline installation mode

Akamas is deployed as a set of containerized services running on Docker and managed via Docker Compose. In the offline installation mode, the latest version of the Akamas Docker Compose file and all the images required by Docker cannot be downloaded from the AWS ECR repository.

Get Akamas Docker artifacts

Get in contact with Akamas Customer Services to get the latest versions of the Akamas artifacts uploaded to a location of your choice on the dedicated Akamas Server.

Akamas installation artifacts will include:

  • images.tar.gz: a tarball containing Akamas main images.

  • docker-compose.yml: docker-compose file for Akamas.

  • akamas: the binary file of the Akamas CLI that will be used to verify the installation.

Import Docker images

A preliminary step in the offline installation mode is to import the shipped Docker images by running the following commands in the same directory where the tar files have been stored:

Mind that this import procedure could take some time!

Configure Akamas environment variables

To configure Akamas, you should set the following environment variables:

  • AKAMAS_CUSTOMER: the customer name matching the one referenced in the Akamas license.

  • AKAMAS_BASE_URL: the endpoint in the Akamas APIs that will be used to interact with the CLI, typically https://<akamas server DNS address>

To avoid losing your environment variables for future upgrades, it is suggested to keep them in the .env file, stored in the same directory as the docker-compose.yml:

Run installation

To start Akamas you can now simply navigate into the akamas folder and run a docker-compose command:

You may get the following error:

  • Ubuntu

  • RHEL

This is a documented docker bug (see ) that can be solved by installing the "pass" package:

cd <your bundle files location>
docker image load -i images.tar.gz
.env
# Required variables
AKAMAS_CUSTOMER=<your name or your organization name>
AKAMAS_BASE_URL=https://<akamas server DNS address>

# Optional variables
## Database password. Use DEFAULT_DATABASE_PASSWORD to set a custom password for all databases
DEFAULT_DATABASE_PASSWORD=
## A custom password per each service can be set using the variables below, otherwise, the default is used. For example, for the kong database, the password is `akamas_kong`.
KONG_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_kong}
AIRFLOW_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_airflow}
KEYCLOAK_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_keycloak}
ANALYZER_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_analyzer}
CAMPAIGN_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_campaign}
LICENSE_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_license}
OPTIMIZER_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_optimizer}
ORCHESTRATOR_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_orchestrator}
SYSTEM_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_system}
TELEMETRY_DATABASE_PASSWORD=${DEFAULT_DATABASE_PASSWORD:-akamas_telemetry}
cd <your docker-compose file location>
docker compose up -d
Error saving credentials: error storing credentials - err: exit status 1, out: Cannot autolaunch D-Bus without X11 $DISPLAY
sudo apt-get install -y pass
yum install pass
this link

Online installation behind a Proxy server

This section describes how to setup an Akamas Server behind a proxy server and to allow Docker to connect to the Akamas repository on AWS ECR.

Configure Docker daemon

First, create the /etc/systemd/system/docker.service.d directory if it does not already exists. Then create or update the /etc/systemd/system/docker.service.d/http-proxy.conf file with the variables listed below, taking care of replacing <PROXY> with the address and port (and credentials if needed) of your target proxy server:

Once configured, flush the changes and restart Docker with the following commands:

Configure the Akamas containers

To allow the Akamas services to connect to addresses outside your intranet, the Docker instance needs to be configured to forward the proxy configuration to the Akamas containers.

Update the ~/.docker/config.json file adding the following field to the JSON, taking care to replace <PROXY> with the address (and credentials if needed) of your target proxy server:

Run Akamas

Set the following variables to configure your working environment, taking care to replace <PROXY> with the address (and credentials if needed) of your target proxy server:

Once configured, you can log into the ECR repository through the AWS CLI and start the Akamas services manually.

For more details, refer to the official documentation page: .

For more details, refer to the official documentation page: .

[Service]
Environment="HTTP_PROXY=<PROXY>"
Environment="HTTPS_PROXY=<PROXY>"
sudo systemctl daemon-reload
sudo systemctl restart docker
{
  # ...
  "proxies": {
    "default": {
      "httpProxy": "<PROXY>",
      "httpsProxy": "<PROXY>",
      "ftpProxy": "<PROXY>",
      "noProxy": "localhost,127.0.0.1,/var/run/docker.sock,database,optimizer,campaign,analyzer,telemetry,log,elasticsearch,metrics,system,license,store,orchestrator,airflow-db,airflow-webserver,kong-database,kong,user-service,keycloak,logstash,kibana,akamas-ui,grafana,prometheus,node-exporter,cadvisor,konga,benchmark"
    }
  }
}
export HTTP_PROXY='<PROXY>'
export HTTPS_PROXY='<PROXY>'
Control Docker with systemd
Configure Docker to use a proxy server

Setup HTTPS configuration

Akamas APIs and UI use plain HTTP when they are first installed. To enable the use of HTTPS you will need to:

  1. Ask your security team to provide you with a valid certificate for your server. The certificate usually consists of two files with ".key" and ".pem" extensions. You will need to provide the Akamas server DNS name.

  2. Create a folder named "certs" in the same directory as Akamas' docker-compose file;

  3. Copy the ".key" and ".pem" files in the created "certs" folder and rename them to "akamas.key" and "akamas.pem" respectively. Ensure the files belong to the same user and group you use to run Akamas.

  4. Restart two Akamas services by running the following commands:

    cd <Akamas docker-compose file folder>
    docker-compose restart akamas-ui kong

After the containers' reboot is complete you will be able to access the UI over HTTPS from your browser:

https://<akamas server name here>

Setup CLI to use HTTPS

Now that your Akamas server is configured to use HTTPS you can update the Akamas CLI configuration to use the secure protocol.

akamas init config

You will be prompted to enter some input, please value it as follows:

Api address [http://localhost:8000]: https://<akamas server dns address>:443/akapi
Workspace [default]: default
Verify SSL: [True]: True

You can test the connection by running:

akamas status

It should return 'OK', meaning Akamas has been properly configured to work over HTTPS.

If you have not installed the Akamas CLI, follow the . If you already have the CLI available, you can run the following command:

CLI installation guide

Prerequisites

Before installing the Akamas please make sure to review all the following requirements:

Troubleshoot Docker installation issues

This section describes some of the most common issues found during the Akamas installation.

Issues when installing Docker

Centos 7 and RHEL 7

Notice: this distro features a known issue since Docker default execution group is named dockerroot instead of docker . To make docker work edit (or create) /etc/docker/daemon.json to include the following fragment:

After editing or creating the file, please restart Docker and then check the group permission of the Docker socket (/var/run/docker.sock), which should show dockerroot as a group:

Then, add the newly created akamas user to the dockerroot group so that it can run docker containers:

and check the akamas user has been correctly added to dockerroot group by running:

Issues when running AWS CLI

In case of issues in logging in through AWS CLI, when executing the following command:

Please check that:

  • Environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION are correctly set

  • AWS CLI version is 2.0+

Issue when starting Akamas services

Akamas failed to start some services

Please notice that the very first time Akamas is started, up to 30 minutes might be required to initialize the environment.

In case the issue persists you can run the following command to identify which service is not able to start up correctly

License service unable to access docker socket

In some systems, the Docker socket, usually located in /var/run/docker.sock can not be accessed within a container. This causes Akamas to signal this behavior by reporting the Access Denied error in the license service logs.

To overcome this limitation edit the docker-compose.yaml file adding the line privileged: true to the following services:

  • License

  • Optimizer

  • Telemetry

  • Airflow

The following is a sample configuration where this change is applied to the license service:

Finally, you can issue the following command to apply these changes

Missing Akamas Customer variable

You can easily inspect which value of this variable has been used when starting Akamas by running the following command on the Akamas server:

If you find out that the value is not the one you expect, you can update the .env file and then start again the license service by running:

Once Akamas is up and running you can re-install your license.

Other issues

We recommend using the for a smoother experience.

When installing Akamas it’s mandatory to provide the AKAMAS_CUSTOMER variable as illustrated in the . This variable must match the one provided by Akamas representatives when issuing a license. If the variable is not properly exported license installation will fail with an error message indicating that the name of the customer installation does not match the one provided in the license.

For any other issues please contact Akamas .

Cluster requirements
Software requirements
{
  "group": "dockerroot"
}
srw-rw----. 1 root dockerroot 0 Jul  4 09:57 /var/run/docker.sock
sudo usermod -aG dockerroot <user_name>
lid -g dockerroot
aws ecr get-login-password --region us-east-2
akamas status -d
license:
  image: 485790562880.dkr.ecr.us-east-2.amazonaws.com/akamas/license_service:2.3.0
  container_name: license
  privileged: true
docker compose up -d
docker inspect license | grep AKAMAS_CUSTOMER
docker compose up -d license
official AWS CLI installation guide
installation guide
Customer Support Services

Cluster Requirements

Kubernetes version

Running Akamas requires a cluster running Kubernetes version 1.24 or higher.

Resources requirements

Akamas can be deployed in three different sizes depending on the number of concurrent optimization studies that will be executed. If you are unsure about which size is appropriate for your environment we suggest you start with the small one and upgrade to bigger ones as you expand the optimization activity to more applications.

The tables below report the required resources both for requests and limits that should be available in the cluster to use Akamas.

Small

The small tier is suited for environments that need to support up to 10 concurrent optimization studies

Resource
Requests
Limits

CPU

8 Cores

16 Cores

Memory

30 GB

30 GB

Disk Space

70 GB

70 GB

Storage requirements

The cluster must provide the definition of a Storage Class so that the application installation can leverage Persistent Volume Claims to dynamically provision the volumes required to persist data.

Permissions

To install and run Akamas cluster level permissions are not required. This is the minimal set of namespaced rules.

- apiGroups: ["", "apps", "policy", "batch", "networking.k8s.io", "events.k8s.io/v1", "rbac.authorization.k8s.io"]
  resources:
    - configmaps
    - cronjobs
    - deployments
    - events
    - ingresses
    - jobs
    - persistentvolumeclaims
    - poddisruptionbudgets
    - pods
    - pods/log
    - rolebindings
    - roles
    - secrets
    - serviceaccounts
    - services
    - statefulsets
  verbs: ["get", "list", "create", "delete", "patch", "update", "watch"]

Networking

For more information on this topic refer to .

Networking requirements depend on how users interact with Akamas. Services can be exposed via Ingress or . Refer to for a more detailed description of the available options.

Kubernetes' official documentation
using kubectl as a proxy
Accessing Akamas

Kubernetes installation

This section describes how to install Akamas on a Kubernetes cluster.

Preliminary steps

Before installing Akamas, please follow these steps:

Installation steps

Please follow these steps to install the Akamas application:

Changing UI Ports

By default, Akamas uses the following ports for its UI:

  • 80 (HTTP)

  • 443 (HTTPS)

Depending on the configuration of your environment, you may want to change the default settings: to do so, you’ll have to update the Akamas docker-compose file.

Inside the docker-compose.yml file, scroll down until you come across the akamas-ui service. There you will find a specification as follows:

Update the YAML file by remapping the UI ports to the desired ports of the host.

In case you were running Akamas with host networking, you are allowed to bind different ports in the container itself. To do so you can expand the docker-compose service by adding a couple of environment variables like this:

Finally, apply the new configuration after updating the AKAMAS_BASE_URL environment variable to match the new protocol or port.

Please also read the section on how to . Finally, read the relevant sections of to integrate Akamas into your specific ecosystem.

Review the cluster requirements
Install the software requirements
Install the application
Install the CLI
Verify the installation
Install the license
manage Akamas
Integrating Akamas
  akamas-ui:
    ports:
      - "443:443"
      - "80:80"
  akamas-ui:
    ports:
      - "<YOUR_HTTPS_PORT_OF_CHOICE>:443"
      - "<YOUR_HTTP_PORT_OF_CHOICE>:80"
  akamas-ui:
    environment:
      - HTTP_PORT=<HTTP_CONTAINER_PORT>
      - HTTPS_PORT=<HTTPS_CONTAINER_PORT>
    ports:
      - "<YOUR_HTTPS_PORT_OF_CHOICE>:<HTTP_CONTAINER_PORT>"
      - "<YOUR_HTTP_PORT_OF_CHOICE>:<HTTPS_CONTAINER_PORT>"

Online Installation

Create the configuration file

To proceed with the installation, you need to create a Helm Values file, called akamas.yaml in this guide, containing the mandatory configuration values required to customize your application. The following template contains the minimal set required to install Akamas:

# AWS credentials to fetch ECR images (required)
awsAccessKeyId: <AWS_ACCESS_KEY_ID>
awsSecretAccessKey: <AWS_SECRET_ACCESS_KEY>

# Akamas customer name. Must match the value in the license (required)
akamasCustomer: <CUSTOMER_NAME>

# Akamas administrator password. If not set a random password will be generated
akamasAdminPassword: <ADMIN_PASSWORD>

# The URL that will be used to access Akamas, for example 'http://akamas.kube.example.com' (required)
akamasBaseUrl: <INSTANCE_HOSTNAME>

You can also download the template file running the following snippet:

curl -so akamas.yaml  http://helm.akamas.io/templates/1.4.1/akamas.yaml.template

Replace in the file the following placeholders:

  • AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY: the AWS credentials for pulling the Akamas images

  • CUSTOMER_NAME: customer name provided with the Akamas license

  • ADMIN_PASSWORD: initial administrator password

Start the installation

With the configuration file you just created (and the new variables you added to override the defaults), you can start the installation with the following command:

helm upgrade --install \
  --create-namespace --namespace akamas \
  --repo http://helm.akamas.io/charts \
  --version '1.4.1' \
  -f akamas.yaml \
  akamas akamas

This command will create the Akamas resources within the specified namespace. You can define a different namespace by changing the argument --namespace <your-namespace>

An example output of a successful installation is the following:

Release "akamas" does not exist. Installing it now.
NAME: akamas
LAST DEPLOYED: Thu Sep 21 10:39:01 2023
NAMESPACE: akamas
STATUS: deployed
REVISION: 1
NOTES:
Akamas has been installed

NOTES:
Akamas has been installed

To get the initial password use the following command:

kubectl get secret akamas-admin-credentials -o go-template='{{ .data.password | base64decode }}'

Check the installation

To monitor the application startup, run the command kubectl get pods. After a few minutes, the expected output should be similar to the following:

NAME                           READY   STATUS    RESTARTS   AGE
airflow-6ffbbf46d8-dqf8m       3/3     Running   0          5m
analyzer-67cf968b48-jhxvd      1/1     Running   0          5m
campaign-666c5db96-xvl2z       1/1     Running   0          5m
database-0                     1/1     Running   0          5m
elasticsearch-master-0         1/1     Running   0          5m
keycloak-66f748d54-7l6wb       1/1     Running   0          5m
kibana-6d86b8cbf5-6nz9v        1/1     Running   0          5m
kong-7d6fdd97cf-c2xc9          1/1     Running   0          5m
license-54ff5cc5d8-tr64l       1/1     Running   0          5m
log-5974b5c86b-4q7lj           1/1     Running   0          5m
logstash-8697dd69f8-9bkts      1/1     Running   0          5m
metrics-577fb6bf8d-j7cl2       1/1     Running   0          5m
optimizer-5b7576c6bb-96w8n     1/1     Running   0          5m
orchestrator-95c57fd45-lh4m6   1/1     Running   0          5m
store-5489dd65f4-lsk62         1/1     Running   0          5m
system-5877d4c89b-h8s6v        1/1     Running   0          5m
telemetry-8cf448bf4-x68tr      1/1     Running   0          5m
ui-7f7f4c4f44-55lv5            1/1     Running   0          5m
users-966f8f78-wv4zj           1/1     Running   0          5m

At this point, you should be able to access the Akamas UI using the endpoint specified in the akamasBaseUrl, and interact through the Akamas CLI with the path /api.

Before starting the installation, make sure the are met.

Akamas on Kubernetes is provided as a set of templates packaged in a chart archive managed by .

INSTANCE_HOSTNAME: the URL that will be used to expose the Akamas installation, for example https://akamas.k8s.example.com when using an Ingress, or http://localhost:9000 when using port-forwarding. Refer to for the list of the supported access methods and a reference for any additional configuration required.

If you haven't already, you can update your configuration file to use a different type of service to expose Akamas' endpoints. To do so, pick from the the configuration snippet for the service type of your choice, add it to the akamas.yaml file, update the akamasBaseUrl value, and re-run the installation command to update your Helm release.

requirements
Helm
Accessing Akamas
Accessing Akamas

Install Akamas

Two installation modes are available:

Offline Installation - Private registry

Configure the registry

If your cluster is in an air-gapped network or is unable to reach the Akamas image repository, you need to copy the required images to your private registry.

The procedure described here leverages your local environment to upload the images. Thus, to interact between the Akamas and private registry, it requires Docker to be installed and configured.

Transfer the Docker images

The offline installation requires you to pull the images and migrate them to your private registry. In the following command replace the chart version to download the related list of images:

curl -sO  http://helm.akamas.io/images/1.4.1/image-list

Once the import is complete, you must re-tag and upload the images. Run the following snippet, replacing <REGISTRY_URL> with the actual URL of the private registry:

NEW_REGISTRY="<REGISTRY_URL>"

while read IMAGE; do
    REGISTRY=$(echo "$IMAGE" | cut -d '/' -f 1)
    REPOSITORY=$(echo "$IMAGE" | cut -d ':' -f 1 | cut -d "/" -f2-)
    TAG=$(echo "$IMAGE" | cut -d ':' -f 2)

    NEW_IMAGE="$NEW_REGISTRY/$REPOSITORY:$TAG"
    echo "Migrating $IMAGE to $NEW_IMAGE"

    docker pull "$IMAGE"
    docker tag "$IMAGE" "$NEW_IMAGE"
    docker push "$NEW_IMAGE"
done <image-list

This process could last several minutes, once the upload is complete, you can proceed with the next steps.

Create the configuration file

To proceed with the installation, you must create a Helm Values file, called akamas.yaml in this guide, containing the mandatory configuration values required to customize your application. The following template contains the minimal set required to install Akamas:

akamas.yaml
# Akamas customer name. Must match the value in the license (required)
akamasCustomer: <CUSTOMER_NAME>

# Akamas administrator password. If not set a random password will be generated
akamasAdminPassword: <ADMIN_PASSWORD>

# The URL that will be used to access Akamas, for example 'http://akamas.kube.example.com' (required)
akamasBaseUrl: <INSTANCE_HOSTNAME>

# The URL of your private registry
global:
  imageRegistry: <REGISTRY_URL>

elasticsearch:
  image: <REGISTRY_URL>/akamas/elastic/elasticsearch

Replace in the file the following placeholders:

  • CUSTOMER_NAME: customer name provided with the Akamas license

  • ADMIN_PASSWORD: initial administrator password

  • REGISTRY_URL: the URL for the private registry used in the transfer process above

Configure the authentication

To authenticate to your private registry, you must manually create the Secret required to pull the images. If the registry uses basic authentication, you can create the credentials in the namespace by running the following command:

kubectl create secret docker-registry registry-token \
  --namespace akamas \
  --docker-server=<REGISTRY_URL> \
  --docker-username=<USER> \
  --docker-password=<PASSWORD>

Otherwise, you can leverage any credential already configured on your machine by running the following command:

kubectl create secret docker-registry registry-token \
  --namespace akamas \
  --from-file=.dockerconfigjson=<PATH/TO/.docker/config.json>

Start the installation

From a machine that can reach the endpoint, run the following command to download the chart:

helm pull --repo http://helm.akamas.io/charts --version '1.4.1' akamas

The command downloads the latest version chart version as an archive named akamas-<version>.tgz. The file can be transferred to the machine where the installation will be run. Replace akamas/akamas with the download package in the following commands.

If you wish to see and override the values that Helm will use to install Akamas, you may execute the following command.

helm show values akamas-<version>.tgz

Now, with the configuration file you just created (and the new variables you added to override the defaults), you can start the installation with the following command:

helm upgrade --install \
  --create-namespace --namespace akamas \
  -f akamas.yaml \
  akamas akamas-<version>.tgz

This command will create the Akamas resources within the specified namespace. You can define a different namespace by changing the argument --namespace <your-namespace>

An example output of a successful installation is the following:

Release "akamas" does not exist. Installing it now.
NAME: akamas
LAST DEPLOYED: Thu Sep 21 10:39:01 2023
NAMESPACE: akamas
STATUS: deployed
REVISION: 1
NOTES:
Akamas has been installed

NOTES:
Akamas has been installed

To get the initial password use the following command:

kubectl get secret akamas-admin-credentials -o go-template='{{ .data.password | base64decode }}'

Check the installation

To monitor the application startup, run the command kubectl get pods. After a few minutes, the expected output should be similar to the following:

NAME                           READY   STATUS    RESTARTS   AGE
airflow-6ffbbf46d8-dqf8m       3/3     Running   0          5m
analyzer-67cf968b48-jhxvd      1/1     Running   0          5m
campaign-666c5db96-xvl2z       1/1     Running   0          5m
database-0                     1/1     Running   0          5m
elasticsearch-master-0         1/1     Running   0          5m
keycloak-66f748d54-7l6wb       1/1     Running   0          5m
kibana-6d86b8cbf5-6nz9v        1/1     Running   0          5m
kong-7d6fdd97cf-c2xc9          1/1     Running   0          5m
license-54ff5cc5d8-tr64l       1/1     Running   0          5m
log-5974b5c86b-4q7lj           1/1     Running   0          5m
logstash-8697dd69f8-9bkts      1/1     Running   0          5m
metrics-577fb6bf8d-j7cl2       1/1     Running   0          5m
optimizer-5b7576c6bb-96w8n     1/1     Running   0          5m
orchestrator-95c57fd45-lh4m6   1/1     Running   0          5m
store-5489dd65f4-lsk62         1/1     Running   0          5m
system-5877d4c89b-h8s6v        1/1     Running   0          5m
telemetry-8cf448bf4-x68tr      1/1     Running   0          5m
ui-7f7f4c4f44-55lv5            1/1     Running   0          5m
users-966f8f78-wv4zj           1/1     Running   0          5m

At this point, you should be able to access the Akamas UI using the endpoint specified in the akamasBaseUrl, and interact through the Akamas CLI with the path /api.

Installing on OpenShift

Running Akamas on OpenShift requires some Helm configurations to be applied.

OpenShift requirements

To proceed with the installation, OpenShift version 4.x.

Installation

The following snippet must be added to the akamas.yaml to install Akamas on OpenShift.

Access Akamas - Ingress to route

Once the Helm command is invoked, ensure the routes have been created by running:

The output must list the Akamas routes with different paths.

Toolbox

The Akamas' Toolbox optional component requires privileged access to run on OpenShift.

Akamas is deployed on your Kubernetes cluster through a , and all the required images can be downloaded from the AWS ECR repository.

, in case the Kubernetes cluster can access the Internet.

, in case the Kubernetes cluster does not have access to the Internet or you need to use a private image registry.

Before starting the installation, make sure the are met.

Akamas on Kubernetes is provided as a set of templates packaged in a chart archive managed by .

INSTANCE_HOSTNAME: the URL that will be used to expose the Akamas installation, for example https://akamas.k8s.example.com when using an Ingress, or http//:localhost:9000 when using port-forwarding. Refer to for the list of the supported access methods and a reference for any additional configuration required.

This section describes how to configure the authentication to your private registry. If your registry does not require any authentication, skip directly to the .

If you haven't already, you can update your configuration file to use a different type of service to expose Akamas' endpoints. To do so, pick from the the configuration snippet for the service type of your choice, add it to the akamas.yaml file, update the akamasBaseUrl value, and re-run the installation command to update your Helm release.

The installation is provided as a set of templates packaged in a chart archive managed by . Custom values are applied to ensure Akamas complies with the default restricted-v2 security context constraints.

Make sure you meet the

The installation can be done offline and online as described in the section . Choose the one that better suits your cluster access policies.

Besides the methods described in , you can use the OpenShift default ingress controller to create the required routes. Add the following snippet to the akamas.yaml file.

Helm chart
online installation
offline installation
requirements
Helm
Accessing Akamas
Accessing Akamas
installation section
akamas.yaml
airflow:
  uid: null
  gid: null

postgresql:
  primary:
    containerSecurityContext:
      enabled: false

    podSecurityContext:
      enabled: false

  shmVolume:
    enabled: false

kibana:
  podSecurityContext:
    fsGroup: null

  securityContext:
    runAsUser: null

elasticsearch:
  sysctlInitContainer:
    enabled: false

  securityContext:
    runAsUser: null

  podSecurityContext:
    fsGroup: null
    runAsUser: null
akamas.yaml
ingress:
  enabled: true
  
  annotations:
    route.openshift.io/termination: edge
    haproxy.router.openshift.io/timeout: 1200s

  className: ""

  tls:
    - {}
oc get routes
Helm
Kubernetes requirements
Install Akamas
Accessing Akamas

Accessing Akamas

To interact with your Akamas instance, you need the UI and API Gateway to be accessible from outside the cluster.

Kubernetes offers different options to expose a service outside of the cluster. The following is a list of the supported ones, with examples of how to configure them to work in your chart release:

While changing the access mode of your Akamas installation, you must also update the value of the akamasBaseUrl option of the Helm Values file to match the new endpoint used.

Port Forwarding

By default, Akams uses Cluster IPs for its services, allowing communication only inside the cluster. Still, you can leverage Kubectl's port-forward to create a private connection and expose any internal service on your local machine.

This solution is suggested to perform quick tests without exposing the application or in scenarios where cluster access to the public is not allowed.

Set akamasBaseUrl to http://localhost:9000 in your Helm Values file, and install or update your Akamas deployment using the Helm command. Once the rollout is complete, open a tunnel to the UI with the following command:

kubectl port-forward service/ui 9000:http

As long as the port-forwarding is running, you will be able to interact with the UI through the tunnel; you can also interact through the Akamas CLI by configuring the URL http://localhost:9000/akapi.

Ingress

An Ingress is a Kubernetes object that provides service access, load balancing, and SSL termination to Kubernetes services.

To expose the Akamas UI through an Ingress, configure the Helm Values file by configuring akamasBaseUrl with the host of the Ingress (e.g.: https://akamas.kube.example.com), and by adding the snippet below:

ingress:
  enabled: true
  tls:
    - secretName: "<SECRET_NAME>"  # secret containing the certificate and key data
  annotations: {}  # optional

Here is a description of the fields:

  • enabled: set to true to enable the Ingress

Refer to the official for more details about port-forwarding.

tls: configure secretName with the name of the Secret containing the TLS certificate for the hostname configured in akamasBaseUrl. This secret must be created manually before applying the configuration (see on the Kubernetes documentation) or managed by a certificate issuer configured in the namespace.

annotations: optional, provide any additional annotation required in your deployment. If your cluster leverages any certificate issuer (such as ), you can add here the annotations required to interact with the issuer.

Re-run to update the configuration. Once the rollout is complete, you will be able to access the UI using the URL specified in akamasBaseUrl and interact with the CLI using ${akamasBaseUrl}/api.

Refer to the for more details on Ingresses.

kubernetes documentation
TLS Secrets
cert-manager
official kubernetes documentation
Port Forwarding
Ingress
the install command

Software Requirements

This page describes the requirements that should be fulfilled by the user when installing or managing an Akamas installation on Kubernetes. The software below is usually installed on the user's workstation or laptop.

Kubectl

kubectl version --short

Helm

helm version --short

Privileged access

Akamas uses Elasticsearch to store logs and time series. When running Akamas on Kubernetes, Elasticsearch is installed automatically using the official Elasticsearch helm chart. This chart required running an init container with privileged access to set up a configuration on the Elasticsearch pod host. If running such a container is not permitted in your environment, you can add the following snippet to the akamas.yaml file when installing Akamas to disable this feature.

# Disable ES privileged initialization container. 
elasticsearch:
  sysctlInitContainer:
    enabled: false

Kubectl must be installed and configured to interact with the desired cluster. Refer to the to set up the client.

To interact with the Kubernetes APIs, you will need , preferably with a version matching the cluster. To check both the client and cluster versions, run the following:

Installing Akamas requires or higher. To check the version, run the following:

official kubectl documentation
kubectl
Helm 3.0

Setup the CLI

Linux

To get Akamas CLI installed on Linux, run the following commands:

curl -o akamas_cli https://s3.us-east-2.amazonaws.com/akamas/cli/$(curl -s https://s3.us-east-2.amazonaws.com/akamas/cli/stable.txt)/linux_64/akamas
sudo mv akamas_cli /usr/local/bin/akamas
chmod 755 /usr/local/bin/akamas

You can now run the Akamas CLI following by running the akamas command.

In some installations, the /usr/local/bin folder is not present in the PATH environment variable. This prevents you from using akamas without specifying the complete file location. To fix this issue you can add an entry to the PATH system environment variable or move the executable to another folder in your PATH.

Auto-completion

To enable auto-completion on Linux systems with a bash shell (requires bash 4.4+), run the following commands:

curl -O https://s3.us-east-2.amazonaws.com/akamas/cli/$(curl -s https://s3.us-east-2.amazonaws.com/akamas/cli/stable.txt)/linux_64/akamas_autocomplete.sh
mkdir -p ~/.akamas
mv akamas_autocomplete.sh ~/.akamas
echo '. ~/.akamas/akamas_autocomplete.sh' >> ~/.bashrc
source ~/.bashrc

Windows

To install the Akamas CLI on Windows run the following command from Powershell:

Invoke-WebRequest "https://s3.us-east-2.amazonaws.com/akamas/cli/$($(Invoke-WebRequest https://s3.us-east-2.amazonaws.com/akamas/cli/stable.txt | Select-Object -Expand Content) -replace '\n', '')/win_64/akamas.exe" -OutFile akamas.exe

You can now run the Akamas CLI by running .\akamas in the same folder.

To invoke the akamas CLI from any folder, create a akamas folder (such as C:\Program Files\akamas), and move there the akamas.exe file. Then, add an entry to the PATH system environment variable with the value C:\Program Files\akamas. Now, you can invoke the CLI from any folder, by simply running the akamas command.

The Akamas CLI can be accessed by simply running the akamascommand.

Verify the CLI

You can verify that the CLI was installed correctly by running this command:

akamas version

which should show an output similar to this one

Akamas CLI: 2.9.0
Akamas platform: 3.4.0

At any time, you can see available commands and options with:

akamas --help

For the full list of Akamas commands please refer to the section .

CLI reference

Useful commands

You may find helpful some of the commands listed in the sections below.

Read database passwords

By default, access to each service database is assigned to a user with randomly generated passwords. For example, to read the campaign service database password, execute the following command:

kubectl get secret database-user-credentials -o go-template='{{ .data.campaign | base64decode }}'

The username for the campaign service can be found in the configuration file under each service section. To read the username for the campaign service set during the installation, launch the following command:

helm get values akamas --all --output json | jq '.campaign.database.user'

You can connect to the campaign_service database with the user and password above.

If you want to show all the passwords, execute this command:

kubectl get secret database-user-credentials -o go-template='{{range $k,$v := .data}} {{printf "%s: %s\n" $k ( $v |base64decode ) }}{{end}}'

Initialize the CLI

The CLI is used to interact with an akamas server. To initialize the configuration of the Akamas CLI you can run the command:

akamas init config

and follow the wizard to provide the required information such as the server IP.

Here is a summary of the configuration wizard options.

Api address [http://localhost:8000]: https://<akamas-hostname>:<ui-port>/akapi
Workpace [default]: default
Verify SSL: [True]: True
Is external certificate CA required? [y/N]: N

After this step, the Akamas CLI can be used to login to the Akamas server, by issuing the following command:

akamas login

and providing the credentials as requested.

This configuration can be changed at any time (see how to ).

Logging into Akamas requires a valid license. If you have not installed your license yet refer to the page .

change the CLI config
Install the Akamas license

Install the CLI

This section describes how to install an Akamas workstation

The Akamas CLI allows users to invoke commands against the Akamas dedicated machine (Akamas Server). The Akamas CLI can also be installed on a different system than the Akamas Server.

Prerequisites

Linux and Windows operating systems are supported for installing Akamas CLI.

Installation steps

The Akamas CLI can be installed and configured in two simple steps:

Refer to the section to modify the CLI ports the Akamas Server is listening to. Section provides instructions on how to interact with Akamas via a proxy server.

Setup the CLI
Initialize the CLI
Change CLI config
Use a proxy server

Change CLI configuration

API Address

The CLI, as well as the UI, interacts with the akamas server via APIs. The apiAddress configuration contains the information required to communicate with the server.

Docker

The Akamas Server provides different listeners to interact with APIs:

  • an HTTP listener on port 80 under the path /akapi

  • an HTTP listener on port 8000

  • an HTTPS listener on port 443 under the path /akapi

  • an HTTPS listener on port 8443

Depending on your networking setup you can either use the listeners on ports 80 and 443 which are also used for the UI or directly interact with the API gateway on ports 8000 and 8443. If you are unsure about your network setup we suggest you start with the HTTPS listener on port 443.

For improved security, it is recommended to configure CLI communications with the Akamas Server over HTTPS. Notice that you need to have a valid certificate installed on your Akamas server (at least a self-signed one) to enable HTTPS communication between CLI and the Akamas Server.

Changing CLI protocol

The CLI can be configured either directly via the CLI itself or via the YAML configuration file akamasconf.

Using the CLI

Issue the following command to change the configuration of the Akamas CLI:

akamas init config

and then follow the wizard to provide the required CLI configuration:

  • enable HTTPS communications:

Api address [http://localhost:8000]: https://<akamas server dns name>:443/akapi
Workspace [default]: Workspace1
Login method (local, oauth2) [local]: local
Verify SSL: [True]: True
Is external certificate CA required? [y/N]: N
  • enable HTTP communications:

Api address [http://localhost:8000]: http://<akamas server DNS name>:80
Workspace [default]: Workspace1
Login method (local, oauth2) [local]: local

Please notice that by default Akamas CLI expects a valid SSL certificate. If you are using a self-signed certificate or a not valid one you can set the Verify SSL variable to false. This will mimic the behavior of accepting an invalid HTTPS certificate on your favorite browser.

Using the akamasconf file

Create a file and name it akamasconf to be located at the following locations:

  • Linux: ~/.akamas/akamasconf

  • Windows: C:\Users\<username>\.akamas (where C: is the drive where the OS is installed)

The file location can be customized by setting an $AKAMASCONF environment variable.

Here is an example akamasconf file provided as a sample:

apiAddress: http[s]://<akamas server dns name>:80[443]/akapi
verifySsl: [true|false]
workspace: default

The CLI configuration contains the information required to communicate with the akamas server. It can be easily created and updated with a configuration wizard. This page describes the main options of the Akamas CLI and how to modify them. If your Akamas instance is installed with Kubernetes, ensure the UI service is .

configured correctly

Verify the installation

Run the following command to verify the correct startup and initialization of Akamas:

akamas status

When all services have been started this command will return an "OK" message. Please notice that it might take a few minutes for Akamas to start all services.

To check that also UI is properly working please access the following URL:

http://<akamas server name here>

You will see the Akamas login form:

Please notice that it is impossible to log into Akamas before a license has been installed. Read here .

how to Install an Akamas license

Use a proxy server

The Akamas CLI supports interacting with the API server through an HTTP/HTTPS proxy server.

To enable access via an HTTP proxy, set the environment variable HTTP_PROXY. From the following snippet, replace proxy_ip and proxy_port with the desired values.

export HTTP_PROXY="http://<proxy_ip>:<proxy_port>"

Then, run the akamas command to verify access.

akamas status debug

Access through an HTTPS proxy can be set by using the environment variable HTTPS_PROXY instead of HTTP_PROXY.

Installing the toolbox

Akamas offers, as an additional container, a toolbox that contains the Akamas CLI executable, along with some other useful command-line tools such as kubectl, Helm, vim, docker cli, jq, yq, git, gzip, zip, OpenSSH, ping, cURL, and wget. It can be executed along akamas services, in the same network, for docker-compose installation, or in the akamas namespace for Kubernetes installations.

This toolbox aims to:

  • allow users to interact with Akamas without the need to install Akamas CLI on their systems

Docker compose installation

By setting the following options in the .env file, you can configure your toolbox by enabling SSH password authentication (only key-based authentication will be available otherwise) and by setting a login password:

.env
ALLOW_PASSWORD=true
CUSTOM_PASSWORD=yourPassword

To start the toolbox container just issue the following command:

docker compose --profile toolbox up -d

If you want to keep the toolbox running also after a complete restart you can also add the following line to your .env file: COMPOSE_PROFILES=toolbox

Accessing the toolbox on Docker

To access the toolbox on docker you can issue the following command:

docker exec -it toolbox bash

You will be provided with a shell inside the toolbox where you can interact with Akamas. Please read the work folder section below for more information on how to persist scripts and data on the toolbox upon restart and upgrades.

Kubernetes installation

Follow the usual guide for installing Akamas on Kubernetes but make sure to override the following variable (its default value is false) in your akamas.yaml file or in file values-files/my-values.yaml (can be created if missing):

Follow the usual guide for installing Akamas on Kubernetes, adding the following variables to values file:

toolbox:
  enabled: true
  sshPassword:
    # enable SSH password authentication. If 'false', only key-based access
    # will be allowed
    enabled: false
    # configure the password for the toolbox user. If not provided, an
    # autogenerated password will be used
    override:

Then, you can launch the usual helm upgrade --install ... command to run the pod, as described in theStart the installation (online) or Start the installation (offline) sections.

Accessing the toolbox on Kubernetes

When it's deployed to Kubernetes, you may access this toolbox in two ways:

  • via kubectl

  • via SSH command

Kubectl access

Accessing is as simple as:

kubectl exec -it deployment/toolbox -- bash

SSH access

For this type of access, you need to retrieve the SSH login password (if enabled) or key. To fetch them, run the following commands:

# Get the password
kubectl exec deployment/toolbox -- cat /home/akamas/password
# Get the key
kubectl exec deployment/toolbox -- cat /home/akamas/.ssh/id_rsa

With this info, you can leverage the toolbox to run commands in your workflows, like in the following example:

name: hello-workflow
tasks:
  - name: Say Hello
    operator: Executor
    arguments:
      command: echo 'Hello Akamas'
      host:
        hostname: toolbox
        username: akamas
        password: d48020ab71be6a07

You can also access the toolbox by port-forwarding from your local machine (on port 2222 in our example). Run the following kubectl command:

kubectl port-forward service/toolbox 2222:22

On another terminal, run:

ssh akamas@localhost -p 2222

and answer yes to the question, then insert the akamas password to successfully SSH access the toolbox (see example below):

$ ssh akamas@localhost -p 2222
The authenticity of host '[localhost]:2222 ([127.0.0.1]:2222)' can't be established.
ED25519 key fingerprint is SHA256:34GXnmRz1YjWr2TTpUpJmRoHYck0NzeAxni2L857Exs.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '[localhost]:2222' (ED25519) to the list of known hosts.
akamas@localhost's password:
Welcome to Ubuntu 20.04.6 LTS (GNU/Linux 5.10.178-162.673.amzn2.x86_64 x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

This system has been minimized by removing packages and content that are
not required on a system that users do not log into.

To restore this content, you can run the 'unminimize' command.

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

akamas@toolbox-6dd8b7f898-8xwzf:~$

Work directory

A typical kubernetes scenario is: Akamas running from inside a namespace and customer application running from inside another namespace. In such a scenario you will probably need to create an Akamas workflow (running from akamas namespace) that applies a new configuration on the customer application (running in customer namespace) then Akamas collects new metrics for a period of time and then calculates a new configuration based on the score of the previous configuration.

What follows is a typical workflow example that:

  • uses a FileConfigurator to create a new helm file that applies the new configuration computed by Akamas on a single service named adservice.FileConfigurator recreates a new adservice.yaml file by using the template adservice.yaml.templ. Just make sure that adservice.yaml.templ contains namespace: boutique (the customer namespace, in our example)

  • uses an Executor that launches kubectl apply with the new helm file adservice.yaml you just saved to apply the new configuration

  • uses another Executor to wait for the new configuration to be actually rolled out by launching kubectl rollout status

  • waits for half an hour to observe the canges in metrics

If you need to store Akamas artifacts, scripts, or any other file that requires persistence, you can use the /work directory, which persists across restarts. This is the default working directory at login time.

provide the with an environment where to run scripts and persist artifacts when no other options (e.g. a dedicated host) are available

By default, SSH access to the toolbox is limited to a subset of internal services. In the Helm values file, you can configure toolbox.ingress with additional .

Akamas' workflows
workflow-related
ingress rules

Install the license

Running Akamas' studies requires a valid license.

To install a license get in touch with Akamas Customer Service to receive:

  • the Akamas license file

  • your "customer name" to configure in the variable AKAMAS_CUSTOMER for Docker installations or akamasCustomer for Kubernetes installations

  • the URL to configure in the AKAMAS_BASE_URL variable for Docker installations

  • login credentials

Once you have this information, you can issue the following commands:

cd <your bundle files location>
akamas install license <license file you have been provided>

To get the administrator's initial password for Kubernetes installations, run the following command:

kubectl get secret -n <NAMESPACE> akamas-admin-credentials -o go-template='{{.data.password | base64decode}}'

Manage anonymous data collection

Akamas might collect anonymized usage information on running optimizations. Collection and tracking are disabled by default and can be manually enabled.

Docker installation

External tracking is managed through the following environment variables:

  • AKAMAS_TRACKER_URL: the target URL for all tracking info.

  • AKAMAS_TRACKING_OPT_OUT: when set to 1, disables anonymous data collection.

Tracking for a running instance can be enabled by editing the AKAMAS_TRACKING_OPT_OUT variable in the docker-compose.yaml file.

To enable tracking set the variable to the following value:

AKAMAS_TRACKING_OPT_OUT=0

Then issue the command:

docker compose up -d

Kubernetes installation

External tracking is managed through the field trackingOptOut in the Values file. To enable tracking set trackingOptOut to 0 as in the following example and upgrade the installation:

awsAccessKeyId: "YOUR_ACCESSKEY_ID"
awsSecretAccessKey: "YOUR_SECRET_ACCESS_KEY"

trackingOptOut: 0

Configure an external identity provider

To configure an external identity provider, access the Keycloak administration console exposed on the /auth page of your installation; for example, https://app.akamas.io/auth.

Now log into the Administration Console using the admin user. The password for such a user can be retrieved in different ways, depending on the installation method:

  • Kubernetes. A custom password can be specified during the installation by providing a value keycloak.adminPassword in the helm chart. If this value was left unspecified, you can retrieve the auto-generated password with the following command:

kubectl get secret keycloak-admin-credentials \
  -o go-template='{{ .data.KEYCLOAK_ADMIN_PASSWORD | base64decode }}'

Note that you might need to provide the namespace in which Akamas have been installed using the flag -n namespace

  • Docker.

    A custom password can be specified during the installation by providing a value for the variable KEYCLOAK_ADMIN_PASSWORD in the environment or in the docker-compose file. if during the installation you didn't specify the value, you can retrieve the auto-generated password with the following command:

docker exec -it keycloak cat /config/keycloak_admin | cut -d '|' -f1

Once logged in, select the akamas realm from the dropdown menu and navigate to the Identity providers section.

From here on, the steps required to proceed with the configuration vary depending on the provider you are integrating with. Select yours from the guides below:

Azure Active Directory
Google

Azure Active Directory

This page provides a walkthrough to configure Azure Active Directory as an external identity provider for Akamas users.

You will need an Azure account with the Application.ReadWrite.All permission, required to create app registrations in your Azure AD tenant.

Configure the App registration

​​Multiple Akamas instances can share the same app registration. It implies that any AD user added to a registration could access all the associated Akamas instances.

If you need to manage accesses with finer granularity, create a dedicated registration for each Akamas installation.

Create a new App registration

  • a name for the application

  • the account type that best suits your use case

Complete the creation by clicking on "Register".

Get the client configuration

On the "Overview" page of the application, note the following values:

  • Application (client) ID

  • OpenID Connect metadata document (found in the "Endpoints" side panel)

Furthermore, in the "Certificates & secrets" section, create a new Client secret and note its value.

With these values, we can now complete the provider configuration in the Keycloak console.

Create the Identity provider

Here, specify an alias for the client ("microsoft" in our case) and optionally the display name used in the login page ("Microsoft").

In the "OpenID Connect settings" section, configure the following fields using the values from the app registration in the Azure portal:

  • "Discovery endpoint": populate with the URL of the "OpenID Connect metadata document". This box should become green upon successful validation.

  • "Client ID": populate with the "Application (client) ID" from the app's overview page

  • "Client Secret": populate with the value of the generated secret

Complete the configuration by clicking "Add". You will land on the detail page of the new provider: here, copy the value of the redirect URI.

Complete the app registration

Back to the app registration in the Azure portal, navigate to the "Authentication" section. Add the "Web" platform (if not already present).

Finally, add to the list of redirect URIs the one from the previous step.

You have now configured Akamas to delegate to Azure your users' login.

When changing the hostname of the Akamas installation, you need to update the redirect URI configured in the app registration. Skipping this step will block any login attempt with the following error:

The redirect URI 'https://...' specified in the request does not match the redirect URIS configured for the application '...'.

Configure the default Akamas roles

The final setup step is to instruct Akamas to associate the default roles with the users automatically. This way, users will be added to the default workspace with read and write permissions the first time they log in.

On the Keycloak console, on the provider's details page, navigate to "Mappers":

Now, add the following configurations.

User role

  • Name: User role

  • Mapper type: Hardcoded role

  • Role: USER

Default Workspace Read

  • Name: Default Workspace Read

  • Mapper type: Hardcoded role

  • Role: WS_ac8481d3-d031-4b6a-8ae9-c7b366f027e8_R

Default Workspace Write

  • Name: Default Workspace Write

  • Mapper type: Hardcoded role

  • Role: WS_ac8481d3-d031-4b6a-8ae9-c7b366f027e8_W

Test the integration

Visit the installation's login page to check that the new authentication method is displayed and works correctly.

Google

This page provides a walkthrough to configure Google as an external identity provider for Akamas users.

You will need a Google account with the privileges required to create app registrations.

Configure the App registration

Configure the Consent Screen

If the "Credentials" page displays a warning box reminding you to configure the consent screen, you first need to create an app. Click the enclosed button to start the wizard.

Create the OAuth client

Click the "Create Credentials" link on top, and select "OAuth Client ID".

Configure the client as follows:

  • "Application Type": select "Web application"

  • "Name": populate with the name of the new client

  • "Authorized redirect URIs": leave it blank, as you will fill it in a later step

Once you click "Create" the console will show you a confirmation pop-up containing the client's configuration. Note the Client ID and the Client Secret.

Create the Identity provider

Configure the following fields using the values from the OAuth client you just created:

  • "Client ID": fill in the id of the client

  • "Client Secret": fill in the secret of the client

To complete the configuration, note the "Redirect URI" value and click "Add".

Complete the app registration

If you change the hostname of the Akamas installation, then you will need to update or add the configured redirect URI registration for the integration to work correctly.

Configure the default Akamas roles

The final setup step is to instruct Akamas to associate the default roles with the users automatically. This way, users will be added to the default workspace with read and write permissions the first time they log in.

On the Keycloak console, on the provider's details page, navigate to "Mappers":

Now, add the following configurations.

User role

  • Name: User role

  • Mapper type: Hardcoded role

  • Role: USER

Default Workspace Read

  • Name: Default Workspace Read

  • Mapper type: Hardcoded role

  • Role: WS_ac8481d3-d031-4b6a-8ae9-c7b366f027e8_R

Default Workspace Write

  • Name: Default Workspace Write

  • Mapper type: Hardcoded role

  • Role: WS_ac8481d3-d031-4b6a-8ae9-c7b366f027e8_W

Test the integration

Visit the installation's login page to check that the new authentication method is displayed and works correctly.

To integrate Akamas with your Active Directory, you first need a dedicated App registration in your Azure Organization. If you want to use an existing registration, skip to ; to create a new one, refer to the following sub-section.

To create a new registration, navigate to in your Azure portal and select "New registration" and specify the following:

Access the Identity providers section for the "akamas" realm in the Keycloak administration console, as described on the page , and select "OpenID Connect v1.0" to start creating the new provider.

To integrate Akamas with your Google Workspace, you first need a project with a dedicated OAuth client. Log in to your Google Developer Console, and navigate to the of "API & Services".

Follow the wizard to configure the consent screen according to your company's policies. For more details, refer to on the official documentation.

Once the configuration is complete, return to the .

Access the Identity providers section for the "akamas" realm in the Keycloak administration console, as described on the page , and select "Google" to start creating the new provider.

Back to the Google Developer Console, on the , open the newly created client and add the URI from the previous step to the list of "Authorized redirect URIs".

"App registrations"
Configure an external identity provider
Get the client configuration

Audit logs

Akamas audit logs

Akamas stores all its logs into an internal Elasticsearch instance: some of these logs are reported to the user in the GUI in order to ease the monitoring of workflow executions, while other logs are only accessible via CLI and are mostly used to provide more context and information to support requests.

Audit access can be performed by using the CLI in order to extract logs related to UI or API access. For instance, to extract audit logs from the last hour use the following commands:

  • UI Logs

akamas logs --no-pagination -S kong -f -1h
  • API Logs

akamas logs --no-pagination -S kong -f -1h

Notice: to visualize the system logs unrelated to the execution of workflows bound to workspaces, you need an account with administrative privileges.

Storing audit logs into files

To ease the integration with external logging systems, Akamas can be configured to store access logs into files. To enable this feature you should:

  1. Create a logs folder next to the Akamas docker-compose.yml file

  2. Edit the docker-compose.yml file by modifying the line FILE_LOG: "false" to FILE_LOG: "true"

  3. If Akamas is already running issue the following command

docker compose up -d logstash

otherwise, start Akamas first.

When the user interacts with the UI or the API Akamas will report detailed access logs both on the internal database and in a file in the logs folder. To ease log rolling and management every day Akamas will create a new file named according to the pattern access-%{+YYYY-MM-dd}.log.

Akamas logs

Akamas allows dumping log entries from a specific service, workspace, workflow, study, trial, and experiment, for a specific timeframe and at different log levels.

Akamas CLI for logs

Akamas logs can be dumped via the following CLI command:

akamas log

This command provides many filters which can be retrieved with the following command:

akamas log --help

which should return

Usage: akamas log [OPTIONS] [MESSAGE]

  Show Akamas logs

Options:
  -d, --debug                     Show extended error messages if present.
  --page-size INTEGER             Number of log lines to be retrieved NOTE:
                                  This argument is mutually exclusive with
                                  arguments: [dump, no_pagination].
  --no-pagination                 Disable pagination and print all logs NOTE:
                                  This argument is mutually exclusive with
                                  arguments: [dump, page_size].
  --dump                          Print the logs without pagination and
                                  formatting NOTE: This argument is mutually
                                  exclusive with arguments: [page_size,
                                  no_pagination].
  -f, --from [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S|%Y-%m-%dT%H:%M:%S.%f|%Y-%m-%d %H:%M:%S.%f|[-]nw|[-]nd|[-]nh|[-]nm|[-]ns]
                                  The start timestamp of the logs
  -t, --to [%Y-%m-%d|%Y-%m-%dT%H:%M:%S|%Y-%m-%d %H:%M:%S|%Y-%m-%dT%H:%M:%S.%f|%Y-%m-%d %H:%M:%S.%f|[-]nw|[-]nd|[-]nh|[-]nm|[-]ns]
                                  The end timestamp of the logs
  -s, --study TEXT                UUID or name of the Study
  -e, --exp INTEGER               Number of the experiment
  --trial INTEGER                 Number of the trial
  -y, --system TEXT               UUID or name of the System
  -W, --workflow TEXT             UUID or name of the Workflow
  -l, --log-level TEXT            Log level
  -S, --service TEXT              Akamas service
  --without-metadata              Hide metadata
  --sorting [ASC|DESC]            Sorting order of the timestamps
  -ws, --workspace TEXT           UUID or name of the Workspace to visualize.
                                  When empty, system logs will be returned
                                  instead
  --help                          Show this message and exit.

For example, to get the list of the most recent Akamas errors:

akamas log -l ERROR

which should return something similar to:

       timestamp                         system                  provider    service                                                                                   message
==============================================================================================================================================================================================================================================================
2022-05-02T15:51:26.88    -                                      -          airflow     Task failed with exception
2022-05-02T15:51:26.899   -                                      -          airflow     Failed to execute job 2 for task Akamas_LogCurator_Task
2022-05-02T15:56:29.195   -                                      -          airflow     Task failed with exception
2022-05-02T15:56:29.215   -                                      -          airflow     Failed to execute job 3 for task Akamas_LogCurator_Task
2022-05-02T16:01:55.587   -                                      -          license     2022-05-02 16:01:47.426 ERROR 1 --- [           main] c.a.m.utils.rest.RestHandlers            :  has failed with returning a response:
                                                                                        {"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following metrics: 'spark.spark_application_duration' were not found
                                                                                        in any of the components of the system 'analytics_cluster'","path":null}
2022-05-02T16:01:55.587   -                                      -          license     2022-05-02 16:01:47.434 ERROR 1 --- [           main] c.a.m.MigrationApplication               : Unable to complete operation. Mode: RESTORE. Cause: A request to a
                                                                                        downstream service CampaignService has failed: 400 : [{"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following
                                                                                        metrics: 'spark.spark_application_duration' were not found in any of the components of the system 'analytics_cluster'","path":null}]
2022-05-02T16:01:55.678   -                                      -          license     2022-05-02 16:01:47.434 ERROR 1 --- [           main] c.a.m.MigrationApplication               : Unable to complete operation. Mode: RESTORE. Cause: A request to a
                                                                                        downstream service CampaignService has failed: 400 : [{"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following
                                                                                        metrics: 'spark.spark_application_duration' were not found in any of the components of the system 'analytics_cluster'","path":null}]
2022-05-02T16:01:55.678   -                                      -          license     2022-05-02 16:01:47.426 ERROR 1 --- [           main] c.a.m.utils.rest.RestHandlers            :  has failed with returning a response:
                                                                                        {"httpStatus":400,"timestamp":"2022-05-02T16:01:47.413638","error":"Bad Request","message":"The following metrics: 'spark.spark_application_duration' were not found
                                                                                        in any of the components of the system 'analytics_cluster'","path":null}
2022-05-02T16:12:10.261   -                                      -          license     2022-05-02 16:05:53.209 ERROR 1 --- [           main] c.a.m.services.CampaignService           : de9f5ff9-418e-4e25-ae2c-12fc8e72cafc
2022-05-02T16:32:07.216   -                                      -          license     2022-05-02 16:31:37.330 ERROR 1 --- [           main] c.a.m.services.CampaignService           : 06c4b858-8353-429c-bacd-0cc56cc44634
2022-05-02T16:38:18.522   -                                      -          campaign    Internal Server Error: Object of class [com.akamas.campaign_service.entities.campaign.experiment.Experiment] with identifier
                                                                                        [ExperimentIdentifier(workspace=ac8481d3-d031-4b6a-8ae9-c7b366f027e8, study=de9f5ff9-418e-4e25-ae2c-12fc8e72cafc, id=2)]: optimistic locking failed; nested exception
                                                                                        is org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) :
                                                                                        [com.akamas.campaign_service.entities.campaign.experiment.Experiment#ExperimentIdentifier(workspace=ac8481d3-d031-4b6a-8ae9-c7b366f027e8,
                                                                                        study=de9f5ff9-418e-4e25-ae2c-12fc8e72cafc, id=2)]
"Credentials" page
Configure the OAuth consent screen
"Credentials" page
Configure an external identity provider
"Credentials" page

Managing

This section is a collection of different topics related to how to manage the Akamas Server.

This section covers some topics on how to manage the Akamas Server:

Upgrade Akamas

The following sections describe the procedure to upgrade your Akamas instance.

Docker compose

Docker compose Configuration

mv docker-compose.yml docker-compose.yml.bak
curl -O https://s3.us-east-2.amazonaws.com/akamas/compose/3.4.0/docker-compose.yml

You can point to a specific version. As an example to download the artifact for version 3.2.2:

curl -O https://s3.us-east-2.amazonaws.com/akamas/compose/3.2.2/docker-compose.yml

If the old docker-compose has been changed and it is still needed in the newer Akamas version, make sure to migrate such changes from docker-compose.yml.bak to the docker-compose.yml .

Then log in to AWS with the following command:

aws ecr get-login-password --region us-east-2 | docker login --username AWS --password-stdin 485790562880.dkr.ecr.us-east-2.amazonaws.com

If the login succeeds, then you can start the upgrade by running:

docker compose up -d

Wait for a few minutes and check the Akamas services are running the command:

akamas status -d

The expected output should be like the following (repeat the command after a minute or two if the last line is not "OK" as expected):

Checking Akamas services on http://localhost:8000
service       status
=========================
analyzer      UP
campaign      UP
metrics       UP
optimizer     UP
orchestrator  UP
system        UP
telemetry     UP
license       UP
log           UP
users         UP
OK

Kubernetes

Online

Start by updating the local chart repository:

helm repo update akamas

Start online upgrade

Ensure your kubectl configuration points to the namespace where Akamas is installed or specify it with the --namespace parameter. To start the upgrade to the latest version:

helm upgrade akamas akamas/akamas

Offline

helm repo update akamas
helm pull akamas/akamas

Useful commands

Listing Akamas chart versions

Akamas' versions can be listed by running the following command:

helm search repo akamas/akamas --versions

It is always suggested to install and upgrade to the latest chart version. The App Version field refers to the Akamas version. To ease the release process multiple chart versions may refer to the same App Version.

Retrieving the Values file

In case you do not have access to the Values file used during the last installation/upgrade, you can still get it by running:

helm get values akamas -o yaml > akamas-values.yaml

Such a command is useful only if you need to change some of the parameters during the upgrade, otherwise the old Values file is kept by Helm.

If you plan to upgrade your Akamas instance, please verify the upgrade path with the Akamas support team. To ensure rollback in case of upgrade failure, it is suggested to backup your studies (see section ).

If you plan to upgrade your Akamas instance, please verify the upgrade path with the Akamas support team. To ensure rollback in case of upgrade failure, it is suggested to backup your studies (see section ).

To start with the upgrade, on the Akamas server navigate to the same folder where the docker-compose.yml and .env file are stored (see section ). Now you can download the latest version compose file:

Ensure your .env file is up to date with the required variables, by comparing your version with the one at .

If you plan to upgrade your Akamas instance, please verify the upgrade path with the Akamas support team. To ensure rollback in case of upgrade failure, it is suggested to back up your studies (see section ).

The following guide uses the same chart repository and helm release names. Before starting the upgrade, you may find it helpful to look at the section .

You can specify an older chart version using the --version parameter. Refer to for discovering the published chart versions.

If you need to specify a different Values file from the latest installation, start from the last one used. If you do not have it stored, it can be retrieved as specified in .

Before starting the upgrade, check to add new docker images.

If you can not reach helm.akamas.io from the machine where the installation will be run, run the following commands from another client (see the for a full explanation).

Then, you can start the upgrade in the same way as for the . If you are using the downloaded chart package, transfer the package and replace akamas/akamas with the downloaded tgz archive.

Akamas logs
Audit logs
Install upgrades and patches
Monitor the Akamas Server
Backup & Recovery of the Akamas Server
Useful Commands
Listing Akamas chart versions
Retrieving the Values file
Online version
Configure the registry
installation guide
Get Akamas Docker artifacts
Configure Akamas environment variables

Backup & Recover of the Akamas Server

Akamas server backup

The process of backing up an Akamas server can be divided in two parts, that is system backup and otherwise start Akamas. Backup can be performed in any way you see fit: they’re just regular files so you can use any backup tool.

System backup

System services are hosted on AWS ECR repo so the only thing that fully defines a working Akamas application is the docker-compose.yml file. Performing a backup of the Akamas application is as simple as copying this single file to your backup location. you may schedule any script that performs this weekly or at any frequency you see fit

User data backup

You may list all existing Akamas studies via the Akamas CLI command:

akamas list study

Then you can export all existing studies one by one via the CLI command

akamas export study <UUID>

where UUID is the UUID of a single study. This command exports into a single archive file (tar.gz). These archive files can be backed up to your favorite backup folder.

Akamas server recovery

Akamas server recovery involves recovering the system backup, restarting the Akamas service then re-importing the studies.

System Restore

To restore the system you must recover the original docker-compose.yml then launch the command

docker compose up &

from the folder where you placed this YAML file and then wait for the system to come up, by checking it with the command

akamas status -d

User data restore

All studies can be re-imported singularly with the CLI command (referring to the correct pathname of the archive):

akamas import study archive.tgz
User data backup
User data backup
User data backup

Using

This section describe the main steps to optimize an application

To optimize a new application on Akamas you have to follow four steps shown in the following picture and described in the next sections by means of a simple example.

As depicted in the picture above, to optimize a new application you should:

  • Create a system that models the key parts of your application (e.g. containers, runtimes, APIs) that will be interested in the optimization initiative.

  • Set up the integration with a monitoring tool via telemetry providers so that Akamas can gather metrics about the performance of your application.

  • Create a workflow that allows Akamas to configure your application (e.g. write a configuration file, relaunch a process).

  • Define the optimization study according to your goal and SLOs so that Akamas knows what you want to achieve.

These steps relate to how Akamas integrates with your environment, described in this section, and apply to both offline and live optimization studies.

Example Application

In the following sections, we will use a simple yet representative web application to illustrate how to onboard a new application on Akamas. The application is called Online Boutique. It is a microservices application composed of 11 microservices that allow users to browse items, add them to the cart, and purchase them in an online store.

Suppose that we are about to deploy a major upgrade to one of the microservices, the Ad Service, that handles the advertisement logic, and we want to reduce the costs of running this service while meeting our SLO on the response time given an increasing number of users.

As shown in the diagram below, our service is built in Java, deployed as a pod in a Kubernetes cluster, and exposes an API using a service. The whole platform is monitored with Dynatrace.

You can now proceed to the first step, creating the system to model this application.

If your technology stack or optimization need does not fit this example, take a look at the section where you can find many optimization scenarios for different use cases.

Optimization Guide

Monitor the Akamas Server

External tools

You can use any monitoring tool to check the availability of the Akamas instance.

Checking Akamas services

To check the status of the Akams services please run akamas status -d to identify which service is not able to start up correctly

Here is an example of output:

Checking Akamas services on http://localhost:8000
 service	 status
=========================
analyzer       	UP
campaign       	UP
metrics        	UP
optimizer      	UP
orchestrator   	UP
system         	UP
telemetry      	UP
license        	UP
log            	UP
users          	UP
OK

Workflow

The third step in optimizing a new application is to create a workflow to instruct Akamas on the actions required to apply a configuration to the target application.

A workflow defines the actions that must be executed to evaluate the performance of a given configuration. These actions usually depend on the application architecture, technology stack, and deployment practices which might vary between environments and organizations (e.g. Deploying a microservice application in a staging environment on Kubernetes and performing a load test might be very different than applying an update to a monolith running in production).

If you are using GitOps practices and deployment pipeline you are probably already familiar with most of the elements used in Akamas workflows. Workflows can also trigger existing pipelines and re-use all the automation already in place.

Workflows are not tightly coupled to a study and can be re-used across studies and systems so you can change the optimization scope and target without the need to re-create a specific workflow.

Creating the workflow for Online Boutique

The workflow that we will create to allow Akamas to evaluate the configurations comprises the following actions:

  1. Create a deployment file from a template

  2. Apply the file via kubectl command

  3. Wait for the deployment to be ready

  4. Start the load test via locust APIs

Even if the integrations of this workflow are specific to the technology used by our test application (e.g. using kubectl CLI to deploy the application), the general structure of the workflow could fit most of the applications subject to offline optimization in a test environment.

Here is the YAML definition of the workflow described above.

Save it to a file named, as an example, workflow.yaml and then issue the creation command:

Here is what the workflow looks like in the UI:

Steps to setup a new optimization
Online Boutique Ad microservice deployment
User Role map

Akamas provide several general-purpose and specialized workflow operators that allow users to perform common actions such as running a command on a Linux instance via SSH as well as integrate enterprise tools such as LoadRunner to run performance tests or Spark to launch Big Data analysis. More information and usage examples are on the .

The structure of the workflow heavily depends on deployment practices and the kind of optimization. In our example, we are dealing with a microservice application deployed in a test environment which is tested by injecting some load using , a popular open-source performance testing tool.

You can find more workflow examples for different use cases on the and references to technology-specific operators (e.g. Loadrunner, Spark) on the .

In this workflow, we used two operators: the which creates a configuration file starting from a template by inserting the configuration values decided by Akamas, and the which runs a command on a remote instance (named mgmserver in this case, via ssh).

name: Configure and Test Online Boutique
tasks:
  # 1 - Create a deployment file from a template
  - name: Configure Online Boutique
    operator: FileConfigurator
    arguments:
      source:
        hostname: mgmserver
        username: akamas
        password: ******
        path: /work/boutique/boutique.yaml.templ
      target:
        hostname: mgmserver
        username: akamas
        password: *******
        path: /work/boutique/boutique.yaml
 
  # 2 - Apply the file via the kubectl command
  - name: Apply new configuration to the Online Boutique
    operator: Executor
    arguments:
      host:
        hostname: mgmserver
        username: akamas
        password: *******
      command: kubectl apply -f /work/boutique/boutique.yaml
  
  # 3 - Wait for the deployment to be ready
  - name: Check Online Boutique is up
    operator: Executor
    arguments:
      retries: 0
      host:
        hostname: mgmserver
        username: akamas
        password: *******
      command: kubectl rollout status --timeout=3m deployment ak-adservice 
  
  # 4 - Start the load test via locust APIs
  - name: Start Locust Test
    operator: Executor
    arguments:
      host:
        hostname: mgmserver
        username: akamas
        password: *******
      command: bash /work/boutique/run-test.sh
akamas create workflow workflow.yaml
Workflow Operators reference page
Locust
Optimization Guides section
Workflow Operators reference page
FileConfigurator operator
Executor operator

Offline Study

Offline optimization studies are typically used to optimize systems in pre-production environments, with respect to planned and what-if scenarios that cannot be directly run in production. Scenarios include new application releases, planned technology changes (e.g. new JVM or DB), cloud migration or new provider, expected workload growth, and resilience under failure scenarios (from chaos engineering).

The following figure represents the iterative process associated with offline optimizations:

The following 5 phases can be identified for each iteration (also known as experiment):

  1. Recommend Conf: Akamas AI engine identifies the configuration for the next iteration until a termination condition for the study is met (e.g. number of experiments).

Thanks to its patented AI (reinforcement learning) algorithms, Akamas can find the optimal configuration without having to explore all the possible configurations.

Trials

For each experiment, Akamas allows multiple trials to be executed. A trial is a repetition of the same experiment to reduce the impact of noise on the result of an experiment.

Environments can be noisy for several reasons such as:

  • External conditions (e.g. background jobs, "noisy neighbors" in the cloud)

  • Measurement errors (e.g. monitoring tools not always 100% accurate)

This approach is consistent with scientific and engineering practices, where the strategy to minimize the impact of noise is to repeat the same experiment multiple times.

Steps

An offline optimization study can include multiple steps.

Typically there are at least two steps:

  • Baseline step: a single experiment that is run by applying the already deployed configuration before the Akamas optimization is applied - the results of this experiment are used as a reference (baseline) for assessing the optimization and as such is a mandatory step for each study

  • Optimize step: a defined number of experiments used to identify the optimal configuration by leveraging Akamas AI.

Other steps are:

  • Bootstrap step: imported experiments from other optimization studies

  • Preset step: a single experiment with a defined configuration

The steps to be executed can be specified when defining an offline optimization study.

Commands

User Interface

The Akamas UI shows offline optimization studies in a specific top-level menu.

The details and results of an offline optimization study are displayed when drilling down (there are multiple tabs and sections).

Telemetry

After modeling the system and its components, the following step is to set up the telemetry. Telemetry is essential to provide Akamas with enough data to evaluate a configuration both in terms of goal (e.g. reducing the cost) and constraints (e.g. meeting SLOs).

To instruct Akamas about the location of the data sources and how to access them, you can create a telemetry instance for your system. A telemetry instance comprises the following properties:

  • Name: An optional unique name within the system to quickly identify it.

  • Provider: The name of the telemetry provider that will be used to gather metrics.

  • Config: Additional configuration options that depend on the provider (e.g. a URL to reach the observability tool or the location of a CSV file to import) refer to each provider reference for more information.

A system can include multiple telemetry instances from different providers (e.g. in case you need to extract some information from Dynatrace and others from a CSV file).

Components and Telemetry

Each telemetry provider supports a unique set of properties that depends on the specific data source which allows Akamas to map each component to one or more entities in the observability tool and extract the right metrics for that particular technology.

Creating a telemetry instance for the Online Boutique

In this file, we specified the URL and the token required to authenticate to our Dynatrace instance.

Save it to a file named, as an example, instance.yaml and then issue the command.

As described in the section above, telemetry instances are coupled to a specific system. For this reason, we had to provide the name of the system Online Boutique as an argument to the create command.

Here is how the telemetry instance looks in the UI.

Mapping Components

Akamas needs to be informed that the component named Adservice used in the system maps to a specific entity in Dynatrace that represents the container running in the Kubernetes cluster.

Recalling the definition of the Adservice component in the system we see that it contains a set of properties starting with the dynatrace keyword. These properties are used by the Dynatrace telemetry provider to map the component to the correct entity and import metrics such as CPU usage and throttling that can be used to gather information about the performance of such components.

Offline optimization studies are where the workload is simulated by leveraging a load-testing tool.

Apply configuration: Akamas applies the parameter configuration (one or more ) to the target system by leveraging a set of

Apply workload: Akamas triggers a workload on the target system by also leveraging a set of

Collect KPIs: Akamas collects the related to the target system - only those metrics that are specified by each defined in the system

Score vs goal: Akamas scores the applied parameter configuration against the defined - the score is the value of the goal function

An offline optimization study is an that can be managed via CLI using the

Akamas can gather metrics from many data sources, from industry-standard observability platforms (e.g. Prometheus or Dynatrace) to simple CSV files. This is done via telemetry providers that contain all the logic and information required to correctly extract the metrics and map them to the components of your system. You can take a look at available telemetry providers in the .

Telemetry instances alone do not provide information on which metrics should be extracted from the data source and to which component they map. As briefly introduced in the this is the job of the component properties.

As we introduced at the beginning of this section, we choose to use to monitor our application. To instruct Akamas to gather metrics from this data source you just need to create the following file.

For a complete definition of the properties available for the Dynatrace provider, as well as other providers, you can take a look at the documentation section.

If Dynatrace is not your observability platform of choice, take a look at the section where you can find many other telemetry providers for different observability tools and common integration strategies like CSV files.

optimization studies
parameters
workflow operators
workflow operators
metrics
telemetry instance
goal and constraints
provider: Dynatrace
name: Staging Environment
config:
  url: https://mydyn87510.live.dynatrace.com/
  token: dt0c01.JQG73....  
akamas create telemetry-instance instance.yaml "Online Boutique"
name: Adservice
description: The adservice of the online boutique by Google
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: akamas-demo
      containerName: server
      basePodName: ak-adservice-*
documentation reference
system section
Dynatrace
telemetry reference
telemetry provider
resource management commands.

Analyzing results of live optimization studies

Even for live optimization studies, it is a good practice to analyze how the optimization is being executed with respect to the defined goal & constraints, and workloads.

This analysis may provide useful insights about the system being optimized (e.g. understanding of the system dynamics) and about the optimization study itself (e.g. how to adjust optimizer options or change constraints). Since this is more challenging for an environment that is being optimized live, a common practice to adopt a recommendation mode before possibly switching to a fully autonomous mode.

The Akamas UI displays the results of an offline optimization study in the following areas:

  • the Metrics section (see the following figures) displays the behavior of the metrics as configurations are recommended and applied (possibly after being reviewed and approved by users); this area supports the analysis of how the optimizer is driven by the configured safety and exploration factors.

  • The All Configurations section provides the list of all the recommended configurations, possibly as modified by the user, as well as the details of each applied configuration (see the following figures).

  • in the case of a recommendation mode, the Pending configuration section (see the following figure) shows the configuration that is being recommended to allow users to review it (see the EDIT toggle) and approve it:

Metrics section of a live optimization study
From the metrics cahrt displaying configurations (toggle on) to a specific configuration
The list of configurations applied ovet time in the All Configuration section
A specifici configuration from the All Configuration section
Pending configutation

Windowing

A critical aspect, when evaluating the performance of an application, is to make sure that the data we use is accurate. It's quite common for IT systems to experience some transient periods of instabilities; these might occur in many situations such as filling up caches, runtime compilation activities, horizontal scaling, and much more.

A common practice, in performance engineering, is to exclude from the analysis the initial and final part of a performance test to consider only the time when the system is in full operation. Akamas can automatically identify a subset of the whole data to evaluate scores and constraints.

Looking at the example below, from the Online Boutique application, we see that the response time has an initial spike to about 7ms and then stabilizes below 1ms; also the CPU utilization shows a similar pattern.

This is quite common, as an example, for Java-based systems as, in the first minutes of operations activities like heap resizing and just-in-time compilation take place. In this case, Akamas considered in the evaluation of the experiment only the gray area effectively avoiding the impact of the initial spike.

This behavior can be configured in the study by specifying a section called windowing. Two windowing policies allow you to properly configure Akamas in different scenarios.

The windowing section in the study definition is optional and the default policy considers all the available data to evaluate the performance of the experiment.

Study

Now that Akamas knows about your application, how to configure it, and how to monitor it, the final step is to define your optimization study.

The study defines the objective of the optimization activity. It contains information about what we want to achieve (e.g. reduce costs, improve latency..), the parameters that can be optimized, and any SLO that should not be breached by the optimized configuration.

Studies are divided into two main categories:

The setup of both studies is similar as both are constituted by the following core elements:

  • Name: A unique identifier that can be used to identify different studies.

  • System: The name of the system that we want to optimize.

  • Workflow: The name of the workflow that will be used to configure the application.

  • Goal: The objective of the optimization (e.g. minimize cost, maximize throughput, reduce latency).

  • Parameter Selection: A list of parameters that will be tuned in the optimization (e.g. container memory and CPU limits, EC2 instance family..).

  • Steps: The flow of the optimization study (e.g. assessing the baseline performance, optimizing the system, restoring the configuration).

Goal

The goal defines the objective of our optimization. Specifying a goal is as simple as defining the metric we want to optimize and the direction of the optimization such as maximizing throughput or minimizing cost. If you want to optimize more complex scenarios or lack a single metric that represents your objective you can also specify a formula and define a goal such as minimizing memory and CPU utilization.

Metrics are identified within a study with the following notation component.metric_name where component is the name of a component of the system linked to the study and metric name is the name of a metric. As an example, the CPU utilization of a container might be identified by MyContainer.cpu_util.

Parameter Selection

The parameter selection contains the list of parameters that are subject to the optimization process. These might include several components and layers, as in the following example.

Similarly to metrics, components are defined with the notation component.parameter_name.

Optionally, you can also specify a range of values that can be assigned to the parameter. This is very useful when you want to evaluate a specific optimization area or want to add some context to the optimization (e.g. avoid setting a memory greater than 8GB because it's not available on the system).

The parameter selection can include any component and parameter of the system. During the optimization process, Akamas will provide values for those parameters and apply them to the system using the workflow provided in the study definition.

Steps

If the goal describes where we are heading, steps describe the road to get there. Usually, when optimizing an application we want to assess its performance before the tuning activity to evaluate the benefits; this initial assessment is called the Baseline. Then, we want to run the optimization process for a definite number of iterations, this is called an Optimization step. Many other use cases can be achieved by providing additional steps to the study. Some of these include:

  • Re-using knowledge gathered by other optimization studies

  • Applying the baseline configuration to the test environment after the optimization has ended

  • Evaluating a specific configuration suggested by the user

Optimizing the Online Boutique

As shown in the image below, you can use the study creation wizard in the UI to specify all the required information.

If you prefer to define it via YAML you can use the following file.

name: Reduce Costs
system: Online Boutique
workflow: Configure and Test Boutique

goal:
  objective: minimize
  function:
    formula: Adservice.cost
  constraints:
    absolute:
      - name: response_time
        formula: Apis.requests_response_time <= 20

parametersSelection:
  - name: Adservice.cpu_limit
    domain: [150, 1000]
  - name: Adservice.memory_limit
    domain: [64, 2048]
  - name: AdserviceJVM.jvm_maxRAMPercentage
  - name: AdserviceJVM.jvm_gcType

steps:
  - name: baseline
    type: baseline
    values:
      Adservice.cpu_limit: 500
      Adservice.memory_limit: 1024
      AdserviceJVM.jvm_maxRAMPercentage: 25

  - name: optimize
    type: optimize
    numberOfExperiments: 30

Save it to a file named, as an example, study.yaml and then issue the command

akamas create study study.yaml

This study's definition contains three main parts.

The goal

The parameters selection

The steps

This final section instructs Akamas to first assess the performance and costs of the current configuration, which we will refer to as the baseline, then run 30 experiments by changing the parameters to optimize the goal.

You can now start your optimization study and wait for Akamas to find the best configuration!

Parameters and constraints

Akamas supports four types of parameters:

  • Integer parameters are those that can only assume an integer value (e.g. the number of cores on a VM instance).

  • Real parameters can assume real values (e.g. 0.2) and are mostly used when dealing with percentages.

  • Categorical parameters map those elements that do not have a strict ordering such as GC types (e.g. Parallel, G1, Serial) or booleans.

  • Ordinal parameters are similar to categorical ones as they also support a set of literal values but they are also ordered. An example is VM instance size (e.g. small, medium, large, xlarge..).

Most of the time you should not bother with defining parameters, as this information is already defined in the Optimization Packs.

When creating new optimization studies you should first select a set of parameters to include in the optimization process. The set might depend on many factors such as:

  • The potential impact of a parameter on the defined goal (e.g. if my goal is to reduce the cost of running an application it might be a good idea to include parameters related to resource usage).

  • The layers selected for the optimization. Optimizing multiple layers at the same time might bring more benefits as the configurations of both layers are aligned.

  • The Akamas' ability to change those parameters (e.g. if my deployment process does not support the definition of some parameters because, as an example, are managed by an external group, I should avoid adding them).

Domains

Besides defining the set of parameters users can also select the domain for the optimization and add a set of constraints.

Optimization packs already include information on the possible values for a parameter but in some situations, it is necessary to shrink it. As an example, the parameter that defines the amount of CPU that a container can use (the cpu_limit ) might vary a lot depending on the underlying cluster and the application. If the cluster that hosts the application only contains nodes with up to 10 CPUs it might be worth limiting the domain of this parameter for the optimization study to that value to avoid failures when deploying the application and speed up the optimization process. If you forget to set this domain restriction Akamas will learn it by itself but it will need to try to deploy a container with a higher CPU limit to find out that that's not possible.

Constraints

In many situations, parameters have dependencies between each other. As an example, suppose you want to optimize at the same time the size of a container and the Java runtime that executes the application inside of it. Both layers have some parameters that affect how much memory can be used, for the container layer this parameter is called memory_limit and for the JVM is called jvm_heap_size. Configurations that have a jvm_heap_size value higher than the memory_limit might lead to out-of-memory errors.

You can define this relationship by specifying a constraint as in the example below:

parameterConstraints:
  - name: Heap should be lower than the container memory limit
    formula: container.memory_limit > jvm.jvm_heap_size + 50

These constraints instruct Akamas to avoid generating configurations that bring the jvm_heap_size parameter close to the memory_limit leaving a gap of 50Mb.

Optimization Guides

What do you want to do with Akamas?

The simplest policy is called trim and allows users to specify how much time should be excluded from the evaluation from the start and the end of the experiment. It is also possible to apply the trim policy to a specific task of the workflow. This policy can be easily used when, for example, the time required to deploy the application might change. You can read more on this policy in the .

In other contexts, discarding the initial warmup period is not enough. For these scenarios, Akamas supports a more advanced policy, called stability. This policy is also particularly useful for stress tests where our objective is to make the system sustain as much load as possible before becoming unstable as it allows users to express constraints on the stability of the system. You can read more on this policy in the

Offline Studies are, generally, executed in test environments where the workload of the application is generated using a load-testing tool. You can read more .

Live Studies are, usually, executed in production environments. You can read more .

The and the , already introduced in the previous sections, are referenced in the study definition to provide Akamas with information on how to apply the parameters (through the workflow) and retrieve the metrics (through the telemetry instances in the system) that are used to calculate the goal.

Another important, although optional, element of the goal is the definition of constraints on other metrics of the system: in many cases optimizing a system involves finding a tradeoff between multiple aspects, and goal constraints can be used to map SLO and inform Akamas about other aspects of our system that we want to safeguard during the optimization (e.g. reducing the amount of CPU assigned to a container might reduce the cost of running the system but increase its response time). Constraints can be used to specify, as an example, an upper limit to the response time or the memory utilization of the system. You can find more information on how to specify .

You can find more information on the .

Besides the goal, parameter selection, and steps, the study can be enriched with other, optional, elements that can be used to better tailor it to your specific needs. These include, as an example automated windowing and parameter constraints. You can find more information on these optional elements in the specific subsections or read the entire study definition in the .

Recalling our application example introduced in , our optimization objective is to reduce the costs of running the Ad service while reaching our SLO on the response time.

In this section, we instruct akamas that we want to minimize the cost of the Adservice and we have added a constraint to the optimization. In particular, we added a constraint on the value of the metric requests_response_time of the Api component to be lower than 20ms. This is an absolute constraint as it's defined on the actual value of the metric and can easily map an SLO. You can also express constraints like "do not make the response time increase more than 10%" by using relative constraints. You can find more info on the supported constraint types in the .

In this section, we defined which parameters Akamas can change to achieve its goal. We decided to include parameters both from the JVM and the container layers to let Akamas tune all of them accordingly. We also specified a custom domain, for a couple of parameters, to allow Akamas to explore only values within those ranges. Note that this is an optional step as Akamas already knows about the range of possible values of many parameters. You can find more info on available parameters and guidelines to choose them in different use cases in the section.

One of the key elements that define an optimization study is the parameters set. We have already seen in the how to define the set of optimized parameters here we dig deeper on this topic.

You can read more on parameters and how they are managed in the .

Constraints usually depend on the set of parameters chosen for the optimization. You can find more information about common constraints for the supported technologies in the documentation of the or the .

reference documentation section
reference documentation section.
here
here
system
workflow
constraints in the reference documentation section
steps in the reference documentation section
reference documentation section
this section
reference documentation section
optimization guides
study section
reference documentation section
related optimization pack
optimization guides

Optimize resources and costs, while preserving application performance and reliability

Optimize application performance and reliability, while avoiding resource and cost wastage

Kubernetes microservices

Offline optimizations

Live optimizations

Optimizing cost of a Kubernetes microservice while preserving SLOs with performance tests

Optimizing cost of a Java microservice on Kubernetes while preserving SLOs with performance tests

Optimizing cost of a Kubernetes microservice while preserving SLOs in production

Optimizing cost of a Java microservice on Kubernetes while preserving SLOs in production

Optimizing cost of a Node.js application with performance tests

COMING SOON! Please reach out to us at support@akamas.io if interested.

Optimizing cost of a Golang application with performance tests

COMING SOON! Please reach out to us at support@akamas.io if interested.

Optimizing cost of a .NET application with performance tests

COMING SOON! Please reach out to us at support@akamas.io if interested.

Application runtime

Offline optimizations

Optimizing cost of a Kubernetes microservice while preserving SLOs in production

In this example, you will use Akamas live optimization to minimize the cost of a Kubernetes deployment, while preserving application performance and reliability requirements.

Prerequisites

In this example, you need:

  • an Akamas instance

  • a Kubernetes cluster, with a deployment to be optimized

  • the kubectl command installed in the Akamas instance, configured to access the target Kubernetes and with privileges to get and update the deployment configurations

  • a supported telemetry data source (e.g. Prometheus or Dynatrace) configured to collect metrics from the target Kubernetes cluster

Optimization setup

Optimization packs

This example leverages the following optimization packs:

System

The system represents the Kubernetes deployment to be optimized (let's call it "frontend"). You can create a system.yaml manifest like this:

name: frontend
description: Kubernetes frontend deployment

Create the new system resource:

akamas create system system.yaml

The system will then have two components:

  • A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits

  • A Web Application component, which contains service-level metrics like throughput and response time

In this example, we assume the deployment to be optimized is called frontend, with a container named server, and is located within the boutique namespace. We also assume that Dynatrace is used as a telemetry provider.

Kubernetes component

Create a component-container.yaml manifest like the following:

name: container
description: Kubernetes container, part of the frontend deployment
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: boutique
      containerName: server
      basePodName: frontend-*

Then run:

akamas create component component-container.yaml frontend

Now create a component-webapp.yaml manifest like the following:

name: webapp
description: The service related to the frontend deployment
componentType: Web Application
properties:
  dynatrace:
    id: <TELEMETRY_DYNATRACE_WEBAPP_ID>

Then run:

akamas create component component-webapp.yaml frontend

Workflow

The workflow in this example is composed of three main steps:

  1. Update the Kubernetes deployment manifest with the Akamas recommended deployment parameters (CPU and memory limits)

  2. Apply the new parameters (kubectl apply)

  3. Wait for the rollout to complete

  4. Sleep for 30 minutes (observation interval)

Create a workflow.yaml manifest like the following:

name: frontend
tasks:
  - name: configure
    operator: FileConfigurator
    arguments:
      source:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml.templ
      target:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml

  - name: apply
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl apply -f frontend.yaml

  - name: verify
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl rollout status --timeout=5m deployment/frontend -n boutique;

  - name: observe
    operator: Sleep
    arguments:
      seconds: 1800

Then run:

akamas create workflow workflow.yaml

Telemetry

Create the telemetry.yamlmanifest like the following:

provider: Dynatrace
config:
  url: <YOUR_DYNATRACE_URL>
  token: <YOUR_DYNATRACE_TOKEN>
  pushEvents: false

Then run:

akamas create telemetry-instance telemetry.yaml frontend

Study

In this live optimization:

  • the goal is to reduce the cost of the Kubernetes deployment. In this example, the cost is based on the amount of CPU and memory limits (assuming requests = limits).

  • the approval mode is set to manual, a new recommendation is generated daily

  • to avoid impacting application performance, constraints are specified on desired response times and error rates

  • to avoid impacting application reliability, constraints are specified on peak resource usage and out-of-memory kills

  • the parameters to be tuned are the container CPU and memory limits (we assume requests=limits in the deployment file)

Create a study.yaml manifest like the following:

name: frontend
system: frontend
workflow: frontend
requireApproval: true

goal:
  objective: minimize
  function:
    formula: (((container.container_cpu_limit/1000) * 3) + (container.container_memory_limit/(1024*1024*1024)))
  constraints:
    absolute:
      - name: Response Time
        formula: webapp.requests_response_time <= 300
      - name: Error Rate
        formula: webapp.service_error_rate:max <= 0.05
      - name: Container CPU saturation
        formula: container.container_cpu_util:p95 < 0.8
      - name: Container memory saturation
        formula: container.container_memory_util:max < 0.7
      - name: Container out-of-memory kills
        formula: container.container_oom_kills_count == 0

parametersSelection:
  - name: container.cpu_limit
    domain: [300, 1000]
  - name: container.memory_limit
    domain: [800, 1536]

windowing:
  type: trim
  trim: [5m, 0m]
  task: observe

workloadsSelection:
  - name: webapp.requests_throughput

steps:
  - name: baseline
    type: baseline
    numberOfTrials: 48
    values:
      container.cpu_limit: 1000
      container.memory_limit: 1536

  - name: optimize
    type: optimize
    numberOfTrials: 48
    numberOfExperiments: 100
    numberOfInitExperiments: 0
    maxFailedExperiments: 50

Then run:

akamas create study study.yaml

You can now follow the live optimization progress and explore the results using the Akamas UI for Live optimizations.

Optimizing performance of a Node.js application with V8 runtime tuning leveraging performance tests

Optimizing performance of a Java application with JVM tuning leveraging performance tests

Kubernetes
Web application

Optimize cost of a Java microservice on Kubernetes while preserving SLOs in production

In this guide, you optimize the cost (or resource footprint) of a Java microservice running on Kubernetes. The study tunes both pod resource settings (CPU and memory requests and limits) and JVM options (max heap size, garbage collection algorithm, etc.) at the same time, while also taking into account your application performance and reliability requirements (SLOs). This optimization happens in production, leveraging Akamas live optimization capabilities.

Prerequisites

  • an Akamas instance

  • a Kubernetes cluster, with a Java-based deployment to be optimized

  • a way to apply configuration changes recommended by Akamas to the target deployment. In this guide, Akamas interacts directly with the Kubernetes APIs via kubectl.You need a service account with permissions to update your deployment (see below for other integration options)

Optimization setup

In this guide, we assume the following setup:

  • the Kubernetes deployment to be optimized is called adservice (in the boutique namespace)

  • in the deployment, there is a container named server, where the application JVM runs

  • Dynatrace is used as an observability tool

Let's set up the Akamas optimization for this use case.

System

For this optimization, you need the following components to model the adservice tech stack:

Let's start by creating the system, which represents the Kubernetes deployment to be optimized. To create it, write a system.yaml manifest like this:

name: adservice
description: The Adservice deployment

Then run:

akamas create system system.yaml

Now create a component-container.yaml manifest like the following:

name: server
description: Kubernetes container in the frontend deployment
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: boutique
      containerName: server
      basePodName: frontend-*

Notice the component includes properties that specify how Dynatrace telemetry will look up this container in the Kubernetes cluster (the same will happen for the following components).

These properties are dependent upon the telemetry provider you are using.

Then run:

akamas create component component-container.yaml frontend

Next, create a component-jvm.yaml manifest like the following:

name: jvm
description: JVM of the frontend deployment
componentType: java-openjdk-17
properties:
  dynatrace:
    type: PROCESS
    tags:
     akamas: adservice-jvm

Then run:

akamas create component component-jvm.yaml adservice

Now create a component-webapp.yaml manifest like the following:

name: webapp
description: The HTTP service of the adservice deployment
componentType: Web Application
properties:
  dynatrace:
    type: SERVICE
    name: adservice

Then run:

akamas create component component-webapp.yaml frontend

Workflow

To optimize a Kubernetes microservice in production, you need to create a workflow that defines how the new configuration recommended by Akamas will be deployed in production.

Let's explore the high-level tasks required in this scenario and the options you have to adapt it to your environment:

1) Update the Kubernetes deployment configuration

The first step is to update the Kubernetes deployment with the new configuration. This can be done in several ways depending on your environment and processes:

  • A simple option is to let Akamas directly update the deployment leveraging the Kubernetes APIs via kubectl commands

  • Another option is to follow an Infrastructure-as-code approach, where the configuration change is managed via pull requests to a Git repository, leveraging your pipelines to deploy the change in production

2) Wait for the new deployment to be rolled out in production

In a live optimization, Akamas needs to understand when the new deployment rollout is complete and whether it was completed successfully or not. This is key information for Akamas AI to observe and optimize your applications safely.

This task can be done in several ways depending on how you manage changes, as discussed in the previous task:

  • A simple option is to use thekubectl rollout command to wait for the deployment rollout completion. This is the approach used in this guide

  • Another option is to follow an Infrastructure-as-code approach, where a change is managed via pull requests to a Git repository, leveraging your pipelines to deploy in production. In this situation, the deployment process is executed externally and is not controlled by Akamas. Hence, the workflow task will periodically poll the Kubernetes deployment to recognize when the new deployment has landed in production

3) Observe how the application behaves with the new configuration

In a live optimization, Akamas simply needs to wait for a given observation interval, while the application works in production with the new configuration. Telemetry metrics will be collected during this observation period and will be analyzed by Akamas AI to recommend the next configuration.

A 30-minute observation interval is recommended for most situations.

Let's now create a workflow.yaml manifest like the following:

name: adservice
tasks:
  - name: configure
    operator: FileConfigurator
    arguments:
      source:
        hostname: toolbox
        username: akamas
        password: <your-toolbox-password>
        path: adservice.yaml.templ
      target:
        hostname: toolbox
        username: akamas
        password: <your-toolbox-password>
        path: adservice.yaml

  - name: apply
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: toolbox
        username: akamas
        password: <your-toolbox-password>
      command: kubectl apply -f adservice.yaml

  - name: verify
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: toolbox
        username: akamas
        password: <your-toolbox-password>
      command: kubectl rollout status --timeout=5m deployment/adservice -n boutique;

  - name: observe
    operator: Sleep
    arguments:
      seconds: 1800

Then run:

akamas create workflow workflow.yaml

Telemetry

To collect metrics of your target Kubernetes deployment, you create a telemetry instance based on your observability setup.

Create a telemetry.yamlmanifest like the following:

provider: Dynatrace
config:
  url: <YOUR_DYNATRACE_URL>
  token: <YOUR_DYNATRACE_TOKEN>

Then run:

akamas create telemetry-instance telemetry.yaml adservice

Study

It's now time to create the Akamas study to achieve your optimization objectives.

Let's explore how the study is designed by going through the main concepts. The complete study manifest is available at the bottom.

Goal

Your overall objective is to reduce the cost (or resource footprint) of a Kubernetes deployment. To do that, you need to define the goal, which is a metric (or combination of metrics) representing the deployment cost to be minimized.

There are different approaches to measuring the cost of Kubernetes deployments:

  • A simple approach is to consider that Kubernetes allocates infrastructure resources based on pod resource requests (CPU and memory). Hence, the cost of a deployment can be derived from the deployment aggregate CPU and memory requests. In this guide, we use this approach and define the study goal as the sum of CPU and memory requests of the container to be optimized

  • Alternatively, the cost of a Kubernetes deployment can also be collected from external data sources that provide actual cost metrics like OpenCost. In this case, the study goal can be defined by leveraging the cost metric

Notice that weighting factors can be used in the goal formula to specify the importance of CPU vs memory resources. For example, the cloud price of 1 CPU is about 9 times that of 1 GB of RAM. You can customize those weights based on your requirements so that Akamas knows how to truly reach the most cost-efficient configuration in your specific context.

Constraints

When optimizing for cost reduction (or resource footprint), it's key not to impact application response time or introduce risks of availability and reliability issues. To ensure this, you can define your performance and reliability requirements (SLOs) as metric constraints.

In this study:

  • to ensure application performance, constraints are specified on application response times and error rate

  • to ensure application reliability, constraints are specified on:

    • container peak CPU and memory utilization, and container out-of-memory kills

    • JVM garbage collection time %, to prevent out-of-memory in the JVM heap memory

Parameters

To achieve cost-efficient and reliable Java-based microservices, Kubernetes container resources and JVM runtime options must be configured optimally and tuned jointly, as they are heavily interconnected.

To do that, the study includes the following parameters:

  • Kubernetes container: CPU and memory requests and limits

  • JVM: heap size and garbage collection (GC) algorithms

The study also includes parameter constraints to ensure that recommended configurations are safe and comply with best practices. In particular:

  • Kubernetes container memory limit must be higher than JVM heap size, plus a buffer to account for JVM off-heap memory usage

  • CPU limits must be at most 2x CPU requests, to avoid excessive over-commitment of CPU limits in the cluster

Notice that the parameters and constraints can change depending on your policies. For example, it is a best practice to set memory requests == limits to avoid pod eviction. In this case, you only include memory requests in the study and set limits to the same value in the deployment file.

Workload

Akamas live optimization considers the application's workload to recommend new configurations that are optimal for the goal (e.g. reduce cost) while meeting all metric constraints (e.g., latency and error rates).

For Kubernetes microservices, the workload is typically the throughput (requests/sec) of the microservice API endpoints. This is the approach used in this guide.

Approval mode and recommendation frequency

In this live optimization, the manual approval is set to required, meaning that Akamas will ask for user approval when a new configuration gets generated. Once you approve it, the workflow will be executed, and the new configuration will be deployed to production according to the integration strategy you have defined above.

You can set it to false to enable fully autonomous optimization: in this case, as soon as a new configuration gets generated, the workflow will be executed without any human involvement.

The recommendation frequency can be chosen by leveraging the numberOfTrials parameter. As the workflow duration is set to 30 minutes, in order to have a new configuration generated daily, set the number of trials to 48.

You can now create a study.yaml manifest like the following:

name: adservice - optimize costs tuning K8s and JVM
system: adservice
workflow: adservice

goal:
  name: Cost
  objective: minimize
  function:
    formula: ((server.container_cpu_limit)/1000)*29 + ((((server.container_memory_limit)/1024)/1024)/1024)*3
  constraints:
    absolute: 
      - name: Application response time degradation
        formula: web_application.requests_response_time:max <= 5
      - name: Application error rate degradation
        formula: web_application.requests_error_rate:max <= 0.02
      - name: Container CPU saturation
        formula: server.container_cpu_util_max:p95 < 1
      - name: Container memory saturation
        formula: server.container_memory_util_max:max < 1
      - name: Container out-of-memory
        formula: server.container_restarts == 0
      - name: JVM heap saturation
        formula: jvm.jvm_gc_time:max < 0.05

windowing:
  type: trim
  trim: [2m, 0s]
  task: observe

parametersSelection:
  - name: server.cpu_request
    domain: [10, 181]
  - name: server.cpu_limit
    domain: [10, 181]
  - name: server.memory_request
    domain: [16, 2048]
  - name: jvm.jvm_maxHeapSize
    domain: [16, 1024]
  - name: jvm.jvm_gcType

parameterConstraints:
  - name: JVM off-heap safety buffer
    formula: jvm.jvm_maxHeapSize + 1000 < server.memory_limit
  - name: CPU limit at most 2x of requests
    formula: server.cpu_limit <= server.cpu_request * 2

workloadsSelection:
  - name: web_application.requests_throughput

numberOfTrials: 48
steps:
  - name: baseline
    type: baseline
    values:
      server.cpu_limit: 1000
      server.memory_limit: 2048
      jvm.jvm_maxHeapSize: 1024
      jvm.jvm_gcType: Serial

  - name: optimize
    type: optimize
    numberOfExperiments: 21

Then run:

akamas create study study.yaml

You can now follow the live optimization progress and explore the results using the Akamas UI.

Artifact templates

To quickly set up this optimization, download the Akamas template manifests and update the values file to match your needs. Then, create your optimization using the Akamas scaffolding.

0B
akamas-templates-optimize-costs-k8s-jvm-live.tgz

a supported telemetry data source configured to collect metrics from the target Kubernetes cluster (see for the full list)

A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits (from the optimization pack)

A Java OpenJDK component, which contains JVM-level metrics like heap memory usage and parameters to be tuned like the garbage collector algorithm (from the optimization pack)

A Web Application component, which contains service-level metrics like throughput and response time of the microservice (from the optimization pack)

In this guide, we take the first option and use the kubectl patch command to configure the new deployment. These commands are executed from the toolbox, an Akamas utility that can be enabled in an Akamas installation on Kubernetes. Make sure that kubectl is configured correctly to connect to your Kubernetes cluster and can update your target deployment. See for more details.

here
Kubernetes
Java OpenJDK
Web application
here

Optimizing a sample Java OpenJDK application

Environment setup

The test environment includes the following instances:

  • Akamas: instance running Akamas

  • PageRank: instance running the PageRank benchmark and the Prometheus monitoring service

Telemetry Infrastructure setup

To gather metrics about PageRank we will use a Prometheus and a JMX exporter. Here’s the scraper to add to the Prometheus configuration to extract the metrics from the exporter:

- job_name: jmx-exporter
  static_configs:
    - targets: ['pagerank.akamas.io:5556']
      labels:
      instance: jvm

Application and Test tool

To run and monitor the benchmark we’ll require on the PageRank instance:

Here’s the snippet of code to configure the instance as required for this guide:

mkdir renaissance; cd renaissance
wget -O renaissance.jar https://github.com/renaissance-benchmarks/renaissance/releases/download/v0.10.0/renaissance-gpl-0.10.0.jar
wget -O jmx_exporter.jar https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.14.0/jmx_prometheus_javaagent-0.14.0.jar
echo -e '--\nwhitelistObjectNames: ["java.lang:*"]' > conf.yaml

Optimization setup

In this section, we will guide you through the steps required to set up the optimization on Akamas.

System

System pagerank

Here’s the definition of the system we will use to group our components and telemetry instances for this example:

name: pagerank
description: A system to tune the pagerank benchmark

To create the system run the following command:

akamas create system pagerank.yaml

Component jvm

Here’s the definition of the component:

name: jvm
componentType: openjdk-11
properties:
  prometheus:
    instance: jvm
    job: jmx-exporter

To create the component in the system run the following command:

akamas create component jvm.yaml pagerank

Workflow

The workflow used for this study consists of two main stages:

  • generate the configuration file containing the tested Java parameters

  • run the execution using previously written parameters

Here’s the definition of the workflow:

name: run-pagerank
tasks:
  - name: Configure parameters
    operator: FileConfigurator
    arguments:
      source:
        hostname: pagerank.akamas.io
        username: ubuntu
        path: /home/ubuntu/renaissance/java_opts.template
        key: key
      target:
        hostname: pagerank.akamas.io
        username: ubuntu
        path: /home/ubuntu/renaissance/java_opts
        key: key

  - name: Run benchmark
    operator: Executor
    arguments:
      command: "cd renaissance; java -javaagent:./jmx_exporter.jar=5556:conf.yaml $(cat java_opts) -jar renaissance.jar -r 2 page-rank"
      host:
        hostname: pagerank.akamas.io
        username: ubuntu
        key: key

Where the configuration template is java_opts.template is defined as follows:

 ${jvm.jvm_gcType} ${jvm.jvm_maxHeapSize} ${jvm.jvm_newSize} ${jvm.jvm_survivorRatio} ${jvm.jvm_maxTenuringThreshold}

To create the workflow run the following command:

akamas create workflow workflow.yaml

Telemetry

The following is the definition of the telemetry instance that fetches metrics from the Prometheus service:

provider: Prometheus
config:
  address: pagerank.akamas.io
  port: 9090

To create the telemetry instance in the system run the following command:

akamas create telemetry-instance prometheus.yaml pagerank

This telemetry instance will be able to bind the fetched metrics to the related jvm component thanks to the prometheus attribute we previously added in its definition.

Study

The goal of this study is to find a JVM configuration that minimizes the peak memory used by the benchmark.

The optimized parameters are the maximum heap size, the garbage collector used and several other parameters managing the new and old heap areas. We also specify a constraint stating that the GC regions can’t exceed the total heap available, to avoid experimenting with parameter configurations that can’t start in the first place.

Here’s the definition of the study:

name: Optimize PageRank
description: Tweaking the JVM parameters to optimize the page-rank benchmark.
system: pagerank
workflow: run-pagerank

goal:
  objective: minimize
  function:
    formula: memory_used
    variables:
      memory_used:
        metric: jvm.jvm_memory_used

parametersSelection:
  - name: jvm.jvm_gcType
  - name: jvm.jvm_maxHeapSize
    domain: [1250, 2000]
  - name: jvm.jvm_newSize
    domain: [350, 2000]
  - name: jvm.jvm_survivorRatio
  - name: jvm.jvm_maxTenuringThreshold

parameterConstraints:
  - name: Max heap must always be greater than new size
    formula: jvm.jvm_maxHeapSize > jvm.jvm_newSize

steps:
  - name: baseline
    type: baseline
    values:
      jvm.jvm_gcType: G1
      jvm.jvm_maxHeapSize: 2000

  - name: optimize
    type: optimize
    numberOfExperiments: 30

To create and run the study execute the following commands:

akamas create study study.yaml
akamas start study 'Optimize PageRank'

In this example study we’ll tune the parameters of PageRank, one of the benchmarks available in the , with the goal of minimizing its memory usage. Application monitoring is provided by Prometheus, leveraging a JMX exporter.

The

The , plus a configuration file to expose the required classes

If you have not installed the optimization pack yet, take a look at the optimization pack page Java OpenJDK to proceed with the installation.

We’ll use a component of type to represent the JVM underlying the PageRank benchmark. To identify the JMX-related metrics in Prometheus the configuration requires the prometheus property for the telemetry service, detailed later in this guide.

Renaissance suite
Renaissance jar
JMX exporter agent
Java OpenJDK
Java OpenJDK 11

Live Study

In cases where a testing environment is not available or it is hard to build representative load tests Akamas can directly optimize production environments by running a Live Optimization study. Production environments differ from test environments in many ways, here are the main aspects that affect how Akamas can optimize the system in such a scenario and that define live optimization studies:

  • Safety, in terms of application stability and performance, is critical in production environments where SLO might be in place.

  • The approval process is usually different between production and lower-level environments. In many cases, a configuration change in a production environment must be manually approved by the SRE or Application team and follow a custom deployment scenario.

  • The workload on the application in a production environment is usually not controlled, it might change with the time of the day, due to special events or external factors

These are the main factors that make live optimization studies differ from offline optimizations.

The following figure represents the iterative process associated with live optimizations:

The following 5 phases can be identified for each iteration:

  1. Recommend Conf: Akamas provides a recommendation for parameter configuration based on the observed behavior and leveraging the Akamas AI

  2. Human Approval: the recommendation is inspected, possibly revisited, and approved by users before being applied to the system. This step is optional and can be automated.

Overall the core process is very similar to the one of offline optimization studies. The main difference is the (optional) presence of a manual configuration review and approval step.

Safety

Even if the process is similar, the way recommended configurations are generated is quite different as it's subject to some safety policies such as:

  • The exploration factor defines the maximum magnitude of the change of a parameter from one configuration to the next (e.g. reducing the CPU limit of a container by at most 10%). As changes are smaller in magnitude their effect on the system is also smaller, this leads to safer optimizations as the optimization can better track changes in the core metrics. As a side effect, it might take more time for a live optimization to fully optimize a configuration when compared to an offline study.

  • The safety factor defines how tight the constraints defined in the study are. As the configuration changes some metrics might approach a limit imposed by constraints. As an example, if we set a response time threshold of 300ms akamas will keep track of how the response time changes due to the configuration changes and react to keep the constraint fulfilled. The safety factor influences how quickly Akamas reacts to approaching constraints.

Workload

A key aspect of live optimization studies is the fact that the incoming workload of the application is not generated by a test script but by real users. This means that, after deploying a new configuration the incoming might be different with respect to the use used to evaluate the previous one. Nevertheless, the Akamas AI algorithm is capable of taking into account the differences in the incoming workload and fairly evaluating different configurations even if applied in different scenarios. As an example, the traffic of web applications exposed to the general public is usually different between workdays and weekends or working hours and nights.

To instruct Akamas to take into account changes that are not controlled by the deployment process you just need to specify the workloadsSelection parameter in the optimization study.

The workload selection should contain a list of metrics that are independent of the configuration and represent external factors that affect the performance of the configuration in terms of goals or constraints. Most of the time the application throughput is a good metric to use as a workload metric.

When one or more workload metrics are specified Akamas will take into account the differences in the workload and build clusters of similar workloads to identify repetitive working conditions for the application. It will then use this information to contextualize the evaluation of each configuration and provide a recommended configuration that fulfills the defined constraints on all the workload conditions seen by the optimization process.

User Interface

Live optimizations are separated from offline optimization studies and are available in the second entry on the left menu.

Live optimizations are run usually for a longer period compared to offline optimizations and their effect on the goal and the constraints is more gradual. For this reason, Akamas offers a specific UI that allows users to evaluate the progress of live optimizations and compare many different configurations applied by looking at the evolution of core metrics.

Collect KPIs: Akamas collects the of the system required to observe its behavior under the current parameter configuration by leveraging the associated - here Akamas is also observing and categorizing the different workload contexts that are used to recommend configurations that are appropriate for each specific workload context

Score vs Goal: Akamas scores the applied parameter configuration under the specific workload context against the defined

Apply Conf: Akamas applies the recommended configuration by leveraging the defined .

You can read more on safety policies in the related .

You can read more on this parameter on the reference .

metrics
telemetry provider
goal and constraints
workflow
documentation section
workload selection page

Optimizing cost of a Kubernetes microservice while preserving SLOs in production

In this example, you will use Akamas live optimization to minimize the cost of a Kubernetes deployment, while preserving application performance and reliability requirements.

Prerequisites

In this example, you need:

  • an Akamas instance

  • a Kubernetes cluster, with a deployment to be optimized

  • the kubectl command installed in the Akamas instance, configured to access the target Kubernetes and with privileges to get and update the deployment configurations

  • a supported telemetry data source (e.g. Prometheus or Dynatrace) configured to collect metrics from the target Kubernetes cluster

Optimization setup

Optimization packs

This example leverages the following optimization packs:

System

The system represents the Kubernetes deployment to be optimized (let's call it "frontend"). You can create a system.yaml manifest like this:

name: frontend
description: Kubernetes frontend deployment

Create the new system resource:

akamas create system system.yaml

The system will then have two components:

  • A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits

  • A Web Application component, which contains service-level metrics like throughput and response time

In this example, we assume the deployment to be optimized is called frontend, with a container named server, and is located within the boutique namespace. We also assume that Dynatrace is used as a telemetry provider.

Kubernetes component

Create a component-container.yaml manifest like the following:

name: container
description: Kubernetes container, part of the frontend deployment
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: boutique
      containerName: server
      basePodName: frontend-*

Then run:

akamas create component component-container.yaml frontend

Now create a component-webapp.yaml manifest like the following:

name: webapp
description: The service related to the frontend deployment
componentType: Web Application
properties:
  dynatrace:
    id: <TELEMETRY_DYNATRACE_WEBAPP_ID>

Then run:

akamas create component component-webapp.yaml frontend

Workflow

The workflow in this example is composed of three main steps:

  1. Update the Kubernetes deployment manifest with the Akamas recommended deployment parameters (CPU and memory limits)

  2. Apply the new parameters (kubectl apply)

  3. Wait for the rollout to complete

  4. Sleep for 30 minutes (observation interval)

Create a workflow.yaml manifest like the following:

name: frontend
tasks:
  - name: configure
    operator: FileConfigurator
    arguments:
      source:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml.templ
      target:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml

  - name: apply
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl apply -f frontend.yaml

  - name: verify
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl rollout status --timeout=5m deployment/frontend -n boutique;

  - name: observe
    operator: Sleep
    arguments:
      seconds: 1800

Then run:

akamas create workflow workflow.yaml

Telemetry

Create the telemetry.yamlmanifest like the following:

provider: Dynatrace
config:
  url: <YOUR_DYNATRACE_URL>
  token: <YOUR_DYNATRACE_TOKEN>
  pushEvents: false

Then run:

akamas create telemetry-instance telemetry.yaml frontend

Study

In this live optimization:

  • the goal is to reduce the cost of the Kubernetes deployment. In this example, the cost is based on the amount of CPU and memory limits (assuming requests = limits).

  • the approval mode is set to manual, a new recommendation is generated daily

  • to avoid impacting application performance, constraints are specified on desired response times and error rates

  • to avoid impacting application reliability, constraints are specified on peak resource usage and out-of-memory kills

  • the parameters to be tuned are the container CPU and memory limits (we assume requests=limits in the deployment file)

Create a study.yaml manifest like the following:

name: frontend
system: frontend
workflow: frontend
requireApproval: true

goal:
  objective: minimize
  function:
    formula: (((container.container_cpu_limit/1000) * 3) + (container.container_memory_limit/(1024*1024*1024)))
  constraints:
    absolute:
      - name: Response Time
        formula: webapp.requests_response_time <= 300
      - name: Error Rate
        formula: webapp.service_error_rate:max <= 0.05
      - name: Container CPU saturation
        formula: container.container_cpu_util:p95 < 0.8
      - name: Container memory saturation
        formula: container.container_memory_util:max < 0.7
      - name: Container out-of-memory kills
        formula: container.container_oom_kills_count == 0

parametersSelection:
  - name: container.cpu_limit
    domain: [300, 1000]
  - name: container.memory_limit
    domain: [800, 1536]

windowing:
  type: trim
  trim: [5m, 0m]
  task: observe

workloadsSelection:
  - name: webapp.requests_throughput

steps:
  - name: baseline
    type: baseline
    numberOfTrials: 48
    values:
      container.cpu_limit: 1000
      container.memory_limit: 1536

  - name: optimize
    type: optimize
    numberOfTrials: 48
    numberOfExperiments: 100
    numberOfInitExperiments: 0
    maxFailedExperiments: 50

Then run:

akamas create study study.yaml

You can now follow the live optimization progress and explore the results using the Akamas UI for Live optimizations.

Kubernetes
Web application

System

Creating a system is the first step in optimizing your application.

A system, in Akamas, is a representation of your application. It might be a complete representation of different layers, a single microservice, a batch job, or any IT system that you want to optimize.

A system can be used to fully model an application and then run multiple optimization initiatives or contain just the elements that are used for a specific optimization study.

The system is identified by a name, which in our example is "Online Boutique", and can be extended with a description to make it easily recognizable.

Components

The core elements of a system are the components. A component represents the fundamental element of an IT system, often composed of various layers or entities. It serves as a black-box definition of an entity involved in optimization, eliminating the need for intricate details in modeling.

A component comprises the following properties:

  • Name: A distinct identifier within the context of the system.

  • Description: A clarification of the component's purpose or function.

  • Component type: An identification of the underlying technology or technology stack of the component.

  • Properties: A set of additional properties that hold information about the component's configuration or telemetry (e.g. the IP used to reach an API or the username to connect to a server via SSH).

Akamas allows users to model their IT systems without the need to focus on technological aspects by providing several out-of-the-box component types to support system and component modeling.

Component types are platform entities (i.e.: shared among all the users) that contain key information about specific technologies such as parameters that can be tuned and key metrics.

Akamas includes off-the-shelf component types for the most popular technologies such as Containers, Linux Hosts, AWS EC2 instances, Web Applications, Spark, and runtimes such as JVM, Node, and Go.

Creating the Online Boutique system

Recalling our example of the Online Boutique application, we decided, for the moment, to model just the elements that are included in the optimization initiative. We have also decided not to model the entire Kubernetes cluster as we are not interested in optimizing and monitoring it at this stage.

We have mapped the JVM and the Pod to the respective component types and mapped the Kubernetes service to the Web Application component type. You can read more about these component types in their documentation reference.

To model our system we used the component types coming from these optimization packs:

The following picture shows our choice of components starting from the architectural diagram.

Creating the system with the CLI

To create this system in Akamas you can use the following YAML file.

name: Online Boutique
description: The Online Boutique e-commerce application

Create the file system.yaml and run the following command.

akamas create system system.yaml

Now you can start adding components. The following three YAML files represent the three components of our Online Boutique system.

APIs component specification
name: Apis
description: The APIs exposed to users
componentType: Web Application
properties:
  dynatrace:
    tags:
      Application: Ad-Service
Ad Service component specification
name: Adservice
description: The adservice of the online boutique by Google
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: akamas-demo
      containerName: server
      basePodName: ak-adservice-*
JVM component specification
name: AdserviceJVM
description: The JVM of the adservice 
componentType: java-openjdk-11
properties:
  dynatrace:
    tags:
      JVM: Ad-Service

Create the files and run the following command for each file.

akamas create component <file-name> "Online Boutique"

Note that, since components are bound to a specific system, we need to provide as an argument to the creation command also the name of the system Online Boutique that we created a few moments ago.

Component types are shipped within and can be easily installed and updated as support for new technologies is released.

From and architectural diagram to the Akamas system
Optimization Packs
Open JDK
Web Application
Kubernetes

Optimize application performance and reliability

Kubernetes microservices

Applications running on cloud instances

Spark applications

Kubernetes microservices

Offline optimizations

Live optimizations

Optimizing cost of a Kubernetes microservice while preserving SLOs with performance tests

Optimizing cost of a Java microservice on Kubernetes while preserving SLOs with performance tests

Optimizing cost of a Kubernetes microservice while preserving SLOs in production

Optimizing cost of a Java microservice on Kubernetes while preserving SLOs in production

Integrating

  • Configuration Management tools providing the ability to set tunable parameters for the system to be optimized - this integration applies to both offline and live optimization studies;

  • Value Stream Delivery tools to implement a continuous optimization process as part of a CI/CD pipeline - this integration applies to both offline and live optimization studies;

  • Load Testing tools used to reproduce a synthetic workload on the system to be optimized; notice that these tools may also act as Telemetry Providers (e.g. for end-user metrics) - this integration only applies to offline optimization studies.

These integrations may require some setup on both the tool and the Akamas side and may also involve defining workflows and making use of workflow operators.

Akamas provides the following areas of integration with your ecosystem, which may apply or not depending on whether you are running or :

Telemetry Providers tools providing time series for metrics of interest for the system to be optimized (see also ) - this integration applies to both offline and live optimization studies;

live optimization studies
offline optimization studies
Telemetry Providers

Create CSV telemetry instances

To create an instance of the CSV provider, build a YAML file (instance.yml in this example) with the definition of the instance:

# CSV Telemetry Provider Instance
provider: CSV File
config:
  address: host1.example.com
  authType: password
  username: akamas
  auth: akamas
  remoteFilePattern: /monitoring/result-*.csv
  componentColumn: COMPONENT
  timestampColumn: TS
  timestampFormat: YYYY-MM-dd'T'HH:mm:ss
metrics:
  - metric: cpu_util
    datasourceMetric: user%

Then you can create the instance for the system using the Akamas CLI:

akamas create telemetry-instance instance.yml system

timestampFormat format

Notice that the week-year format YYYY is compliant with the ISO-8601 specification, but you should replace it with the year-of-era format yyyy if you are specifying a timestampFormat different from the ISO one. For example:

  • Correct: yyyy-MM-dd HH:mm:ss

  • Wrong: YYYY-MM-dd HH:mm:ss

Configuration options

When you create an instance of the CSV provider, you should specify some configuration information to allow the provider to correctly extract and process metrics from your CSV files.

You can specify configuration information within the config part of the YAML of the instance definition.

Required properties

  • address - a URL or IP identifying the address of the host where CSV files reside

  • username - the username used when connecting to the host

  • authType - the type of authentication to use when connecting to the file host; either password or key

  • auth - the authentication credential; either a password or a key according to authType. When using keys, the value can either be the value of the key or the path of the file to import from

  • remoteFilePattern - a list of remote files to be imported

Optional properties

  • protocol - the protocol to use to retrieve files; either scp or sftp. Default is scp

  • fieldSeparator - the character used as a field separator in the CSV files. Default is ,

  • componentColumn - the header of the column containing the name of the component. Default is COMPONENT

  • timestampColumn - the header of the column containing the timestamp. Default is TS

  • timestampFormat - the format of the timestamp (e.g. yyyy-MM-dd HH:mm:ss zzz). Default is YYYY-MM-ddTHH:mm:ss

You should also specify the mapping between the metrics available in your CSV files and those provided by Akamas. This can be done in the metrics section of the telemetry instance configuration. To map a custom metric you should specify at least the following properties:

  • metric - the name of a metric in Akamas

  • datasourceMetric - the header of a column that contains the metric in the CSV file

The provider ignores any column not present as datasourceMetric in this section.

The sample configuration reported in this section would import the metric cpu_util from CSV files formatted as in the example below:

TS,                   COMPONENT,  user%
2020-04-17T09:46:30,  host,       20
2020-04-17T09:46:35,  host,       23
2020-04-17T09:46:40,  host,       32
2020-04-17T09:46:45,  host,       21

Telemetry instance reference

The following represents the complete configuration reference for the telemetry provider instance.

provider: CSV File             # this is an instance of the CSV provider
config:
  address: host1.example.com   # the address of the host with the CSV files
  port: 22                     # the port used to connect
  authType: password           # the authentication method
  username: akamas             # the username used to connect
  auth: akamas                 # the authentication credential
  protocol: scp                # the protocol used to retrieve the file
  fieldSeparator: ","          # the character used as field separator in the CSV files
  remoteFilePattern: /monitoring/result-*.csv    # the path of the CSV files to import
  componentColumn: COMPONENT                     # the header of the column with component names
  timestampColumn: TS                            # the header of the column with the time stamp
  timestampFormat: YYYY-mm-ddTHH:MM:ss           # the format of the timestamp
metrics:
  - metric: cpu_util                             # the name of the Akamas metric
    datasourceMetric: user%                      # the header of the column with the original metric
    staticLabels:
      mode: user                                 # (optional) additional labels to add to the metric

The following table reports the configuration reference for the config section

Field
Type
Description
Default Value
Restrictions
Required

address

String

The address of the machine where the CSV file resides

A valid URL or IP

Yes

port

Number (integer)

The port to connect to, in order to retrieve the file

22

1≤port≤65536

No

username

String

The username to use in order to connect to the remote machine

Yes

protocol

String

scp

scp sftp

No

authType

String

Specify which method is used to authenticate against the remote machine:

  • password: use the value of the parameter auth as a password

  • key: use the value of the parameter auth as a private key. Supported formats are RSA and DSA

password key

Yes

auth

String

A password or an RSA/DSA key (as YAML multi-line string, keeping new lines)

Yes

remoteFilePattern

String

A list of valid path for linux

Yes

componentColumn

String

The CSV column containing the name of the component.

The column's values must match (case sensitive) the name of a component specified in the System

COMPONENT

The column must exists in the CSV file

Yes

timestampColumn

String

The CSV column containing the timestamps of the samples

TS

The column must exists in the CSV file

No

timestampFormat

String

Timestamps' format

YYYY-mm-ddTHH:MM:ss

No

fieldSeparator

String

Specify the field separator of the CSV

,

, ;

No

The following table reports the configuration reference for the metrics section

Field
Type
Description
Restrictions
Required

metric

String

The name of the metric in Akamas

An existing Akamas metric

Yes

datasourceMetric

String

The name (header) of the column that contains the specific metric

An existing column in the CSV file

Yes

scale

Decimal number

The scale factor to apply when importing the metric

staticLabels

List of key-value pairs

A list of key-value pairs that will be attached to the specific metric sample

No

Use cases

Here you can find common use cases addressed by this provider.

Linux SAR

hostname, interval,     timestamp, 		        %user,	%system,      %memory
machine1, 600,		2018-08-07 06:45:01 UTC,	30.01,	20.77,		96.21
machine1, 600,		2018-08-07 06:55:01 UTC,	40.07,	13.00,		84.55
machine1, 600,		2018-08-07 07:05:01 UTC,	5.00,	90.55,		89.23

Note that the metrics are percentages (between 1 and 100), while Akamas accepts percentages as values between 0 and 1, therefore each metric in this configuration has a scale factor of 0.001.

You can import the two CPU metrics and the memory metric from a SAR log using the following telemetry instance configuration.

provider: CSV File
config:
  remoteFilePattern: /csv/sar.csv
  address: 127.0.0.1
  port: 22
  username: user123
  auth: password123
  authType: password
  protocol: scp
  componentColumn: hostname
  timestampColumn: timestamp
  timestampFormat: yyyy-MM-dd HH:mm:ss zzz
metrics:
  - metric: cpu_util
    datasourceMetric: %user
    scale: 0.001
    staticLabels:
      mode: user
  - metric: cpu_util
    datasourceMetric: %system
    scale: 0.001
    staticLabels:
      mode: system
  - metric: mem_util
    scale: 0.001
    datasourceMetric: %memory

Using the configured instance, the CSV File provider will perform the following operations to import the metrics:

  1. Retrieve the file "/csv/sar.csv" from the server "127.0.0.1" using the SCP protocol authenticating with the provided password.

  2. Use the column hostname to lookup components by name.

  3. Use the column timestamp to find the timestamps of the samples (that are expected to be in the format specified by timestampFormat).

  4. Collect the metrics (two with the same name, but different labels, and one with a different name):

    • cpu_util: in the CSV file is in the column %user and attach to its samples the label "mode" with value "user".

    • cpu_util: in the CSV file is in the column %system and attach to its samples the label "mode" with value "system".

    • mem_util: in the CSV file is in the column %memory.

Integrating Telemetry Providers

Akamas supports the integration with virtually any telemetry and observability tool.

Supported Telemetry Providers

The following table describes the supported Telemetry Providers, which are created automatically at installation time.

Notice that Telemetry Providers are shared across all the workspaces within the same Akamas installation, and only users with administrative privileges can manage them.

Install Dynatrace provider

Install the Telemetry Provider

Skip this part if the Telemetry Provider is already installed.

To install the Dynatrace provider, create a YAML file (called provider.yml in this example) with the definition of the provider:

Then you can install the provider using the Akamas CLI:

Optimizing a Spark application

In this example study we’ll tune the parameters of SparkPi, one of the example applications provided by most of the Apache Spark distributions, to minimize its execution time. Application monitoring is provided by the Spark History Server APIs.

Environment setup

The test environment includes the following instances:

  • Akamas: instance running Akamas

  • Spark cluster: composed of instances with 16 vCPUs and 64 GB of memory, where the Spark binaries are installed under /usr/lib/spark. In particular, the roles are:

    • 1x master instance: the Spark node running the resource manager and Spark History Server (host: sparkmaster.akamas.io)

    • 2x worker instances: the other instances in the cluster

Telemetry Infrastructure setup

To gather metrics about the application we will leverage the Spark History Server. If it is not already running, start it on the master instance with the following command:

Application and Test tools

To make sure the tested application is available on your cluster and runs correctly, execute the following commands:

Optimization setup

In this section, we will guide you through the steps required to set up on Akamas the optimization of the Spark application execution.

System

System spark

Here’s the definition of the system we will use to group our components and telemetry instances for this example:

To create the system run the following command:

Component sparkPi

In the snippet shown below, we specify:

  • the field properties required by Akamas to connect via SSH to the cluster master instance

  • the parameters required by spark-submit to execute the application

  • the sparkApplication flag required by the telemetry instance to associate the metrics from the History Server to this component

To create the component in the system run the following command:

Workflow

The workflow used for this study contains only a single stage, where the operator submits the application along with the Spark parameters under test.

Here’s the definition of the workflow:

To create the workflow run the following command:

Telemetry

Here’s the definition of the component, specifying the History Server endpoint:

To create the telemetry instance in the system run the following command:

This telemetry instance will be able to bind the fetched metrics to the related sparkPi component thanks to the sparkApplication attribute we previously added in its definition.

Study

The goal of this study is to find a Spark configuration that minimizes the execution time for the example application.

To achieve this goal we’ll operate on the number of executor processes available to run the application job, and the memory and CPUs allocated for both driver and executors. The domains are configured so that the single driver/executor process does not exceed the size of the underlying instance, and the constraints make it so that the application overall does not require more resources than the ones available in the cluster, also taking into account that some resources must be reserved for other services such as the cluster manager.

Note that this study uses two constraints on the total number of resources to be used by the spark application. This example refers to a cluster of three nodes with 16 cores and 64 GB of memory each, and at least one core per instance should be reserved for the system.

Here’s the definition of the study:

To create and run the study execute the following commands:

You can find detailed information on timestamp patterns in the Patterns for Formatting and Parsing section on the page.

The protocol used to connect to the remote machine: or

The path of the remote file(s) to be analyzed. The path can contains expressio

Must be specified using .

In this use case, you are going to import some metrics coming from , a popular UNIX tool to monitor system resources. SAR can export CSV files in the following format.

Telemetry Provider
Description

We’ll use a component of type to represent the application running on the Apache Spark framework 2.3.

If you have not installed the Spark History Server telemetry provider yet, take a look at the telemetry provider page to proceed with the installation.

DateTimeFormatter (Java Platform SE 8)
SAR
# Dynatrace Telemetry Provider
name: Dynatrace
description: Telemetry Provider that enables to import metrics from Dynatrace installations
dockerImage: 485790562880.dkr.ecr.us-east-2.amazonaws.com/akamas/telemetry-providers/dynatrace-provider:3.4.0
akamas install telemetry-provider provider.yml
/usr/lib/spark/sbin/start-history-server.sh
file /usr/lib/spark/examples/jars/spark-examples.jar
spark-submit \
  --master yarn --deploy-mode client \
  --class 'org.apache.spark.examples.SparkPi' \
  /usr/lib/spark/examples/jars/spark-examples.jar 100
name: spark
description: A system to tune the Spark Pi example application
akamas create system system.yaml
name: sparkPi
description: The Spark Application used to calculate KPIs for ContentWise Analytics
componentType: Spark Application 2.3.0

properties:
  hostname: sparkmaster.akamas.io
  username: hadoop
  key: ssh_key

  master: yarn
  deployMode: client
  className: org.apache.spark.examples.SparkPi
  file: /usr/lib/spark/examples/jars/spark-examples.jar
  args: [ 1000 ]

  sparkApplication: 'true'
akamas create component sparkPi.yaml spark
name: Run SparkPi
tasks:
- name: run application
  operator: SSHSparkSubmit
  arguments:
    component: sparkPi
    retries: 0
akamas create workflow workflow.yaml
provider: SparkHistoryServer
config:
  address: sparkmaster.akamas.io
  port: 18080

  importLevel: job
akamas create telemetry-instance telemetry.yaml spark
name: Speedup SparkPi execution
system: spark
workflow: Run SparkPi

goal:
  objective: minimize
  function:
    formula: sparkPi.spark_application_duration

parametersSelection:
- name: sparkPi.driverCores
  domain: [1, 10]
- name: sparkPi.driverMemory
  domain: [32, 2048]
- name: sparkPi.executorCores
  domain: [1, 15]
- name: sparkPi.executorMemory
  domain: [32, 2048]
- name: sparkPi.numExecutors
  domain: [1, 45]

parameterConstraints:
- name: cap_total_allocated_cpus
  formula: (spark.driverCores + spark.executorCores*spark.numExecutors) <= 15*3

- name: cap_total_allocated_memory
  formula: (spark.driverMemory + spark.executorMemory*spark.numExecutors) <= 60*3

steps:
- name: baseline
  type: baseline

- name: tune
  type: optimize
  numberOfExperiments: 200
  maxFailedExperiments: 200
akamas create study study.yaml
akamas start study 'Speedup SparkPi execution'

Install CSV provider

To install the CSV File provider, create a YAML file (called provider.yml in this example) with the specification of the provider:

# CSV File Telemetry Provider
name: CSV File
description: Telemetry Provider that enables to import of metrics from a remote CSV file
dockerImage: 485790562880.dkr.ecr.us-east-2.amazonaws.com/akamas/telemetry-providers/csv-file-provider:3.2.0

Then, you can then install the provider with the Akamas CLI:

akamas install telemetry-provider provider.yml
SCP
SFTP
GLOB
Java syntax
Spark Application 2.3.0
Spark History Server Provider

Optimize application costs and resource efficiency

CSV provider

The CSV provider collects metrics from CSV files and makes them available to Akamas. It offers a very versatile way to integrate custom data sources.

Prerequisites

This section provides the minimum requirements that you should match before using the CSV File telemetry provider.

Network requirements

The following requirements should be met to enable the provider to gather CSV files from remote hosts:

  • Port 22 (or a custom one) should be open from Akamas installation to the host where the files reside.

  • The host where the files reside should support SCP or SFTP protocols.

Permissions

  • Read access to the CSV files target of the integration

Akamas supported version

  • Versions < 2.0.0 are compatibile with Akamas until version 1.8.0

  • Versions >= 2.0.0 are compatible with Akamas from version 1.9.0

Supported component types

The CSV File provider is generic and allows integration with any data source, therefore it does not come with support for a specific component type.

Setup the data source

To operate properly, the CSV file provider expects the presence of four fields in each processed CSV file:

  • A timestamp field used to identify the point in time a certain sample refers to.

  • A component field used to identify the Akamas entity.

  • A metric field used to identify the name of the metric.

  • A value field used to store the actual value of the metric.

These fields can have custom names in the CSV file, you can specify them in the provider configuration.

collects metrics from CSV files

collects metrics from Dynatrace

collects metrics from Prometheus

collects metrics from Spark History Server

collects metrics from Tricentis Neoload Web

collects metrics from MicroFocus Load Runner Professional

collects metrics from MicroFocus Load Runner Enterprise

collects price metrics for Amazon Elastic Compute Cloud (ec2) from Amazon’s own APIs

The page describes how to get this Telemetry Provider installed. Once installed, this provider is shared with all users of your Akamas installation and can be used to monitor many different systems, by configuring appropriate telemetry provider instances as described in the page.

CSV provider
Dynatrace
Prometheus
Spark History Server
NeoloadWeb
Load Runner Professional
Load Runner Enterprise
AWS

Kubernetes microservices

Cloud instances

Spark applications

Application Runtimes

Install CSV provider
Create a CSV provider instance
Akamas resource

Dynatrace provider

The Dynatrace provider collects metrics from Dynatrace and makes them available to Akamas.

This provider includes support for several technologies. In any case, custom queries can be defined to gather the desired metrics.

Supported versions

Dynatrace SaaS/Managed version 1.187 or later

Supported component types:

  • Kubernetes and Docker

  • Web Application

  • Ubuntu-16.04, Rhel-7.6

  • java-openjdk-8, java-openjdk-11, java-openjdk-17

  • java-ibm-j9vm-6, java-ibm-j9vm-8, java-eclipse-openj9-11

Prerequisites

This section provides the minimum requirements that you should match before using the Prometheus provider.

  • Dynatrace SaaS/Managed version 1.187 or later

  • A valid Dynatrace license

  • Dynatrace OneAgent installed on the servers where the Dynatrace entities to be monitored are running

  • Connectivity between Akamas and the Dynatrace server on port 443

Dynatrace Token

The Dynatrace provider needs a Dynatrace API token with the following privileges:

  • metrics.read (Read metrics)

  • entities.read (Read entities and tags)

  • DataExport (Access problem and event feed, metrics, and topology)

  • ReadSyntheticData (Read synthetic monitors, locations, and nodes)

  • DataImport (Data ingest, e.g.: metrics and events). This permission is used to inform Dynatrace about configuration changes.

Component configuration

To instruct Akamas from which Dynatrace entities (e.g. Workloads, Services, Process Groups) metrics should be collected you can some specific properties on components.

Different strategies can be used to map Dynatrace entities to Akamas components:

  • By id

  • By name

  • By tags

  • By Kubernetes properties

By id

You can map a component to a Dynatrace entity by leveraging the unique id of the entity, which you should put under the id property in the component. This strategy is best used for long-lived instances whose ID does not change during the optimization such as Hosts, Process Groups, or Services.

Here is an example of how to setup host monitoring via id:

name: My Host
properties:
 dynatrace:
  id: HOST-12345YUAB1

You can find the id of a Dynatrace entity by looking at the URL of a Dynatrace dashboard relative to the entity. Watch out that the "host" key is valid only for Linux components, other components (e.g. the JVM) must drill down into the host entities to get the PROCESS_GROUP_INSTANCE or PROCESS_GROUP id.

By name

You can map a component to a Dynatrace entity by leveraging the entity’s display name. This strategy is similar to the map by id but provides a more friendly way to identify the mapped entity. Beware that id multiple entities in your Dynatrace installation share the same name they will all be mapped to the same component. The Dynatrace display name should be put under the name property in the component definition:

name: MyComponent
properties:
 dynatrace:
  name: host-1

By tags

You can map a component to a Dynatrace entity by leveraging Dynatrace tags that match the entity, tags which you should put under the tags property in the component definition.

If multiple tags are specified, instances matching any of the specified tags will be selected.

This sample configuration maps to the component all Dynatrace entities with tag environment: test or [AWS]dynatrace-monitored: true

name: MyComponent
properties:
 dynatrace:
  tags:
     environment: test
     [AWS]dynatrace-monitored: true

Dynatrace supports both key-value and key-only tags. Key-only tags can be specified as Key-value tags with an empty value as in the following example

name: MyComponent
properties:
 dynatrace:
  tags:
     myKeyOnlyTag: ""

By Kubernetes properties

By leveraging dedicated properties, you can map a component to a Dynatrace entity referring to a Kubernetes cluster (e.g., a Pod or a Container).

Container

To properly identify the set of containers to be mapped, you can specify the following properties. Any container matching all the properties will be mapped to the component.

Akamas property
Dynatrace property
Location

namespace

Kubernetes namespace

Container dashboard

containerName

Kubernetes container name

Container dashboard

basePodName

Kubernetes base pod name

Container dashboard

You can retrieve all the information to setup the properties on the top of the Dynatrace container dashboard.

The following example shows how to map a component to a container running in Kubernetes:

dynatrace:
  type: CONTAINER_GROUP_INSTANCE
  kubernetes:
    namespace: boutique
    containerName: server
    basePodName: ak-frontend-*

Pod

To properly identify the set of pods to be mapped, you can specify the following properties. Any pod matching all the properties will be mapped to the component.

Akamas property
Dynatrace property
Location

state

State

Pod dashboard

namespace

Namespace

Pod dashboard

workload

Workload

Pod dashboard

If you need to narrow your pod selection further you can also specify a set of tags as described in the by tags. Note that tags for Kubernetes resources are called Labels in the Dynatrace dashboard.

Labels are specified as key-value in the Akamas configuration. In Dynatrace’s dashboard key and value are separated by a column (:)

Example

The following example shows how to map a component to a pod running in Kubernetes:

dynatrace:
  type: CLOUD_APPLICATION_INSTANCE
  namePrefix: ak-frontend-
  kubernetes:
    labels:
      workload: ak-frontend
      product: hipstershop

Container, Pod, or Workload?

Please note, that when you are mapping components to Kubernetes entities the property type is required to instruct Akamas on which type of entity you want to map. Dynatrace maps Kubernetes entities to the following types:

Kubernetes type
Dynatrace type

Docker container

CONTAINER_GROUP_INSTANCE

Pod

CLOUD_APPLICATION_INSTANCE

Workload

CLOUD_APPLICATION

Namespace

CLOUD_APPLICATION_NAMESPACE

Cluster

KUBERNETES_CLUSTER

Improve component mapping with type

You can improve the matching of components with Dynatrace by adding a type property in the component definition, this property will help the provider match only those Dynatrace entities of the given type.

name: MyComponent
properties:
 dynatrace:
  type: SERVICE     # the type helps the mapping by tags by filtering down entities that are only services
  tags:
     environment: test
     "[AWS]dynatrace-monitored": true

The type of an entity can be retrieved from the URL of the entity’s dashboard

Available entity types can be retrieved, from your Dynatrace instance, with the following command:

curl 'https://<Your Dynatrace host>/api/v2/entityTypes/?pageSize=500' \
  --header 'Authorization: Api-Token <API-TOKEN>'

Mapping multiple entities in one component

In some circumstances, you might want to map multiple Dyantrace entities (e.g. a set of hosts) to the same Akamas component and import aggregated metrics.

This can be easily done by using tags. If Akamas detects that multiple entities have been mapped to the same component it will try to aggregate metrics; some metrics, however, can not be automatically aggregated.

To force aggregation on all available metrics you can add the mergeable: trueproperty to the component under the Dynatrace element.

name: MyComponent
properties:
 dynatrace:
  mergeable: true
  tags:
     environment: test
     [AWS]dynatrace-monitored: true

Create Dynatrace telemetry instances

The installed provider is shared with all users of your Akamas installation and can monitor many different systems, by configuring appropriate telemetry provider instances.

To create an instance of the Dynatrace provider, build a YAML file (instance.yml in this example) with the definition of the instance:

# Dynatrace Telemetry Provider Instance
provider: Dynatrace
config:
  url: https://wuy711522.live.dynatrace.com
  token: XbERgThisIsAnExampleToken

Then you can create the instance for the system using the Akamas CLI:

akamas create telemetry-instance instance.yml system

Configuration options

When you create an instance of the Dynatrace provider, you should specify some configuration information to allow the provider to correctly extract and process metrics from Dynatrace.

You can specify configuration information within the config part of the YAML of the instance definition.

Required properties

Collect additional metrics

You can collect additional metrics with the Dynatrace provider by using the metrics field:

config:
  url: https://wuy71982.live.dynatrace.com
  token: XbERgkKeLgVfDI2SDwI0h
metrics:
- metric: "akamas_metric"                     # extra akamas metrics to monitor
  datasourceMetric: builtin:host:new_metric   # query to execute to extract the metric
  labels:
  - "method"      # the "method" label will be retained within akamas

Configure a proxy for Dynatrace

In the case in which Akamas cannot reach directly your Dynatrace installation, you can configure an HTTP proxy by using the proxy field:

config:
  url: https://wuy71982.live.dynatrace.com
  token: XbERgkKeLgVfDI2SDwI0h
  proxy:
    address: https://dynaproxy  # the URL of the HTTP proxy
    port: 9999                  # the port the proxy listens to

Telemetry instance reference

This section reports the complete reference for the definition of a telemetry instance.

provider: Dynatrace  # this is an instance of the <name> provider
config:
  url: https://wuy71982.live.dynatrace.com
  token: XbERgkKeLgVfDI2SDwI0h
  proxy:
    address: https://dynaproxy # the URL of the HTTP proxy
    port: 9999            # the port the proxy listens to
    username: myusername  # http basic auth username if necessary
    password: mypassword  # http basic auth password if necessary
  tags:
    Environment: Test       # dynatrace tags to be matched for every component

metrics:
- metric: "cpu_usage"  # this is the name of the metric within Akamas
  # The Dynatrace metric name
  datasourceMetric: "builtin:host.cpu.usage"
  extras:
    mergeEntities: true  # instruct the telemetry to aggregate the metric over multiple entities
  aggregation: avg  # The aggregation to perform if the mergeEntities property is set to true

This table shows the reference for the config section within the definition of the Dynatrace provider instance:

Field
Type
Value restrictions
Required
Default Value
Description

url

String

It should be a valid URL

Yes

token

String

Yes

proxy

Object

See Proxy options reference

No

The specification of the HTTP proxy to use to communicate with Dynatrace.

pushEvents

String

true, false

No

true

If set to true the provider will inform dynatrace of the configuration change event which will be visible in the Dynatrace UI.

tags

Object

No

A set of global tags to match Dynatrace entities. The provider uses these tags to apply a default filtering of Dynatrace entities for every component.

Proxy options reference

This table reports the reference for the config → proxy section within the definition of the Dynatrace provider instance:

Field
Type
Value restrictions
Required
Default value
Description

address

String

It should be a valid URL

Yes

The URL of the HTTP proxy to use to communicate with the Dynatrace installation API

port

Number (integer)

1 <port<65535

Yes

The port at which the HTTP proxy listens for connections

username

String

No

The username to use when authenticating against the HTTP proxy, if necessary

password

String

No

The username to use when authenticating against the HTTP proxy, if necessary

Metrics options reference

This table reports the reference for the metrics section within the definition of the Dynatrace provider instance. The section contains a collection of objects with the following properties:

Field
Type
Value Restrictions
Required
Default value
Description

metric

String

It must be an Akamas metric

Yes

The name of an Akamas metric that should map to the new metric you want to gather

datasourceMetric

String

A valid Dynatrace metric

Yes

The Dynatrace query to use to extract metric

labels

Array of strings

-

No

The list of Dynatrace labels that should be retained when gathering the metric

staticLabels

Key-Value

-

No

Static labels that will be attached to metric samples

aggregation

String

No

avg

The aggregation to perform if the mergeEntities property under the extras section is set to true

extras

Object

Only the parameter mergeEntities can be defined to either true or false

No

Section for additional properties

Use cases

This section reports common use cases addressed by this provider.

Collect system metrics

Check the Linux optimization pack for a list of all the system metrics available in Akamas.

As a second step, choose a strategy to map your Linux component (MyLinuxComponent) with the corresponding Dyntrace entity.

Let’s assume you want to map by id your Dynatrace entity, you can find the id in the URL bar of a Dyntrace dashboard of the entity:

Grab the id and add it to the Linux component definition:

name: MyLinuxComponent
description: this is a Linux component
properties:
  dynatrace:
    id: HOST-A987D45512ABCEEE

You can leverage the name of the entity as well:

name: MyLinuxComponent
description: this is a Linux component
properties:
  dynatrace:
    name: Host1

As a third and final step, once the component is all set, you can create an instance of the Dynatrace provider and then build your first studies:

name: Dynatrace
config:
  url: https://my_dyna_installation_url
  token: MY_DYNA_TOKEN

Optimizing a sample application running on AWS

In this example, you will go through the optimization of a Spark application running on AWS instances. We’ll be using a PageRank implementation included in Renaissance, an industry-standard Java benchmarking suite, tuning both Java and AWS parameters to improve the performance of our application.

Environment setup

For this example, you’re expected to use two dedicated machines:

  • an Akamas instance

  • a Linux-based AWS EC2 instance

The Linux-based instance will run the application benchmark, so it requires the latest open-jdk11 release

Telemetry Infrastructure setup

For this study you’re going to require the following telemetry providers:

Application and Test tool

Since the application consists of a jar file only, the setup is rather straightforward; just download the binary in the ~/renaissance/ folder:

In the same folder upload the template file launch.benchmark.sh.temp, containing the script that executes the benchmark using the provided parameters and parses the results:

Optimization setup

In this section, we will guide you through the steps required to set up the optimization on Akamas.

Optimization packs

This example requires the installation of the following optimization packs:

System

Our system could be named renaissance after its application, so you’ll have a system.yaml file like this:

Then create the new system resource:

The renaissance system will then have three components:

  • A benchmark component

  • A Java component

  • An EC2 component, i.e. the underlying instance

Java component

Create a component-jvm.yaml file like the following:

Then type:

Benchmark component

Since there is no optimization pack associated with this component, you have to create some extra resources.

  • A metrics.yaml file for a new metric tracking execution time:

  • A component-type benchmark.yaml:

  • The component pagerank.yaml:

Create your new resources, by typing in your terminal the following commands:

EC2 component

Create a component-ec2.yaml file like the following:

Then create its resource by typing in your terminal:

Workflow

The workflow in this example is composed of three main steps:

  1. Update the instance type

  2. Run the application benchmark

  3. Stop the instance

In detail:

  1. Update the instance size

    1. Generate the playbook file from the template

    2. Update the instance using the playbook

    3. Wait for the instance to be available

  2. Run the application benchmark

    1. Configure the benchmark Java launch script

    2. Execute the launch script

    3. Parse PageRank output to make it consumable by the CSV telemetry instance

  3. Stop the instance

    1. Configure the playbook to stop an instance with a specific instance id

    2. Run the playbook to stop the instance

The following is the template of the Ansible playbook:

The following is the workflow configuration file:

Telemetry

Prometheus

  • The prometheus.yml file, located in your Prometheus folder:

The config.yml file you have to create in the ~/renaissance folder:

Now you can create a prometheus-instance.yaml file:

Then you can install the telemetry instance:

CSV - Telemetry instance

Create a telemetry-csv.yaml file to read the benchmark output:

Then create the resource by typing in your terminal:

Study

Here we provide a reference study for AWS. As we’ve anticipated, the goal of this study is to optimize a sample Java application, the PageRank benchmark you may find in the renaissance benchmark suite by Oracle.

Our goal is rather simple: minimizing the product between the benchmark execution time and the instance price, that is, finding the most cost-effective instance for our application.

Create a study.yaml file with the following content:

Then create the corresponding Akamas resource and start the study:

Refer to to see how component-types metrics are extracted by this provider.

A Dynatrace API token with the privileges described .

To generate an API Token for your Dynatrace installation you can follow .

url - URL of the Dynatrace installation API (see to retrieve the URL of your installation)

token - A Dynatrace API Token with the

The URL of the Dynatrace installation API (see the )

The Dynatrace API Token the provider should use to interact with Dynatrace. The token should have .

see

As a first step to start extracting metrics from Dyntrace, and make sure it has the right permissions.

The Akamas instance requires provisioning and manipulating instances, therefore it requires to be enabled to do so by setting , integrating with orchestration tools (such as ), and an inventory linked to your AWS EC2 environment.

to parse the results of the benchmark

to monitor the instance

to extract instance price

The suite provides the benchmark we’re going to optimize.

You may find further info about the suite and its benchmarks in the .

To manage the instance we are going to integrate a very simple in our workflow: the will replace the parameters in the template file in order to generate the code run by the as explained in the page.

If you have not installed the Prometheus telemetry provider or the CSV telemetry provider yet, take a look at the telemetry provider pages and to proceed with the installation.

Prometheus allows us to gather jvm execution metrics through the jmx exporter: download the java agent required to gather metrics from , then update the two following files:

You may find further info on exporting Java metrics to Prometheus .

Dynatrace provider metrics mapping
these steps
https://www.dynatrace.com/support/help/extend-dynatrace/dynatrace-api/
generate your API token
here
proper permissions
sudo apt install openjdk-11-jre
mkdir ~/renaissance
cd ~/renaissance
wget -O renaissance.jar https://github.com/renaissance-benchmarks/renaissance/releases/download/v0.10.0/renaissance-gpl-0.10.0.jar
#!/bin/bash
java -XX:MaxRAMPercentage=60 ${jvm.*} -jar renaissance.jar -r 50 --csv renaissance.csv page-rank

total_time=$(awk -F"," '{total_time+=$2}END{print total_time}' ./renaissance.csv)
first_line=$(head -n 1 renaissance.csv)
end_time=$(tail -n 1 renaissance.csv | cut -d',' -f3)
start_time=$(sed '2q;d' renaissance.csv | cut -d',' -f4)
echo $first_line,"TS,COMPONENT" > renaissance-parsed.csv
ts=$(date -d @$(($start_time/1000)) "+%Y-%m-%d %H:%M:%S")

echo -e "page-rank,$total_time,$end_time,$start_time,$ts,pagerank" >> renaissance-parsed.csv
name: jvm
description: The JVM running the benchmark
componentType: java-openjdk-11
properties:
    prometheus:
      job: jmx
      instance: jmx_instance
akamas create component component-jvm.yaml renaissance
name: jvm
description: The JVM running the benchmark
componentType: java-openjdk-11
properties:
    prometheus:
      job: jmx
      instance: jmx_instance
akamas create component component-jvm.yaml renaissance
metrics:
  - name: elapsed
    unit: nanoseconds
    description: The duration of the benchmark execution
name: benchmark
description: A component type for the Renaissance Java benchmarking suite
metrics:
  - name: elapsed
parameters: []
name: pagerank
description: The pagerank application included in Renaissance benchmarks
componentType: benchmark
akamas create metrics metrics.yaml
akamas create component-type benchmark.yaml
akamas create component pagerank.yaml renaissance
name: instance
description: The ec2 instance the benchmark runs on
componentType: ec2
properties:
  hostname: renaissance.akamas.io
  sshPort: 22
  instance: ec2_instance
  username:  ubuntu
  key: # SSH KEY
  ec2:
    region: us-east-2 # This is just a reference
akamas create component component-ec2.yaml renaissance
# Change instance type, requires AWS CLI

- name: Resize the instance
  hosts: localhost
  gather_facts: no
  connection: local
  tasks:
  - name: save instance info
    ec2_instance_info:
      filters:
        "tag:Name": <your-instance-name>
    register: ec2
  - name: Stop the instance
    ec2:
      region: <your-aws-region>
      state: stopped
      instance_ids:
        - "{{ ec2.instances[0].instance_id }}"
      instance_type: "{{ ec2.instances[0].instance_type }}"
      wait: True
  - name: Change the instances ec2 type
    shell: >
       aws ec2 modify-instance-attribute --instance-id "{{ ec2.instances[0].instance_id }}"
       --instance-type "${ec2.aws_ec2_instance_type}.${ec2.aws_ec2_instance_size}"
    delegate_to: localhost
  - name: restart the instance
    ec2:
      region: <your-aws-region>
      state: running
      instance_ids:
        - "{{ ec2.instances[0].instance_id }}"
      wait: True
    register: ec2
  - name: wait for SSH to come up
    wait_for:
      host: "{{ item.public_dns_name }}"
      port: 22
      delay: 60
      timeout: 320
      state: started
    with_items: "{{ ec2.instances }}"
name: Pagerank AWS optimization
tasks:

  # Creating the EC2 instance
  - name: Configure provisioning
    operator: FileConfigurator
    arguments:
      sourcePath: /home/ubuntu/ansible/resize.yaml.templ
      targetPath: /home/ubuntu/ansible/resize.yaml
      host:
        hostname: bastion.akamas.io
        username: ubuntu
        key: # SSH KEY

  - name: Execute Provisioning
    operator: Executor
    arguments:
      command: ansible-playbook /home/akamas/ansible/resize.yaml
      host:
        hostname: bastion.akamas.io
        username: akamas
        key: # SSH KEY

  # Waiting for the instance to come up and set up its DNS
  - name: Pause
    operator: Sleep
    arguments:
      seconds: 120

  # Running the benchmark
  - name: Configure Benchmark
    operator: FileConfigurator
    arguments:
        source:
            hostname: renaissance.akamas.io
            username: ubuntu
            path: /home/ubuntu/renaissance/launch_benchmark.sh.templ
            key: # SSH KEY
        target:
            hostname: renaissance.akamas.io
            username: ubuntu
            path: /home/ubuntu/renaissance/launch_benchmark.sh
            key: # SSH KEY

  - name: Launch Benchmark
    operator: Executor
    arguments:
      command: bash /home/ubuntu/renaissance/launch_benchmark.sh
      host:
        hostname: renaissance.akamas.io
        username: ubuntu
        key: # SSH KEYCreate the workflow resource by typing in your terminal:
# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: prometheus
    static_configs:
    - targets: ['localhost:9090']

  - job_name: jmx
    static_configs:
    - targets: ["localhost:9110"]
    relabel_configs:
    - source_labels: ["__address__"]
      regex: "(.*):.*"
      target_label: instance
      replacement: jmx_instanc
startDelaySeconds: 0
username:
password:
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
# using the property above we are telling the export to export only relevant java metrics
whitelistObjectNames:
  - "java.lang:*"
  - "jvm:*"
provider: Prometheus
config:
  address: renaissance.akamas.io
  port: 9090
akamas create telemetry-instance prometheus-instance.yaml renaissance
provider: CSV
config:
  protocol: scp
  address: renaissance.akamas.io
  username: ubuntu
  authType: key
  auth: # SSH KEY
  remoteFilePattern: /home/ubuntu/renaissance/renaissance-parsed.csv
  csvFormat: horizontal
  componentColumn: COMPONENT
  timestampColumn: TS
  timestampFormat: yyyy-MM-dd HH:mm:ss

metrics:
  - metric: elapsed
    datasourceMetric: nanos
akamas create telemetry-instance renaissance
name: aws
description: Tweaking aws and the JVM to optimize the page-rank application.
system: renaissance

goal:
  objective: minimize
  function:
    formula: benchmark.elapsed * aws.aws_ec2_price

workflow: workflow-aws

parametersSelection:
  - name: aws.aws_ec2_instance_type
    categories: [c5,c5d,c5a,m5,m5d,m5a,r5,r5d,r5a]
  - name: aws.aws_ec2_instance_size
    categories: [large,xlarge,2xlarge,4xlarge]
  - name: jvm.jvm_gcType
  - name: jvm.jvm_newSize
  - name: jvm.jvm_maxHeapSize
  - name: jvm.jvm_minHeapSize
  - name: jvm.jvm_survivorRatio
  - name: jvm.jvm_maxTenuringThreshold

steps:
  - name: baseline
    type: baseline
    numberOfTrials: 2
    values:
     aws.aws_ec2_instance_type: c5
     aws.aws_ec2_instance_size: 2xlarge
     jvm.jvm_gcType: G1
  - name: optimize
    type: optimize
    numberOfExperiments: 60
akamas create study study.yaml
akamas start study aws
official reference
sufficient permissions
Dynatrace metric aggregations
CSV Provider
Prometheus provider
AWS Telemetry provider
renaissance
official doc
AWS
Java OpenJDK
Ansible
FileConfigurator operator
Executor operator,
Ansible
Prometheus provider
CSV Provider
here
here

Import Key Requests

By default, only requests at the service level are imported by the telemetry provider.

To import specific key requests you can follow these steps.

Currently only average response time, throughput, and error rate metrics are available for key requests.

Component Creation

Create a new component of type Web Application for each key request you want to import. This allows tracking response time, throughput, and error rates separately.

You can use the following yaml file as an example and customize it to suit your needs.

name: KeyRequestA
description: The key request A for my application
componentType: Web Application
properties:
 dynatrace: 
  type: SERVICE_METHOD
  id: SERVICE_METHOD-D4BCC949D5DD656A

In order to instruct Akamas to import a specific key request you just need to change the id field of the yaml above to the one that matches your key request on Dynatarce.

To obtain that ID open the analysis page for the request as in the example below, take note of the URL of the page, and look for the SERVICE_METHOD keyword. The id is the one starting with SERVICE_METHOD and ending before the character %14

Considering the example below the id is SERVICE_METHOD-D4BCC949D5DD656A

Telemetry instance setup

Create a telemetry instance for your system using the yaml specified below as an example and modify it to provide your Dynatrace account and credentials. This will instruct Akamas to use key request metrics instead of service metrics.

provider: Dynatrace
config:
 url: https://<my-account>.dynatrace.com/
 token: <my-token>
metrics:
  - metric: requests_response_time
    datasourceMetric: builtin:service.keyRequest.response.time
    scale: 0.001    
    defaultValue: 0.0
    staticLabels:
      provider: dynatrace
  - metric: requests_throughput
    datasourceMetric: builtin:service.keyRequest.errors.server.successCount
    scale: 0.0166666666666666666666666666666
    defaultValue: 0.0
    staticLabels:
      provider: dynatrace        
  - metric: requests_error_rate
    datasourceMetric: builtin:service.keyRequest.errors.server.rate
    scale: 0.01
    defaultValue: 0.0
    staticLabels:
      provider: dynatrace    

Prometheus provider

The Prometheus provider collects metrics from a Prometheus instance and makes them available to Akamas.

This provider includes support for several technologies (Prometheus exporters). In any case, custom queries can be defined to gather the desired metrics.

Prerequisites

This section provides the minimum requirements that you should match before using the Prometheus provider.

Supported Prometheus versions:

Akamas supports Prometheus starting from version2.26.

Using also theprometheus-operator requires Prometheus 0.47 or greater. This version is bundled with the kube-prometheus-stack since version 15.

Connectivity between the Akamas server and the Prometheus server is also required. By default, Prometheus is run on port 9090.

Supported Prometheus exporters

Supported Akamas component types

  • Kubernetes (Pod, Container, Workload, Namespace)

  • Web Application

  • Java (java-ibm-j9vm-6, java-ibm-j9vm-8, java-eclipse-openj9-11, java-openjdk-8, java-openjdk-11, java-openjdk-17)

  • Linux (Ubuntu-16.04, Rhel-7.6)

Component configuration

Akamas reasons in terms of a system to be optimized and in terms of parameters and metrics of components of that system. To understand which metrics collected from Prometheus should be mapped to a component, the Prometheus provider looks up some properties in the components of a system grouped under prometheus property. These properties depend on the exporter and the component type.

Nested under this property you can also include any additional field your use case may require to filter the imported metrics further. These fields will be appended in queries to the list of label matches in the form field_name=~'field_value', and can specify either exact values or patterns.

It is important that you add instance and, optionally, the job properties to the components of a system so that the Prometheus provider can gather metrics from them:

# Specification for a component, whose metrics should be collected by the Prometheus Provider
name: jvm1  # name of the component
description: jvm1 for payment services  # description of the component
properties:
  prometheus:
    instance: service0001  # instance of the component: where the component is located relative to Prometheus
    job: jmx               # job of the component: which prom exporter is gathering metrics from the component

Prometheus configuration

The Prometheus provider does not usually require a specific configuration of the Prometheus instance it uses.

When gathering metrics for hosts it's usually convenient to set the value of the instance label so that it matches the value of the instance property in a component; in this way, the Prometheus provider knows which system component each data point refers to.

Here’s an example configuration for Prometheus that sets the instance label:

# Custom global config
global:
  scrape_interval:     5s   # Set the scrape interval to every 15 seconds. The default is every 1 minute.
  evaluation_interval: 5s   # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# A scrape configuration containing exactly one endpoint to scrape:
scrape_configs:
# Node Exporter
- job_name: 'node'
  static_configs:
  - targets: ["localhost:9100"]
  relabel_configs:
  - source_labels: ["__address__"]
    regex: "(.*):.*"
    target_label: instance
    replacement: value_of_instance_property_in_the_component_the_data_points_should_refer_to

Install Prometheus provider

To install the Prometheus provider, create a YAML file (provider.yml in this example) with the definition of the provider:

Then you can install the provider using the Akamas CLI:

The installed provider is shared with all users of your Akamas installation and can monitor many different systems, by configuring appropriate telemetry provider instances.

Ansible
Dynatarce Analysis view of a key request
Service Method ID

(Linux system metrics)

(Java metrics)

(Docker container metrics)

exporter (AWS resources metrics)

(Web application metrics)

The Prometheus provider includes queries for most of the monitoring use cases these exporters cover. If you need to specify custom queries or make use of exporters not currently supported you can specify them as described in creating .

Refer to to see how component-type metrics are extracted by this provider.

Notice: you should configure your Prometheus instances so that the Prometheus provider can leverage the instance property of components, as described in the section here above.

Node exporter
JMX exporter
cAdvisor
CloudWatch
Jmeter
Prometheus provider metrics mapping
Setup datasource
name: Prometheus
description: Telemetry Provider that enables to import of metrics from Prometheus
dockerImage: 485790562880.dkr.ecr.us-east-2.amazonaws.com/akamas/telemetry-providers/prometheus-provider:3.5.0
akamas install telemetry-provider provider.yml
Prometheus telemetry instances

Create Prometheus telemetry instances

To create an instance of the Prometheus provider, edit a YAML file (instance.yml in this example) with the definition of the instance:

# Prometheus Telemetry Provider Instance
provider: Prometheus

config:
  address: host1  # URL or IP of the Prometheus from which extract metrics
  port: 9090      # Port of the Prometheus from which extract metrics

Then you can create the instance for the system using the Akamas CLI:

akamas create telemetry-instance instance.yml system

Configuration options

When you create an instance of the Prometheus provider, you should specify some configuration information to allow the provider to extract and process metrics from Prometheus correctly.

You can specify configuration information within the config part of the YAML of the instance definition.

Required properties

  • address, a URL or IP identifying the address of the host where Prometheus is installed

  • port, the port exposed by Prometheus

Optional properties

  • user, the username for the Prometheus service

  • password, the user password for the Prometheus service

  • job, a string to specify the scraping job name. The default is ".*" for all scraping jobs

  • logLevel, set this to "DETAILED" for some extra logs when searching for metrics (default value is "INFO")

  • headers, to specify additional custom headers (e.g.: headers: {key: value})

  • namespace, a string to specify the namespace

  • duration, integer to determine the duration in seconds for data collection (use a number between 1 and 3600)

  • enableHttps, boolean to enable HTTPS in Prometheus (since 3.2.6)

  • ignoreCertificates, boolean to ignore SSL certificates

  • disableConnectionCheck, boolean to disable initial connection check to Prometheus

Custom queries

The Prometheus provider allows defining additional queries to populate custom metrics or redefine the default ones according to your use case. You can configure additional metrics using the metrics field as shown in the configuration below:

config:
  address: host1
  port: 9090

metrics:
  - metric: cust_metric   # extra akamas metric to monitor
    datasourceMetric: 'http_requests_total{environment=~"staging|testing|development", method!="GET"}' # query to execute to extract the metric
    labels:
    - method   # The "method" label will be retained within akamas

In this example, the telemetry instance will populate cust_metric with the results of the query specified in datasource, maintaining the value of the labels listed under labels.

Akamas placeholders

Akamas pre-processes the queries before running them, replacing special-purpose placeholders with the fields provided in the components. For example, given the following component definition:

name: jvm1
description: jvm1 for payment services
properties:
  prometheus:
    instance: service01
    job: jmx

the query sum(jvm_memory_used_bytes{instance=~"$INSTANCE$", job=~"$JOB$"}) will be expanded for this component into sum(jvm_memory_used_bytes{instance=~"service01", job=~"jmx"}). This provides greater flexibility through the templatization of the queries, allowing the same query to select the correct data sources for different components.

The following is the list of available placeholders:

Placeholder
Usage example
Component definition example
Expanded query
Description

$INSTANCE$, $JOB$

node_load1{instance=~"$INSTANCE$", job=~"$JOB$"}

node_load1{instance=~"frontend", job=~"node"}

These placeholders are replaced respectively with the instance and job fields configured in the component’s prometheus configuration.

%FILTERS%

container_memory_usage_bytes{job=~"$JOB$" %FILTERS%}

container_memory_usage_bytes{job=~"advisor", name=~"db-.*"}

This placeholder is replaced with a list containing any additional filter in the component’s definition (other than instance and job), where each field is expanded as field_name=~"field_value". This is useful to define additional label matches in the query without the need to hardcode them.

$DURATION$

rate(http_client_requests_seconds_count[$DURATION$])

rate(http_client_requests_seconds_count[30s])

$NAMESPACE$, $POD$, $CONTAINER$

1e3 * avg(kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

1e3 * avg(kube_pod_container_resource_limits{resource="cpu", namespace=~"boutique", pod=~"adservice.*", container=~"server"})

These placeholders are used within kubernetes environments

Example

prometheus:
  instance: frontend
  job: node

Use cases

This section reports common use cases addressed by this provider.

Collect Kubernetes metrics

To gather kubernetes metrics, the following exporters are required:

  • kube-state-metrics

  • cadvisor

As an example, you can define a component with type Kubernetes Container in this way:

name: adservice
description: The adservice of the online boutique by Google
componentType: Kubernetes Container
properties:
  prometheus:
    namespace: boutique
    pod: adservice.*
    container: server

Collect Java metrics

java -javaagent:the_downloaded_jmx_exporter_jar.jar=9100:config.yaml -jar yourJar.jar

The command will expose on localhost on port 9100 Java metrics of youJar.jar __ which can be scraped by Prometheus.

config.yaml is a configuration file useful for the activity of this exporter. It is suggested to use this configuration for an optimal experience with the Prometheus provider:

startDelaySeconds: 0
username:
password:
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
# using the property above we are telling the export to export only relevant Java metrics
whitelistObjectNames:
- "java.lang:*"
- "jvm:*"

As a next step, add a new scraping target in the configuration of the Prometheus used by the provider:

...
scrape_configs:
# JMX Exporter
- job_name: "jmx"
  static_configs:
  - targets: ["jmx_exporter_host:9100"]

You can then create a YAML file with the definition of a telemetry instance (prom_instance.yml) of the Prometheus provider:

name: Prometheus
config:
  address: prometheus_host
  port: 9090

And you can create the telemetry instance using the Akamas CLI:

akamas create telemetry-instance prom_instance.yml

Finally, to bind the extracted metrics to the related component, you should add the following field to the properties of the component’s definition:

prometheus:
  job: jmx

Collect system metrics

systemctl start node_exporter

Here’s the manifest of the node_exporter service:

[Unit]
Description=Node Exporter

[Service]
ExecStart=/path/to/node_exporter/executable

[Install]
WantedBy=default.target

The service will expose on localhost on port 9100 system metrics __ which can be scraped by Prometheus.

As a final step, add a new scraping target in the configuration of the Prometheus used by the provider:

scrape_configs:
# Node Exporter
- job_name: "node"
  static_configs:
  - targets: ["node_exporter_host:9100"]
  relabel_configs:
  - source_labels: ["__address__"]
    regex: "(.*):.*"
    # here we put as "instance", the name of the component the metrics refer to
    target_label: "instance"
    replacement: "linux_component_name"

You can then create a YAML file with the definition of a telemetry instance (prom_instance.yml) of the Prometheus provider:

provider: Prometheus
config:
  address: prometheus_host
  port: 9090

And you can create the telemetry instance using the Akamas CLI:

akamas create telemetry-instance prom_instance.yml

Finally, to bind the extracted metrics to the related component, you should add the following field to the properties of the component’s definition:

prometheus:
  instance: linux_component_name
  job: node

Please refer to for a complete reference of PromQL

See below

See below

If not set in the component properties, this placeholder is replaced with the duration field configured in the telemety-instance. You should use it with instead of hardcoding a fixed value.

See

Check for a list of all the Java metrics available in Akamas

You can leverage the Prometheus provider to collect Java metrics by using the . The JMX Exporter is a collector of Java metrics for Prometheus that can be run as an agent for any Java application. Once downloaded, you execute it alongside a Java application with this command:

Check the for a list of all the system metrics available in Akamas

You can leverage the Prometheus provider to collect system metrics (Linux) by using the . The Node exporter is a collector of system metrics for Prometheus that can be run as a standalone executable or a service within a Linux machine to be monitored. Once downloaded, schedule it as a service using, for example, systemd:

Querying basics | Prometheus
Java OpenJDK page
JMX Exporter
Linux page
Node exporter
range vectors
Example
Example
Collect Kubernetes metrics
AWS Policies
Akamas high-level architecture

Optimizing cost of a Java microservice on Kubernetes while preserving SLOs in production

In this example, you will use Akamas live optimization to minimize the cost of a Kubernetes deployment, while preserving application performance and reliability requirements.

Prerequisites

In this example, you need:

  • an Akamas instance

  • a Kubernetes cluster, with a deployment to be optimized

  • the kubectl command installed in the Akamas instance, configured to access the target Kubernetes and with privileges to get and update the deployment configurations

  • a supported telemetry data source (e.g. Prometheus or Dynatrace) configured to collect metrics from the target Kubernetes cluster

Optimization setup

Optimization packs

This example leverages the following optimization packs:

System

The system represents the Kubernetes deployment to be optimized (let's call it "frontend"). You can create a system.yaml manifest like this:

name: frontend
description: Kubernetes frontend deployment

Create the new system resource:

akamas create system system.yaml

The system will then have two components:

  • A Kubernetes container component, which contains container-level metrics like CPU usage and parameters to be tuned like CPU limits

  • A Web Application component, which contains service-level metrics like throughput and response time

In this example, we assume the deployment to be optimized is called frontend, with a container named server, and is located within the boutique namespace. We also assume that Dynatrace is used as a telemetry provider.

Kubernetes component

Create a component-container.yaml manifest like the following:

name: container
description: Kubernetes container, part of the frontend deployment
componentType: Kubernetes Container
properties:
  dynatrace:
    type: CONTAINER_GROUP_INSTANCE
    kubernetes:
      namespace: boutique
      containerName: server
      basePodName: frontend-*

Then run:

akamas create component component-container.yaml frontend

Now create a component-webapp.yaml manifest like the following:

name: webapp
description: The service related to the frontend deployment
componentType: Web Application
properties:
  dynatrace:
    id: <TELEMETRY_DYNATRACE_WEBAPP_ID>

Then run:

akamas create component component-webapp.yaml frontend

Workflow

The workflow in this example is composed of three main steps:

  1. Update the Kubernetes deployment manifest with the Akamas recommended deployment parameters (CPU and memory limits)

  2. Apply the new parameters (kubectl apply)

  3. Wait for the rollout to complete

  4. Sleep for 30 minutes (observation interval)

Create a workflow.yaml manifest like the following:

name: frontend
tasks:
  - name: configure
    operator: FileConfigurator
    arguments:
      source:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml.templ
      target:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
        path: frontend.yaml

  - name: apply
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl apply -f frontend.yaml

  - name: verify
    operator: Executor
    arguments:
      timeout: 5m
      host:
        hostname: mymachine
        username: user
        key: /home/user/.ssh/key
      command: kubectl rollout status --timeout=5m deployment/frontend -n boutique;

  - name: observe
    operator: Sleep
    arguments:
      seconds: 1800

Then run:

akamas create workflow workflow.yaml

Telemetry

Create the telemetry.yamlmanifest like the following:

provider: Dynatrace
config:
  url: <YOUR_DYNATRACE_URL>
  token: <YOUR_DYNATRACE_TOKEN>
  pushEvents: false

Then run:

akamas create telemetry-instance telemetry.yaml frontend

Study

In this live optimization:

  • the goal is to reduce the cost of the Kubernetes deployment. In this example, the cost is based on the amount of CPU and memory limits (assuming requests = limits).

  • the approval mode is set to manual, a new recommendation is generated daily

  • to avoid impacting application performance, constraints are specified on desired response times and error rates

  • to avoid impacting application reliability, constraints are specified on peak resource usage and out-of-memory kills

  • the parameters to be tuned are the container CPU and memory limits (we assume requests=limits in the deployment file)

Create a study.yaml manifest like the following:

name: frontend
system: frontend
workflow: frontend
requireApproval: true

goal:
  objective: minimize
  function:
    formula: (((container.container_cpu_limit/1000) * 3) + (container.container_memory_limit/(1024*1024*1024)))
  constraints:
    absolute:
      - name: Response Time
        formula: webapp.requests_response_time <= 300
      - name: Error Rate
        formula: webapp.service_error_rate:max <= 0.05
      - name: Container CPU saturation
        formula: container.container_cpu_util:p95 < 0.8
      - name: Container memory saturation
        formula: container.container_memory_util:max < 0.7
      - name: Container out-of-memory kills
        formula: container.container_oom_kills_count == 0

parametersSelection:
  - name: container.cpu_limit
    domain: [300, 1000]
  - name: container.memory_limit
    domain: [800, 1536]

windowing:
  type: trim
  trim: [5m, 0m]
  task: observe

workloadsSelection:
  - name: webapp.requests_throughput

steps:
  - name: baseline
    type: baseline
    numberOfTrials: 48
    values:
      container.cpu_limit: 1000
      container.memory_limit: 1536

  - name: optimize
    type: optimize
    numberOfTrials: 48
    numberOfExperiments: 100
    numberOfInitExperiments: 0
    maxFailedExperiments: 50

Then run:

akamas create study study.yaml

You can now follow the live optimization progress and explore the results using the Akamas UI for Live optimizations.

Kubernetes
Web application