arrow-left

All pages
gitbookPowered by GitBook
triangle-exclamation
Couldn't generate the PDF for 134 pages, generation stopped at 100.
Extend with 50 more pages.
1 of 100

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Reference

This guide provides a Glossary describing Akamas key concepts with their associated construct templates, command line commands, and user interfaces.

This guide also provides references to:

  • Construct templates

  • Workflow Operators structure and operators

  • metric mapping

  • metrics and parameters

  • to administer Akamas, manage users, authenticate and manage its resources

Telemetry Providers
Optimization Packs
Commands

Glossary

This section provides a definition of Akamas' key concepts and terms and also provides references to the related construct properties, commands, and user interfaces.

Term
Definition

systems targeted by optimization studies

System

A system represents of the entire system which is the target of optimization.

A system is a single object irrespective of the number or type of entities or layers that are in the scope of the optimization. It can be used to model and describe a wide set of entities like:

  • An N-layers application

  • A single micro-service

  • A single (or a collection of) batch job(s)

A System is made of one or more . Each component represents one of the elements in the system, whose are involved in the optimization or whose are collected to evaluate the results of such optimization.

hashtag
Construct

A system is described by the following properties:

  • The full micro-services stack of an application

  • a name that uniquely identifies the system

  • a description that clarifies what the system refers to

The construct to be used to define a system is described on the page.

hashtag
Commands

A system is an that can be managed via CLI using the

hashtag
User Interface

The Akamas UI displays systems (depending on the user privileges on the defined ) in a specific top-level menu.

Component Type

A component type is a blueprint for a component that describes the type of entity the component refers to. In Akamas, a component needs to be associated with a component type, from which the component inherits its metrics and parameters.

Component types are platform entities (i.e.: shared among all the users) usually provided off the shelf and shipped within the Optimization Packs. Typically, different component types within the same optimization pack are used to model different versions/releases of the same technology.

Akamas' users with appropriate privileges can create custom component types and optimization packs, as described on the Creating custom optimization packarrow-up-right page.

hashtag
Construct

A component type is described by the following mandatory properties (other properties can be defined but are not mandatory):

  • a name that uniquely identifies the component type within the system

  • a description that clarifies what the component type refers to

  • a parameter definitions array (more on Parameters later)

The construct to be used to define a component type is described on the page.

hashtag
Commands

A component type is an that can be managed via CLI using the

hashtag
User Interface

When visualizing system components the component type is displayed.

The following figure shows the out-of-the-box JVM component types related to the JVM optimization pack.

Telemetry Provider

A telemetry provider is a software object that represents a data source of metrics. A telemetry instance is a specific instance of a telemetry provider that refers to a specific data source.

Examples of telemetry providers are:

  • monitoring tools (e.g. Prometheus or Dynatrace)

  • load testing tools (e.g. LoadRunner or Neoload)

  • CSV files

A telemetry provider is a platform-wide entity that can be reused across systems to ease the integration with metrics sources.

Akamas provides a number of out-of-the-box . Custom telemetry providers can also be created.

hashtag
Construct

The construct to be used to define a telemetry provider is described on the page.

hashtag
Commands

A telemetry provider is an that can be managed via CLI using the

hashtag
User Interface

The Akamas UI shows systems in a specific top-level menu.

Telemetry Instance

A telemetry instance is an instance of a telemetry provider that collects data from a specific instance of the data source.

A telemetry instance is an instance of a telemetry provider, providing the required information on how to connect and collect a given set of metrics from a specific data source.

While telemetry providers are platform-wide entities, telemetry instances are defined at each system level.

hashtag
Construct

The construct to be used to define a telemetry instance is described on the page.

hashtag
Commands

A telemetry provider is an that can be managed via CLI using the

hashtag
User Interface

Telemetry instances are displayed in the Akamas UI when drilling down each system component.

Goals & Constraints

The optimization goal defines the objective of an optimization study to be achieved by changing the system parameters to modify the system behavior while also satisfying any defined optimization constraints on the system metrics, possibly representing SLOs.

hashtag
Construct

A goal is defined by:

  • an optimization objective: either maximize or minimize

  • a scoring function (scalar): either a single metric or a formula defined by one or more metrics

One or more constraints can be associated with a goal

  • a formula defined on one or more metrics, referring to either absolute values (absolute constraints) or relative to a baseline value (relative constraints)

Notice that relative constraints are only supported by while absolute constraints are supported by both offline and .

Goals and constraints are not an as they are defined as part of an optimization study. The construct to be used to define a goal and its constraints are described in the page of the section.

hashtag
Commands

Goals and constraints are not an and are always defined as part of an optimization study.

hashtag
User Interface

Goals and constraints are displayed in the Akamas UI when drilling down each optimization study.

The detail of the formula used to define the goal may also be displayed:

Workspace

A workspace is a virtual environment that groups systems, workflows, and studies to restrict user access to them: a user can access these resources only when granted the required permissions to that workspace.

Akamas defines two user roles according to the assigned permission on the workspace:

  • Contributors (write permission) can create and manage workspace resources (studies, telemetry instances, systems, and workflows) and can also do exports/imports, view all global resources (Optimization Packs, and Telemetry Providers), and see remaining credits;

  • Viewers (read permission) can only access optimization results but cannot create or modify workspace resources.

Workspaces and accesses are managed by users with administrative privileges. A user with administrator privileges can manage licenses, users, workspaces, and install/deinstall Optimization Packs, and Telemetry Providers.

Workspaces can be defined according to different criteria, such as:

  • By department (e.g. Performance, Development)

  • By initiative (e.g. Poc, Training)

  • By application (e.g. Registry, Banking..)

A workspace is described by the following property:

  • a name that uniquely identifies the workspace

hashtag
Commands

A workspace is an that can be managed via CLI using the See also this page devoted to commands on .

hashtag
User Interface

The workspace a study belongs to is always displayed. Filters can be used to select only studies belonging to specific workspaces

Windowing policy

The Windowing field in a study specifies the windowing policy to be adopted to score the experiments of an optimization study.

The two available windowing strategies have different structures:

  • Trim windowing: trim the temporal interval of a trial, both from the start and the end of a specified temporal amount - this is the default strategy

  • : discard temporal intervals in which a given metric is not stable and selects the temporal interval in which a metric is maximized or minimized.

In case the windowing strategy is not specified, the entire time window is considered.

Metric selection

The MetricSelection field in a study specifies the metrics of the system. Such selection has only the purpose to specify which metrics need to be tracked while running the study and does not affect the optimization.

In case this selection is not specified, all metrics are considered.

A metrics selection can either assume the value of all to indicate that all the available metrics of the system of the study should be tracked, or can assume the value of a list of the names of metrics of the system that should be tracked prepended with the name of the component.

hashtag
Example

The following fragment is an example:

Component

A component represents an element of a . Typically, systems are made up of different entities and layers which can be modeled by components. In other words, a system can be considered a collection of related components.

Notice that a component is a black-box definition of each entity involved in an optimization study, so detailed modeling of the entities being involved in the optimization is not required. The only relevant elements are the that are involved in the optimization and the that are collected to evaluate the results of such an optimization.

Notice that only the entities that are directly involved in the optimization need to be modeled and defined within Akamas. An entity is involved in an optimization study if it is optimized or monitored by Akamas, where "optimized" means that Akamas is optimizing at least one of its parameters, and "monitored" means that Akamas is monitoring at least one of its metrics.

KPI

A KPI is a metric that is worth considering when analyzing the result of an offline optimization study, looking for (sub)optimal configurations generated by Akamas AI to be applied.

Akamas automatically considers any metric referred to in the defined optimization goal and constraints for an offline optimization study as a KPI. Moreover, any other metrics of the system component can be specified as a KPI for an offline optimization study.

hashtag
Construct

A KPI is defined as follows (from the UI or the CLI):

System template

Systems are defined using a YAML manifest with the following structure:

with the following properties:

Field
Type
Value restrictions
Is required
Default Value
Description

Sleep Operator

The Sleep operator pauses a workflow for a given amount of time.

hashtag
Operator arguments

Name
Type
Value Restrictions

Telemetry Provider template

Telemetry Providers are defined using a YAML manifest with the following structure:

with the following properties:

Name
Type
Description
Mandatory

Steps

The Steps field in a study specifies the sequence of steps executed while running the study. These steps are run in exactly the same order in which they will be executed.

The following types of steps are available:

  • : performs an experiment and sets it as a baseline for all the other ones

Optimization Pack

An optimization pack is a software object that provides a convenient facility for encapsulating all the knowledge (e.g. metrics, parameters with their default values and domain ranges) required to apply Akamas optimizations to a set of entities associated with the same technology.

Notice that while optimization packs are very convenient for modeling systems and creating studies, it is not required for these entities to be covered by an optimization pack.

Akamas provides a library of out-the-box optimization packs and new custom optimization packs can be easily added (no coding is required).

hashtag

Docker optimization pack

The NodeJS optimization pack provides support for optimizing Docker containers.

hashtag
Component Types

The following component types are supported for Docker.

Component Type
Description
Stability windowing

elements of the system

Component Type

types associated to a system component

Optimization Pack

objects encapsulating knowledge about component types

Metric

a measured metric, collected via telemetry providers

Parameter

tunable parameters, set via native or other interfaces

Telemetry Provider

general definition of providers of collected metrics

Telemetry Instance

specific instances of telemetry providers

Workflow

automation workflow to set parameters, collect metrics and run load testing

Goal & Constraints

goal and constraints defined for an optimization study

Optimization Study

optimization studies for a target system

Offline Optimization Study

optimization studies for a non-live system

Live Optimization Study

optimization studies for a live system

Workspace

virtual environments to organize and isolate resources

System
Component
Bootstrap: imports experiments from other studies
  • Preset: performs an experiment with a specific configuration

  • Optimize: performs experiments and generates optimized configurations

  • Notice that the structure corresponding to the steps is different for the different types of steps.

    Baseline
    components
    parameters
    metrics
    System template
    Akamas resource
    resource management commands.
    workspaces

    a metrics array (more on Metrics later)

    Component type template
    Akamas resource
    resource management commands.
    telemetry providers
    Telemetry Provider template
    Akamas resource
    resource management commands.
    Telemetry Instance template
    Akamas resource
    resource management commands.
    offline optimization studies
    online optimization studies
    Akamas resource
    Goal & Constraint
    Study template
    Akamas resource
    Akamas resource
    resource management commands.
    how to define users and workspaces
    metricsSelection:
      - application.response_time
      - application.error_rate
      - jvm1.gc_time

    The name of the Telemetry Provider. This name will be used to reference the Telemetry Provider in the Telemetry Provider Instances. This is unique in an Akamas instance

    yes

    description

    string

    A description for the Telemetry Provider

    yes

    dockerImage

    string

    The docker image of the Telemetry Provider.

    yes

    Refer to the page Integrating Telemetry Providers which describes the out-of-the-box Telemetry Providers that are created automatically at Akamas install time.

    name

    string

    string

    TRUE

    The name of the system

    description

    string

    TRUE

    A description to characterize the system

    hashtag
    Example

    The following represents a system (for Cassandra related system)

    name

    Required
    Default
    Description

    seconds

    Number (integer)

    seconds > 0

    Yes

    The number of seconds for which pause the workflow

    hashtag
    Examples

    hashtag
    Pause a workflow for 30 seconds

    name: "<string>"
    description: "<string>"
    dockerImage: "<string>"
    # General section
    name: Analytical functions
    description: A collection of analytical functions
    name: system1
    description: my system with 3 nodes of cassandra
    name: Pause
    operator: Sleep
    arguments:
      seconds: 30
    hashtag
    Construct

    A component is described by the following mandatory properties (other properties can be defined but are not mandatory):

    • a name that uniquely identifies the component within the system

    • a description that clarifies what the component refers to

    • a component type that identifies the technology of the component (see component type)

    In general, a component contains a set of each of the following:

    • parameter(s) in the scope of the optimization

    • metric(s) needed to define the optimization goal

    • metric(s) needed to define the optimization constraints

    • metric(s) that are not needed to either define the optimization goal or constraints, and hence not used by Akamas to perform the optimization, but are collected in order to support the analysis (and which can be possibly added at a later time as part of optimization goal or constraint when refining the optimization).

    The construct to be used to define a component is described on the Component template page.

    hashtag
    Commands

    A component is an Akamas resource that can be managed via CLI using the resource management commands.

    hashtag
    User Interface

    The Akamas UI shows more details about components by drilling down their respective system.

    system
    parameters
    metrics
    Field name
    Field description

    Name

    Name of the KPI that will be used on UI labels

    Formula

    Must be defined as <Component_name>.<metric_name>

    Direction

    Must be 'minimize' or 'maximize'

    KPIs are not an Akamas resource as they are defined as part of an optimization study. The construct to define KPIs is described on the KPIs page of the Study template section.

    hashtag
    Commands

    KPIs are not an Akamas resource and are always defined as part of an optimization study.

    hashtag
    User Interface

    The number and first KPIs are displayed in the Akamas UI in the header of each offline optimization study.

    Number and first KPIs diplayed in the optimization study

    The full list of KPIs is displayed by drilling down to the KPIs section.

    KPI section displaying all KPIs

    From this section, it is possible to modify the list of KPIs and change their names and other attributes.

    Construct

    An optimization pack needs to include the entities that encapsulate technology-specific information related to the supported component types:

    • supported component types

    • parameters and metrics for each component type

    • supported telemetry providers (optional)

    hashtag
    Commands

    An optimization pack is an Akamas resource that can be managed via CLI using the resource management commands.

    hashtag
    User Interface

    The Akamas UI shows systems in a specific top-level menu.

    An optimization pack encapsulates one or more of the following technology-specific elements:

    • Component Types: these represent the type of the component(s) included, each with its associated parameters and metrics

    • Telemetry Providers: that define where to collect metrics

    An optimization pack enables Akamas users to optimize a technology without necessarily being an expert in that technology and to code their knowledge about a technology or a specific application to be reused in multiple optimization studies to ease the modeling process.

    Docker container

    hashtag
    Installing

    Here’s the command to install the Docker optimization pack using the Akamas CLI:

    akamas install optimization-pack Docker

    Optimization Study

    An optimization study (or study for short) represents an optimization initiative aimed at optimizing a goal on a target system. A study instructs Akamas about the space to explore and the KPIs used to evaluate whether a con configuration is good or bad

    Akamas supports two types of optimizations:

    • Offline Optimization Studies are optimization studies where the workload is simulated by leveraging a load-testing tool.

    • are applied to systems that need to be optimized in production with respect to varying workloads observed while running live. For example, a microservices application can be optimized live by having Kubernetes and JVM parameters dynamically tuned for multiple microservices so as to minimize costs while matching response time objectives.

    hashtag
    Construct

    A study is described by the following properties

    • system: the under optimization

    • parameters: the set of being optimized

    • metrics: the set of to be collected

    The construct to be used to define an optimization is described on the page.

    hashtag
    Commands

    An optimization study is an that can be managed via CLI using the

    hashtag
    User Interface

    The Akamas UI shows optimization studies in 2 specific top-level menus: one for offline optimization studies and another for live optimization studies.

    Parameter selection

    The ParameterSelection field in a study specifies which parameters of the system should be tuned while running the optimization study.

    In case this selection is not specified, all parameters are considered.

    A parameter selection can either assume the value of all to indicate that all the available parameters of the system of the study should be tuned, or can assume the value of a list with items of the shape like the one below:

    Field
    Type
    Value restriction
    Is required
    Default value
    Description

    Notice that, by default, every parameter specified in the parameters selection of a study is applied. This can be modified, by leveraging the options.

    hashtag
    Example

    The following fragment is an example:

    Workflow

    A workflow is a set of tasks that run in sequence to evaluate a configuration as part of an optimization study. A task is a single action performed within a workflow.

    Workflows allow you to automate Akamas optimization studies, by automatically executing a sequence of tasks such as initializing an environment, triggering load testing, restoring a database, applying configurations, and much more.

    These are examples of common tasks:

    • Launch remote commands via SSH

    • Apply parameter values in configuration files

    • Execute Spark jobs via spark-submit API

    • Start performance tests by integrating with external tools such as Neoload

    Workflows are first-class entities that can be defined globally and then used in multiple optimization studies.

    Akamas provides several that can be used to perform tasks in a workflow. Some operators are general-purpose, such as those executing a command or script on a specific host, while others provide native integrations with specific technologies and tools, such as Spark History Server or load testing tools.

    hashtag
    Construct

    The construct to be used to define a workflow is described on the page.

    hashtag
    Commands

    A telemetry provider is an that can be managed via CLI using the

    hashtag
    User Interface

    The Akamas UI shows systems in a specific top-level menu.

    The list of tasks is displayed when drilling down to each specific workflow.

    Telemetry metric mapping

    This section documents the mapping between the metrics provided by Telemetry Providers and the Akamas metrics for each supported component type.

    Telemetry Provider
    Telemetry Provider metric mapping

    no predefined mapping as CSV provider is extensible

    Java OpenJDK optimization pack

    The Java-OpenJDK optimization pack enables the ability to optimize Java applications based on the OpenJDK and Oracle HotSpot JVM. Through this optimization pack, Akamas is able to tackle the problem of performance of JVM-based applications from both the point of view of cost savings and quality of service.

    To achieve these goals the optimization pack provides parameters that focus on the following areas:

    • Garbage collection

    • Heap

    • JIT

    Similarly, the bundled metrics provide visibility on the following aspects of tuned applications:

    • Heap and memory utilization

    • Garbage collection

    • Execution threads

    hashtag
    Component Types

    The optimization pack supports the most used versions of OpenJDK and Oracle HotSpot JVM.

    Component Type
    Description

    hashtag
    Installing

    Here’s the command to install the Java OpenJDK optimization pack using the Akamas CLI:

    For more information on the process of installing or upgrading an optimization pack refer to .

    Component template

    Components are defined using a YAML manifest with the following structure:

    and properties:

    Field
    Type
    Value restrictions
    Is required
    Default value
    Description

    hashtag
    Examples

    Example of a component for OpenJDK11:

    Example of a component for the Linux operating system:

    Workload selection

    The workloadsSelection is a structure used to define the metrics that are used by Akamas to model workloads as part of a live optimization study.

    workloadsSelection:
      - name: component1.metric1
      - name: component2.metric2:p95

    with the following fields:

    Field
    Type
    Value restriction
    Is required
    Default value
    Description

    Notice that workload metrics must have been defined in the metricsSelection. Variables used in the name field can include an aggregation. The following aggregations are available: avg, min, max, sum, p90, p95, p99.

    hashtag
    Examples

    The following refers to a workload represented by the metric transactions_throughput of the konakart component with multiple aggregations:

    General operator arguments

    All operators accept some common, optional, arguments that allow you to control how the operator is executed within your workflow.

    The following table reports all the arguments that can be used with any operator.

    Name
    Type
    Value Restrictions
    Required
    Default
    Description

    GO optimization pack

    The Go optimization pack provides support for optimizing Go applications.

    hashtag
    Component Types

    The following component types are supported for Go applications.

    Component Type
    Description

    hashtag
    Installing

    Here’s the command to install the Go optimization pack using the Akamas CLI:

    Node JS optimization pack

    The Node JS optimization pack enables the ability to optimize applications based on Node running in V8.

    The following component types are supported for NodeJS applications.

    Component Type
    Description

    Node JS 18 runtime

    hashtag
    Installing

    Here’s the command to install the Node JS optimization pack using the Akamas CLI:

    For more information on the process of installing or upgrading an optimization pack refer to .

    Kubernetes Namespace

    This page describes the Optimization Pack for the Kubernetes Namespace component type.

    hashtag
    Metrics

    Name
    Unit
    Description

    hashtag
    Parameters

    There are no parameters for the Kubernetes Namespace component type.

    WebSphere optimization pack

    The Cassandra optimization pack provides support for optimizing WebSphere middleware.

    hashtag
    Component Types

    The following component types are supported for WebSphere middleware.

    Component Type
    Description

    hashtag
    Installing

    Here’s the command to install the WebSphere optimization pack using the Akamas CLI:

    Parameter

    A parameter is a property of the system that can be applied and tuned to change the system's behavior. Akamas optimizes systems by changing parameters to achieve the stated while respecting the defined .

    Examples of a parameter include:

    • Configuration knobs (e.g. JVM garbage collection type)

    Construct templates

    This section describes all the structures that can be used to define resources and objects in Akamas.

    Resource
    Construct template

    Bootstrap step

    A baseline step performs an experiment (a baseline experiment) and marks it as the initial experiment of a study. The purpose of the step is to build a reference configuration that Akamas can use to measure the effectiveness of an optimization conducted towards a system.

    When a bootstrap step imports an experiment from another study, the step copies not only the experiment but also its trials and the system metrics generated during its execution.

    The bootstrap step has the following structure:

    Field
    Type
    Value restrictions
    Is required

    Kubernetes Pod

    This page describes the Optimization Pack for the Kubernetes Pod component type.

    hashtag
    Metrics

    Name
    Unit
    Description

    Trim windowing

    A windowing policy of type trim trims the temporal interval of a trial, both from the start and from the end of a specified temporal amount (e.g., 3 seconds).

    The trim windowing has the following structure:

    Filed
    Type
    Value restrictions
    Is required
    Default value
    Description

    Parameter rendering

    The renderParameters and doNotRenderParameters can be used to specify which configuration parameters should be rendered when doing experiments within a step.

    Parameter rendering can be defined at the step level for baseline, preset, and optimize steps. This is not possible for bootstrap steps as bootstrapped experiments are not executed.

    Field
    Type
    Value restrictions
    Is required

    Metric

    A metric is a measured property of a system.

    Examples of a metric include:

    • the response time of an application

    • the utilization of a CPU

    Kubernetes optimization pack

    The Kubernetes optimization pack allows optimizing containerized applications running on a Kubernetes cluster. Through this optimization pack, Akamas is able to tackle the problem of distributing resources to containerized applications in order to minimize waste and ensure the quality of service.

    To achieve these goals the optimization pack provides parameters that focus on the following areas:

    • Memory allocation

    Kubernetes Cluster

    This page describes the Optimization Pack for the Kubernetes Cluster component type.

    hashtag
    Metrics

    Name
    Unit
    Description

    Web Application optimization pack

    The Web Application optimization pack provides a component type apt for monitoring the performances from the end-user perspective of a generic web application, to evaluate the configuration of the technologies in the underlying stack.

    The bundled component type provides Akamas with performance metrics representing concepts like throughput, response time, error rate, and user load, split into different levels of detail such as transaction, page, and single request.

    hashtag
    Component Types

    name: branin
    description: The branin analytical function
    componentType: function_branin
    properties:
      hostname: function-server

    retries

    integer

    -

    no

    1

    How many times a task can be re-executed in case of failures. If a task reaches the maximum number of retries and fails the entier workflow execution is aborted and the trial is considered failed.

    retry_delay

    string

    string (supporting seconds, minutes and hours) int (seconds only)

    no

    5m

    How much time to wait before retrying a failed task.

    timeout

    string

    string (supporting seconds, minutes and hours) int (seconds only)

    no

    Infinite

    The maximum time a task can run before considering a failure. If the timeout exceeds the task is considered failed.

    workflow: the workflow describing tasks to perform experiments/trials

  • goal: the desired optimization goal to be achieved

  • constraints: the optimization constraints that any configuration needs to satisfy

  • steps: the steps that are executed to run specific configurations (e.g. the baseline) and run the optimization

  • Live Optimization Studies
    system
    parameters
    metrics
    Study template
    Akamas resource
    resource management commands.
    Container

    Dynatrace metrics mapping

    Prometheus

    Prometheus metrics mapping

    NeoLoadWeb

    NeoLoadWeb metrics mapping

    Spark History Server

    Spark History Server metrics mapping

    Load Runner Enterprise

    Load Runner metrics mapping

    Load Runner Professional

    Load Runner metrics mapping

    AWS

    CSV
    Dynatrace
    workflow operators
    Workflow template
    Akamas resource
    resource management commands.

    Aggregation

    A valid metric aggregation such as min, max, avg, sum, p95, etc. If unspecified, default is avg

    bytes

    Memory requested for the namespace

    k8s_namespace_running_pods

    pdds

    The number of running pods in the namespace

    k8s_namespace_cpu_limit

    millicores

    The CPU limit for the namespace

    k8s_namespace_cpu_request

    millicores

    The CPUs requested for the namespace

    k8s_namespace_memory_limit

    bytes

    The memory limit for the namespace

    k8s_namespace_memory_request

    A custom domain for the parameter to be used only for the study

    categories

    array of string

    should be set only if the parameter has a domain of type categorical, and be compatible with the domain defined in the component-type the component_name refers to.

    FALSE

    A custom set of categories for the parameter to be used only for the study.

    name

    string

    should match the following syntax:

    component_name.parameter_name

    where component_name is an existing component, where parameter_name is an existing parameter that is associated with the component-type of the component component_name

    TRUE

    The name of the parameter to be tuned including the name of the component it refers to

    domain

    array of numbers

    should be of size 2, contain either all integers or real numbers(do not omit the "."), be set only if the parameter has a domain of type integer or real,and be compatible with the domain defined in the component-type the component_name refers to

    FALSE

    Parameter rendering

    Java OpenJDK 8

    Java OpenJDK 8 JVM

    Java OpenJDK 11

    Java OpenJDK 11 JVM

    Java OpenJDK 17

    Java OpenJDK 17 JVM

    Install Optimization Packsarrow-up-right

    name

    string

    should match the following regexp:

    ^[a-zA-Z][a-zA-Z0-9_]*$

    that is only letters, number and underscores, no initial number of underscore

    Notice: this should not match the name of another component

    TRUE

    The name of the component.

    description

    string

    TRUE

    A description to characterize the component.

    componentType

    string

    notice: this should match the name of an existing component-type

    TRUE

    The name of the component-type that defines the type of the component.

    properties

    object

    FALSE

    General custom properties of the component. These properties can be defined freely and usually have the purpose to expose information useful for configuring the component.

    name

    string

    should match the following syntax:

    component_name.metric_name<:aggregation>

    where component_name is an existing component, metric_name is an existing metric associated with the component-type of the component component_name, and aggregation is an optional aggregation (default avg)

    TRUE

    The metric of the component that represents the workload

    Go 1

    The Golang runtime 1

    akamas install optimization-pack Go
    akamas install optimization-pack NodeJS
    Install Optimization Packsarrow-up-right
    Node JS 18

    WebSphere 8.5

    IBM WebSphere Application Server 8.5

    WebSphere Liberty ND

    IBM WebSphere Liberty ND

    akamas install optimization-pack WebSphere
    Description

    renderParameters

    Array of strings

    should contain strings in the form component.parameter or component.*. The latter means every parameter of the component

    No

    Which configuration parameters should be rendered or applied when doing experiments/trials in addition to ones in the parameters selection or in the values if the step is of type baseline or preset

    doNotRenderParameters

    Array of strings

    should contain strings in the form component.parameter or component.*. The latter means every parameter of the component

    No

    Which configuration parameters should not be rendered or applied when doing experiments/trials

    hashtag
    Examples

    The following baseline step specifies that every parameter of the component 'os' should not be rendered while the parameter 'cpu_limit' of the component 'docker' should be rendered:

    The following preset step specifies that the parameter 'cpu_limit' of the component 'docker' should be rendered:

    The following optimize step specifies that every parameter of the component 'os' should not be rendered:

    parametersSelection:
      - name: jvm.jvm_gcType
      - name: jvm.jvm_maxG1NewSizePercent
        domain: [10, 90]
      - name: webserver.aws_ec2_instance_size
        categories: ["large", "x.large", "2x.large"]
    akamas install optimization-pack Java-OpenJDK
    name: JVM_1
    description: The first jvm of the system
    componentType: java-openjdk-11
    properties:
      hostname: ycsb1.dev.akamas.io
      username: ubuntu
    name: os_1
    description: The operating system of team 1
    componentType: Ubuntu-20.04
    properties:
      hostname: ycsb1.dev.akamas.io
      username: ubuntu
    workloadsSelection:
      - name: konakart.transactions_throughput
      - name: konakart.transactions_throughput:p95
    name: "mybaseline"
    type: "baseline"
    values: # every parameter in 'values' is rendered by default
      jvm.jvm_compilation_threads: 10
      jvm.jvm_gcType: -XX:+UseG1GC
    doNotRenderParameters: ["os.*"] # every parameter of the component 'os' will not be rendered
    renderParameters: ["docker.cpu_limit"] # the parameter 'cpu_limit' of the component 'docker' will be rendered
    name: "mypreset"
    type: "preset"
    values:
      jvm.jvm_compilation_threads: 10
      jvm.jvm_gcType: -XX:+UseG1GC
    renderParameters: ["docker.cpu_limit"] # the parameter 'cpu_limit' of the component 'docker' will be rendered
    name: "myoptimize"
    type: "optimize"
    numberOfExperiment: 200
    doNotRenderParameters: ["os.*"] # every parameter of the component 'os' will not be rendered
    Resource settings (e.g. amount of memory allocated to a Spark job)
  • Algorithms settings (e.g. learning rate of a neural network)

  • Architectural properties (e.g. how many caching layers in an enterprise application)

  • Type of resources (e.g. AWS EC2 instance or EBS volume type)

  • Any other thing (e.g. amount of sugar in your cookies)

  • The following table describes the parameter types:

    Prameter Type
    Domain
    Akamas normalized domain

    REAL

    real values

    Akamas normalizes the values

    [0.0, 10.0] → [0.0, 1.0]

    INTEGER

    integer values

    Akamas converts the integer into real and then normalizes the values

    [0, 3] → [0.0, 3.0] → [0.0, 1.0]

    hashtag
    Construct

    A parameter is described by the following properties:

    • a name that uniquely identifies the parameter

    • a description that clarifies the semantics of the parameter

    • a unit that defines the unit of measurement used by the parameter

    Although users can create parameters with any name, we suggest using the naming convention context_parameter where

    • context refers to the technology or more general environment in which that metric is defined (e.g. elasticsearch, jvm, mysql, spark)

    • parameter is the parameter name in the original context (e.g. gcType, numberOfExecutors)

    This makes it possible to identify parameters more easily and avoid any potential name clash.

    The construct to be used to define a parameter is described on the Parameter template page.

    hashtag
    User Interface

    Parameters are displayed in the Akamas UI when drilling down to each system component.

    For each optimization study, the optimization scope is the set of parameters that Akamas can change to achieve the defined optimization goal.

    goal
    constraints

    System

    System template

    Component

    Default value
    Description

    type

    string

    bootstrap

    yes

    The type of the step, in this case, bootstrap

    name

    string

    yes

    where the from field should have the following structure:

    with:

    • study contains the name or ID of the study from which to import experiments

    • experiments contains the numbers of the experiments from which to import

    hashtag
    Examples

    The following is an example of a bootstrap step that imports four experiments from two studies:

    You can also import all the experiments of a study by omitting the experiments field:

    k8s_pod_cpu_used

    millicores

    The CPUs used by all containers of the pod

    k8s_pod_memory_used

    bytes

    The total amount of memory used as sum of all containers in a pod

    k8s_pod_cpu_request

    millicores

    The CPUs requested for the pod as sum of all container cpu requests

    hashtag
    Parameters

    There are no parameters for the Kubernetes Pod component type.

    type

    string

    {trim}

    TRUE

    the type of windowing strategy

    trim

    array of strings

    The length of the array should be two.

    Valid values should have the form of a whole number followed by either "s", "m", or "h"

    TRUE

    In case a windowing policy is not specified, the default windowing corresponding to trim[0s,0s] is considered.

    hashtag
    Example

    The following fragment shows a windowing strategy of type "trim" where the time window is specified to start 10s after the beginning of the trial and to end immediately before the end of the trial:

    the amount of time spent in garbage collection

  • the cost of a cloud service

  • Metrics are used to both specify the optimization goal and constraints (e.g. minimize the heap size while keeping response time < 1000 and error rate <= 10% of a baseline value), and to assess the behavior of the system with respect to each specific configuration applied.

    hashtag
    Construct

    A metric is described by the following properties:

    • a name that uniquely identifies the metric

    • a description that clarifies the semantics of the metric

    • a unit that defines the unit of measurement used by the metric

    The construct to be used to define a metric is described on the Metric template page.

    hashtag
    User Interface

    Metrics are displayed in the Akamas UI when drilling down to each system component.

    and are represented in metric charts for each specific optimization study

    Please notice that in order for a metric to be displayed in the Akamas UI, it has to be collected from a Telemetry Provider by means of a specific Telemetry Instance defined for each specific target system.

    CPU allocation
  • Number of replicas

  • Similarly, the bundled metrics provide visibility on the following aspects of tuned applications:

    • Memory utilization

    • CPU utilization

    hashtag
    Component Types

    The component types provided in this optimization pack allow modeling the entities found in a Kubernetes-based application, optimizing their parameters, and monitoring the key performance metrics.

    Component Type
    Description

    Kubernetes Container

    Kubernetes Pod

    Kubernetes Workload

    hashtag
    Installing

    Here’s the command to install the Kubernetes optimization pack optimization-pack using the Akamas CLI:

    k8s_cluster_cpu

    millicores

    The CPUs in the cluster

    k8s_cluster_cpu_available

    millicores

    The CPUs available for additional pods in the cluster

    k8s_cluster_cpu_util

    percent

    The percentage of used CPUs in the cluster

    hashtag
    Parameters

    There are no parameters for the Kubernetes Cluster component type.

    Component Type
    Description

    Web Application

    hashtag
    Installing

    Here’s the command to install the Web Application optimization pack using the Akamas CLI:

    akamas install optimization-pack Web-Application

    KPIs

    The kpis field in a study specifies which metrics should be considered as KPI for an offline optimization study.

    In case this selection is not specified, all metrics mentioned in the goal and constraint of the optimization study are considered.

    A KPI is defined as follows:

    Field
    Type
    Value restriction
    Is required
    Default value
    Description

    Notice that the textual badge displayed in the Akamas UI use "Best name".

    hashtag
    Example

    The following fragment is an example of a list of KPIs:

    Kubernetes Workload

    This page describes the Optimization Pack for the Kubernetes Workload component type.

    hashtag
    Metrics

    Name
    Unit
    Description

    hashtag
    Parameters

    Name
    Unit
    Type
    Default Value
    Domain
    Restart
    Description

    OpenJ9 optimization pack

    The OpenJ9 optimization pack enables the ability to optimize Java applications based on the Eclipse OpenJ9 VM, formerly known as IBM J9. Through this optimization pack, Akamas is able to tackle the problem of performance of JVM-based applications from both the point of view of cost savings and quality of service.

    To achieve these goals the optimization pack provides parameters that focus on the following areas:

    • Garbage collection

    • Heap

    • JIT

    Similarly, the bundled metrics provide visibility on the following aspects of tuned applications:

    • Heap and memory utilization

    • Garbage Collection

    • Execution threads

    hashtag
    Component Types

    The optimization pack supports the most used versions of JVM.

    Component Type
    Description

    hashtag
    Installing

    Here’s the command to install the Eclipse OpenJ9 optimization pack using the Akamas CLI:

    For more information on the process of installing or upgrading an optimization pack refer to .

    DotNet optimization pack

    The DotNet optimization pack provides support for optimizing MS .Net applications.

    hashtag
    Component Types

    The following component types are supported for DotNet.

    Component Type
    Description

    hashtag
    Installing

    Here’s the command to install the DotNet optimization pack using the Akamas CLI:

    Workflows template

    Workflow are defined using a YAML manifest with the following structure:

    with the following properties:

    Name
    Type
    Value Restrictions
    Required
    Default
    Description

    Optimize step

    An optimize step generates optimized configurations according to the defined optimization strategy. During this step, Akamas AI is used to generate such optimized configurations.

    The optimize step has the following structure:

    Field
    Type
    Value restrictions
    Is required
    Default value
    Description

    Preset step

    A preset step performs a single experiment with a specific configuration. The purpose of this step is to help you quickly understand how good is a particular configuration.

    A preset step offers two options when selecting the configuration of the experiment to be executed:

    • Use a configuration taken from an experiment of a study (can be the same study)

    • Use a custom configuration

    Baseline step

    A baseline step performs an experiment (a baseline experiment) and marks it as the initial experiment of a study. The purpose of the step is to build a reference configuration that Akamas can use to measure the effectiveness of an optimization conducted towards a system.

    A baseline step offers three options when it comes to selecting the configuration of the baseline experiment:

    • Use a configuration made with the default values of the parameters taken from the system of the study

    Parameter template

    Parameters are defined using a YAML manifest with the following structure:

    with the following properties:

    Field
    Type
    Value restrictions
    Is required
    Default Value
    Description

    Linux optimization pack

    The Linux optimization pack helps you optimize Linux-based systems. The optimization pack provides component types for various Linux distributions, thus enabling performance improvements on a plethora of different configurations.

    Through this optimization pack, Akamas is able to tackle the problem of performance of Linux-based systems from both the point of you of cost savings, as well as quality and level of service: the included component types bring in parameters that act on the memory footprint of systems, on their ability to sustain higher levels of traffic, on their capacity of leveraging all the available resources and on their potential for lower latency transactions.

    Each component type providers parameters that cover four main areas of tuning:

    • CPU tasks scheduling (for example, if to auto-group and schedule together similar tasks)

    Workflow Operators

    This section documents the out-of-the-box workflow operators.

    Operator
    Description

    Telemetry Instance template

    Telemetry instances are defined using a YAML manifest with the following structure:

    with the following properties for the global section

    Name
    Type
    Description
    Mandatory
    study: "study_bootstrap_1"
    experiments: [1,2,3]
    name: "my_bootstrap"  # name of the step
    type: "bootstrap"     # type of the step (bootstrap)
    from:
      - study: "study_bootstap_1"  # name or ID of the study from which to import
        experiments: [1, 2, 4]     # the numbers of the experiments to import
      - study: "study_bootstrap_2"
        experiments: [1]
    name: "my_bootstrap"   # name of the step
    type: "bootstrap"      # type of the step (bootstrap)
    from:
      - study: "study_bootstrap_1" # name or ID of the study from which to import
    windowing: # the temporal window in which to compute the score of a trial
      type: "trim" # type of windowing is trim
      trim: [10s, 0s]
    akamas install optimization-pack Kubernetes

    The name of the step

    runOnFailure

    boolean

    true false

    no

    false

    The execution policy of the step:

    • false prevents the step from running in case the previous step failed

    • true allows the step to run even if the previous step failed

    from

    array of objects

    Each object should have the structure described below

    yes

    The experiments to import in the current study

    In case this is not set, this step imports every experiment of a study

    k8s_pod_cpu_limit

    millicores

    The CPUs allowed for the pod as sum of all container cpu limits

    k8s_pod_memory_request

    bytes

    The memory requested for the pod as sum of all container memory requests

    k8s_pod_memory_limit

    bytes

    The memory limit for the pod as sum of all container memory limits

    k8s_pod_restarts

    events

    The number of container restarts in a pod

    How to trim the temporal interval of a trial to get the window. ["0s", "10m"] means trim 0 seconds from the start of the interval, 10 minutes from the end. ["0s", "1h"] means trim 0 seconds from the start, 1 hour from the end

    task

    string

    The name of a task of the workflow associated with the study

    FALSE

    If the field is specified, the trim offset calculation for the window will be applied from the start time of the assigned task. Otherwise, it will be calculated from the start time of the trial.

    k8s_cluster_cpu_request

    millicores

    The total CPUs requested in the cluster

    k8s_cluster_memory

    bytes

    The overall memory in the cluster

    k8s_cluster_memory_available

    bytes

    The amount of memory available for additional pods in the cluster

    k8s_cluster_memory_util

    percent

    The percentage of used memory in the cluster

    k8s_cluster_memory_request

    bytes

    The total memory requested in the cluster

    k8s_cluster_nodes

    nodes

    The number of nodes in the cluster

    Component template
    Metric
    Metric template
    Parameter
    Parameter template
    Component Type
    Component Type template
    Workflow
    Workflow template
    Telemetry Provider
    Telemetry Provider template
    Telemetry Instance
    Telemetry Instance template
    Workflow Operator
    Workflow Operator template
    Optimization Study
    Study template

    Kubernetes Namespace

    Kubernetes Namespace

    Kubernetes Cluster

    Kubernetes Cluster

    Kubernetes Container
    Kubernetes Pod
    Kubernetes Workload
    Web Application

    ORDINAL

    integer values

    Akamas converts the category into real and then normalizes the values

    ['a', 'b', 'c'] → [0, 2] → [0.0, 2.0] → [0.0, 1.0]

    CATEGORICAL

    categorical values

    Akamas converts each param value into a new param that may be either 1.0 (active) or 0.0 (inactive), only 1 of these new params can be "active" during each exp:

    ['a', 'b', 'c'] → [[0.0, 1.0], [0.0, 1.0], [0.0, 1.0]]

    millicores

    The total amount of CPUs used by the entire workload

    k8s_workload_memory_used

    bytes

    The total amount of memory used by the entire workload

    k8s_workload_cpu_request

    millicores

    The total amount of CPUs requests for the workload

    k8s_workload_cpu_limit

    millicores

    The total amount of CPUs limits for the entire workload

    k8s_workload_memory_request

    millicores

    The total amount of memory requests for the workload

    k8s_workload_memory_limit

    millicores

    The total amount of memory limits for the entire workload

    0 → 1024

    yes

    Number of desired pods in the deployment

    k8s_workload_desired_pods

    pods

    Number of desired pods per workload

    k8s_workload_running_pods

    pods

    The number of running pods per workload

    k8s_workload_ready_pods

    pods

    The number of ready pods per workload

    k8s_workload_replicas

    integer

    pods

    k8s_workload_cpu_used

    1

    The metric name associated to a component

    direction

    string

    minimize, maximize

    yes

    The direction corresponding to the metric

    aggregation

    string

    avg, min, max, sum, p90, p95, p99

    no

    avg

    A valid metric aggregation

    name

    string

    should match a component metric

    no

    <metric_name>

    Label that will be used in the UI

    formula

    string

    Must be defined as <component_name>.<metric_name>

    yes

    IBM J9 VM 6

    Eclipse OpenJ9 (formerly known as IBM J9) Virtual Machine version 6

    IBM J9 VM 8

    Eclipse OpenJ9 (formerly known as IBM J9) Virtual Machine version 8

    Eclipse Open J9 11

    Eclipse OpenJ9 (formerly known as IBM J9) version 11

    Install Optimization Packsarrow-up-right

    DotNet Core 3.1

    MS .NET 3.1

    $ akamas install optimization-pack Docker

    string

    -

    yes

    -

    The name of the task.

    operator

    string

    -

    yes

    -

    The operator the task implements: the chosen operator affects available arguments.

    critical

    boolean

    -

    no

    true

    When set to true, task failure will determine workflow failure.

    alwaysRun

    boolean

    -

    no

    false

    When set to true, task will be executed regardless of workflow failure.

    collectMetricsOnFailure

    boolean

    -

    no

    false

    When set to true, failure of the task will not prevent metrics collection.

    arguments

    list

    Determined by operator choice

    yes

    -

    Arguments list required by operators to run.

    The full list of Operators and related options is provided on the Workflow Operators pages.

    hashtag
    Example

    A workflow for the java-based renaissance benchmark application

    name

    kpis:
      - name: "Response time"
        formula: renaissance.response_time
        direction: minimize
      - name: "CPU used"
        formula: renaissance.cpu_used
        direction: minimize
      - name: "Memory used"
        formula: renaissance.mem_used
        direction: minimize
    akamas install optimization-pack OpenJ9
    name: "insert_workflow_name_here"
    tasks:
    # these are the tasks that will be executed sequentially to complete a trial (configure the system under tests with the parameters optimized by Akamas )
    - name: "insert_here_name_of_task"
      # an operator specifies which type of task should be used
      operator: "insert_here_which_operator_to_use_for_the_task"
      # each operator accepts different arguments necessary to specify how it should behave
      arguments:
    	...
    name: renaissance-optimize
    
    tasks:
    - name: Configure Benchmark
      operator: FileConfigurator
      arguments:
        source:
          hostname: benchmark
          username: akamas
          password: akamas
          path: launch_benchmark.sh.templ
        target:
          hostname: benchmark
          username: akamas
          password: akamas
          path: launch_benchmark.sh
    
    - name: Launch Benchmark
      operator: Executor
      arguments:
        command: "bash launch_benchmark.sh"
        host:
          hostname: benchmark
          username: akamas
          password: akamas
    
    - name: Parse Output
      operator: Executor
      arguments:
        command: "bash parse_output.sh"
        host:
          hostname: benchmark
          username: akamas
          password: akamas

    type

    string

    optimize

    yes

    The type of the step, in this case, optimize

    name

    string

    yes

    The name of the step

    runOnFailure

    boolean

    true false

    no

    false

    The execution policy of the step:

    • false prevents the step from running in case the previous step failed

    • true allows the step to run even if the previous step failed

    numberOfExperiments

    integer

    numberOfExperiments > 0 and

    numberOfExperiments >= numberOfInitExperiments

    yes

    The number of experiments to execute - see below

    numberOfTrials

    integer

    numberOfTrials > 0

    no

    1

    The number of trials to execute for each experiment

    numberOfInitExperiments

    integer

    numberOfInitExperiments < numberOfExperiments

    no

    10

    The number of initialization experiment to execute - see below.

    maxFailedExperiments

    integer

    maxFailedExperiments > 1

    no

    30

    The number of experiment failures (as either workflow errors or constraint violations) to accept before the step is marked as failed

    optimizer

    string

    AKAMAS SOBOL RANDOM

    no

    AKAMAS

    The type of optimizer to use to generate the configuration of the experiments - see below

    doNotRenderParameters

    string

    no

    Parameters not to be rendered. - see

    renderParameters

    string

    no

    Parameters to be rendered. - see

    hashtag
    Optimizer

    The optimizer field allows selecting the desired optimizer:

    • AKAMAS identifies the standard AI optimizer used by Akamas

    • SOBOL identifies an optimizer that generates configurations using Sobol sequencesarrow-up-right

    • RANDOM identifies an optimization that generates configurations using random numbers

    Notice that SOBOL and RANDOM optimizers do not perform initialization experiments, hence the field numberOfInitExperiments is ignored.

    Refer to the page Optimizer Options for more configuration options for the optimizer

    hashtag
    Failures

    The optimize step is fault-tolerant and tries to relaunch experiments on failure. Nevertheless, the step limits the number of failed experiments: if too many experiments fail, then the entire step fails too. By default, at most 30 experiments can fail while Akamas is optimizing systems. An experiment is considered failed when it fails to run (i.e., there is an error in the workflow) or violates some constraint.

    hashtag
    Initializations

    The optimize step launches some initialization experiments (by default 10) that do not apply the AI optimizer and are used to find good configurations. By default, the step performs 10 initialization experiments.

    Initialization experiments take into account bootstrapped experiments, experiments executed in preset steps, and baseline experiments.

    hashtag
    Example

    The following snippet shows an optimization step that runs 50 experiments using the SOBOL optimizer:

    The preset step has the following structure:
    Field
    Type
    Value restrictions
    Is required
    Default value
    Description

    type

    string

    preset

    yes

    where the from field should have the following structure:

    with

    • study contains the name or ID of the study from which to take the configuration. In the case this is omitted, the same study of the step is considered for experiments from which taking configurations

    • experiments contains the number of the experiment from which to take the configuration

    hashtag
    Examples

    hashtag
    Custom configuration

    You can provide a custom configuration by setting values:

    hashtag
    Configuration from another study

    You can select a configuration taken from another study by setting from:

    hashtag
    Configuration from the same study

    You can select a configuration taken from the same study by setting from but by omitting the study field:

    Notice: the from and experiments fields are defined as a list, but can only contain one element.

    Use a configuration taken from an experiment of another study
  • Use a custom configuration

  • The baseline step has the following structure:

    Field
    Type
    Value restriction
    Is required
    Default value
    Description

    type

    string

    baseline

    yes

    where the from field should have the following structure:

    with

    • study contains the name or ID of the study from which to take the configuration

    • experiments contains the number of the experiment from which to take the configuration

    hashtag
    Examples

    hashtag
    Baseline configuration with default values

    Default values for the baseline configuration only require setting the name and type fields:

    hashtag
    Baseline configuration from another study

    The configuration taken from another study to be used as a baseline only requires setting the from field:

    Notice: the from and experiments fields are defined as an array, but can only contain one element.

    hashtag
    Custom baseline configuration

    The custom configuration for the baseline only requires setting the values field:

    string

    It should contain only lower/uppercase letters, numbers or underscores. It should start only with a letter. No spaces are allowed.

    TRUE

    The name of the parameter

    description

    string

    TRUE

    A description characterizing the parameter

    unit

    string

    A supported unit or a custom unit (see )

    FALSE

    empty unit

    The unit of measure of the parameter

    restart

    boolean

    FALSE

    FALSE

    If the use of the parameters for changing the configuration of a system should cause the system to be restarted.

    Notice that parameter definitions are shared across all the workspaces on the same Akamas installation, and require an account with administrative privileges to manage them.

    hashtag
    Example

    The following represents a set of parameters for a JVM component

    The following represents a set of CPU-related parameters for the Linux operating system

    name

  • Memory (for example, the limit on memory usage for which start swapping pages on disk)

  • Network (for example, the size of the buffers used to write/read network packets)

  • Storage (for example, the type of storage scheduler)

  • hashtag
    Component Types

    Component Type
    Description

    Amazon Linux AMI

    Amazon Linux 2 AMI

    Amazon Linux 2022 AMI

    hashtag
    Installing

    Here’s the command to install the Linux optimization pack using the Akamas CLI:

    For more information on the process of installing or upgrading an optimization pack refer to Install Optimization Packsarrow-up-right.

    Configures Linux kernel parameters using different strategies

    Executes a command on a target Windows machine using WinRM

    Interpolates configuration parameters into files on remote Windows machines

    Pauses the execution of the workflow for a certain time

    Executes custom queries on Oracle database instances

    Configures Oracle database instances

    Executes a Spark application using spark-submit on a machine using SSH

    Executes a Spark application using spark-submit locally

    Executes a Spark application using the Livy web service

    Triggers the execution of performance tests using NeoLoad Web

    Runs a performance test with LoadRunner Professional

    Runs a performance test with LoadRunner Enterprise

    hashtag

    Executor

    Executes a shell command on a machine using SSH

    FileConfigurator

    Interpolates configuration parameters values into a file with templates and saves this file on a machine using SSH

    The name of the Telemetry Provider

    Yes

    config

    object

    Provider-specific configuration in a key-value format (see specific provider documentation for details)

    Yes

    name

    string

    Custom telemetry instance name

    No

    metrics

    object

    This section is used to specify the metrics to extract. This section is specific for each Telemetry Provider (see specific provider documentation for details)

    No

    and the metrics section

    Name
    Type
    Description
    Mandatory

    name

    string

    Name of the metric in Akamas.

    This metric must exists in at least one of the referred by the System associated with the Telemetry Provider Instance

    Yes

    datasourceName

    string

    provider

    string

    Optimization Packs

    This section documents Akamas out-of-the-box optimization packs.

    Optimization Pack
    Support for applications

    based on Linux operating system

    Web Application

    This page describes the Optimization Pack for Web Applications.

    hashtag
    Metrics

    hashtag
    Performance metrics

    Metric
    Unit
    Description

    hashtag
    Error metrics

    Metric
    Unit
    Description

    hashtag
    Load metrics

    Unit
    Description

    hashtag
    Parameters

    There are no parameters for Web Applications.

    GO 1

    This page describes the Optimization Pack for the component type Go 1.

    hashtag
    Metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    Parameter
    Type
    Unit
    Default
    Domain
    Restart
    Description

    LoadRunner metrics mapping

    This page describes the mapping between metrics provided by Load Runner to Akamas metrics for each supported component type.

    Component Type
    Notes

    hashtag
    Web Application

    Component metric
    LoadRunner metric

    Container

    This page describes the Optimization Pack for the component type (Docker) Container.

    hashtag
    Metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    Parameter
    Type
    Unit
    Default
    Domain
    Restart
    Description

    Stability windowing

    A windowing policy of type stability discards temporal intervals in which a given metric is not stable, and selects, among the remaining intervals, the ones in which another target metric is maximized or minimized. Stability windowing can be sample-based or time-frame based.

    The stability windowing has the following structure:

    Field
    Type
    Value restrictions
    Is required
    Default value
    Description

    SparkLivy Operator

    The SparkLivy operator uses Livy to run Spark applications on a Spark instance.

    hashtag
    Operator arguments

    Name
    type
    Value restrictions
    name: "my_optimize"    # name of the step
    type: "optimize"       # type of the step (optimize)
    optimizer: "SOBOL"
    numberOfExperiments: 50  # amount of experiments to execute
    numberOfTrials: 2        # amount of trials for each experiment
    study: "study_preset_1"
    experiments: [1]
    name: "my_preset"   # name of the step
    type: "preset"      # type of the step (preset)
    values:
      jvm1.maxHeapSize: 1024 # parameter maxHeapSize of jvm1 is set to 1024
      jvm2.maxHeapSize: 2048 # parameter maxHeapSize of jvm2 is set to 2048
    name: "my_preset"  # name of the step
    type: "preset"     # type of the step (preset)
    from:
      - study: "preset_study_1" # name or ID of the study from which to take the configuration
        experiments: [1] # the number of the experiment from which to take the configuration
    name: "my_preset"    # name of the step
    type: "preset"       # type of the step (preset)
    from:
      - experiments: [1] # the step will take the configuration of the experiment #1 of the same study of the step
    study: "study_from_which_to_take_the_baseline"
    experiments: [1]
    name: "my_baseline"    # name of the step
    type: "baseline"       # type of the step (baseline)
    name: "my_baseline"   # name of the step
    type: "baseline"      # type of the step (baseline)
    from:
      - study: "study_from_which_take_the_baseline" # name or id of the study from which to take the configuration
        experiments: [1]  # the number of the experiment from which to take the configuration
    name: "my_baseline"         # name of the step
    type: "baseline"            # type of the step (baseline)
    values:
      jvm1.maxHeapSize: 1024    # parameter maxHeapSize of jvm1 is set to 1024
      jvm2.maxHeapSize: 2048    # parameter maxHeapSize of jvm2 is set to 2048
    parameters:
      - name: jvm_heap_size
        description: the size of the heap of the jvm
        unit: megabytes
        restart: false
      - name: jvm_survival_ratio
        description:  the ratio of the two survivor spaces in the JVM GC
    parameters:
      - name: jvm_maxHeapSize
        description: Maximum heap size
        unit: megabytes
        restart: true
      - name: jvm_newRatio
        description: Ratio of old/new generation sizes
        restart: true
      - name: jvm_maxTenuringThreshold
        description: Maximum value for tenuring threshold
        restart: true
      - name: jvm_survivorRatio
        description: Ratio of eden/survivor space size
        restart: true
      - name: jvm_concurrentGCThreads
        description: Number of threads concurrent garbage collection will use
        unit: threads
        restart: true
      - name: jvm_gcType
        description: Type of the garbage collection algorithm
        restart: true
    parameters:
      # CPU Related
      - name: os_cpu_sched_min_granularity_ns
        description: Target minimum scheduler period in which a single task will run
        unit: nanoseconds
        restart: false
      - name: os_cpu_sched_wakeup_granularity_ns
        unit: nanoseconds
        description: desc
        restart: false
      - name: os_cpu_sched_migration_cost_ns
        unit: nanoseconds
        description: desc
        restart: false
      - name: os_cpu_sched_child_runs_first
        description: desc
        restart: false
      - name: os_cpu_sched_latency_ns
        unit: nanoseconds
        description: desc
        restart: false
      - name: os_cpu_sched_autogroup_enabled
        description: desc
        restart: false
      - name: os_cpu_sched_nr_migrate
        description: desc
        restart: false
    akamas install optimization-pack Linux
    provider: Provider Name
    name: My Telemetry
    config:
      providerSpecificConfig1: "<value>"
      providerSpecificConfig2: 123
    metrics:
    - name: metric_name
      datasourceName: datsource_metric_name
      defaultValue: 1.23
      labels:
        - label1
        - label2
      staticLabels:
        staticLabel1: staticValue1
        staticLabel2: staticValue2

    The type of the step, in this case, preset

    name

    string

    yes

    The name of the step

    runOnFailure

    boolean

    true false

    no

    false

    The execution policy of the step:

    • false prevents the step from running in case the previous step failed

    • true allows the step to run even if the previous step failed

    from

    array of objects

    Each object should have the structure described below

    no

    The study and the experiment from which to take the configuration of the experiment

    The from and experiments fields are defined as an array, but it can only contain one element

    This can be set only if values is not set

    values

    object

    The keys should match existing parameters

    no

    The configuration with which execute the experiment

    This can be set only if from is not set

    doNotRenderParameters

    string

    this cannot be used when using a from option since no experiment is actually executed

    no

    Parameters not to be rendered. - see Parameter rending

    renderParameters

    string

    this cannot be used when using a from option since no experiment is actually executed

    no

    Parameters to be rendered. - see Parameter rending

    The type of the step, in this case, baseline

    name

    string

    yes

    The name of the step

    runOnFailure

    boolean

    true false

    no

    false

    The execution policy of the step:

    • false prevents the step from running in case the previous step failed

    • true allows the step to run even if the previous step failed

    from

    array of objects

    Each object should have the structure described below

    no

    The study and the experiment from which to take the configuration of the baseline experiment

    The from and experiments fields are defined as an array, but it can only contain one element

    This can be set only if values is not set

    values

    object

    The keys should match existing parameters

    no

    The configuration with which execute the baseline experiment

    This can be set only if from is not set

    doNotRenderParameters

    string

    this cannot be used when using a from option since no experiment is actually executed

    no

    Parameters not to be rendered. - see Parameter rending

    renderParameters

    string

    this cannot be used when using a from option since no experiment is actually executed

    no

    Parameters to be rendered. - see Parameter rending

    Name of the metric (or extraction query) in the data source. The value of this parameter is specific to the data source.

    Yes

    defaultValue

    double

    Default value that, if specified, is used to create metrics in time-intervals where no other valid datapoint is available.

    No

    labels

    List of strings

    List of labels. For the specific usage of this parameter, see the documentation of the specific Telemetry Provider

    No

    staticLabels

    List of key-value pair

    List of Key-Value pairs that are interpreted as a pair of labels name and value. This "static labels" are copied directly in each sample of the specific metric and sent to the Metric Service

    No

    aggregation

    String

    see Dynatrace metric aggregationsarrow-up-right

    No

    extras

    Object

    Only the parameter mergeEntities can be defined to either true or false

    No

    transactions_response_time_max

    milliseconds

    The maximum recorded transaction response time

    transactions_response_time_min

    milliseconds

    The minimum recorded transaction response time

    pages_throughput

    pages/s

    The number of pages requested per second

    pages_response_time

    milliseconds

    The average page response time

    pages_response_time_max

    milliseconds

    The maximum recorded page response time

    pages_response_time_min

    milliseconds

    The minimum recorded page response time

    requests_throughput

    requests/s

    The number of requests performed per second

    requests_response_time

    milliseconds

    The average request response time

    requests_response_time_max

    milliseconds

    The maximum recorded request response time

    requests_response_time_min

    milliseconds

    The minimum recorded request response time

    pages_error_rate

    percent

    The percentage of pages flagged as error

    pages_error_throughput

    pages/s

    The number of pages flagged as error per second

    requests_error_rate

    percent

    The requests of requests flagged as error

    requests_error_throughput

    requests/s

    The number of requests flagged as error per second

    transactions_throughput

    transactions/s

    The number of transactions executed per second

    transactions_response_time

    milliseconds

    The average transaction response time

    transactions_error_rate

    percent

    The percentage of transactions flagged as error

    transactions_error_throughput

    transactions/s

    The number of transactions flagged as error per second

    users

    users

    The number of users performing requests on the web

    Parameter rending
    Parameter rending
    supported units of measure

    CentOS-7

    CentOS Linux distribution version 7.x

    CentOS-8

    CentOS Linux distribution version 8.x

    Rhel-7

    Red Hat Enterprise Linux distribution version 7.x

    Rhel-8

    Red Hat Enterprise Linux distribution version 8.x

    Ubuntu-16.04

    Ubuntu Linux distribution by Canonical version 16.04 (LTS)

    Ubuntu-18.04

    Ubuntu Linux distribution by Canonical version 18.04 (LTS)

    Ubuntu-20.04

    Ubuntu Linux distribution by Canonical version 20.04 (LTS)

    AmazonLinux
    AmazonLinux-2
    AmazonLinux-2022
    LinuxConfigurator
    WindowsExecutor
    WindowsFileConfigurator
    Sleep
    OracleExecutor
    OracleConfigurator
    SparkSSHSubmitOperator
    SparkSubmit
    SparkLivy
    NeoLoadWeb
    LoadRunner
    LoadRunner Enteprise

    based on MS .Net technology

    Java Open JDK

    based on OpenJDK and Oracle HotSpot JVM

    Eclipse OpenJ9

    based on Eclipse OpenJ9 VM (formerly known as IBM J9)

    NodeJS

    based on NodeJS

    GO

    based on GO runtime (aka Golang)

    Web Application

    exposed as web applications

    Docker

    based on Docker containters

    Kubernetes

    based on Kubernetes containters

    WebSphere

    based on WebSphere middleware

    Spark

    based on Apache Spark middleware

    PostgreSQL

    based on PostgreSQL database

    Cassandra

    based on Cassandra database

    MySQL

    based on MySQL database

    Oracle Database

    based on Oracle database

    MongoDB

    based on MongoDB database

    Elasticsearch

    based on Elasticsearch database

    AWS

    based on AWS EC2 or Lambda resources

    Linux
    MS DotNet

    transactions_response_time_max

    The max response time of LoadRunner transaction (requests)

    transactions_response_time

    The response time of LoadRunner transaction (requests)

    transactions_response_time_p50

    The 50th percentile (weighted median) of the response time of LoadRunner transaction (requests)

    transactions_response_time_p85

    The 85th percentile of the response time of LoadRunner transaction (requests)

    transactions_response_time_p95

    The 95th percentile of the response time of LoadRunner transaction (requests)

    transactions_response_time_p99

    The 99th percentile of the response time of LoadRunner transaction (requests)

    pages_throughput

    The average throughput of LoadRunner pages (transactions breakdown, second level) , per second

    pages_response_time_min

    The min response time of LoadRunner pages (transactions breakdown, second level)

    pages_response_time_max

    The max response time of LoadRunner pages (transactions breakdown, second level)

    pages_response_time

    The response time of LoadRunner pages (transactions breakdown, second level)

    pages_response_time_p50

    The 50th percentile (weighted median) of the response time of LoadRunner requests

    pages_response_time_p85

    The 85th percentile of the response time of LoadRunner transaction breakdown, first level (pages)

    pages_response_time_p95

    The 95th percentile of the response time of LoadRunner transaction breakdown, first level (pages)

    pages_response_time_p99

    The 99th percentile of the response time of LoadRunner transaction breakdown, first level (pages)

    requests_throughput

    The average throughput of LoadRunner requests, per second

    requests_response_time_min

    The min response time of LoadRunner requests

    requests_response_time_max

    The max response time of LoadRunner requests

    requests_response_time

    The response time of LoadRunner requests

    requests_response_time_p50

    The 50th percentile (weighted median) of the response time of LoadRunner requests

    requests_response_time_p85

    The 85th percentile of the response time of LoadRunner transaction breakdown, second level (requests)

    requests_response_time_p95

    The 95th percentile of the response time of LoadRunner transaction breakdown, second level (requests)

    requests_response_time_p99

    The 99th percentile of the response time of LoadRunner transaction breakdown, second level (requests)

    requests_error_throughput

    The number of requests (transactions breakdown, first level) flagged as error by LoadRunner, per second

    users

    The average number of users active in a specific timeframe.

    transactions_throughput

    The average throughput of LoadRunner transaction (requests), per second

    transactions_response_time_min

    The min response time of LoadRunner transaction (requests)

    Web Application

    bytes

    The amount of heap memory used

    go_heap_util

    bytes

    The amount of heap memory used

    go_memory_used

    bytes

    The total amount of memory used by Go

    go_gc_time

    percent

    The % of wall clock time the Go spent doing stop the world garbage collection activities

    go_gc_duration

    seconds

    The average duration of a stop the world Go garbage collection

    go_gc_count

    collections/s

    The total number of stop the world Go garbage collections that have occurred per second

    go_threads_current

    threads

    The total number of active Go threads

    go_goroutines_current

    goroutines

    The total number of active Goroutines

    0 → 25000

    yes

    Sets the GOGC variable which controls the aggressiveness of the garbage collector

    go_maxProcs

    integer

    theads

    8

    0 → 100

    yes

    Limits the number of operating system threads that can execute user-level code simultaneously

    go_memLimit

    integer

    megabtes

    100

    0 → 1048576

    yes

    Sets a soft memory limit for the runtime. Available since Go 1.19

    cpu_used

    CPUs

    The total amount of CPUs used

    cpu_util

    percents

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    go_heap_size

    bytes

    The largest size reached by the Go heap memory

    go_gcTargetPercentage

    integer

    go_heap_used

    100

    percent

    The number of CPUs (or fraction of CPUs) allowed for a container

    container_mem_util_nocache

    percent

    Percentage of memory used with respect to the limit. Memory used includes all types of memory, including file system cache

    container_mem_util

    percent

    Percentage of working set memory used with respect to the limit

    container_mem_used

    bytes

    The total amount of memory used by the container. Memory used includes all types of memory, including file system cache

    container_mem_limit

    bytes

    Memory limit for the container

    container_mem_working_set

    threads

    Current working set in bytes

    container_mem_limit_hits

    hits/s

    Number of times memory usage hits memory limit per second

    0.1 → 100.0

    Limits on the amount of CPU resources usage in CPU units

    requests_cpu

    integer

    megabytes

    0.7

    0.1 → 100.0

    Limits on the amount of memory resources usage

    limits_memory

    integer

    CPUs

    128

    64 → 64000

    Amount of CPU resources requests in CPU units

    requests_memory

    integer

    megabytes

    128

    64 → 64000

    Amount of memory resources requests

    container_cpu_util

    CPUs

    The number of CPUs (or fraction of CPUs) allowed for a container

    container_cpu_used

    percents

    CPUs used by the container per second

    container_cpu_throttle_time

    CPUs

    CPUs used by the container per second

    limits_cpu

    integer

    CPUs

    container_cpu_limit

    0.7

    type

    string

    {stability}

    TRUE

    The type of windowing.

    stability->metric

    string

    It should match the name of an existing metric monitored by AKAMAS

    TRUE

    and for the comparison metric section

    Field
    Type
    Value restrictions
    Is required
    Default value
    Description

    metric

    string

    It should match the name of an existing metric monitored by Akamas

    TRUE

    hashtag
    Example

    The following fragment is an example of stability windowing (time-frame based):

    Is required
    Default
    Description

    file

    String

    It should be a path to a valid java or python spark application file

    Yes

    Spark application to submit (jar or python file)

    args

    List of Strings, Numbers or Booleans

    Yes

    hashtag
    Parameters applied from Experiments

    The operator fetches the following parameters from the current Experiment to apply them to the System under test.

    Name
    Description
    Restrictions

    spark_driver_memory

    Memory for the driver

    spark_executor_memory

    Memory per executor

    hashtag
    Examples

    hashtag
    Execute with Livy

    SparkSubmit Operator

    The SparkSubmit operator connects to a Spark instance and invokes a local spark-submit to schedule a job.

    hashtag
    Operator arguments

    Name
    Type
    Value Restrictions
    Required
    Default
    Description

    OracleConfigurator Operator

    This page introduces the OracleConfigurator operator, a workflow operator that allows configuring the optimized parameters of an Oracle instance.

    hashtag
    Prerequisites

    This section provides the minimum requirements that you should meet in order to use the OracleConfigurator operator.

    hashtag
    Supported versions

    • Oracle 12c or later

    hashtag
    Network requirements

    • The Oracle operator must be able to connect to the Oracle URL or IP address and port (default port: 1521).

    hashtag
    Permissions

    • The user used to log into the database must have ALTER SYSTEM privileges.

    hashtag
    Supported component types

    In order to configure the tuned parameters the Oracle Configurator operator requires to be bound to a component with one of the following types:

    • Oracle Database 12c

    • Oracle Database 18c

    • Oracle Database 19c

    Databases hosted on Amazon RDS are not supported.

    hashtag
    Configuration options

    When you define an OracleExecutor task in the workflow you should specify some configuration information to allow the operator to connect to the Oracle instance.

    You can specify configuration information within the config part of the YAML of the instance definition. The operator can also inherit some specific arguments from the properties of a bound component when not specified in the task.

    hashtag
    Operator arguments

    The following table describes all the properties for the definition of a task using the OracleConfigurator operator.

    Field
    Type
    Description
    Default Value
    Restrictions
    Required
    Source

    hashtag
    Examples

    hashtag
    Update database entries

    In the following example, the workflow leverages the OracleConfigurator operator to update the database parameters before triggering the execution of the load test for a component oracledb:

    Optimizer Options

    The Optimizer Options is a set of parameters used to fine-tune the study optimization strategy during the optimize step.

    Optimizer options have the following structure:

    Field
    Type
    Value restrictions
    Description

    onlineMode

    hashtag
    Options for both online and offline studies

    hashtag
    Safety Factor

    The safetyFactor field specifies how much the optimizer should stay on the safe side in evaluating a candidate configuration with respect to the goal constraints. A higher safety factor corresponds to a safer configuration, that is a configuration that is less likely to violate goal constraints.

    Acceptable values are all the real values ranging between 0 and 1, with (safetyFactor - 0.5) representing the allowed margin for staying within the defined constraint:

    • 0 means "no safety", as with this value the optimizer totally ignores goal constraint violations;

    • 0.5 means "safe, but no margin", as with this value the optimizer only tries configurations that do not violate the goal constraints, by remaining as close as possible to them;

    • 1 means "super safe", as with this value the optimize only tries configurations that are very far from goal constraints.

    For live optimization studies, 0.6 is the default value, while for offline optimization studies, the default value is 0.5.

    hashtag
    Options for offline studies

    For , the optimizerOptions field can be used to specify whether beta-warping optimization (a more sophisticated optimization that requires a longer time) should be used and for how many experiments (as a percentage):

    where experimentsWithBeta can be:

    • A percentage between 0 and 100%

    • A number less than or equal to numberOfExperiments

    hashtag
    Options for live studies

    For live optimization studies, the optimizerOptions field can be used to specify several important parameters governing the live optimization:

    Notice that while available as independent options, the optimizer options onlineMode (described ), workloadOptimizedForStrategy () and the safetyFactor () work in conjunction according to the following schema:

    Online Mode
    Safety Mode
    Workload strategy

    All these optimizer options can be changed at any time, that is while the optimization study is running, to become immediately effective. The page in the reference guide provides these specific update commands.

    hashtag
    Online Mode

    The onlineMode field specifies how the Akamas optimizer should operate:

    • RECOMMEND: configurations are recommended to the user by Akamas and are only applied after having been approved (and possibly modified) by the user;

    • FULLY AUTONOMOUS MODE: configurations are immediately applied by Akamas.

    hashtag
    Safety Mode

    The safetyMode field describes how the Akamas optimizer should evaluate the goal constraints on a candidate configuration for that configuration to be considered valid:

    • GLOBAL: the constraints must be satisfied by the configuration under all observed workloads in the configuration history - this is the value taken in case onlineMode is set to RECOMMEND;

    • LOCAL: the constraints are evaluated only under the workload selected according to the workload strategy - this should be used with onlineMode set to

    Notice that when setting the safetyMode to LOCAL, the recommended configuration is only expected to be good for the specific workload selected under the defined workload strategy, but it might violate constraints under another workload.

    hashtag
    Workload Strategy

    The workloadOptimizedForStrategy field specifies the workload strategy that drives how Akamas leverages the workload information when looking for the next configuration:

    • MAXIMIN: the optimizer looks for a configuration that maximizes the minimum improvements for all the already observed workloads;

    • LAST: for each workload, the last observed workload is considered - this works well to find a configuration that is good for the last workloads - it is often used in conjunction with a LOCAL safety mode (see );

    hashtag
    Exploration Factor

    The explorationFactor field specifies how much the optimizer explores the (unknown) optimization space when looking for new configurations. For any parameter, this factor measures the difference between already tried values and the value of a new possible configuration. A higher exploration factor corresponds to a broader exploration of never-tried-before parameter values.

    Acceptable values are all the real values ranging between 0 and 1, plus the special string FULL_EXPLORATION:

    • 0 means "no exploration", as with this value the optimizer chooses a value among the previously seen values for each parameter;

    • 1 means "full exploration, except for categories", as with this value the optimizer for a non-categorical parameter any value among all its domain values can be chosen, while only values (categories) that have already been seen in previous configurations are chosen for a categorical parameter;

    • FULL_EXPLORATION

    In case the desired explorationFactoris 1 but there are some specific parameters that also need to be explored with respect to all its categories, then PRESET steps (refer to the page) can be used to run an optimization study with these values. For an example of a live optimization study where this approach is adopted see .

    hashtag
    Example

    The following fragment refers to an optimization study that runs 100 experiments using the SOBOL optimizer and forces 50% of the experiments to use the beta-warping option, enabling a more sophisticated but longer optimization:

    NeoLoadWeb metrics mapping

    This page describes the mapping between metrics provided by NeoLoadWeb to Akamas metrics for each supported component type

    Component Type
    Notes

    hashtag
    Web Application

    Component metric
    NeoLoad element category
    NeoLoad monitor path
    NeoLoad stat

    Kubernetes Container

    This page describes the Optimization Pack for the Kubernetes Container component type.

    hashtag
    Metrics

    hashtag

    type: stability
    stability:
      metric: throughput
      labels:
        componentName: DB
      resolution: "30s"
      width: 10
      maxStdDev: .6
      # Comparison metric section
      when:
        metric: response_time
        labels:
          componentName: FE
        is: min
    - name: Run spark application
      operator: SparkLivy
      arguments:
        component: sparkemr
        file: /spark-examples.jar

    The metric whose stability is going to be verified to exclude some temporal intervals over the duration of a trial.

    stability->labels

    set of key-value pairs

    FALSE

    A set of key-value pairs that represent filtering conditions for retrieving the value of the metric. This conditions can be used to consider the right metric of the right component, you can in fact filter by componentName or by other custom properties defined in the components of the system of the study.

    stability->resolution

    string

    Valid values are in the form 30s 40m 2h

    where s refers to seconds, m to minutes, h to hours

    FALSE

    0s

    The temporal resolution at which Akamas aggregate data points to determine feasible windows.

    stability->width

    integer string

    stability->width > 1 Valid values are in the form 30s 40m 2h as specified in stability->resolution

    TRUE

    The width of temporal intervals over the duration trial which are checked for the stability of the metric. Width can be sample-based (integer) or time frame-based (string).

    stability->maxStdDev

    double

    TRUE

    The stability condition, i.e, the maximum amount of standard deviation among the value of the data point of the metric tolerated for a temporal interval of size width, otherwise, the temporal interval will be discarded

    The metric whose value is analyzed to include or exclude temporal intervals over the duration of a trial, when another reference metric is stable.

    labels

    set of key-value pairs

    FALSE

    A set of key-value pairs that represent filtering conditions for retrieving the value of the metric. This conditions can be used to consider the right metric of the right component, you can in fact filter by componentName or by other custom properties defined in the components of the system of the study.

    is

    string

    {min,max}

    TRUE

    If the value of the metric should be maximum or minimum to include or exclude temporal intervals over the duration of a trial when another reference metric is stable.

    Additional application arguments

    className

    String

    No. Required for java applications.

    The entry point of the java application.

    name

    String

    No

    Name of the task. When submitted the id of the study, experiment and trial will be appended.

    queue

    String

    No

    The name of the YARN queue to which submit a Spark application

    pyFiles

    List of Strings

    Each item of the list should be a path that matches an existing python file

    No

    A list of python scripts to be added to the PYTHONPATH

    proxyUser

    String

    No

    The user to be used to launch Spark applications

    pollingInterval

    Number

    pollingInterval > 0

    No

    10

    The number of seconds to wait before checking if a launched Spark application has finished

    component

    String

    It should match the name of an existing Component of the System under test

    Yes

    The name of the component whose properties can be used as arguments of the operator

    spark_total_executor_cores

    Total cores used by the application

    Spark standalone and Mesos only

    spark_executor_cores

    Cores per executor

    Spark standalone and YARN only

    spark_num_executors

    The number of executors

    YARN only

    AVG_DURATION

    requests_response_time

    requests

    AVG_DURATION

    transactions_response_time_min

    transactions

    MIN_DURATION

    pages_response_time_min

    pages

    MIN_DURATION

    requests_response_time_min

    requests

    MIN_DURATION

    transactions_response_time_max

    transactions

    MAX_DURATION

    pages_response_time_max

    pages

    MAX_DURATION

    requests_response_time_max

    requests

    MAX_DURATION

    transactions_throughput

    transactions

    THROUGHPUT

    pages_throughput

    pages

    THROUGHPUT

    requests_throughput

    requests

    THROUGHPUT

    transactions_error_rate

    transactions

    ERROR_RATE

    pages_error_rate

    pages

    ERROR_RATE

    requests_error_rate

    requests

    ERROR_RATE

    transactions_error_throughput

    transactions

    ERRORS_PER_SECOND

    pages_error_throughput

    pages

    ERRORS_PER_SECOND

    requests_error_throughput

    requests

    ERRORS_PER_SECOND

    users

    Controller/User Load

    AVG

    transactions_response_time

    transactions

    AVG_DURATION

    pages_response_time

    Web Application

    pages

    FULLY_AUTONOMOUS
    .

    MOST_VIOLATED: for each workload, the workload of the configuration with more violations is considered.

    means "full exploration, including categories" as with this value the optimizer chooses any value among all its domain values, including categories, even if not already seen in previous configurations.

    string

    RECOMMEND, FULLY_AUTONOMOUS

    Changes are approved automatically or must be edited/approved by the user

    safetyMode

    string

    LOCAL, GLOBAL

    Defines how Akamas optimizer evaluates goal constraints

    safetyFactor

    decimal

    between 0 and 1

    Parameter that impacts the distance from goal constraints for new configurations

    workloadOptimizedForStrategy

    string

    LAST, MOST_VIOLATED, MAXIMIN

    Selects the computation strategy to generates future configurations

    explorationFactor

    decimal, string

    between 0 and 1 or FULL_EXPLORATION

    Set the tendency to explore toward unexplored configuration values

    experimentsWithBeta

    decimal / string

    if string must be a percentage between 0% and 100%. If numeric must be less than or equal to numberOfExperiments

    Percentage/number of experiments that will be computed with beta-warping optimization

    RECOMMEND

    GLOBAL

    MAXIMIN

    FULLY_AUTONOMOUS

    LOCAL

    LAST

    offline optimization studies
    here below
    here below
    see above
    Optimizer options commands
    here above
    Preset step
    Optimizing a live full-stack deployment (K8s + JVM)
    # Half the experiments should be done with beta-warping
    experimentsWithBeta: "50%"
    optimizerOptions:
      onlineMode: RECOMMEND                    # [RECOMMEND|FULLY_AUTONOMOUS]
      safetyMode: GLOBAL                       # [GLOBAL|LOCAL]
      workloadOptimizedForStrategy: MAXIMIN    # [MAXIMIN|LAST|MOST_VIOLATED]
      safetyFactor: 0.55                       # 0 <= safetyFactor <= 1
      explorationFactor: 0.05                  # 0 <= explorationFactor <= 1 or FULL_EXPLORATION
    name: "my_optimize"  # name of the step
    type: "optimize"     # type of the step (optimize)
    optimizer: "SOBOL"
    numberOfExperiments: 100  # amount of experiments to execute
    numberOfTrials: 2         # amount of trials for each experiment
    optimizerOptions:
      experimentsWithBeta: "50%"

    Additional application arguments

    master

    String

    It should be a valid supported Master URL:

    • local

    • local[K]

    • local[K,F]

    Yes

    The master URL for the Spark cluster

    deployMode

    client cluster

    No

    cluster

    Whether to launch the driver locally (client) or in the cluster (cluster)

    className

    String

    No

    The entry point of the java application. Required for java applications.

    name

    String

    No

    Name of the task. When submitted the id of the study, experiment and trial will be appended.

    jars

    List of Strings

    Each item of the list should be a path that matches an existing jar file

    No

    A list of jars to be added in the classpath.

    pyFiles

    List of Strings

    Each item of the list should be a path that matches an existing python file

    No

    A list of python scripts to be added to the PYTHONPATH

    files

    List of Strings

    Each item of the list should be a path that matches an existing file

    No

    A list of files to be added to the context of the spark-submit

    conf

    Object (key-value pairs)

    No

    Mapping containing additional Spark configurations. See Spark documentation.

    envVars

    Object (key-value pairs)

    No

    Env variables when running the spark-submit command

    sparkSubmitExec

    String

    It should be a path that matches an existing executable

    No

    The default for the Spark installation

    The path of the spark-submit executable command

    sparkHome

    String

    It should be a path that matches an existing directory

    No

    The default for the Spark installation

    The path of the SPARK_HOME

    proxyUser

    String

    No

    The user to be used to execute Spark applications

    verbose

    Boolean

    No

    true

    If additional debugging output should be displayed

    component

    String

    It should match the name of an existing Component of the System under test

    Yes

    The name of the component whose properties can be used as arguments of the operator

    file

    String

    It should be a path to a valid java or python spark application file

    Yes

    Spark application to submit (jar or python file)

    args

    List of Strings, Numbers or Booleans

    Yes

    Is possible to define only one of the following sets of configurations:

    • dsn

    • host, service and optionally port

    task, component

    connection.host

    String

    Address of the database instance

    task, component

    connection.port

    Integer

    listening port of the database instance

    1521

    task, component

    connection.service

    String

    Database service name

    task, component

    connection.sid

    String

    Database SID

    task, component

    connection.user

    String

    User name

    Yes

    task, component

    connection.password

    String

    User password

    Yes

    task, component

    connection.mode

    String

    Connection mode

    sysdba, sysoper

    task, component

    component

    String

    Name of the component to fetch properties and parameters from

    Yes

    task

    connection.dsn

    String

    DSN or EasyConnect string

    CPU
    Name
    Unit
    Description

    container_cpu_used

    millicores

    The CPUs used by the container

    container_cpu_used_max

    millicores

    The maximum CPUs used by the container among all container replicas

    hashtag
    Memory

    Name
    Unit
    Description

    container_memory_used

    bytes

    The total amount of memory used by the container

    container_memory_used_max

    bytes

    The maximum memory used by the container among all container replicas

    hashtag
    Parameters

    Parameter
    Type
    Unit
    Default
    Domain
    Restart
    Description

    cpu_request

    integer

    millicores

    hashtag
    Constraints

    The following tables show a list of constraints that may be required in the definition of the study, depending on the tuned parameters:

    Formula
    Notes

    component_name.cpu_request <= component_name.cpu_limit

    component_name.memory_request <= component_name.memory_limit

    LoadRunnerEnteprise Operator

    This page introduces the LoadRunnerEnterprise operator, a workflow operator that allows piloting performance tests on a target system by leveraging Micro Focus LoadRunner Enterprise (formerly known as Performance Center).

    hashtag
    Prerequisites

    This section provides the minimum requirements that you should meet to use this operator.

    hashtag
    Supported versions

    • Micro Focus Performance Center 12.60 or 12.63

    • LoadRunner Enterprise 2020 SP3

    hashtag
    Configuration options

    When you define a task that uses the LoadRunnerEnterprise operator you should specify some configuration information to allow the operator to connect to LoadRunner Enterprise and execute a provided test scenario.

    You can specify configuration information within the arguments that are part of a task in the YAML of the definition of a workflow.

    You can avoid specifying each configuration information at the task level, by including a component property with the name of a component; in this way, the operator will take any configuration information from the properties of the referenced component

    hashtag
    Operator arguments

    This table reports the configuration reference for the arguments section

    Field
    Type
    Value Restrictions
    Required
    Dafault
    Description

    hashtag
    How to retrieve the testId value

    The following screenshot from Performance Center shows the testId value highlighted.

    hashtag
    How to retrieve the testSet value

    The following screenshot from Performance Center shows the testSet name highlighted.

    How to retrieve the testId value from LoadRunner Enterprise

    URL:

    then test management from the main menu

    hashtag
    Examples

    hashtag
    A simple performance test

    Executor Operator

    The Executor Operator can be used to execute a shell command on a target machine using SSH.

    hashtag
    Operator arguments

    Name
    Type
    Values restrictions

    WindowsFileConfigurator Operator

    The WindowsFileConfigurator operator allows configuring systems tuned by Akamas by interpolating configuration parameters into files on remote Windows machines.

    The operator performs the following operations:

    1. It reads an input file from a remote machine containing templates for interpolating the configuration parameters generated by Akamas

    2. It replaces the values of configuration parameters in the input file

    OracleExecutor Operator

    This page introduces the OracleExecutor operator, a workflow operator that allows executing custom queries on an Oracle instance.

    hashtag
    Prerequisites

    This section provides the minimum requirements that you should meet in order to use the Oracle Executor operator.

    LoadRunner Operator

    This page introduces the LoadRunner operator, a workflow operator that allows piloting performance tests on a target system by leveraging Micro Focus LoadRunner. This page assumes you are familiar with the definition of a workflow and its tasks. If this is not the case, then check .

    hashtag
    Prerequisites

    This section provides the minimum requirements that you should meet to use this operator.

    name: oracledb
    componentType: Oracle Database 18c
    properties:
      connection:
        user: application
        password: password
        host: oradb.dev.akamas.io
        service: XE
    tasks:
    - name: update parameters
      operator: OracleConfigurator
      arguments:
        component: oracledb
    
    - name: run load test
      operator: Executor
      arguments:
        command: sh run_test.sh
        component: generator

    local[]

  • local[,F]

  • spark://HOST:PORT

  • spark://HOST1:PORT1, HOST2:PORT2

  • yarn

  • host, sid and optionally port

  • container_cpu_util

    percent

    The percentage of CPUs used with respect to the limit

    container_cpu_util_max

    percent

    The maximum percentage of CPUs used with respect to the limit among all container replicas

    container_cpu_throttle_time

    percent

    The amount of time the CPU has been throttled

    container_cpu_throttled_millicores

    millicores

    The CPUs throttling per container in millicores

    container_cpu_request

    millicores

    The CPUs requested for the container

    container_cpu_limit

    millicores

    The CPUs limit for the container

    container_memory_util

    percent

    The percentage of memory used with respect to the limit

    container_memory_util_max

    percent

    The maximum percentage of memory used with respect to the limit among all container replicas

    container_memory_working_set

    bytes

    The working set usage in bytes

    container_memory_resident_set

    bytes

    The resident set usage in bytes

    container_memory_cache

    bytes

    The memory cache usage in bytes

    container_memory_request

    bytes

    The memory requested for the container

    container_memory_limit

    bytes

    The memory limit for the container

    You should select your own default value.

    You should select your own domain.

    yes

    Amount of CPU resources requests in CPU units (milllicores)

    cpu_limit

    integer

    millicores

    You should select your own default value.

    You should select your own domain.

    yes

    Limits on the amount of CPU resources usage in CPU units (millicores)

    memory_request

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Amount of memory resources requests in megabytes

    memory_limit

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Limits on the amount of memory resources usage in megabytes

    The information required to connect to LoadRunner Enterprise.

    username

    String

    -

    Yes

    -

    The username used to connect to LoadRunner Enterprise

    password

    String

    -

    Yes

    -

    The password for the specified user

    tenantID

    String

    -

    No

    -

    The id of the tenant (Only for LR2020)

    domain

    String

    -

    Yes

    The Domain of your load test projects.

    project

    String

    -

    Yes

    The Project name of your load test projects

    testId

    Number

    -

    Yes

    The id of the load test. See here below how to retrieve this from LoadRunner.

    testSet

    String

    -

    Yes

    -

    The name of the TestSet. See here below how to retrieve this from LoadRunner.

    timeSlot

    String

    A number followed by the time unit.

    Values must be multiple of 15m and greater then 30m

    Valid units are:

    • m: minutes

    • h: hours

    Yes

    -

    The reserved time slot for the test.

    Examples:

    • 1h

    • 45m

    component

    String

    A valid component name

    No

    -

    The name of the component from which the operator will take its configuration options

    pollingInterval

    Number

    A positive integer number

    No

    30

    The frequency (in seconds) of at witch Akamas checks for the load test status

    verifySSL

    String

    True, False

    No

    True

    Wether to validate the certificate provided by the LRE server when using an https connection

    address

    String

    A valid URL I.e. http://loadrunner-enterprise.yourdomain.comarrow-up-right

    Yes

    http://<LRE address>/Loadtest/arrow-up-right

    -

    Required
    Default
    Description

    command

    String

    yes

    The shell command to be executed on the remote machine

    host

    Object

    See structure documented below

    no

    hashtag
    Host structure and arguments

    Here follows the structure of the host argument:

    with its arguments:

    Name
    Type
    Value Retrictions
    Required
    Default
    Description

    hostname

    String

    should be a valid SSH host address

    no, if the Component whose name is defined in component has a property named hostname

    hashtag
    Get operator arguments from component

    The component argument can refer to a component by name and use its properties as the arguments of the operator (see mapping here below). In case the mapped arguments are already provided to the operator, there is no override.

    hashtag
    Component property to operator argument mapping

    Component Property
    Operator Argument

    hostname

    host->hostname

    username

    host->username

    sshPort

    host->sshPort

    hashtag
    Examples

    Let's assume you want to run a script on a remote host and expect the script to be executed successfully within 30 seconds but might fail occasionally.

    Launch a script, wait for its completion, and in case of failures or timeout retry 3 times by waiting 10 seconds between retries:

    Execute a uname command with explicit host information (explicit SSH key)

    Execute a uname command with explicit host information (imported SSH key)

    Execute a uname command with host information taken from a Component

    Start a load-testing script and keep it running in the background during the workflow

    hashtag
    Troubleshooting

    hashtag
    Troubles in running sh scripts remotely

    Due to the stderr configuration, it could happen that invoking a bash script on a server has a different result than running the same script from Akamas Executor Operator. This is quite common with Tomcat startup scripts like $HOME/tomcat/apache-tomcat_1299/bin/startup.sh.

    To avoid this issue simply create a wrapper bash file on the target server adding the set -m instruction before the sh command, eg:

    and then configure the Executor Operator to run the wrapper script like:

    You can run the following to emulate the same behavior of Akamas running scripts over SSH:

    hashtag
    Troubles in keeping a script running in the background

    There are cases in which you would like to keep a script running for the whole duration of the test. Some examples could be:

    • A script applying load to your system for the duration of the workflow

    • The manual start of an application to be tested

    • The setup of a listener that gathers logs, metrics, or data

    In all the instances where you need to keep a task running beyond the task that started it, you must use the detach: true property. Note that a detached executor task returns immediately, so you should run only the background task in detached mode.

    Remember to keep all tasks requiring synchronous (standard) behavior out of the detached task.

    Example:

    Library references

    The library used to execute scripts remotely is Fabricarrow-up-right, a high-level Python library designed to execute shell commands remotely over SSH, yielding useful Python objects in return.

    The Fabric library uses a connection object to execute scripts remotely (see connection — Fabric documentation)arrow-up-right. The option of a dedicated detach mode comes from implementing the more robust disown property from the Invoke Runner underlying the Connection (see runners — Invoke documentationarrow-up-right). This is the reason you should rely on detach whenever possible instead of running the background processes straight into the script.

    In the Frequently Asked/Answered Questions (FAQ) — Fabric documentationarrow-up-right you may find some further information about the typical problems and solutions due to hanging problems for background processes.

  • It writes the file with replaced configuration parameters on a specified path on another remote machine

  • Access on remote machines is performed using WinRM

    hashtag
    Templates for configuration parameters

    The Windows File Configurator allows writing templates for configuration parameters in two ways:

    • a single parameter is specified to be interpolated:

    • all parameters of a component to be interpolated:

    hashtag
    Adding a suffix or prefix for interpolated parameters

    It is possible to add a prefix or suffix to interpolated configuration parameters by acting at the component-type level:

    In the example above, the parameter x1 will be interpolated with the prefix PREFIX and the suffix SUFFIX, ${value} will be replaced with the actual value of the parameter at each experiment.

    hashtag
    Example

    Suppose we have the configuration of the following parameters for experiment 1 of a study:

    where component1 is of type MyComponentType defined as follows:

    A template file to interpolate only parameter component1.param1 and all parameters from component2 would look like this:

    The file after the configuration parameters are interpolated would look like this:

    Note that the file in this example contains a bash command whose arguments are constructed by interpolating configuration parameters. This represents a typical use case for the WindowsFileConfigurator: to construct the right bash commands that configure a system with the new configuration parameters computed by Akamas.

    hashtag
    Operator arguments

    Name
    Type
    Value Restrictions
    Required
    Default
    Description

    source

    Object

    It should have a structure like the one defined in the next section

    No, if the Component whose name is defined in component has properties that map to the ones defined within source

    hashtag
    source and target structure and arguments

    Here follows the structure of either the source or target operator argument

    Name
    Type
    Value Restrictions
    Required
    Default
    Description

    hostname

    String

    It should be a valid host address

    Yes

    hashtag
    Get operator arguments fromcomponent

    The component argument can be used to refer to a Component by name and use its properties as the arguments of the operator. In case the mapped arguments are already provided to the operator, there is no override.

    Notice that in this case, the operator replaces in the template file only tokens referring to the specified component. A parameter bound to any component causes the substitution to fail.

    hashtag
    Component property to operator argument mapping

    Component property

    Operator argument

    hostname

    source->hostname target->hostname

    username

    source->username target->username

    password

    source->password target->password

    hashtag
    Examples

    hashtag
    Configure parameters for an Apache server with explicit source and target information

    hashtag
    Configure parameters for an Apache server with information taken from a Component

    Where the apache-server-1 component is defined as:

    hashtag
    Supported versions
    • Oracle 12c or later

    hashtag
    Network requirements

    • The OracleExecutor operator must be able to connect to the Oracle URL or IP address and port (default port is 1521)

    hashtag
    Permissions

    • The user used to log into the database must have enough privilege to perform the required queries

    hashtag
    Operator arguments

    When you define a task that uses the Oracle Executor operator you should specify some configuration information to allow the operator to connect to the Oracle instance and execute queries.

    The operator inherits the connection arguments from the properties of the component when referenced in the task definition. The Akamas user can also override the properties of the component or not reference it at all defining the connection fields directly in the configuration of the task.

    The following table provides the list of all properties required to define a task that uses the OracleExecutor operator.

    Field
    Type
    Description
    Default Value
    Restrictions
    Required
    Source

    connection.dsn

    String

    The DSN or EasyConnect string

    Notice: it is a good practice to define only queries that update the state of the database. Is not possible to use SELECT queries to extract data from the database.

    hashtag
    Examples

    hashtag
    Truncate tables after a load test

    In the following example, the operator performs a cleanup action on a table of the database:

    hashtag
    Update database entries

    In the following example, the operator leverages its templating features to update a table:

    The referenced oracledb component contains properties that specify how to connect to the Oracle database instance:

    hashtag
    Supported versions
    • Micro Focus LoadRunner 12.60 or 2020

    • Microsoft Windows Server 2016 or 2019

      • Powershell version 5.1 or greater

    hashtag
    User and WinRM configuration

    To configure WinRM to allow Akamas to launch tests please read the Integrating LoadRunner Professional page.

    All LoadRunner test files (VuGen scripts and folder, lrs files) and their parent folders, must be readable and writable by the user account used by Akamas.

    hashtag
    Configuration options

    When you define a task that uses the LoadRunner operator you should specify some configuration information to allow the operator to connect to the LoadRunner controller and execute a provided test scenario.

    You can specify configuration information within the arguments that are part of a task in the YAML of the definition of a workflow.

    You can avoid specifying each configuration information at the task level, by including a component property with the name of a component; in this way, the operator will take any configuration information from the properties of the referenced component

    hashtag
    Required properties

    • controller - a set of pieces of information useful for connecting to the LoadRunner controller

    • scenarioFile - the path to the scenario file within the LoadRunner controller to execute the performance test

    • resultFolder - the path to the performance tests results folder with the LoadRunner controller

    hashtag
    Connect to a LoadRunner controller

    To make it possible for the operator to connect to a LoadRunner controller to execute a performance test you can use the controller property within the workflow task definition:

    hashtag
    Operator arguments

    This table reports the configuration reference for the arguments section.

    Field
    Type
    Value restrictions
    Required
    Default
    Description

    controller

    Object

    Yes

    Important notice: remember to escape your path with four backslashes (e.g. C:\\\\Users\\\\\...)

    hashtag
    Controller arguments

    This table reports the configuration reference for the controller section, which is an object with the following fields:

    Field
    Type
    Value restrictions
    Required
    Default
    Description

    component

    String

    No

    Important notice: remember to escape your path with four backslashes (e.g. C:\\\\Users\\\\\...)

    hashtag
    Examples

    hashtag
    A simple performance test

    Creating automation workflowsarrow-up-right

    WindowsExecutor Operator

    The WindowsExecutor operator executes a command on a target Windows machine using WinRM.

    The command can be anything that runs on a Windows Command Prompt.

    hashtag
    Operator arguments

    Name
    Type
    Value Restrictions
    Required
    Default
    Description

    hashtag
    host structure and arguments

    Here follows the structure of the host argument

    with its arguments:

    Name
    Type
    Value Restrictions
    Required
    Default
    Description

    hashtag
    Get operator arguments from component

    The component argument can refer to a Component by name and use its properties as the arguments of the operator. In case the mapped arguments are already provided to the operator, there is no override.

    Here is an example of a component that overrides the host and the command arguments:

    hashtag
    Examples

    hashtag
    Execute a dir command with explicit host information

    hashtag
    Execute a dir command with host information taken from a Component

    Goal & Constraints

    Optimization goals and constraints are defined using a YAML manifest with the following structure:

    where:

    Field
    Type
    Value restriction
    Is Required
    Default value
    Description
    name: test
    operator: LoadRunnerEnterprise
    arguments:
      address: "http://lr-pc.dev.akamas.io"
      username: akamas
      password: akamas
      tenantID: cf59c1a8-ad2d-4c9a-9324-edadaae5b8b9
      domain: AKAMASDOMAIN
      project: akamasproject
      testId: 1
      testSet: testsetname
      timeSlot: '30m'
    host:
      hostname: this_is_a_hostname
      username: this_is_a_username
      password: this_is_a_password
      sshPort: 22
      key: this_is_a_key
    name: Run Script
    operator: Executor
    arguments:
      timeout: 30s
      retries: 3
      retry_delay: 10s
      command: bash /tmp/myscript.sh
      host:
        hostname: frontend.akamas.io
        username: akamas
        key: secret.key
    name: TestConnectivity
    operator: Executor
    arguments:
      command: bash uname -a
      host:
        hostname: frontend.akamas.io
        username: akamas
        key: |-
          -----BEGIN RSA PRIVATE KEY-----
          RSA KEY HERE
          -----END RSA PRIVATE KEY-----
    name: TestConnectivity
    operator: Executor
    arguments:
      command: bash uname -a
      host:
        hostname: frontend.akamas.io
        username: akamas
        key: path/to/key
    name: TestConnectivity
    operator: Executor
    arguments:
      command: bash uname -a
      component: frontend1
    name: TestConnectivity
    operator: Executor
    arguments:
      command: bash start_load.sh
      component: tester
      detach: true
    #!/bin/bash
    set -m;
    $HOME/tomcat/apache-tomcat_1299/bin/startup.sh
    command: "bash $HOME/akamasScript/tomcatStart.sh
    ssh -t <user>@<server> <your command here>
    switch on machine and wait for SSH
    run application test in background → detached mode
    execute test run
    ${component_name.parameter_name}
    ${component_name.*}
    name: Component Type 1
    description: My Component type
    parameters:
    - name: x1
      domain:
        type: real
        domain: [-5.0, 10.0]
      defaultValue: -5.0
      # Under this section, the operator to be used to configure the parameters is defined
      operators:
        WindowsFileConfigurator:
            # using this OPTIONAL confTemplate property is possible to interpolate the parameter value with a prefix and a suffix
            confTemplate: "PREFIX${value}SUFFIX"
    component1.param1: 1024
    component1.param2: Category1
    component2.param3: 7
    component2.param4: 35.4
    name: MyComponentType
    description: "MyComponentType
    parameters:
    - name: param1
      domain:
        type: real
        domain: [-5.0, 10.0]
      defaultValue: -5.0
      # Under this section, the operator to be used to configure the parameters is defined
      operators:
        WindowsFileConfigurator:
            # using this OPTIONAL confTemplate property is possible to interpolate the parameter value with a prefix and a suffix
            confTemplate: "X1:${value}MB"
    # ...
    myexecutable.exe /PARAM ${component1.param1} /PARAMS ${component2.*}
    myexecutable.exe /PARAM X1:1024MB /PARAMS 7 35.4
    name: RemoteConfOperatorTestStandalone
    operator: WindowsFileConfigurator
    arguments:
      source:
        hostname: template-server
        username: akamas-user1
        password: akamas-password1
        path: C:\templates\frontned-httpd.conf
      target:
        hostname: frontend-server
        username: akamas-user2
        password: akamas-password22
        path: c:\httpd\httpd.conf
    name: RemoteConfOperatorTestStandalone
    operator: WindowsFileConfigurator
    arguments:
      component: apache-server-1
    name: apache-server-1
    description: The Apache server instance
    componentType: Apache Server 2.4
    
    properties:
      hostname: apache.akamas.io
      username: administrator
      sourcePath: c:\template\httpd.conf.template
      targetPath: c:\httpd\httpd.conf
    tasks:
    - name: clean database
      operator: OracleExecutor
      arguments:
        sql:
        - TRUNCATE TABLE user_action
        - DELETE FROM user WHERE id LIKE 'test%'
        connection:
          user: application
          password: password
          dsn: oradb.dev.akamas.io/XE
    tasks:
    - name: set value
      operator: OracleExecutor
      arguments:
        sql:
        - UPDATE rs_component_pros SET value='${app.max_connections}' WHERE property='maxconn'
        component: oracledb
    name: oracledb
    componentType: Oracle Database 18c
    properties:
      connection:
        user: application
        password: password
        host: oradb.dev.akamas.io
        service: XE
    controller:
      hostname: loarrunner.example.com
      username: Domain\LoadRunnerUser
      password: j(sBdH5fsG9.I56P%7n2XPjmgO6!ARm=
    name: "task1"
    operator: "LoadRunner"
    arguments:
      controller:
        hostname: loarrunner.example.com
        username: Domain\LoadRunnerUser
        password: j(sBdH5fsG9.I56P%7n2XPjmgO6!ARm=
      scenarioFile: 'C:\Users\LoadRunnerUser\Desktop\test\scenario\Scenario1.lrs'
      resultFolder: 'c:\Temp\{study}\{exp}\{trial}'
      timeout: 15m
      checkFrequency: 30s
    1h30m

    Information relative to the target machine onto which the command has to be executed using SSH

    component

    String

    It should match the name of an existing Component of the System under test

    no

    The name of the Component whose properties can be used as arguments of the operator

    detach

    Boolean

    no

    False

    The execution mode of the shell command. Default (False) execution will be synchronous, detached (True) execution will be asynchronous and will return immediately

    SSH endpoint

    username

    String

    no, if the Component whose name is defined in component has a property named username

    SSH login username

    password

    String

    cannot be set if key is already set

    no, if the Component whose name is defined in component has a property named password

    SSH login password

    sshPort

    Number

    1≤sshPort≤65532

    no

    22

    SSH port

    key

    String

    cannot be set if password is already set

    no, if the Component whose name is defined in component has a property named key

    SSH login key. Either provide directly the key value or specify the path of the file (local to the cli executing the create command) to read the key from. The operator supports RSA and DSA Keys.

    password

    host->password

    key

    host->key

    Information relative to the source/input file to be used to interpolate optimal configuration parameters discovered by Akamas

    target

    Object

    It should have a structure like the one defined in the next section

    No, if the Component whose name is defined in component has properties that map to the ones defined within target

    Information relative to the target/output file to be used to interpolate optimal configuration parameters discovered by Akamas

    component

    String

    It should match the name of an existing Component of the System under test

    No

    The name of the Component whose properties can be used as arguments of the operator

    Windows host

    username

    String

    Yes

    Login username

    password

    String

    Windows password for the specified user

    Yes

    Login password

    path

    String

    It should be a valid path

    Yes

    The path of the file to be used either as the source or target of the activity to applying Akamas computed configuration parameters using files

    sourcePath

    source->path

    targetPath

    target->path

    Is possible to define only one of the following sets of configurations:

    • dsn

    • host, service and optionally port

    • host, sid and optionally port

    task, component

    connection.host

    String

    The address of the database instance

    task, component

    connection.port

    Integer

    The listening port of the database instance

    1521

    task, component

    connection.service

    String

    The database service name

    task, component

    connection.sid

    String

    The database SID

    task, component

    connection.user

    String

    The user name

    Yes

    task, component

    connection.password

    String

    The user password

    Yes

    task, component

    connection.mode

    String

    The connection mode

    sysdba, sysoper

    task, component

    sql

    List[String]

    The list of queries to update the database status before or after the workload execution. Queries can be templatized, containing tokens referencing parameters of any component in the system.

    Yes

    task

    autocommit

    boolean

    A Flag to enable the auto-commit feature

    False

    No

    task

    component

    String

    The name of the component to fetch properties from

    No

    task

    The information required to connect to LoadRunner controller machine.

    component

    String

    No

    The name of the component from which the operator will take its configuration options

    scenarioFile

    String

    Matches an existing file within the LoadRunner controller

    Yes

    The LoadRunner scenario file to execute the performance test.

    resultFolder

    String

    Yes

    The folder, on the controller, where Loadrunner will put the results of a performance test.

    You can use the placeholders {study}, {exp}, {trial} to generate a path that is unique for the running Akamas trial.

    It can be a local path on the controller or on a network share

    loadrunnerResOverride

    String

    A valid name for a Windows folder

    No

    res

    The folder name where LoadRunner save the analysis results.

    The default value can be changed in the LoadRunner controller.

    timeout

    String

    The string must contain a numeric value followed by a suffix (s, m, h, d).

    No

    2h

    The timeout for the Loadrunner scenario. If Loadrunner doesn’t finish the scenario within the specified amount of time, Akamas will consider the workflow as failed.

    checkFrequency

    String

    The string must contain a numeric value followed by a suffix (s, m, h, d).

    No

    1m

    The interval at which Akamas check’s the status of the Loadrunner scenario.

    executable

    String

    A valid windows path

    No

    C:\Program Files (x86)\Micro Focus\LoadRunner\bin\Wlrun.exe

    The LoadRunner executable path

    The name of the component from which the operator will take its configuration options.

    scenarioFile

    String

    Matches an existing file within the LoadRunner controller

    Yes

    The LoadRunner scenario file to execute the performance test.

    resultFolder

    String

    Yes

    The folder, on the controller, where Loadrunner will put the results of a performance test.

    You can use the placeholders {study}, {exp}, {trial} to generate a path that is unique for the running Akamas trial.

    It can be a local path on the controller or on a network share.

    loadrunnerResOverride

    String

    A valid name for a Windows folder

    No

    res

    The folder name where LoadRunner save the analysis results.

    The default value can be changed in the LoadRunner controller.

    timeout

    String

    The string must contain a numeric value followed by a suffix (s, m, h, d).

    No

    2h

    The timeout for the Loadrunner scenario. If Loadrunner doesn’t finish the scenario within the specified amount of time, Akamas will consider the workflow as failed.

    checkFrequency

    String

    The string must contain a numeric value followed by a suffix (s, m, h, d).

    No

    1m

    The interval at which Akamas check’s the status of the Loadrunner scenario.

    executable

    String

    A valid windows path

    No

    C:\Program Files (x86)\Micro Focus\LoadRunner\bin\Wlrun.exe

    The LoadRunner executable path.

    Information relative to the target machine onto which the command has to be executed

    component

    String

    It should match the name of an existing Component of the System under test

    No

    The name of the Component whose properties can be used as arguments of the operator

    The protocol to use to connect to the Windows machine with WinRM

    hostname

    String

    Valid FQDN or ip address

    Yes, if the Component whose name is defined in component hasn’t a property named host->hostname

    -

    Windows machine’s hostname

    port

    Number

    1≤port≤65532

    Yes, if the Component whose name is defined in component hasn’t a property named host->port

    5863

    WinRM port

    path

    String

    -

    Yes, if the Component whose name is defined in component hasn’t a property named host->path

    /wsman

    The path where WinRM is listening

    username

    String

    • username

    • domain\username

    • username@domain

    Yes, if the Component whose name is defined in component hasn’t a property named host->hostname

    -

    User login (domain or local)

    password

    String

    -

    Yes, if the Component whose name is defined in component hasn’t a property named host->password

    -

    Login password

    authType

    String

    • ntlm

    • ssl

    Yes, if the Component whose name is defined in component hasn’t a property named host->authType

    ntlm

    The authentication method to use against Windows machine

    validateCertificate

    Boolean

    • true

    • false

    Yes, if the Component whose name is defined in component hasn’t a property named host->validateCertificate

    False

    Whether or not validate the server certificate

    ca

    String

    A valid CA certificate

    Yes, if the Component whose name is defined in component hasn’t a property named host->ca

    -

    The CA that is required to validate the servier certificate

    operationTimeoutSec

    Integer

    Must be greather then 0

    No

    The amount in seconds after which the execution of the command is considered failed

    Notice that the ouput of the command doesn’t reset the timeout.

    readTimeoutSec

    Integer

    Must be greather then operationTimeoutSec

    No

    The amount of seconds to wait before an HTTP connect/read times out

    command

    String

    Yes

    The command to be executed on the remote machine

    host

    Object

    It should have a structure like the one described here below

    No

    protocol

    String

    • https

    • http

    Yes, if the Component whose name is defined in component hasn’t a property named host->protocol

    https

    host:
      protocol: [https|http]
      hostname: this_is_a_hostname
      port: 5863
      path: /wsman
      username: this_is_a_username
      password: this_is_a_password
      validateCertificate: false
    name: LoadRunnerMachine
    componentType: WebApplication
    properties:
      command: "dir c:\"
      host:
        hostname: lr.mydomain.com
        username: MyLoadRunnerUser
        password: MyPassword
    name: TestConnectivity
    operator: WindowsExecutor
    arguments:
      command: "dir c:\"
      host:
        hostname: frontend.akamas.io
        username: administrator
        password: MyPassword
    name: TestConnectivity
    operator: WindowsExecutor
    arguments:
      command: "dir c:\"
      component: frontend1

    objective

    String

    minimize maximize

    Yes

    How Akamas should evaluate the goodness of a generated configuration: if it should consider good a configuration generated that maximizes function, or a configuration that minimizes it.

    function

    Object

    It should have a structure like the one described in Goal function

    Yes

    The specification of the function to be evaluated to assess the goodness of a configuration generated by Akamas. This function is a function of the metrics of the different Components of the System under test.

    constraints

    List of objects

    It should have a structure like the one described in Goal constraints

    No

    A list of constraints on aggregated metrics of the Components of the System under test for which a generated configuration should not be considered valid.

    hashtag
    Function

    The function field of the Goal of a Study details the characteristics of the function Akamas should minimize or maximize to reach the desired performance objective.

    The function field has the following structure:

    Where:

    Field
    Type
    Value restrictions
    Is required
    Default value
    Description

    formula

    String

    See formula

    Yes

    hashtag
    Formula

    The formula field represents the mathematical expression of the performance objective for the Study and contains variables and operators with the following characteristics:

    • Valid operators are: +, -, *, /, ^, sqrt(variable), log(variable), max(variable1, variable2), and min(variable1, variable2)

    • Valid variables are in the form:

      • <component_name>.<metric_name>, which correspond directly to metrics of Components of the System under test

      • <variable_name>, which should match variables specified in the variables field

    Each metric that is directly or indirectly part of the formula of the function of the Goal is aggregated by default by average; more specifically, Akamas computes the average of each metric within the time window specified by the windowing strategy of the Study. Variables in the formula can be expanded with an aggregation in the form of <variable>:<aggreggation>. A list of available aggregations is available in the section Aggregations.

    hashtag
    Variables

    The variables field contains the specification of additional variables present in the formula, variables that can offer more flexibility compared to directly specifying each metric of each Component in the formula.

    Notice: each subfield of variables specifies a variable with its characteristics, the name of the subfield is the name of the variable.

    The variable subfield has the following structure:

    Field
    Type
    Value restrictions
    Is required
    Default value
    Description

    metric

    String

    should match the name of a metric defined for the Components of the System under test

    Yes

    It is possible to use the notation <component_name>.<metric_name> in the metric field to automatically filter the metric’s data point by that component name is applied.

    hashtag
    Constraints

    The constraints field specifies constraints on the metrics of the system under test. For a configuration to be valid for the defined goal, such constraints must be satisfied. Constraints can be defined as absolute or relativeToBaseline.

    Each constraint has the form of:

    mathematical_operationcomparison_operatorvalue_to_compare

    where valid mathematical operations include:

    • + - * / ^

    • min max

    • sqrt log (log is a natural logarithm)

    valid comparison operators include:

    • > < <= >=

    • == != (equality, inequality)

    and valid values to compare include:

    • absolute values (e.g, 104343)

    • percentage values relative to the baseline (e.g, 20%)

    As an example, you could define an absolute constraint with the following snippet:

    Relative constraints can be defined by adding other constraints under the relativeToBaseline section. In the example below, for the configuration to be considered valid, it's required that the metric jvm.memory_used does not exceed by 80% the value measured in the baseline.

    hashtag
    Aggregations

    Variables used in the study formula specification and in the constraints definition can include an aggregation. The following aggregations are available: avg, min, max, sum, p90, p95, p99.

    hashtag
    Examples

    The following example refers to a study whose goal is to optimize the throughput of a Java service (jpetstore), that is to maximize the throughput (measured as elements_per_second) while keeping errors (error_rate) and latency (avg_duration, max_duration) under control (absolute values):

    The following example refers to a study whose goal is to optimize the memory consumption of Docker containers in a microservices application, that is to minimize the average memory consumption of Docker containers within the application of appId="app1" by observing memory limits, also normalizing by the maximum duration of a benchmark (containers_benchmark_duration).

    LinuxConfigurator Operator

    The LinuxConfigurator operator allows configuring systems tuned by Akamas by applying parameters related to the Linux kernel using different strategies.

    The operator can configure provided Components or can configure every Component which has parameters related to the Linux kernel.

    The parameters are applied via SSH protocol.

    hashtag
    Using

    In the most basic use of the Operator, it is sufficient to add a task of type LinuxConfigurator in the workflow.

    The operator makes use of properties specified in the component to identify which instance should be configured, how to access it, and any other information required to apply the configuration.

    hashtag
    Operator arguments

    If no component is provided, this operator will try to configure every parameter defined for the Components of the System under test

    hashtag
    Supported Component Properties

    The following table highlights the properties that can be specified on components and are used by this operator.

    hashtag
    Filter parameters and block/network devices

    The properties blockDevices and networkDevices allow specifying which parameters to apply to each block/network-device associated with the Component, as well as which block/network-device should be left untouched by the LinuxConfigurator.

    If the properties are omitted, then all block/network-devices associated with the Component will be configured will all the available related parameters.

    All block-devices called loopN (where N is an integer number greater or equal to 0) are automatically excluded from the Component’s block-devices

    The properties blockDevices and networkDevices are lists of objects with the following structure:

    hashtag
    Examples

    In this example, only the parameters os_StorageReadAhead and os_StorageQeueuScheduler are applied to all the devices that match the regex "xvd[a-z]" (i.e. xvda, xvdb, …, xvdc).

    In these examples, only the parameter os_StorageMaxSectorKb is applied to block device xvdb and loop0.

    circle-info

    Note that the parameter is applied also to the block device loop0, since it is specified in the name filter, this overrides the default behavior since loopN devices are excluded by the Linux Optimization Pack

    In this example, no parameters are applied to the wlp4s0 network device, which is therefore excluded from the optimization.

    hashtag
    How are parameters applied to the system?

    To support the scenario in which some configuration parameters related to the Linux kernel may be applied using the strategies supported by this operator, while others with other strategies (e.g, using a file to be written on a remote machine), it is necessary to specify which parameters should be applied with the LinuxConfigurator, and this is done at the ComponentType level; moreover, still at the ComponentType level, it is necessary to specify which strategy should be used to configure each parameter. This information is already embedded in the Linux Optimization pack and, usually, no customization is required.

    hashtag
    Sysctl strategy

    With this strategy, a parameter is configured by leveraging the sysctl utility. The sysctl variable to map to the parameter that needs to be configured is specified using the key argument.

    hashtag
    Echo strategy

    With this strategy, a parameter is configured by echoing and piping its value into a provided file. The path of the file is specified using the file argument.

    hashtag
    Map strategy

    With this strategy, each possible value of a parameter is mapped to a command to be executed on the machine the LinuxConfigurator operates on(this is especially useful for categorical parameters).

    hashtag
    Command strategy

    With this strategy, a parameter is configured by executing a command into which the parameter value is interpolated.

    FileConfigurator Operator

    The FileConfigurator operator allows configuring systems tuned by Akamas by interpolating configuration parameters into files on remote machines.

    The operator performs the following operations:

    1. It reads an input file from a remote machine containing templates for interpolating the configuration parameters generated by Akamas

    2. It replaces the values of configuration parameters in the input file

    3. It writes the file with replaced configuration parameters on a specified path on another remote machine

    Access on remote machines is performed using SFTP (SSH).

    hashtag
    Templates for configuration parameters

    The FileConfigurator allows writing templates for configuration parameters in two ways:

    • specify that a parameter should be interpolated directly:

    • specify that all parameters of a component should be interpolated:

    hashtag
    Suffix or prefix for interpolated parameters

    It is possible to add a prefix or suffix to interpolated configuration parameters by acting at the component-type level:

    Notice that any parameter that does not contain the FileConfigurator element in the operators' attribute is ignored and not written.

    In the example above, the parameter x1 will be interpolated with the prefix PREFIX and the suffix SUFFIX, ${value} will be replaced with the actual value of the parameter at each experiment.

    hashtag
    Example

    Let's assume we want to apply the following configuration:

    where component1 is of type MyComponentType and MyComponentType is defined as follows:

    A template file to interpolate only parameter component1.param1 and all parameters from component2 would look like this:

    The file after the configuration parameters are interpolated would look like this:

    Note that the file in this example contains a bash command whose arguments are constructed by interpolating configuration parameters. This represents a typical use case for the File Configurator: to construct the right bash commands that will configure a system with the new configuration parameters computed by Akamas.

    hashtag
    Operator arguments

    Name
    Type
    Value Restrictions
    Required
    Default
    Description

    hashtag
    source and target structures and arguments

    Here follows the structure of either the source or target operator argument

    Name
    Type
    Value restrictions
    Required
    Default
    Description

    hashtag
    Get operator arguments from component

    The component argument can be used to refer to a component by name and use its properties as the arguments of the operator. In case the mapped arguments are already provided to the operator, there is no override.

    In this case, the operator replaces in the template file only tokens referring to the specified component. A parameter bound to any component will cause the substitution to fail.

    hashtag
    Component property to operator argument mapping

    Component property
    Operator argument

    hashtag
    Examples

    hashtag
    Configure parameters for an Apache server with explicit source and target machine information

    hashtag
    Configure parameters for an Apache server with information taken from a Component

    where the apache-server-1 component is defined as:

    Component Types template

    Component types are defined using a YAML manifest with the following structure:

    and properties for the general section:

    Field
    Type
    Value restrictions
    Is required
    Default value
    Description

    The parameter section describes the relationship between the component type and already defined parameters with the following properties:

    Field
    Type
    Value restrictions
    Is required
    Default value
    Description

    The metric section describes the relationship between the component type and already defined metrics with the following properties:

    Field
    Type
    Value restrictions
    Is required
    Default value
    Description

    Notice that component type definitions are shared across all the workspaces on the same Akamas installation, and require an account with administrative privileges to manage them.

    hashtag
    Examples

    Example of a component for the Cassandra component type:

    Example of a component for the Linux operating component type:

    Study template

    Optimization studies are defined using a YAML manifest with the following structure:

    with the following mandatory properties:

    Field
    Type
    Value restrictions
    Is required
    Default Value
    Description

    DotNet Core 3.1

    This page describes the Optimization Pack for the component type DotNet Core 3.1.

    hashtag
    Metrics

    Metric
    Unit
    Description
    goal:
      objective: "minimize"
      function:
        formula: "jvm1.response_time + jvm2.response_time"
      constraints:
        absolute:
          - name: heap_used
            formula: jvm1.heap_used <= 3221225472
        relativeToBaseline:
          - name: memory_used
            formula: jvm1.memory_used <= 80%
    function:
      formula: "jvm1.response_time / sqrt(x:max)"
      variables:
        x:
          metric: "throughput"
          labels:
            componentName: "jvm2"
    goal:
      objective: "minimize"
      function:
        formula: "jvm.response_time"
      constraints:
        absolute:
          - name: heap_used
            formula: jvm.heap_used <= 3221225472
    goal:
      objective: "minimize"
      function:
        formula: "jvm.response_time"
      constraints:
        absolute:
          - name: heap_used
            formula: jvm.heap_used <= 3221225472
        relativeToBaseline:
          - name: memory_used
            formula: jvm.memory_used <= 80%
    goal:
        objective: "maximize"
        function:
          formula: "jpetstore.elements_per_second"
        constraints:
          absolute:
            - name: elements_per_second
              formula: "jpetstore.elements_per_second > 55"
            - name: max_duration
              formula: "jpetstore.max_duration < 800"
            - name: avg_duration
              formula: "jpetstore.avg_duration < 70"
            - name: error_rate
              formula: "jpetstore.error_rate < 0.01"
    goal:
      objective: "minimize"
      function:
        formula: "containers_memory_limit/containers_benchmark_duration:max"
        variables:
          containers_memory_limit:
            metric: "memory_limit"
            labels:
              appId: "app1"
          containers_benchmark_duration:
            metric: "benchmark_duration"
            labels:
              appId: "app1"
    # General section
    name: function_branin
    description: A component type for the branin analytical function
    
    # Parameters section
    parameters:
      - name: x1
        domain:
          type: real
          domain: [-5.0, 10.0]
        defaultValue: -5.0
        decimals: 3
        operators:
        FileConfigurator:
          confTemplate: "${value}"
    
      - name: x2
        domain:
          type: real
          domain: [0.0, 15.0]
        defaultValue: 0.0
    
      - name: x3
        domain:
          type: categorical
          categories: [cat1,cat2,cat3]
        operators:
        LinuxConfigurator:
          echo:
            file: /sys/class/block/nvme0n1/queue/scheduler
    
    # Metrics section
    metrics:
      - name: function_value

    The mathematical expression of what to minimize or maximize to reach the objective of the Study.

    variables

    Object

    See below

    No

    The specification of additional variables present in the formula.

    The name of the metric of the Components of the System under test that maps to the variable.

    labels

    A set of key-value pairs

    No

    A set of filters based on the values of the labels that are attached to the different data points of the metric. One of these labels is componentName, which contains the name of the Component the metric refers to.

    aggregation

    String

    MAX MIN AVG

    No

    AVG

    The strategy through which data points of the metric should be aggregated within the window produced by the application of the selected windowing strategy. By default, an average is taken.

    Information relative to the source/input file to be used to interpolate optimal configuration parameters discovered by Akamas

    target

    Object

    should have a structure like the one defined in the next section

    no, if the Component whose name is defined in component has properties that map to the ones defined within target

    Information relative to the target/output file to be used to interpolate optimal configuration parameters discovered by Akamas

    component

    String

    should match the name of an existing Component of the System under test

    no

    The name of the Component whose properties can be used as arguments of the operator

    ignoreUnsubstitutedTokens

    Boolean

    no

    False

    Behavior of the operator regarding leftover tokens in the target file. When False, FileConfigurator fails. When True , FileConfigurator succeeds regardless of leftover tokens

    SSH endpoint

    username

    String

    yes

    SSH login username

    password

    String

    cannot be set if key is already set

    no

    SSH login password

    sshPort

    Number

    1≤sshPort≤65532

    no

    22

    SSH port

    key

    String

    cannot be set if password is already set

    no

    SSH login key, provided directly its value or the path of the file to import from. The operator supports RSA and DSA Keys

    path

    String

    should be a valid path

    yes

    The path of the file to be used either as the source or target of the activity to applying Akamas computed configuration parameters using files

    password

    source->password target->password

    key

    source->key target->key

    sourcePath

    source->path

    targetPath

    target->path

    source

    Object

    should have a structure like the one defined in the next section

    no, if the Component whose name is defined in component has properties that map to the ones defined within source

    hostname

    String

    should be a valid SSH host address

    yes

    hostname

    source->hostname target->hostname

    username

    source->username target->username

    sshPort

    source->sshPort target->sshPort

    ${component_name.parameter_name}
    ${component_name.*}
    name: Component Type 1
    description: My Component type
    parameters:
    - name: x1
      domain:
        type: real
        domain: [-5.0, 10.0]
      defaultValue: -5.0
      # Under this section, the operator to be used to configure the parameters is defined
      operators:
        FileConfigurator:
          # using this OPTIONAL confTemplate property is possible to interpolate the parameter value with a prefix and a suffix
          confTemplate: "PREFIX${value}SUFFIX"
    component1.param1: 1024
    component1.param2: Category1
    component2.param3: 7
    component2.param4: 35.4
    name: MyComponentType
    description: "MyComponentType
    parameters:
    - name: param1
      domain:
        type: real
        domain: [-5.0, 10.0]
      defaultValue: -5.0
      # Under this section, the operator to be used to configure the parameters is defined
      operators:
        FileConfigurator:
          # using this OPTIONAL confTemplate property is possible to interpolate the parameter value with a prefix and a suffix
          confTemplate: "X1:${value}MB"
    ...
    myexecutable.sh -PARAM ${component1.param1} -PARAMS ${component2.*}
    myexecutable.sh -PARAM X1:1024MB -PARAMS 7 35.4
    name: RemoteConfOperatorTestStandalone
    operator: FileConfigurator
    arguments:
      source:
        hostname: template-server
        username: akamas-user1
        password: akamas-password1
        path: /templates/frontend-httpd.conf
      target:
        hostname: frontend-server
        username: akamas-user2
        password: akamas-password22
        path: /etc/httpd/httpd.conf
    name: RemoteConfOperatorTestStandalone
    operator: FileConfigurator
    arguments:
      component: apache-server-1
    name: apache-server-1
    description: The Apache server instance
    componentType: Apache Server 2.4
    
    properties:
      hostname: apache.akamas.io
      username: ubuntu
      key: key.pem
      sourcePath: templates/httpd.conf.template
      targetPath: /etc/httpd/httpd.conf
    

    The name of the Component for which available Linux kernel parameters will be configured

    SSH host address

    sshPort

    Integer

    1≤sshPort≤65532

    Yes

    22

    SSH port

    username

    String

    Yes

    SSH login username

    key

    Multiline string

    Either key or password is required

    SSH login key, provided directly its value or the path of the file to import from. The operator supports RSA and DSA Keys

    password

    String

    Either key or password is required

    blockDevices

    List of objects

    It should have a structure like the one described in the

    No

    Allows the user to restrict and specify to which block-device apply block-device-related parameters

    networkDevices

    List of objects

    It should have a structure like the one described in the

    No

    Allows the user to restrict and specify to which network-device apply network-device-related parameters

    A regular expression that matches block/network-devices to configure with related parameters of the Component

    parameters

    List of strings

    It should contain the names of matching parameters of the Component

    No

    The list of parameters to be configured for the specified block/network-devices. If the list is empty, then no parameter will be applied for the block/network-devices matched by name

    Name

    Type

    Value restrictions

    Required

    Default

    Description

    component

    String

    It should match the name of an existing Component of the System under test

    Name

    Type

    Value restrictions

    Required

    Default

    Description

    hostname

    String

    It should be a valid SSH host address

    Name

    Type

    Value restrictions

    Required

    Default

    Description

    name

    String

    It should be a valid regular expression to match block/network-devices

    No

    Yes

    Yes

    domain->type

    string

    {real, integer, categorical}

    Yes

    -

    The type of domain to be set for the parameter in relationship with the component-type

    domain->domain

    array of numbers

    The numbers should be either all integers or real numbers(do not omit the " . ") depending on domain->type.

    The size of the array must be 2.

    No

    -

    The bounds to be used to define the domain of the parameter. These bounds are inclusive

    domain->categories

    array of strings

    No

    -

    The possible categories that the parameter could possess

    defaultValue

    string, integer, real

    The value must be included in the domain, for real and integer types and must be a value included in the categories

    Yes

    -

    The default value of the parameter

    decimals

    integer

    [0-255]

    No

    5

    The number of decimal digits rendered for this parameter

    operators

    object

    The name and the parameters of a supported

    Yes

    -

    Specify what operators can be used to apply the parameter

    name

    string

    should match the following regexp:

    ^[a-zA-Z][a-zA-Z0-9_]*$

    that is only letters, number and underscores, no initial number of underscore

    Notice: this should not match the name of another component

    Yes

    The name of the component.

    description

    string

    Yes

    A description to characterize the component.

    componentType

    string

    notice: this should match the name of an existing component-type

    Yes

    The name of the component-type that defines the type of the component.

    properties

    object

    No

    General custom properties of the component. These properties can be defined freely and usually have the purpose to expose information useful for configuring the component.

    name

    string

    It should match the name of an existing parameter.

    Yes

    -

    name

    string

    It should match the name of an existing metric

    Yes

    The name of the parameter that should be related to the component-type

    The name of the metric that should be related to the component type

    system

    object reference

    TRUE

    The system the study refers to

    name

    string

    TRUE

    The name of the study

    goal

    object

    TRUE

    The goal and constraint description - see

    kpis

    list

    FALSE

    The KPIs description - see

    numberOfTrials

    integer

    FALSE

    1

    The number of trials for each experiment - see below

    trialAggregation

    string

    MAX, MIN, AVG

    FALSE

    AVG

    The aggregation used to calculate the score across multiple trials - see below

    parametersSelection

    list

    FALSE

    all

    The list of parameters to be tuned - see

    metricsSelection

    list

    FALSE

    all

    The list of metrics - see

    workloadsSelection

    object array

    FALSE

    The list of defined workloads - this only applies to live optimization studies - see

    windowing

    string

    FALSE

    trim

    The windowing strategy - this only applies to offline optimization studies - see

    workflow

    object reference

    TRUE

    The workflow the study refers to

    steps

    list

    TRUE

    The description of the steps - see

    Some of these optional properties depend on whether the study is an offline or live optimization study.

    hashtag
    Number of trials

    It is possible to perform more than one trial per experiment to validate the score of a configuration under test, e.g., to consider noisy environments.

    The following fragment of the YAML definition of a study sets the number of trials to 3:

    Notice: This is a global property of the study which can be overwritten for each step.

    hashtag
    Trial aggregation

    The trial aggregation policy defines how trial scores are aggregated to form experiment scores.

    There are three different types of strategies to aggregate trial scores:

    • AVG: the score of an experiment is the average of the scores of its trials - this is the default

    • MIN: the score of an experiment is the minimum among the scores of its trials

    • MAX: the score of an experiment is the maximum among the scores of its trial

    The following fragment of the YAML definition of a study sets the trial aggregation to MAX:

    hashtag
    Examples

    The following system refers to an offline optimization study for a system modeling an e-commerce service, where a windowing strategy is specified:

    The following offline study refers to a tuning initiative for a Cassandra-based system (ID 2)

    The following offline study is for tuning another Cassandra-based system (ID 3) by acting only on JVM and Linux parameters

    gc_count

    collections/s

    The total number of garbage collections

    gc_duration

    seconds

    The garbage collection duration

    heap_hard_limit

    bytes

    The size of the heap

    hashtag
    Parameters

    Parameter
    Type
    Unit
    Default
    Domain
    Restart
    Description

    csproj_System_GC_Server

    categorical

    CPUs

    Safety Policies

    While Akamas leverages similar AI methods for both live optimizations and optimization studies, the way these methods are applied is radically different. Indeed, for optimization studies running in pre-production environments, the approach is to explore the configuration space by also accepting potential failed experiments, to identify regions that do not correspond to viable configurations. Of course, this approach cannot be accepted for live optimization running in production environments. For this purpose, Akamas live optimization uses observations of configuration changes combined with the automatic detection of workload contexts and provides several customizable safety policies when recommending configurations to be approved, revisited, and applied.

    Akamas provides a few customizable optimizer options (refer to the options described on the Optimize step page of the reference guide) that should be configured so as to make configurations recommended in live optimization and applied to production environments as safe as possible.

    hashtag

    - name: LinuxConf
      operator: LinuxConfigurator
      arguments:
        component: ComponentName
    blockDevices:
    - name: "xvd[a-z]"
      parameters:
        - os_StorageReadAhead
        - os_StorageQueueScheduler
    blockDevices:
    - name: "xvdb|loop0"
      parameters:
        - os_StorageMaxSectorsKb
    networkDevices:
      - name: wlp4s0
        parameters: []
    name: Component Type 1
    description: My Component type
    parameters:
      - name: net_forwarding
        domain:
          type: integer
          domain: [0, 1]
        defaultValue: 1
        operators:
          # the parameter is configured using LinuxConfigurator
          LinuxConfigurator:
            sysctl:
              key: net.ipv4.forwarding
    name: Component Type 1
    description: My Component type
    parameters:
      - name: os_MemoryTransparentHugepageEnabled
        domain:
          type: categorical
          categories: [always, never]
        defaultValue: always
        operators:
          LinuxConfigurator:
            echo:
              file: /sys/kernel/mm/transparent_hugepage/enabled
    name: Component Type 1
    description: My Component type
    parameters:
      - name: os_MemorySwap
        domain:
          type: categorical
          categories: [swapon, swapoff]
        defaultValue: swapon
        operators:
          LinuxConfigurator:
            map:
              swapon: command1
              swapoff: command2
    name: Component Type 1
    description: My Component type
    parameters:
      - name: os_MemorySwap
        domain:
          type: categorical
          categories: [swapon, swapoff]
        defaultValue: swapon
        operators:
          LinuxConfigurator:
            command:
              cmd: sudo ${value} -a
    name: Cassandra
    description: The Cassandra NoSQL database version 3
    parameters:
      - name: cassandra_compactionStrategy
        domain:
          type: categorical
          categories: [A, B]
        defaultValue: A
    
    metrics:
      - name: total_rate
      - name: read_rate
      - name: write_rate
      - name: read_response_time_avg
      - name: read_response_time_p90
      - name: read_response_time_p99
      - name: read_response_time_max
      - name: write_response_time_avg
      - name: write_response_time_p90
      - name: write_response_time_p99
      - name: write_response_time_max
    name: Linux OS
    description: A component type for the Linux Operating System
    parameters:
      #CPU Related
      - name: os_cpuSchedMinGranularity
        domain:
          type: integer
          domain: [300000, 30000000]
        defaultValue: 3000000
      - name: os_cpuSchedWakeupGranularity
        domain:
          type: integer
          domain: [400000, 40000000]
        defaultValue: 4000000
      - name: osCpu.schedMigrationCost
        domain:
          type: integer
          domain: [100000, 5000000]
        defaultValue: 500000
      - name: os_CPUSchedChildRunsFirst
        domain:
          type: integer
          domain: [0, 1]
        defaultValue: 0
      - name: os_CPUSchedLatency
        domain:
          type: integer
          domain: [2400000, 240000000]
        defaultValue: 24000000
      - name: os_CPUSchedAutogroupEnabled
        domain:
          type: integer
          domain: [0, 1]
        defaultValue: 1
      - name: os_CPUSchedNrMigrate
        domain:
          type: integer
          domain: [3, 320]
        defaultValue: 32
      #Memory Related
      - name: os_MemorySwappiness
        domain:
          type: integer
          domain: [0, 100]
        defaultValue: 60
      - name: os_MemoryVmVfsCachePressure
        domain:
          type: integer
          domain: [10, 100]
        defaultValue: 100
      - name: os_MemoryVmMinFree
        domain:
          type: integer
          domain: [10240, 1024000]
        defaultValue: 67584
      - name: os_MemoryVmDirtyRatio
        domain:
          type: integer
          domain: [1, 99]
        defaultValue: 10
      - name: os_MemoryTransparentHugepageEnabled
        domain:
          type: categorical
          categories: ['True', 'False']
        defaultValue: 'True'
      - name: os_MemoryTransparentHugepageDefrag
        domain:
          type: categorical
          categories: ['True', 'False']
        defaultValue: 'True'
      - name: os_MemorySwap
        domain:
          type: categorical
          categories: ['True', 'False']
        defaultValue: 'True'
      - name: os_MemoryVmDirtyExpire
        domain:
          type: integer
          domain: [300, 30000]
        defaultValue: 3000
      - name: os_MemoryVmDirtyWriteback
        domain:
          type: integer
          domain: [50, 5000]
        defaultValue: 500
    
    metrics:
      - name: cpu_num
      - name: cpu_util
      - name: mem_util
      - name: load_avg
      - name: swapins
      - name: swapouts
      - name: disk_iops_writes
      - name: disk_iops_reads
      - name: disk_iops_total
      - name: disk_await_worst
      - name: proc_blocked
      - name: context_switch
      - name: tcp_retrans
      - name: tcp_tozerowin
      - name: net_band_rx_bits
      - name: net_band_tx_bits
      - name: network_in_byte_rate
      - name: network_out_byte_rate
      - name: mem_fault_minor
      - name: mem_fault_major
      - name: mem_active_file
      - name: mem_active_anon
      - name: mem_inactive_file
      - name: mem_inactive_anon
    system: 1
    name: Optimizing the e-shop application
    goal:
      objective: maximize
      function:
        formula: payments_per_sec
        variables:
          payments_per_sec:
            metric: eshop_payments
            labels:
              componentName: eshop
    
    workflow: eshop_jmeter_test
    steps:
      - name: baseline
        type: baseline
        values:
          tomcat.maxThreads: 1024
          jvm.maxHeap: 2048
          jvm.garbageCollectorType: G1GC
          postgres.shared_buffers: 4096
    numberOfTrials: 3
    trialAggregation: MAX    # Other possible values are AVG, MIN
    system: "bde4f259-9a51-4c67-87aa-3c5bc599c6b9" # id of the system to optimize with the actions defined in this study
    workflow: "eshop_jmeter_test" # name of the workflow to use to perform trials
    name: Optimizing the e-shop application # name of the study
    goal: # the performance goal to achieve
      objective: "maximize"
      function:
        formula: "eshop.payments_per_second"
    windowing: # the temporal window in which to compute the score of a trial
      type: "trim"
      trim: ["10s", "0s"] # use the duration of the trial minus 0s from start and end to compute the score of the trial
    parametersSelection: "all" # use all available configuration parameters
    metricsSelection: "all" # gather all metrics
    steps: # the steps to conduct to perform experiments and trials
      - name: "my_baseline" # do first a baseline with the provided configuration
        type: "baseline"
        values:
          jvm.maxHeap: 2048
          jvm.gcType: "-XX:+UseParallelGC"
      - name: my_optimization # then do 20 optimization experiments of 2 trials each
        type: optimize
        numberOfExperiments: 200
        numberOfTrials: 2
    system: 2
    name: Optimizing the cassandra - team 2
    goal:
      objective: minimize
      function:
        formula: read_response_time_p90
        variables:
          read_response_time_p90:
            metric: read_response_time_p90
            labels:
              componentName: cassandra
    
    windowing:
      type: trim
      trim: [5m, 1m]
    
    workflow: cassandra_workflow
    parametersSelection:
      - name: cassandra_jvm.jvm_maxHeapSize
      - name: cassandra.cassandra_concurrentReads
      - name: cassandra.cassandra_concurrentWrites
      - name: cassandra.cassandra_fileCacheSizeInMb
      - name: cassandra.cassandra_memtableCleanupThreshold
      - name: cassandra.cassandra_concurrentCompactors
    
    steps:
      - name: baseline_step
        type: baseline
        values:
          cassandra_jvm.jvm_maxHeapSize: 1024
          cassandra.cassandra_concurrentReads: 32
          cassandra.cassandra_concurrentWrites: 32
          cassandra.cassandra_fileCacheSizeInMb: 512
          cassandra.cassandra_memtableCleanupThreshold: 0.11
          cassandra.cassandra_concurrentCompactors: 2
    
      - name: optimization_step
        type: optimize
        optimizer: CALABI
        numberOfExperiments: 50
    system: 3
    name: Optimizing a Cassandra NoSQL database version 3 (jvm + os parameters)
    goal:
      objective: minimize
      function:
        formula: (x1+x2)/2
        variables:
          x1:
            metric: write_response_time_p90
            labels:
              componentName: cassandra_team1
          x2:
            metric: read_response_time_p90
            labels:
              componentName: cassandra_team1
    
    windowing:
      type: trim
      trim: [8m,2m]
    
    numberOfTrials: 2
    workflow: cassandra_workflow_jvm_os
    
    parametersSelection:
      - name: JVM1.jvm_maxHeapSize
      - name: JVM1.jvm_newRatio
      - name: JVM1.jvm_survivorRatio
      - name: JVM1.jvm_maxTenuringThreshold
      - name: JVM1.jvm_gcType
      - name: JVM1.jvm_concurrentGCThreads
      - name: os1.os_cpuSchedMinGranularity
      - name: os1.os_cpuSchedWakeupGranularity
      - name: os1.os_CPUSchedMigrationCost
      - name: os1.os_CPUSchedChildRunsFirst
      - name: os1.os_CPUSchedLatency
    
    steps:
      - name: baseline_step
        type: baseline
        values:
          JVM_team1.jvm_maxHeapSize: 1024
          JVM_team1.jvm_newRatio: 2
          JVM_team1.jvm_survivorRatio: 8
          JVM_team1.jvm_maxTenuringThreshold: 15
          JVM_team1.jvm_gcType: UseConcMarkSweepGC
          JVM_team1.jvm_concurrentGCThreads: 8
          os_team1.os_cpuSchedMinGranularity: 3000000
          os_team1.os_cpuSchedWakeupGranularity: 4000000
          os_team1.os_CPUSchedMigrationCost: 500000
          os_team1.os_CPUSchedChildRunsFirst: 0
          os_team1.os_CPUSchedLatency: 24000000
    
      - name: optimization_sobol
        type: optimize
        optimizer: SOBOL
        numberOfExperiments: 3
    
      - name: optimization_calabi
        type: optimize
        optimizer: CALABI
        numberOfExperiments: 50
    Exploration factor

    Akamas provides an optimizer option known as the exploration factor that only allows gradual changes to the parameters. This gradual optimization allows Akamas to observe how these changes impact the system behavior before applying the following gradual changes.

    By properly configuring the optimizer, Akamas can gradually explore regions of the configuration space and slowly approach any potentially risky regions, thus avoiding recommending any configurations that may negatively impact the system. Gradual optimization takes into account the maximum recommended change for each parameter. This is defined as a percentage (default is 5%) with respect to the baseline value. For example, in the case of a container whose CPU limit is 1000 millicores, the corresponding maximum allowed change is 50 millicores. It is important to notice that this does not represent an absolute cap, as Akamas also takes into account any good configurations observed. For example, in the event of a traffic peak, Akamas would recommend a good configuration that was observed working fine for a similar workload in the past, even if the change is higher than 5% of the current configuration value.

    Notice that this feature would not work for categorical parameters (e.g. JVM GC Type) as their values do not change incrementally. Therefore, when it comes to these parameters, Akamas by default takes a conservative approach of only recommending configurations with categorical parameters taking already observed before values. This still allows some never-observed values to be recommended as users are allowed to modify values also for categorical parameters when operating in human-in-the-loop mode. Once Akamas has observed that that specific configuration is working fine, the corresponding value can then be recommended. For example, a user might modify the recommended configuration for GC Type from Serial to Parallel. Once Parallel has been observed as working fine, Akamas would consider it for future recommendations of GC Type, while other values (e.g. G1) would not be considered until verified as safe recommendations.

    The exploration factor can be customized for each live optimization individually and changed while live optimizations are running.

    hashtag
    Safety factor

    Akamas provides an optimizer option known as the safety factor designed to prevent Akamas from selecting configurations (even if slowly approaching them) that may impact the ability to match defined SLOs. For example, when optimizing container CPU limits, lower and lower CPU limits might be recommended, up to the point that the limit becomes too low that the application performance degrades.

    Akamas takes into account the magnitude of constraint breaches: a severe breach is considered more negative than a minor breach. For example, in the case of an SLO of 200 ms on response time, a configuration causing a 1 sec response time is assigned a very different penalty than a configuration causing a 210 ms response time. Moreover, Akamas leverages the smart constraint evaluation feature that takes into account if a configuration is causing constraints to approach their corresponding thresholds. For example, in the case of an SLO of 200 ms on response time, a configuration changing response time from 170 ms to 190 ms is considered more problematic than one causing a change from 100 ms to 120 ms. The first one is considered by Akamas as corresponding to a gray area that should not be explored.

    The safety factor is also used when starting the study in order to validate the behavior of the baseline to identify the safety of exploring configurations close to the baseline. If the baseline presents some constraint violations, then even exploring configurations close to the baseline might cause a risk. If Akamas identifies that, in the baseline configuration, more than (safety_factor*number_of_trials) manifest constraint violations then the optimization is stopped.

    If your baseline has some trials failing constraint validation we suggest you analyze them before proceeding with the optimization

    The safety factor is set by default to 0.5 and can be customized for each live optimization individually and changed while live optimizations are running.

    hashtag
    Outlier detection

    It is also worth mentioning that Akamas also features an outlier detection capability to compensate for production environments typically being noisy and much less stable than staging environments, thus displaying highly fluctuating performance metrics. As a consequence, constraints may fail from time to time, even for perfectly good configurations. This may be due to a variety of causes, such as shared infrastructure on the cloud, slowness of external systems, etc.

    false

    true, false

    yes

    The main flavor of the GC: set it to false for workstation GC or true for server GC. To be set in csproj file and requires rebuild.

    csproj_System_GC_Concurrent

    categorical

    boolean

    true

    true, false

    yes

    Configures whether background (concurrent) garbage collection is enabled (setting to true). To be set in csproj file and requires rebuild.

    runtime_System_GC_Server

    categorical

    boolean

    false

    true, false

    yes

    The main flavor of the GC: set it to false for workstation GC or true for server GC. To be set in csproj file and requires rebuild.

    runtime_System_GC_Concurrent

    categorical

    boolean

    true

    true, false

    yes

    Configures whether background (concurrent) garbage collection is enabled (setting to true). To be set in csproj file and requires rebuild.

    runtime_System_GC_HeapCount

    integer

    heapcount

    8

    1 → 1000

    no

    Limits the number of heaps created by the garbage collector. To be set in runtimeconfig.json in runtimeOptions: configProperties

    runtime_System_GC_CpuGroup

    categorical

    boolean

    0

    1, 0

    no

    Configures whether the garbage collector uses CPU groups or not. Default is false. To be set in runtimeconfig.json

    runtime_System_GC_NoAffinitize

    categorical

    boolean

    false

    true, false

    no

    Specifies whether to affinitize garbage collection threads with processors. To affinitize a GC thread means that it can only run on its specific CPU. To be set in runtimeconfig.json in runtimeOptions: configProperties

    runtime_System_GC_HeapHardLimit

    integer

    bytes

    20971520

    16777216 → 1099511627776

    no

    Specifies the maximum commit size, in bytes, for the GC heap and GC bookkeeping. To be set in runtimeconfig.json in runtimeOptions: configProperties

    runtime_System_GC_HeapHardLimitPercent

    real

    percent

    0.75

    0.1 → 100.0

    no

    Specifies the allowable GC heap usage as a percentage of the total physical memory. To be set in runtimeconfig.json in runtimeOptions: configProperties.

    runtime_System_GC_HighMemoryPercent

    integer

    bytes

    20971520

    16777216 → 1099511627776

    no

    Specify the memory threshold that triggers the execution of a garbage collection. To be set in runtimeconfig.json.

    runtime_System_GC_RetainVM

    categorical

    boolean

    false

    true, false

    no

    Configures whether segments that should be deleted are put on a standby list for future use or are released back to the operating system (OS). Default is false. To be set in runtimeconfig.json in runtimeOptions: configProperties

    runtime_System_GC_LOHThreshold

    integer

    bytes

    85000

    850000 → 1099511627776

    no

    Specifies the threshold size, in bytes, that causes objects to go on the large object heap (LOH). To be set in runtimeconfig.json in runtimeOptions: configProperties

    webconf_maxconnection

    integer

    connections

    2

    2 → 1000

    no

    This setting controls the maximum number of outgoing HTTP connections that you can initiate from a client. To be set in web.config (target app only) or machine.config (global)

    webconf_maxIoThreads

    integer

    threads

    20

    20 → 1000

    no

    Controls the maximum number of I/O threads in the .NET thread pool. Automatically multiplied by the number of available CPUs. To be set in web.config (target app only) or machine.config (global). It requires autoConfig=false

    webconf_minIoThreads

    integer

    threads

    20

    20 → 1000

    no

    The minIoThreads setting enable you to configure a minimum number of worker threads and I/O threads for load conditions. To be set in web.config (target app only) or machine.config (global). It requires autoConfig=false

    webconf_maxWorkerThreads

    integer

    threads

    20

    20 → 1000

    no

    This setting controls the maximum number of worker threads in the thread pool. This number is then automatically multiplied by the number of available CPUs.To be set in web.config (target app only) or machine.config (global).It requires autoConfig=false

    webconf_minWorkerThreads

    integer

    threads

    20

    20 → 1000

    no

    The minWorkerThreads setting enable you to configure a minimum number of worker threads and I/O threads for load conditions. To be set in web.config (target app only) or machine.config (global). It requires autoConfig=false

    webconf_minFreeThreads

    integer

    threads

    8

    8 → 800

    no

    Used by the worker process to queue all the incoming requests if the number of available threads in the thread pool falls below its value. To be set in web.config (target app only) or machine.config (global). It requires autoConfig=false

    webconf_minLocalRequestFreeThreads

    integer

    threads

    4

    4 → 7600

    no

    Used to queue requests from localhost (where a Web application sends requests to a local Web service) if the number of available threads falls below it. To be set in web.config (target app only) or machine.config (global). It requires autoConfig=false

    webconf_autoConfig

    categori

    boolean

    true

    true, false

    no

    Enable settings the system.web configuration parameters. To be set in web.config (target app only) or machine.config (global)

    next section
    next section
    workflow operators
    Goal & Constraints
    KPI
    Parameter selection
    Metric selection
    Workload Selection
    Windowing strategy
    Steps

    NeoLoadWeb Operator

    The NeoLoadWeb operator allows piloting performance tests on a target system by leveraging the Tricentis NeoLoad Web solution.

    Once triggered, this operator will configure and start the execution of a NeoLoad test run on the remote endpoint. When the test is unable to run then the operator blocks the Akamas workflow issuing an error.

    hashtag
    Operator arguments

    This operator requires five pieces of information to pilot successfully performance tests within Akamas:

    1. The location of a .zip archive(project file) containing the definition of the performance test. This location can be a URL accessible via HTTP/HTTPS or a file path accessible via SFTP. Otherwise, the unique identifier of a previously uploaded project must be provided.

    2. The name of the scenario to be used for the test

    3. The URL of the NeoLoad Web API (either on-premise or SaaS)

    When a projectFile is specified the Operator uploads the provided project to NeoLoad and launches the specified scenario. After the execution of the scenario, the project is deleted from NeoLoad. When a projectId is specified the Operator expects the project to be already available on NeoLoad. Please refer to on how to upload a project and obtain a project ID.

    Name
    Type
    Value Restrictions
    Required
    Default
    Description

    hashtag
    ProjectFile structure and arguments

    The projectFile argument needs to be specified differently depending on the protocol used to get the specification of the performance test:

    • HTTP/HTTPS

    • SSH (SFTP)

    hashtag
    HTTP/HTTPS

    Here follows the structure of the projectFile argument in the case in which HTTP/HTTPS is used to get the specification of the performance test:

    with its arguments:

    Name
    Type
    Value Restrictions
    Required
    Default
    Descrption

    hashtag
    SSH (SFTP)

    Here follows the structure of the projectFile argument in the case in which SFTP is used to get the specification of the performance test.

    with its arguments

    Type
    Value Restrictions
    Required
    Default
    Description

    hashtag
    component structure and arguments

    The component argument can be used to refer to a component by name and use its properties as the arguments of the operator.

    hashtag
    Component property to Operator argument mapping

    Component property
    Operator argument

    hashtag
    Examples

    hashtag
    Without component argument

    hashtag
    With component argument

    The URL of the NeoLoad Web API for uploading project files
  • The account token used to access the NeoLoad Web APIs

  • The name of the scenario to be used for the performance piloted by Akamas

    projectId

    String

    It should be a valid UUID

    No, if a projectFile is already defined

    The identified of a previously uploaded project file. Has precedence over projectFile

    projectFile

    Object

    It should have a structure like the one described here below

    No, if a projectId is already defined

    The specification of the strategy to be used to get the archive containing the specification of the performance test to be piloted by Akamas. When defined projectId has the precedence.

    neoloadProjectFilesApi

    String

    It should be a valid URL or IP

    No

    The address of the API to be used to upload project files to NeoLoad Web

    neoloadApi

    String

    It should be a valid URL or IP

    No

    The address of the Neotys' NeoLoad Web API

    lgZones

    String

    Comma-separated list of zones and number of LG

    No

    The list of LG zones id with the number of the LGs. Example: "ZoneId1:10,ZoneId2:5". If empty, the default zone will be used with one LG.

    controllerZoneId

    String

    A controller zone Id

    No

    The controller zone Id. If empty, the default zone will be used.

    component

    String

    It should match the name of an existing component of the System under test

    No

    The name of the component whose properties can be used as arguments of the operator.

    accountToken

    String

    It should match an existing access token registered with NeoLoad Web

    No, if specified in the component. See example below

    The token to be used to authenticate requests against the NeoLoad Web APIs

    The URL of the project file

    verifySSL

    Boolean

    No

    true

    If the https connection should be verified using the certificates available on the machine in which the operator is running

    SSH host address

    username

    String

    Yes

    SSH login username

    password

    String

    No. Either password or key should be provided

    SSH login password

    sshPort

    Number (integer)

    1≤sshPort≤65532

    22

    SSH port

    key

    String

    No, Either password or key should be provided

    SSH login key, provided directly its value or the path of the file to import from. The operator supports RSA and DSA Keys.

    path

    String

    It should be a valid path on the SSH host machine

    Yes

    The path of the project file

    scenarioName

    scenarioName

    controllerZoneId

    controllerZoneId

    lgZones

    lgZones

    deleteProjectAfterTest

    deleteProjectAfterTest

    url

    projectFile->http->url

    verifySSL

    projectFile->http->verifySSL

    hostname

    projectFile->ssh->hostname

    username

    projectFile->ssh->username

    password

    projectFile->ssh->password

    key

    projectFile->ssh->key

    sshPort

    projectFile->ssh->sshPort

    path

    projectFile->ssh->path

    scenarioName

    String

    It should match an existing scenario in the project file. Can be retrieved from the "runtime" section of your neoload controller.

    No, if the component whose name is defined in component has a property that maps to scenarioName

    url

    String

    It should be a valid URL or IP

    Yes

    hostname

    String

    It should be a valid SSH host address

    Yes

    neoloadProjectFilesApi

    neoloadProjectFilesApi

    neoloadApi

    neoloadApi

    accountToken

    accountToken

    NeoLoad official documentationarrow-up-right

    # ...
    projectFile:
        http:
            url: http://url_of_project_file
    projectFile:
      ssh:
        hostname: this_is_a_hostname
        username: this_is_a_username
        sshPort: 22
        key: this_is_a_key
        path: /path/to/project/file
    name: task1
    operator: NeoLoadWeb
    arguments:
      projectFile:
        ssh:
          hostname: akamas-machine-1
          username: akamas
          key: |-
            -----BEGIN RSA PRIVATE KEY-----
            RSA KEY HERE
            -----END RSA PRIVATE KEY-----
           path: projects/project1.zip
      scenarioName: scenario1
      accountToken: "ACCOUNT TOKEN HERE"
    name: task1
    operator: NeoLoadWeb
    arguments:
      component: component1
      accountToken: "ACCOUNT TOKEN HERE"
    https://neoload-files.saas.neotys.comarrow-up-right
    https://neoload-api.saas.neotys.comarrow-up-right

    Node JS 18

    This page describes the Optimization Pack for the component type NodeJS.

    hashtag
    Metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    Parameter
    Type
    Unit
    Default Value
    Domain
    restart
    Description

    SparkSSHSubmit Operator

    The SSHSparkSubmit operator connects to a Spark instance invoking a spark-submit on a machine reachable via SSH.

    hashtag
    Operator arguments

    Name

    Name
    Type
    Value Restrictions
    Required
    Default
    Description

    hashtag
    Get operator arguments from component

    This operator automatically maps some properties of its component to some arguments. In case the mapped arguments are already provided to the operator, the is no override.

    hashtag
    Component property to operator argument mapping

    percent

    The average memory utilization %

    nodejs_gc_heap_used

    bytes

    GC heap used

    nodejs_rss

    bytes

    Process Resident Set Size (RSS)

    nodejs_v8_heap_total

    bytes

    V8 heap total

    nodejs_v8_heap_used

    bytes

    V8 heap used

    nodejs_number_active_threads

    threads

    Number of active threads

    nodejs_suspension_time

    percent

    Suspension time %

    nodejs_active_handles

    handles

    Number of active libuv handles grouped by handle type. Every handle type is C++ class name

    nodejs_active_handles_total

    handles

    Total number of active handles

    nodejs_active_requests

    requests

    Number of active libuv requests grouped by request type. Every request type is C++ class name

    nodejs_active_requests_total

    requests

    Total number of active requests

    nodejs_eventloop_lag_max_seconds

    seconds

    The maximum recorded event loop delay

    nodejs_eventloop_lag_mean_seconds

    seconds

    The mean of the recorded event loop delays

    nodejs_eventloop_lag_min_seconds

    seconds

    The minimum recorded event loop delay

    nodejs_eventloop_lag_p50_seconds

    seconds

    The 50th percentile of the recorded event loop delays

    nodejs_eventloop_lag_p90_seconds

    seconds

    The 90th percentile of the recorded event loop delays

    nodejs_eventloop_lag_p99_seconds

    seconds

    The 99th percentile of the recorded event loop delays

    nodejs_eventloop_lag_seconds

    seconds

    Lag of event loop in seconds

    nodejs_external_memory_bytes

    bytes

    NodeJS external memory size in bytes

    nodejs_gc_duration_seconds_bucket

    seconds

    The total count of observations for a bucket in the histogram. Garbage collection duration by kind, one of major, minor, incremental or weakcb

    nodejs_gc_duration_seconds_count

    seconds

    The total number of observations for Garbage collection duration by kind, one of major, minor, incremental or weakcb

    nodejs_gc_duration_seconds_sum

    seconds

    The total sum of observations for Garbage collection duration by kind, one of major, minor, incremental or weakcb

    nodejs_heap_size_total_bytes

    bytes

    Process heap size from NodeJS in bytes

    nodejs_heap_size_used_bytes

    bytes

    Process heap size used from NodeJS in bytes

    nodejs_heap_space_size_available_bytes

    bytes

    Process heap size available from NodeJS in bytes

    nodejs_heap_space_size_total_bytes

    bytes

    Process heap space size total from NodeJS in bytes

    nodejs_heap_space_size_used_bytes

    bytes

    Process heap space size used from NodeJS in bytes

    process_cpu_seconds_total

    seconds

    Total user and system CPU time spent in seconds

    process_cpu_system_seconds_total

    seconds

    Total system CPU time spent in seconds

    process_cpu_user_seconds_total

    seconds

    Total user CPU time spent in seconds

    process_heap_bytes

    bytes

    Process heap size in bytes

    process_max_fds

    fds

    Maximum number of open file descriptors

    process_open_fds

    fds

    Number of open file descriptors

    process_resident_memory_bytes

    bytes

    Resident memory size in bytes

    process_virtual_memory_bytes

    bytes

    Virtual memory size in bytes

    --allocation-site-pretenuring, --no-allocation-site-pretenuring

    yes

    Pretenure with allocation sites

    v8_min_semi_space_size

    integer

    megabytes

    0

    0 → 1048576

    yes

    Min size of a semi-space (in MBytes), the new space consists of two semi-spaces

    v8_max_semi_space_size

    integer

    megabytes

    0

    0 → 1048576

    yes

    Max size of a semi-space (in MBytes), the new space consists of two semi-spaces. This parameter is equivalent to v8_max_semi_space_size_ordinal.

    v8_max_semi_space_size_ordinal

    ordinal

    megabytes

    16

    2, 4, 6, 8, 16, 32, 64, 128, 256, 512, 1024, 2048

    yes

    Max size of a semi-space (in MBytes), the new space consists of two semi-spaces. This parameter is equivalent to v8_max_semi_space_size but forces power of 2 values.

    v8_semi_space_grouth_factor

    integer

    2

    0 → 100

    yes

    Factor by which to grow the new space

    v8_max_old_space_size

    integer

    megabytes

    0

    0 → 1048576

    yes

    Max size of the old space (in Mbytes)

    v8_max_heap_size

    integer

    megabytes

    0

    0 → 1048576

    yes

    Max size of the heap (in Mbytes) both max_semi_space_size and max_old_space_size take precedence. All three flags cannot be specified at the same time.

    v8_initial_heap_size

    integer

    megabytes

    0

    0 → 1048576

    yes

    Initial size of the heap (in Mbytes)

    v8_initial_old_space_size

    integer

    megabytes

    0

    0 → 1048576

    yes

    Initial old space size (in Mbytes)

    v8_parallel_scavenge

    categorical

    --parallel-scavenge

    --parallel-scavenge, --no-parallel-scavenge

    yes

    Parallel scavenge

    v8_scavenge_task_trigger

    integer

    80

    1 → 100

    yes

    Scavenge task trigger in percent of the current heap limit

    v8_scavenge_separate_stack_scanning

    categorical

    --no-scavenge-separate-stack-scanning

    --scavenge-separate-stack-scanning, --no-scavenge-separate-stack-scanning

    yes

    Use a separate phase for stack scanning in scavenge

    v8_concurrent_marking

    categorical

    --concurrent-marking

    --concurrent-marking, --no-concurrent-marking

    yes

    Use concurrent marking

    v8_parallel_marking

    categorical

    --parallel-marking

    --parallel-marking, --no-parallel-marking

    yes

    Use parallel marking in atomic pause

    v8_concurrent_sweeping

    categorical

    --concurrent-sweeping

    --concurrent-sweeping, --no-concurrent-sweeping

    yes

    Use concurrent sweeping

    v8_heap_growing_percent

    integer

    0

    0 → 99

    yes

    Specifies heap growing factor as (1 + heap_growing_percent/100)

    v8_os_page_size

    integer

    kilobytes

    0

    0 → 1048576

    yes

    Override OS page size (in KBytes)

    v8_stack_size

    integer

    kilobytes

    984

    16 → 1048576

    yes

    Default size of stack region v8 is allowed to use (in kBytes)

    v8_single_threaded

    categorical

    --no-single-threaded

    --single-threaded, --no-single-threaded

    yes

    Disable the use of background tasks

    v8_single_threaded_gc

    categorical

    --no-single-threaded-gc

    --single-threaded-gc, --no-single-threaded-gc

    yes

    Disable the use of background gc tasks

    cpu_used

    CPUs

    The total amount of CPUs used

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    memory_used

    bytes

    The total amount of memory used

    v8_allocation_size_pretenuring

    categorical

    memory_util

    --allocation-site-pretenuring

    Spark application to submit (jar or python file)

    args

    List of Strings, Numbers or Booleans

    Yes

    Additional application arguments

    master

    String

    t should be a valid supported Master URL:

    • local

    • local[K]

    • local[K,F]

    Yes

    The master URL for the Spark cluster

    deployMode

    client cluster

    No

    cluster

    Whether to launch the driver locally (client) or in the cluster (cluster)

    className

    String

    No

    The entry point of the java application. Required for java applications.

    name

    String

    No

    Name of the task. When submitted the id of the study, experiment and trial will be appended.

    jars

    List of Strings

    Each item of the list should be a path that matches an existing jar file

    No

    A list of jars to be added in the classpath.

    pyFiles

    List of Strings

    Each item of the list should be a path that matches an existing python file

    No

    A list of python scripts to be added to the PYTHONPATH

    files

    List of Strings

    Each item of the list should be a path that matches an existing file

    No

    A list of files to be added to the context of the spark-submit command

    conf

    Object (key-value pairs)

    No

    Mapping containing additional Spark configurations. See Spark documentation.

    envVars

    Object (key-value pairs)

    No

    Env variables when running the spark-submit command

    verbose

    Boolean

    No

    true

    If additional debugging output should be output

    sparkSubmitExec

    String

    It should be a path that matches an existing executable

    No

    The default for the Spark installation

    The path of the spark-submit executable command

    sparkHome

    String

    It should be a path that matches an existing directory

    No

    The default for the Spark installation

    The path of the SPARK_HOME

    proxyUser

    String

    No

    The user to be used to execute Spark applications

    hostname

    String

    It should be a valid SSH host address

    No, if the Component whose name is defined in component has a property named hostname

    SSH host address

    username

    String

    No, if the Component whose name is defined in component has a property named username

    SSH login username

    sshPort

    Number

    1≤sshPort≤65532

    No

    22

    SSH port

    password

    String

    Cannot be set if key is already set

    No, if the Component whose name is defined in component has a property named password

    SSH login password

    key

    String

    Cannot be set if password is already set

    No, if the Component whose name is defined in component has a property named key

    SSH login key, provided directly its value or the path of the file to import from. The operator supports RSA and DSA Keys.

    component

    String

    It should match the name of an existing Component of the System under test

    Yes

    The name of the Component whose properties can be used as arguments of the operator

    key

    key

    Type

    Values restrictions

    Required

    Default

    Description

    file

    String

    It should be a path to a valid java or python spark application file

    Yes

    hostname

    hostname

    username

    username

    sshPort

    sshPort

    password

    password

    RHEL 7

    This page describes the Optimization Pack for the component type RHEL 7.

    hashtag
    Metrics

    hashtag
    CPU

    Metric
    Unit
    Description

    hashtag
    Memory

    Metric
    Unit
    Description

    hashtag
    Network

    Metric
    Unit
    Description

    hashtag
    Disk

    Notice: you can use a device custom filter to monitor a specific disk with Prometheus. You can find more information on Prometheus queries and the %FILTERS% placeholder here: and here: .

    Metric
    Unit
    Description

    hashtag
    Filesystem

    Metric
    Unit
    Description

    hashtag
    Other metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    hashtag
    CPU

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Memory

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Network

    Parameter
    Default value
    Domain
    Description

    hashtag
    Storage

    Parameter
    Default value
    Domain
    Description

    CentOS 7

    This page describes the Optimization Pack for the component type CentOS 7.

    hashtag
    Metrics

    hashtag
    CPU

    Metric
    Unit
    Description

    hashtag
    Memory

    Metric
    Unit
    Description

    hashtag
    Network

    Metric
    Unit
    Description

    hashtag
    Disk

    Notice: you can use a device custom filter to monitor a specific disk with Prometheus. You can find more information on Prometheus queries and the %FILTERS% placeholder here: and here: .

    Metric
    Unit
    Description

    hashtag
    Filesystem

    Metric
    Unit
    Description

    hashtag
    Other metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    hashtag
    CPU

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Memory

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Network

    Parameter
    Default value
    Domain
    Description

    hashtag
    Storage

    Parameter
    Default value
    Domain
    Description

    hashtag
    Constraints

    There are no general constraints among CentOS 7 parameters.

    Ubuntu 20.04

    This page describes the Optimization Pack for the component type Ubuntu 20.04.

    hashtag
    Metrics

    hashtag
    CPU

    Metric
    Unit
    Description

    hashtag
    Memory

    Metric
    Unit
    Description

    hashtag
    Network

    Metric
    Unit
    Description

    hashtag
    Disk

    Notice: you can use a device custom filter to monitor a specific disk with Prometheus. You can find more information on Prometheus queries and the %FILTERS% placeholder here: and here: .

    Metric
    Unit
    Description

    hashtag
    Filesystem

    Metric
    Unit
    Description

    hashtag
    Other metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    hashtag
    CPU

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Memory

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Network

    Parameter
    Default value
    Domain
    Description

    hashtag
    Storage

    Parameter
    Default value
    Domain
    Description

    Ubuntu 16.04

    This page describes the Optimization Pack for the component type Ubuntu 16.04.

    hashtag
    Metrics

    hashtag
    CPU

    Metric
    Unit
    Description

    hashtag
    Memory

    Metric
    Unit
    Description

    hashtag
    Network

    Metric
    Unit
    Description

    hashtag
    Disk

    Notice: you can use a device custom filter to monitor a specific disk with Prometheus. You can find more information on Prometheus queries and the %FILTERS% placeholder here: and here: .

    Metric
    Unit
    Description

    hashtag
    Filesystem

    Metric
    Unit
    Description

    hashtag
    Other metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    hashtag
    CPU

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Memory

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Network

    Parameter
    Default value
    Domain
    Description

    hashtag
    Storage

    Parameter
    Default value
    Domain
    Description

    RHEL 8

    This page describes the Optimization Pack for the component type RHEL 8.

    hashtag
    Metrics

    hashtag
    CPU

    Metric
    Unit
    Description

    hashtag
    Memory

    Metric
    Unit
    Description

    hashtag
    Network

    Metric
    Unit
    Description

    hashtag
    Disk

    Notice: you can use a device custom filter to monitor a specific disk with Prometheus. You can find more information on Prometheus queries and the %FILTERS% placeholder here: and here: .

    Metric
    Unit
    Description

    hashtag
    Filesystem

    Metric
    Unit
    Description

    hashtag
    Other metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    hashtag
    CPU

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Memory

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Network

    Parameter
    Default value
    Domain
    Description

    hashtag
    Storage

    Parameter
    Default value
    Domain
    Description

    CentOS 8

    This page describes the Optimization Pack for the component type CentOS 8.

    hashtag
    Metrics

    hashtag
    CPU

    ,
    4096
    ,
    8192
    ,
    16384
    ,
    32768

    local[]

  • local[,F]

  • spark://HOST:PORT

  • spark://HOST1:PORT1, HOST2:PORT2

  • yarn

  • cpu_util_details

    percent

    The average CPU utilization % broken down by usage type and cpu number (e.g., cp1 user, cp2 system, cp3 soft-irq)

    cpu_load_avg

    tasks

    The system load average (i.e., the number of active tasks in the system)

    mem_util_details

    percent

    The memory utilization % (i.e., the % of memory used) broken down by usage type (e.g., active memory)

    mem_used

    bytes

    The total amount of memory used

    mem_used_nocache

    bytes

    The total amount of memory used without considering memory reserved for caching purposes

    mem_total

    bytes

    The total amount of installed memory

    mem_fault_minor

    faults/s

    The number of minor memory faults (i.e., faults that do not cause disk access) per second

    mem_fault_major

    faults/s

    The number of major memory faults (i.e., faults that cause disk access) per second

    mem_fault

    faults/s

    The number of memory faults (major + minor)

    mem_swapins

    pages/s

    The number of memory pages swapped in per second

    mem_swapouts

    pages/s

    The number of memory pages swapped out per second

    network_out_bytes_details

    bytes/s

    The number of outbound network packets in bytes per second broken down by network device (e.g., eth01)

    disk_util_details

    percent

    The utilization % of disk, i.e how much time a disk is busy doing work broken down by disk (e.g., disk D://)

    disk_iops_writes

    ops/s

    The average number of IO disk-write operations per second across all disks

    disk_iops_reads

    ops/s

    The average number of IO disk-read operations per second across all disks

    disk_iops

    ops/s

    The average number of IO disk operations per second across all disks

    disk_response_time_read

    seconds

    The average response time of IO read-disk operations

    disk_response_time_worst

    seconds

    The average response time of IO disk operations of the slowest disk

    disk_response_time_write

    seconds

    The average response time of IO write-disk operations

    disk_response_time_details

    ops/s

    The average response time of IO disk operations broken down by disk (e.g., disk /dev/nvme01 )

    disk_iops_details

    ops/s

    The number of IO disk-write operations of per second broken down by disk (e.g., disk /dev/nvme01)

    disk_io_inflight_details

    ops

    The number of IO disk operations in progress (outstanding) broken down by disk (e.g., disk /dev/nvme01)

    disk_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_read_bytes

    bytes/s

    The number of bytes per second read across all disks

    disk_read_write_bytes

    bytes/s

    The number of bytes per second read and written across all disks

    disk_write_bytes_details

    bytes/s

    The number of bytes per second written from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation WRITE)

    disk_read_bytes_details

    bytes/s

    The number of bytes per second read from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation READ)

    filesystem_size

    bytes

    The size of filesystems broken down by type and device (e.g., filesystem of type ext4 for device /dev/nvme01)

    400000→40000000 ns

    Scheduler Wakeup Granularity (in nanoseconds)

    os_CPUSchedMigrationCost

    500000 ns

    100000→5000000 ns

    Amount of time (in nanoseconds) after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations

    os_CPUSchedChildRunsFirst

    0

    0→1

    A freshly forked child runs before the parent continues execution

    os_CPUSchedLatency

    18000000 ns

    2400000→240000000 ns

    Targeted preemption latency (in nanoseconds) for CPU bound tasks

    os_CPUSchedAutogroupEnabled

    1

    0→1

    Enables the Linux task auto-grouping feature, where the kernel assigns related tasks to groups and schedules them together on CPUs to achieve higher performance for some workloads

    os_CPUSchedNrMigrate

    32

    3→320

    Scheduler NR Migrate

    10→100 %

    VFS Cache Pressure

    os_MemoryVmMinFree

    67584 KB

    10240→1024000 KB

    Minimum Free Memory

    os_MemoryVmDirtyRatio

    20 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyBackgroundRatio

    10 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, the kernel begins to write them asynchronously in the background

    os_MemoryTransparentHugepageEnabled

    always

    always never

    Transparent Hugepage Enablement

    os_MemoryTransparentHugepageDefrag

    always

    always never

    Transparent Hugepage Enablement Defrag

    os_MemorySwap

    swapon

    swapon swapoff

    Memory Swap

    os_MemoryVmDirtyExpire

    3000 centisecs

    300→30000 centisecs

    Memory Dirty Expiration Time

    os_MemoryVmDirtyWriteback

    500 centisecs

    50→5000 centisecs

    Memory Dirty Writeback

    100→10000 packets

    Network Max Backlog

    os_NetworkNetIpv4TcpMaxSynBacklog

    1024 packets

    52→15120 packets

    Network IPV4 Max Sync Backlog

    os_NetworkNetCoreNetdevBudget

    300 packets

    30→3000 packets

    Network Budget

    os_NetworkNetCoreRmemMax

    212992 bytes

    21299→2129920 bytes

    Maximum network receive buffer size that applications can request

    os_NetworkNetCoreWmemMax

    21299→2129920 bytes

    21299→2129920 bytes

    Maximum network transmit buffer size that applications can request

    os_NetworkNetIpv4TcpSlowStartAfterIdle

    1

    0→1

    Network Slow Start After Idle Flag

    os_NetworkNetIpv4TcpFinTimeout

    60

    6 →600 seconds

    Network TCP timeout

    os_NetworkRfs

    0

    0→131072

    If enabled increases datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running

    100→10000 packets

    Network Max Backlog

    os_StorageRqAffinity

    1

    1→2

    Storage Requests Affinity

    os_StorageQueueScheduler

    none

    none kyber

    Storage Queue Scheduler Type

    os_StorageNomerges

    0

    0→2

    Enables the user to disable the lookup logic involved with IO merging requests in the block layer. By default (0) all merges are enabled. With 1 only simple one-hit merges will be tried. With 2 no merge algorithms will be tried

    os_StorageMaxSectorsKb

    128 KB

    32→128 KB

    The largest IO size that the OS c

    cpu_num

    CPUs

    The number of CPUs available in the system (physical and logical)

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    mem_util

    percent

    The memory utilization % (i.e, the % of memory used)

    mem_util_nocache

    percent

    The memory utilization % (i.e., the % of memory used) without considering memory reserved for caching purposes

    network_tcp_retrans

    retrans/s

    The number of network TCP retransmissions per second

    network_in_bytes_details

    bytes/s

    The number of inbound network packets in bytes per second broken down by network device (e.g., wlp4s0)

    disk_swap_util

    percent

    The average space utilization % of swap disks

    disk_swap_used

    bytes

    The total amount of space used by swap disks

    filesystem_util

    percent

    The space utilization % of filesystems broken down by type and device (e.g., filesystem of type overlayfs on device /dev/loop1)

    filesystem_used

    bytes

    The amount of space used on the filesystems broken down by type and device (e.g., filesystem of type zfs on device /dev/nvme01)

    proc_blocked

    processes

    The number of processes blocked (e.g, for IO or swapping reasons)

    os_context_switch

    switches/s

    The number of context switches per second

    os_cpuSchedMinGranularity

    2250000 ns

    300000→30000000 ns

    Minimal preemption granularity (in nanoseconds) for CPU bound tasks

    os_cpuSchedWakeupGranularity

    os_MemorySwappiness

    1

    0→100

    Memory Swappiness

    os_MemoryVmVfsCachePressure

    os_NetworkNetCoreSomaxconn

    128 connections

    12→1200 connections

    Network Max Connections

    os_NetworkNetCoreNetdevMaxBacklog

    os_StorageReadAhead

    128 KB

    0→1024 KB

    Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk

    os_StorageNrRequests

    Prometheus provider
    Prometheus provider metrics mapping

    3000000 ns

    100 %

    1000 packets

    1000 packets

    cpu_util_details

    percent

    The average CPU utilization % broken down by usage type and cpu number (e.g., cp1 user, cp2 system, cp3 soft-irq)

    cpu_load_avg

    tasks

    The system load average (i.e., the number of active tasks in the system)

    mem_util_details

    percent

    The memory utilization % (i.e., the % of memory used) broken down by usage type (e.g., active memory)

    mem_used

    bytes

    The total amount of memory used

    mem_used_nocache

    bytes

    The total amount of memory used without considering memory reserved for caching purposes

    mem_total

    bytes

    The total amount of installed memory

    mem_fault_minor

    faults/s

    The number of minor memory faults (i.e., faults that do not cause disk access) per second

    mem_fault_major

    faults/s

    The number of major memory faults (i.e., faults that cause disk access) per second

    mem_fault

    faults/s

    The number of memory faults (major + minor)

    mem_swapins

    pages/s

    The number of memory pages swapped in per second

    mem_swapouts

    pages/s

    The number of memory pages swapped out per second

    network_out_bytes_details

    bytes/s

    The number of outbound network packets in bytes per second broken down by network device (e.g., eth01)

    disk_util_details

    percent

    The utilization % of disk, i.e how much time a disk is busy doing work broken down by disk (e.g., disk D://)

    disk_iops_writes

    ops/s

    The average number of IO disk-write operations per second across all disks

    disk_iops_reads

    ops/s

    The average number of IO disk-read operations per second across all disks

    disk_iops

    ops/s

    The average number of IO disk operations per second across all disks

    disk_response_time_read

    seconds

    The average response time of IO read-disk operations

    disk_response_time_worst

    seconds

    The average response time of IO disk operations of the slowest disk

    disk_response_time_write

    seconds

    The average response time of IO write-disk operations

    disk_response_time_details

    ops/s

    The average response time of IO disk operations broken down by disk (e.g., disk /dev/nvme01 )

    disk_iops_details

    ops/s

    The number of IO disk-write operations of per second broken down by disk (e.g., disk /dev/nvme01)

    disk_io_inflight_details

    ops

    The number of IO disk operations in progress (outstanding) broken down by disk (e.g., disk /dev/nvme01)

    disk_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_read_bytes

    bytes/s

    The number of bytes per second read across all disks

    disk_read_write_bytes

    bytes/s

    The number of bytes per second read and written across all disks

    disk_write_bytes_details

    bytes/s

    The number of bytes per second written from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation WRITE)

    disk_read_bytes_details

    bytes/s

    The number of bytes per second read from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation READ)

    filesystem_size

    bytes

    The size of filesystems broken down by type and device (e.g., filesystem of type ext4 for device /dev/nvme01)

    400000→40000000 ns

    Scheduler Wakeup Granularity (in nanoseconds)

    os_CPUSchedMigrationCost

    500000 ns

    100000→5000000 ns

    Amount of time (in nanoseconds) after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations

    os_CPUSchedChildRunsFirst

    0

    0→1

    A freshly forked child runs before the parent continues execution

    os_CPUSchedLatency

    18000000 ns

    2400000→240000000 ns

    Targeted preemption latency (in nanoseconds) for CPU bound tasks

    os_CPUSchedAutogroupEnabled

    1

    0→1

    Enables the Linux task auto-grouping feature, where the kernel assigns related tasks to groups and schedules them together on CPUs to achieve higher performance for some workloads

    os_CPUSchedNrMigrate

    32

    3→320

    Scheduler NR Migrate

    10→100 %

    VFS Cache Pressure

    os_MemoryVmMinFree

    67584 KB

    10240→1024000 KB

    Minimum Free Memory

    os_MemoryVmDirtyRatio

    20 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyBackgroundRatio

    10 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, the kernel begins to write them asynchronously in the background

    os_MemoryTransparentHugepageEnabled

    always

    always never

    Transparent Hugepage Enablement

    os_MemoryTransparentHugepageDefrag

    always

    always never

    Transparent Hugepage Enablement Defrag

    os_MemorySwap

    swapon

    swapon swapoff

    Memory Swap

    os_MemoryVmDirtyExpire

    3000 centisecs

    300→30000 centisecs

    Memory Dirty Expiration Time

    os_MemoryVmDirtyWriteback

    500 centisecs

    50→5000 centisecs

    Memory Dirty Writeback

    100→10000 packets

    Network Max Backlog

    os_NetworkNetIpv4TcpMaxSynBacklog

    1024 packets

    52→15120 packets

    Network IPV4 Max Sync Backlog

    os_NetworkNetCoreNetdevBudget

    300 packets

    30→3000 packets

    Network Budget

    os_NetworkNetCoreRmemMax

    212992 bytes

    21299→2129920 bytes

    Maximum network receive buffer size that applications can request

    os_NetworkNetCoreWmemMax

    21299→2129920 bytes

    21299→2129920 bytes

    Maximum network transmit buffer size that applications can request

    os_NetworkNetIpv4TcpSlowStartAfterIdle

    1

    0→1

    Network Slow Start After Idle Flag

    os_NetworkNetIpv4TcpFinTimeout

    60

    6 →600 seconds

    Network TCP timeout

    os_NetworkRfs

    0

    0→131072

    If enabled increases datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running

    100→10000 packets

    Network Max Backlog

    os_StorageRqAffinity

    1

    1→2

    Storage Requests Affinity

    os_StorageQueueScheduler

    none

    none kyber

    Storage Queue Scheduler Type

    os_StorageNomerges

    0

    0→2

    Enables the user to disable the lookup logic involved with IO merging requests in the block layer. By default (0) all merges are enabled. With 1 only simple one-hit merges will be tried. With 2 no merge algorithms will be tried

    os_StorageMaxSectorsKb

    128 KB

    32→128 KB

    The largest IO size that the OS c

    cpu_num

    CPUs

    The number of CPUs available in the system (physical and logical)

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    mem_util

    percent

    The memory utilization % (i.e, the % of memory used)

    mem_util_nocache

    percent

    The memory utilization % (i.e., the % of memory used) without considering memory reserved for caching purposes

    network_tcp_retrans

    retrans/s

    The number of network TCP retransmissions per second

    network_in_bytes_details

    bytes/s

    The number of inbound network packets in bytes per second broken down by network device (e.g., wlp4s0)

    disk_swap_util

    percent

    The average space utilization % of swap disks

    disk_swap_used

    bytes

    The total amount of space used by swap disks

    filesystem_util

    percent

    The space utilization % of filesystems broken down by type and device (e.g., filesystem of type overlayfs on device /dev/loop1)

    filesystem_used

    bytes

    The amount of space used on the filesystems broken down by type and device (e.g., filesystem of type zfs on device /dev/nvme01)

    proc_blocked

    processes

    The number of processes blocked (e.g, for IO or swapping reasons)

    os_context_switch

    switches/s

    The number of context switches per second

    os_cpuSchedMinGranularity

    2250000 ns

    300000→30000000 ns

    Minimal preemption granularity (in nanoseconds) for CPU bound tasks

    os_cpuSchedWakeupGranularity

    os_MemorySwappiness

    1

    0→100

    Memory Swappiness

    os_MemoryVmVfsCachePressure

    os_NetworkNetCoreSomaxconn

    128 connections

    12→1200 connections

    Network Max Connections

    os_NetworkNetCoreNetdevMaxBacklog

    os_StorageReadAhead

    128 KB

    0→1024 KB

    Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk

    os_StorageNrRequests

    Prometheus provider
    Prometheus provider metrics mapping

    3000000 ns

    100 %

    1000 packets

    1000 packets

    cpu_util_details

    percent

    The average CPU utilization % broken down by usage type and cpu number (e.g., cp1 user, cp2 system, cp3 soft-irq)

    cpu_load_avg

    tasks

    The system load average (i.e., the number of active tasks in the system)

    mem_util_details

    percent

    The memory utilization % (i.e., the % of memory used) broken down by usage type (e.g., active memory)

    mem_used

    bytes

    The total amount of memory used

    mem_used_nocache

    bytes

    The total amount of memory used without considering memory reserved for caching purposes

    mem_total

    bytes

    The total amount of installed memory

    mem_fault_minor

    faults/s

    The number of minor memory faults (i.e., faults that do not cause disk access) per second

    mem_fault_major

    faults/s

    The number of major memory faults (i.e., faults that cause disk access) per second

    mem_fault

    faults/s

    The number of memory faults (major + minor)

    mem_swapins

    pages/s

    The number of memory pages swapped in per second

    mem_swapouts

    pages/s

    The number of memory pages swapped out per second

    network_out_bytes_details

    bytes/s

    The number of outbound network packets in bytes per second broken down by network device (e.g., eth01)

    disk_util_details

    percent

    The utilization % of disk, i.e how much time a disk is busy doing work broken down by disk (e.g., disk D://)

    disk_iops_writes

    ops/s

    The average number of IO disk-write operations per second across all disks

    disk_iops_reads

    ops/s

    The average number of IO disk-read operations per second across all disks

    disk_iops

    ops/s

    The average number of IO disk operations per second across all disks

    disk_response_time_read

    seconds

    The average response time of IO read-disk operations

    disk_response_time_worst

    seconds

    The average response time of IO disk operations of the slowest disk

    disk_response_time_write

    seconds

    The average response time of IO write-disk operations

    disk_response_time_details

    ops/s

    The average response time of IO disk operations broken down by disk (e.g., disk /dev/nvme01 )

    disk_iops_details

    ops/s

    The number of IO disk-write operations of per second broken down by disk (e.g., disk /dev/nvme01)

    disk_io_inflight_details

    ops

    The number of IO disk operations in progress (outstanding) broken down by disk (e.g., disk /dev/nvme01)

    disk_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_read_bytes

    bytes/s

    The number of bytes per second read across all disks

    disk_read_write_bytes

    bytes/s

    The number of bytes per second read and written across all disks

    disk_write_bytes_details

    bytes/s

    The number of bytes per second written from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation WRITE)

    disk_read_bytes_details

    bytes/s

    The number of bytes per second read from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation READ)

    filesystem_size

    bytes

    The size of filesystems broken down by type and device (e.g., filesystem of type ext4 for device /dev/nvme01)

    400000→40000000 ns

    Scheduler Wakeup Granularity (in nanoseconds)

    os_CPUSchedMigrationCost

    500000 ns

    100000→5000000 ns

    Amount of time (in nanoseconds) after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations

    os_CPUSchedChildRunsFirst

    0

    0→1

    A freshly forked child runs before the parent continues execution

    os_CPUSchedLatency

    18000000 ns

    2400000→240000000 ns

    Targeted preemption latency (in nanoseconds) for CPU bound tasks

    os_CPUSchedAutogroupEnabled

    1

    0→1

    Enables the Linux task auto-grouping feature, where the kernel assigns related tasks to groups and schedules them together on CPUs to achieve higher performance for some workloads

    os_CPUSchedNrMigrate

    32

    3→320

    Scheduler NR Migrate

    10→100 %

    VFS Cache Pressure

    os_MemoryVmMinFree

    67584 KB

    10240→1024000 KB

    Minimum Free Memory

    os_MemoryVmDirtyRatio

    20 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyBackgroundRatio

    10 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, the kernel begins to write them asynchronously in the background

    os_MemoryTransparentHugepageEnabled

    madvise

    always never madvise

    Transparent Hugepage Enablement

    os_MemoryTransparentHugepageDefrag

    madvise

    always never madvise defer defer+madvise

    Transparent Hugepage Enablement Defrag

    os_MemorySwap

    swapon

    swapon swapoff

    Memory Swap

    os_MemoryVmDirtyExpire

    3000 centisecs

    300→30000 centisecs

    Memory Dirty Expiration Time

    os_MemoryVmDirtyWriteback

    500 centisecs

    50→5000 centisecs

    Memory Dirty Writeback

    100→10000 packets

    Network Max Backlog

    os_NetworkNetIpv4TcpMaxSynBacklog

    1024 packets

    52→15120 packets

    Network IPV4 Max Sync Backlog

    os_NetworkNetCoreNetdevBudget

    300 packets

    30→3000 packets

    Network Budget

    os_NetworkNetCoreRmemMax

    212992 bytes

    21299→2129920 bytes

    Maximum network receive buffer size that applications can request

    os_NetworkNetCoreWmemMax

    21299→2129920 bytes

    21299→2129920 bytes

    Maximum network transmit buffer size that applications can request

    os_NetworkNetIpv4TcpSlowStartAfterIdle

    1

    0→1

    Network Slow Start After Idle Flag

    os_NetworkNetIpv4TcpFinTimeout

    60

    6 →600 seconds

    Network TCP timeout

    os_NetworkRfs

    0

    0→131072

    If enabled increases datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running

    100→10000 packets

    Network Max Backlog

    os_StorageRqAffinity

    1

    1→2

    Storage Requests Affinity

    os_StorageQueueScheduler

    none

    none mq-deadline

    Storage Queue Scheduler Type

    os_StorageNomerges

    0

    0→2

    Enables the user to disable the lookup logic involved with IO merging requests in the block layer. By default (0) all merges are enabled. With 1 only simple one-hit merges will be tried. With 2 no merge algorithms will be tried

    os_StorageMaxSectorsKb

    128 KB

    32→128 KB

    The largest IO size that the OS c

    cpu_num

    CPUs

    The number of CPUs available in the system (physical and logical)

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    mem_util

    percent

    The memory utilization % (i.e, the % of memory used)

    mem_util_nocache

    percent

    The memory utilization % (i.e., the % of memory used) without considering memory reserved for caching purposes

    network_tcp_retrans

    retrans/s

    The number of network TCP retransmissions per second

    network_in_bytes_details

    bytes/s

    The number of inbound network packets in bytes per second broken down by network device (e.g., wlp4s0)

    disk_swap_util

    percent

    The average space utilization % of swap disks

    disk_swap_used

    bytes

    The total amount of space used by swap disks

    filesystem_util

    percent

    The space utilization % of filesystems broken down by type and device (e.g., filesystem of type overlayfs on device /dev/loop1)

    filesystem_used

    bytes

    The amount of space used on the filesystems broken down by type and device (e.g., filesystem of type zfs on device /dev/nvme01)

    proc_blocked

    processes

    The number of processes blocked (e.g, for IO or swapping reasons)

    os_context_switch

    switches/s

    The number of context switches per second

    os_cpuSchedMinGranularity

    2250000 ns

    300000→30000000 ns

    Minimal preemption granularity (in nanoseconds) for CPU bound tasks

    os_cpuSchedWakeupGranularity

    os_MemorySwappiness

    1

    0→100

    Memory Swappiness

    os_MemoryVmVfsCachePressure

    os_NetworkNetCoreSomaxconn

    128 connections

    12→1200 connections

    Network Max Connections

    os_NetworkNetCoreNetdevMaxBacklog

    os_StorageReadAhead

    128 KB

    0→1024 KB

    Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk

    os_StorageNrRequests

    Prometheus provider
    Prometheus provider metrics mapping

    3000000 ns

    100 %

    1000 packets

    1000 packets

    cpu_util_details

    percent

    The average CPU utilization % broken down by usage type and cpu number (e.g., cp1 user, cp2 system, cp3 soft-irq)

    cpu_load_avg

    tasks

    The system load average (i.e., the number of active tasks in the system)

    mem_util_details

    percent

    The memory utilization % (i.e., the % of memory used) broken down by usage type (e.g., active memory)

    mem_used

    bytes

    The total amount of memory used

    mem_used_nocache

    bytes

    The total amount of memory used without considering memory reserved for caching purposes

    mem_total

    bytes

    The total amount of installed memory

    mem_fault_minor

    faults/s

    The number of minor memory faults (i.e., faults that do not cause disk access) per second

    mem_fault_major

    faults/s

    The number of major memory faults (i.e., faults that cause disk access) per second

    mem_fault

    faults/s

    The number of memory faults (major + minor)

    mem_swapins

    pages/s

    The number of memory pages swapped in per second

    mem_swapouts

    pages/s

    The number of memory pages swapped out per second

    network_out_bytes_details

    bytes/s

    The number of outbound network packets in bytes per second broken down by network device (e.g., eth01)

    disk_util_details

    percent

    The utilization % of disk, i.e how much time a disk is busy doing work broken down by disk (e.g., disk D://)

    disk_iops_writes

    ops/s

    The average number of IO disk-write operations per second across all disks

    disk_iops_reads

    ops/s

    The average number of IO disk-read operations per second across all disks

    disk_iops

    ops/s

    The average number of IO disk operations per second across all disks

    disk_response_time_read

    seconds

    The average response time of IO read-disk operations

    disk_response_time_worst

    seconds

    The average response time of IO disk operations of the slowest disk

    disk_response_time_write

    seconds

    The average response time of IO write-disk operations

    disk_response_time_details

    ops/s

    The average response time of IO disk operations broken down by disk (e.g., disk /dev/nvme01 )

    disk_iops_details

    ops/s

    The number of IO disk-write operations of per second broken down by disk (e.g., disk /dev/nvme01)

    disk_io_inflight_details

    ops

    The number of IO disk operations in progress (outstanding) broken down by disk (e.g., disk /dev/nvme01)

    disk_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_read_bytes

    bytes/s

    The number of bytes per second read across all disks

    disk_read_write_bytes

    bytes/s

    The number of bytes per second read and written across all disks

    disk_write_bytes_details

    bytes/s

    The number of bytes per second written from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation WRITE)

    disk_read_bytes_details

    bytes/s

    The number of bytes per second read from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation READ)

    filesystem_size

    bytes

    The size of filesystems broken down by type and device (e.g., filesystem of type ext4 for device /dev/nvme01)

    400000→40000000 ns

    Scheduler Wakeup Granularity (in nanoseconds)

    os_CPUSchedMigrationCost

    500000 ns

    100000→5000000 ns

    Amount of time (in nanoseconds) after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations

    os_CPUSchedChildRunsFirst

    0

    0→1

    A freshly forked child runs before the parent continues execution

    os_CPUSchedLatency

    18000000 ns

    2400000→240000000 ns

    Targeted preemption latency (in nanoseconds) for CPU bound tasks

    os_CPUSchedAutogroupEnabled

    1

    0→1

    Enables the Linux task auto-grouping feature, where the kernel assigns related tasks to groups and schedules them together on CPUs to achieve higher performance for some workloads

    os_CPUSchedNrMigrate

    32

    3→320

    Scheduler NR Migrate

    10→100 %

    VFS Cache Pressure

    os_MemoryVmMinFree

    67584 KB

    10240→1024000 KB

    Minimum Free Memory

    os_MemoryVmDirtyRatio

    20 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyBackgroundRatio

    10 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, the kernel begins to write them asynchronously in the background

    os_MemoryTransparentHugepageEnabled

    always

    always never

    Transparent Hugepage Enablement

    os_MemoryTransparentHugepageDefrag

    always

    always never

    Transparent Hugepage Enablement Defrag

    os_MemorySwap

    swapon

    swapon swapoff

    Memory Swap

    os_MemoryVmDirtyExpire

    3000 centisecs

    300→30000 centisecs

    Memory Dirty Expiration Time

    os_MemoryVmDirtyWriteback

    500 centisecs

    50→5000 centisecs

    Memory Dirty Writeback

    100→10000 packets

    Network Max Backlog

    os_NetworkNetIpv4TcpMaxSynBacklog

    1024 packets

    52→15120 packets

    Network IPV4 Max Sync Backlog

    os_NetworkNetCoreNetdevBudget

    300 packets

    30→3000 packets

    Network Budget

    os_NetworkNetCoreRmemMax

    212992 bytes

    21299→2129920 bytes

    Maximum network receive buffer size that applications can request

    os_NetworkNetCoreWmemMax

    21299→2129920 bytes

    21299→2129920 bytes

    Maximum network transmit buffer size that applications can request

    os_NetworkNetIpv4TcpSlowStartAfterIdle

    1

    0→1

    Network Slow Start After Idle Flag

    os_NetworkNetIpv4TcpFinTimeout

    60

    6 →600 seconds

    Network TCP timeout

    os_NetworkRfs

    0

    0→131072

    If enabled increases datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running

    100→10000 packets

    Network Max Backlog

    os_StorageRqAffinity

    1

    1→2

    Storage Requests Affinity

    os_StorageQueueScheduler

    none

    none kyber

    Storage Queue Scheduler Type

    os_StorageNomerges

    0

    0→2

    Enables the user to disable the lookup logic involved with IO merging requests in the block layer. By default (0) all merges are enabled. With 1 only simple one-hit merges will be tried. With 2 no merge algorithms will be tried

    os_StorageMaxSectorsKb

    128 KB

    32→128 KB

    The largest IO size that the OS c

    cpu_num

    CPUs

    The number of CPUs available in the system (physical and logical)

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    mem_util

    percent

    The memory utilization % (i.e, the % of memory used)

    mem_util_nocache

    percent

    The memory utilization % (i.e., the % of memory used) without considering memory reserved for caching purposes

    network_tcp_retrans

    retrans/s

    The number of network TCP retransmissions per second

    network_in_bytes_details

    bytes/s

    The number of inbound network packets in bytes per second broken down by network device (e.g., wlp4s0)

    disk_swap_util

    percent

    The average space utilization % of swap disks

    disk_swap_used

    bytes

    The total amount of space used by swap disks

    filesystem_util

    percent

    The space utilization % of filesystems broken down by type and device (e.g., filesystem of type overlayfs on device /dev/loop1)

    filesystem_used

    bytes

    The amount of space used on the filesystems broken down by type and device (e.g., filesystem of type zfs on device /dev/nvme01)

    proc_blocked

    processes

    The number of processes blocked (e.g, for IO or swapping reasons)

    os_context_switch

    switches/s

    The number of context switches per second

    os_cpuSchedMinGranularity

    2250000 ns

    300000→30000000 ns

    Minimal preemption granularity (in nanoseconds) for CPU bound tasks

    os_cpuSchedWakeupGranularity

    os_MemorySwappiness

    1

    0→100

    Memory Swappiness

    os_MemoryVmVfsCachePressure

    os_NetworkNetCoreSomaxconn

    128 connections

    12→1200 connections

    Network Max Connections

    os_NetworkNetCoreNetdevMaxBacklog

    os_StorageReadAhead

    128 KB

    0→1024 KB

    Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk

    os_StorageNrRequests

    Prometheus provider
    Prometheus provider metrics mapping

    3000000 ns

    100 %

    1000 packets

    1000 packets

    cpu_util_details

    percent

    The average CPU utilization % broken down by usage type and cpu number (e.g., cp1 user, cp2 system, cp3 soft-irq)

    cpu_load_avg

    tasks

    The system load average (i.e., the number of active tasks in the system)

    mem_util_details

    percent

    The memory utilization % (i.e., the % of memory used) broken down by usage type (e.g., active memory)

    mem_used

    bytes

    The total amount of memory used

    mem_used_nocache

    bytes

    The total amount of memory used without considering memory reserved for caching purposes

    mem_total

    bytes

    The total amount of installed memory

    mem_fault_minor

    faults/s

    The number of minor memory faults (i.e., faults that do not cause disk access) per second

    mem_fault_major

    faults/s

    The number of major memory faults (i.e., faults that cause disk access) per second

    mem_fault

    faults/s

    The number of memory faults (major + minor)

    mem_swapins

    pages/s

    The number of memory pages swapped in per second

    mem_swapouts

    pages/s

    The number of memory pages swapped out per second

    network_out_bytes_details

    bytes/s

    The number of outbound network packets in bytes per second broken down by network device (e.g., eth01)

    disk_util_details

    percent

    The utilization % of disk, i.e how much time a disk is busy doing work broken down by disk (e.g., disk D://)

    disk_iops_writes

    ops/s

    The average number of IO disk-write operations per second across all disks

    disk_iops_reads

    ops/s

    The average number of IO disk-read operations per second across all disks

    disk_iops

    ops/s

    The average number of IO disk operations per second across all disks

    disk_response_time_read

    seconds

    The average response time of IO read-disk operations

    disk_response_time_worst

    seconds

    The average response time of IO disk operations of the slowest disk

    disk_response_time_write

    seconds

    The average response time of IO write-disk operations

    disk_response_time_details

    ops/s

    The average response time of IO disk operations broken down by disk (e.g., disk /dev/nvme01 )

    disk_iops_details

    ops/s

    The number of IO disk-write operations of per second broken down by disk (e.g., disk /dev/nvme01)

    disk_io_inflight_details

    ops

    The number of IO disk operations in progress (outstanding) broken down by disk (e.g., disk /dev/nvme01)

    disk_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_read_bytes

    bytes/s

    The number of bytes per second read across all disks

    disk_read_write_bytes

    bytes/s

    The number of bytes per second read and written across all disks

    disk_write_bytes_details

    bytes/s

    The number of bytes per second written from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation WRITE)

    disk_read_bytes_details

    bytes/s

    The number of bytes per second read from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation READ)

    filesystem_size

    bytes

    The size of filesystems broken down by type and device (e.g., filesystem of type ext4 for device /dev/nvme01)

    400000→40000000 ns

    Scheduler Wakeup Granularity (in nanoseconds)

    os_CPUSchedMigrationCost

    500000 ns

    100000→5000000 ns

    Amount of time (in nanoseconds) after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations

    os_CPUSchedChildRunsFirst

    0

    0→1

    A freshly forked child runs before the parent continues execution

    os_CPUSchedLatency

    18000000 ns

    2400000→240000000 ns

    Targeted preemption latency (in nanoseconds) for CPU bound tasks

    os_CPUSchedAutogroupEnabled

    1

    0→1

    Enables the Linux task auto-grouping feature, where the kernel assigns related tasks to groups and schedules them together on CPUs to achieve higher performance for some workloads

    os_CPUSchedNrMigrate

    32

    3→320

    Scheduler NR Migrate

    10→100 %

    VFS Cache Pressure

    os_MemoryVmMinFree

    67584 KB

    10240→1024000 KB

    Minimum Free Memory

    os_MemoryVmDirtyRatio

    30 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyBackgroundRatio

    10 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, the kernel begins to write them asynchronously in the background

    os_MemoryTransparentHugepageEnabled

    never

    always never madvise

    Transparent Hugepage Enablement

    os_MemoryTransparentHugepageDefrag

    always

    always never madvise defer defer+madvise

    Transparent Hugepage Enablement Defrag

    os_MemorySwap

    swapon

    swapon swapoff

    Memory Swap

    os_MemoryVmDirtyExpire

    3000 centisecs

    300→30000 centisecs

    Memory Dirty Expiration Time

    os_MemoryVmDirtyWriteback

    500 centisecs

    50→5000 centisecs

    Memory Dirty Writeback

    100→10000 packets

    Network Max Backlog

    os_NetworkNetIpv4TcpMaxSynBacklog

    512 packets

    52→15120 packets

    Network IPV4 Max Sync Backlog

    os_NetworkNetCoreNetdevBudget

    300 packets

    30→3000 packets

    Network Budget

    os_NetworkNetCoreRmemMax

    212992 bytes

    21299→2129920 bytes

    Maximum network receive buffer size that applications can request

    os_NetworkNetCoreWmemMax

    21299→2129920 bytes

    21299→2129920 bytes

    Maximum network transmit buffer size that applications can request

    os_NetworkNetIpv4TcpSlowStartAfterIdle

    1

    0→1

    Network Slow Start After Idle Flag

    os_NetworkNetIpv4TcpFinTimeout

    60

    6 →600 seconds

    Network TCP timeout

    os_NetworkRfs

    0

    0→131072

    If enabled increases datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running

    100→10000 packets

    Network Max Backlog

    os_StorageRqAffinity

    1

    1→2

    Storage Requests Affinity

    os_StorageQueueScheduler

    none

    none kyber mq-deadline bfq

    Storage Queue Scheduler Type

    os_StorageNomerges

    0

    0→2

    Enables the user to disable the lookup logic involved with IO merging requests in the block layer. By default (0) all merges are enabled. With 1 only simple one-hit merges will be tried. With 2 no merge algorithms will be tried

    os_StorageMaxSectorsKb

    256 KB

    32→256 KB

    The largest IO size that the OS c

    cpu_num

    CPUs

    The number of CPUs available in the system (physical and logical)

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    mem_util

    percent

    The memory utilization % (i.e, the % of memory used)

    mem_util_nocache

    percent

    The memory utilization % (i.e., the % of memory used) without considering memory reserved for caching purposes

    network_tcp_retrans

    retrans/s

    The number of network TCP retransmissions per second

    network_in_bytes_details

    bytes/s

    The number of inbound network packets in bytes per second broken down by network device (e.g., wlp4s0)

    disk_swap_util

    percent

    The average space utilization % of swap disks

    disk_swap_used

    bytes

    The total amount of space used by swap disks

    filesystem_util

    percent

    The space utilization % of filesystems broken down by type and device (e.g., filesystem of type overlayfs on device /dev/loop1)

    filesystem_used

    bytes

    The amount of space used on the filesystems broken down by type and device (e.g., filesystem of type zfs on device /dev/nvme01)

    proc_blocked

    processes

    The number of processes blocked (e.g, for IO or swapping reasons)

    os_context_switch

    switches/s

    The number of context switches per second

    os_cpuSchedMinGranularity

    2250000 ns

    300000→30000000 ns

    Minimal preemption granularity (in nanoseconds) for CPU bound tasks

    os_cpuSchedWakeupGranularity

    os_MemorySwappiness

    30

    0→100

    Memory Swappiness

    os_MemoryVmVfsCachePressure

    os_NetworkNetCoreSomaxconn

    128 connections

    12→1200 connections

    Network Max Connections

    os_NetworkNetCoreNetdevMaxBacklog

    os_StorageReadAhead

    128 KB

    0→1024 KB

    Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk

    os_StorageNrRequests

    Prometheus provider
    Prometheus provider metrics mapping

    3000000 ns

    100 %

    1000 packets

    1000 packets

    Metric
    Unit
    Description

    cpu_num

    CPUs

    The number of CPUs available in the system (physical and logical)

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    hashtag
    Memory

    Metric
    Unit
    Description

    mem_util

    percent

    The memory utilization % (i.e, the % of memory used)

    mem_util_nocache

    percent

    The memory utilization % (i.e., the % of memory used) without considering memory reserved for caching purposes

    hashtag
    Network

    Metric
    Unit
    Description

    network_tcp_retrans

    retrans/s

    The number of network TCP retransmissions per second

    network_in_bytes_details

    bytes/s

    The number of inbound network packets in bytes per second broken down by network device (e.g., wlp4s0)

    hashtag
    Disk

    Notice: you can use a device custom filter to monitor a specific disk with Prometheus. You can find more information on Prometheus queries and the %FILTERS% placeholder here: Prometheus provider and here: Prometheus provider metrics mapping.

    Metric
    Unit
    Description

    disk_swap_util

    percent

    The average space utilization % of swap disks

    disk_swap_used

    bytes

    The total amount of space used by swap disks

    hashtag
    Filesystem

    Metric
    Unit
    Description

    filesystem_util

    percent

    The space utilization % of filesystems broken down by type and device (e.g., filesystem of type overlayfs on device /dev/loop1)

    filesystem_used

    bytes

    The amount of space used on the filesystems broken down by type and device (e.g., filesystem of type zfs on device /dev/nvme01)

    hashtag
    Other metrics

    Metric
    Unit
    Description

    proc_blocked

    processes

    The number of processes blocked (e.g, for IO or swapping reasons)

    os_context_switch

    switches/s

    The number of context switches per second

    hashtag
    Parameters

    hashtag
    CPU

    Parameter
    Default Value
    Domain
    Description

    os_cpuSchedMinGranularity

    2250000 ns

    300000→30000000 ns

    Minimal preemption granularity (in nanoseconds) for CPU bound tasks

    os_cpuSchedWakeupGranularity

    hashtag
    Memory

    Parameter
    Default Value
    Domain
    Description

    os_MemorySwappiness

    1

    0→100

    Memory Swappiness

    os_MemoryVmVfsCachePressure

    hashtag
    Network

    Parameter
    Default value
    Domain
    Description

    os_NetworkNetCoreSomaxconn

    128 connections

    12→1200 connections

    Network Max Connections

    os_NetworkNetCoreNetdevMaxBacklog

    hashtag
    Storage

    Parameter
    Default value
    Domain
    Description

    os_StorageReadAhead

    128 KB

    0→1024 KB

    Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk

    os_StorageNrRequests

    hashtag
    Constraints

    There are no general constraints among RHEL 8 parameters.

    cpu_util_details

    percent

    The average CPU utilization % broken down by usage type and cpu number (e.g., cp1 user, cp2 system, cp3 soft-irq)

    cpu_load_avg

    tasks

    The system load average (i.e., the number of active tasks in the system)

    mem_util_details

    percent

    The memory utilization % (i.e., the % of memory used) broken down by usage type (e.g., active memory)

    mem_used

    bytes

    The total amount of memory used

    mem_used_nocache

    bytes

    The total amount of memory used without considering memory reserved for caching purposes

    mem_total

    bytes

    The total amount of installed memory

    mem_fault_minor

    faults/s

    The number of minor memory faults (i.e., faults that do not cause disk access) per second

    mem_fault_major

    faults/s

    The number of major memory faults (i.e., faults that cause disk access) per second

    mem_fault

    faults/s

    The number of memory faults (major + minor)

    mem_swapins

    pages/s

    The number of memory pages swapped in per second

    mem_swapouts

    pages/s

    The number of memory pages swapped out per second

    network_out_bytes_details

    bytes/s

    The number of outbound network packets in bytes per second broken down by network device (e.g., eth01)

    disk_util_details

    percent

    The utilization % of disk, i.e how much time a disk is busy doing work broken down by disk (e.g., disk D://)

    disk_iops_writes

    ops/s

    The average number of IO disk-write operations per second across all disks

    disk_iops_reads

    ops/s

    The average number of IO disk-read operations per second across all disks

    disk_iops

    ops/s

    The average number of IO disk operations per second across all disks

    disk_response_time_read

    seconds

    The average response time of IO read-disk operations

    disk_response_time_worst

    seconds

    The average response time of IO disk operations of the slowest disk

    disk_response_time_write

    seconds

    The average response time of IO write-disk operations

    disk_response_time_details

    ops/s

    The average response time of IO disk operations broken down by disk (e.g., disk /dev/nvme01 )

    disk_iops_details

    ops/s

    The number of IO disk-write operations of per second broken down by disk (e.g., disk /dev/nvme01)

    disk_io_inflight_details

    ops

    The number of IO disk operations in progress (outstanding) broken down by disk (e.g., disk /dev/nvme01)

    disk_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_read_bytes

    bytes/s

    The number of bytes per second read across all disks

    disk_read_write_bytes

    bytes/s

    The number of bytes per second read and written across all disks

    disk_write_bytes_details

    bytes/s

    The number of bytes per second written from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation WRITE)

    disk_read_bytes_details

    bytes/s

    The number of bytes per second read from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation READ)

    filesystem_size

    bytes

    The size of filesystems broken down by type and device (e.g., filesystem of type ext4 for device /dev/nvme01)

    3000000 ns

    400000→40000000 ns

    Scheduler Wakeup Granularity (in nanoseconds)

    os_CPUSchedMigrationCost

    500000 ns

    100000→5000000 ns

    Amount of time (in nanoseconds) after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations

    os_CPUSchedChildRunsFirst

    0

    0→1

    A freshly forked child runs before the parent continues execution

    os_CPUSchedLatency

    18000000 ns

    2400000→240000000 ns

    Targeted preemption latency (in nanoseconds) for CPU bound tasks

    os_CPUSchedAutogroupEnabled

    1

    0→1

    Enables the Linux task auto-grouping feature, where the kernel assigns related tasks to groups and schedules them together on CPUs to achieve higher performance for some workloads

    os_CPUSchedNrMigrate

    32

    3→320

    Scheduler NR Migrate

    100 %

    10→100 %

    VFS Cache Pressure

    os_MemoryVmMinFree

    67584 KB

    10240→1024000 KB

    Minimum Free Memory

    os_MemoryVmDirtyRatio

    20 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyBackgroundRatio

    10 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, the kernel begins to write them asynchronously in the background

    os_MemoryTransparentHugepageEnabled

    never

    always never madvise

    Transparent Hugepage Enablement

    os_MemoryTransparentHugepageDefrag

    always

    always never madvise defer defer+madvise

    Transparent Hugepage Enablement Defrag

    os_MemorySwap

    swapon

    swapon swapoff

    Memory Swap

    os_MemoryVmDirtyExpire

    3000 centisecs

    300→30000 centisecs

    Memory Dirty Expiration Time

    os_MemoryVmDirtyWriteback

    500 centisecs

    50→5000 centisecs

    Memory Dirty Writeback

    1000 packets

    100→10000 packets

    Network Max Backlog

    os_NetworkNetIpv4TcpMaxSynBacklog

    512 packets

    52→15120 packets

    Network IPV4 Max Sync Backlog

    os_NetworkNetCoreNetdevBudget

    300 packets

    30→3000 packets

    Network Budget

    os_NetworkNetCoreRmemMax

    212992 bytes

    21299→2129920 bytes

    Maximum network receive buffer size that applications can request

    os_NetworkNetCoreWmemMax

    21299→2129920 bytes

    21299→2129920 bytes

    Maximum network transmit buffer size that applications can request

    os_NetworkNetIpv4TcpSlowStartAfterIdle

    1

    0→1

    Network Slow Start After Idle Flag

    os_NetworkNetIpv4TcpFinTimeout

    60

    6 →600 seconds

    Network TCP timeout

    os_NetworkRfs

    0

    0→131072

    If enabled increases datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running

    1000 packets

    100→10000 packets

    Network Max Backlog

    os_StorageRqAffinity

    1

    1→2

    Storage Requests Affinity

    os_StorageQueueScheduler

    none

    none kyber mq-deadline bfq

    Storage Queue Scheduler Type

    os_StorageNomerges

    0

    0→2

    Enables the user to disable the lookup logic involved with IO merging requests in the block layer. By default (0) all merges are enabled. With 1 only simple one-hit merges will be tried. With 2 no merge algorithms will be tried

    os_StorageMaxSectorsKb

    128 KB

    32→128 KB

    The largest IO size that the OS c

    Metric template

    Metrics are defined using a YAML manifest with the following structure:

    and properties:

    Field
    Type
    Value restrictions
    Is required
    Default Value
    Description

    hashtag
    Supported units of measure

    The supported units of measure for metrics are:

    Type
    Units

    Notice that supported units of measure are automatically scaled for visualization purposes. In particular, for units of information, Akamas uses a base 2 scaling for bytes, i.e., 1 kilobyte = 1024 bytes, 1 megabyte = 1024 kilobytes, and so on. Other units of measure are only scaled up using millions or billions (e.g., 124000000 custom units become 124 Mln custom units).

    Spark History Server metrics mapping

    This page describes the mapping between metrics provided by Spark History Server to Akamas metrics for each supported component type

    Component Type
    Notes

    hashtag
    Spark Application

    Component metric
    Granularity
    Document Path
    JSON query

    Ubuntu 18.04

    This page describes the Optimization Pack for the component type Ubuntu 18.04.

    hashtag
    Metrics

    hashtag
    CPU

    Metric
    Unit
    Description

    hashtag
    Memory

    Metric
    Unit
    Description

    hashtag
    Network

    Metric
    Unit
    Description

    hashtag
    Disk

    Notice: you can use a device custom filter to monitor a specific disk with Prometheus. You can find more information on Prometheus queries and the %FILTERS% placeholder here: and here: .

    Metric
    Unit
    Description

    hashtag
    Filesystem

    Metric
    Unit
    Description

    hashtag
    Other metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    hashtag
    CPU

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Memory

    Parameter
    Default Value
    Domain
    Description

    hashtag
    Network

    Parameter
    Default value
    Domain
    Description

    hashtag
    Storage

    Parameter
    Default value
    Domain
    Description
    metrics:
      - name: "cpu_util"
        description: "cpu utilization"
        unit: "percent"
      - name: "mem_util"
        description: "memory utilization"
        unit: "percent"

    milliseconds

    seconds

    minutes

    hours

    nanoseconds

    microseconds

    milliseconds

    seconds

    minutes

    hours

    nanoseconds

    microseconds

    milliseconds

    seconds

    minutes

    hours

    nanoseconds

    microseconds

    milliseconds

    seconds

    minutes

    hours

    nanoseconds

    microseconds

    milliseconds

    seconds

    minutes

    hours

    nanoseconds

    microseconds

    milliseconds

    seconds

    minutes

    hours

    nanoseconds

    microseconds

    milliseconds

    seconds

    minutes

    hours

    nanoseconds

    microseconds

    milliseconds

    seconds

    minutes

    hours

    Units of infomation

    62d9d76450514395b1391af7efc320be

    bits

    kilobits

    megabits

    gigabit

    terabit

    petabit

    bytes

    kilobytes

    megabytes

    gigabytes

    terabytes

    petabytes

    bits

    kilobits

    megabits

    gigabit

    terabit

    petabit

    bytes

    kilobytes

    megabytes

    gigabytes

    terabytes

    petabytes

    bits

    kilobits

    megabits

    gigabit

    terabit

    petabit

    bytes

    kilobytes

    megabytes

    gigabytes

    terabytes

    petabytes

    bits

    kilobits

    megabits

    gigabit

    terabit

    petabit

    bytes

    kilobytes

    megabytes

    gigabytes

    terabytes

    petabytes

    bits

    kilobits

    megabits

    gigabit

    terabit

    petabit

    bytes

    kilobytes

    megabytes

    gigabytes

    terabytes

    petabytes

    bits

    kilobits

    megabits

    gigabit

    terabit

    petabit

    bytes

    kilobytes

    megabytes

    gigabytes

    terabytes

    petabytes

    bits

    kilobits

    megabits

    gigabit

    terabit

    petabit

    bytes

    kilobytes

    megabytes

    gigabytes

    terabytes

    petabytes

    bits

    kilobits

    megabits

    gigabit

    terabit

    petabit

    bytes

    kilobytes

    megabytes

    gigabytes

    terabytes

    petabytes

    Others

    percent

    name

    string

    No spaces are allowed

    TRUE

    The name of the metric

    unit

    string

    A supported unit or a custom unit (see supported units of measure)

    The unit of measure of the metric

    description

    string

    TRUE

    A description characterizing the metric

    Temporal units

    bc49696649424295b7d565ceb89995a1

    nanoseconds

    microseconds

    /{appId}/1/jobs/{jobId}

    .numCompletedTasks

    spark_active_tasks

    job

    /{appId}/1/jobs/{jobId}

    .numActiveTasks

    spark_skipped_tasks

    job

    /{appId}/1/jobs/{jobId}

    .numSkippedTasks

    spark_failed_tasks

    job

    /{appId}/1/jobs/{jobId}

    .numFailedTasks

    spark_killed_tasks

    job

    /{appId}/1/jobs/{jobId}

    .numKilledTasks

    spark_completed_stages

    job

    /{appId}/1/jobs/{jobId}

    .numCompletedStages

    spark_failed_stages

    job

    /{appId}/1/jobs/{jobId}

    .numFailedStages

    spark_skipped_stages

    job

    /{appId}/1/jobs/{jobId}

    .numSkippedStages

    spark_active_stages

    job

    /{appId}/1/jobs/{jobId}

    .numActiveStages

    spark_duration

    stage

    /{appId}/1/stages/{stageId}

    .getDuration

    spark_task_stage_executor_run_time

    stage

    /{appId}/1/stages/{stageId}

    .getExecutorRunTime

    spark_task_stage_executor_cpu_time

    stage

    /{appId}/1/stages/{stageId}

    .getExecutorCpuTime

    spark_active_tasks

    stage

    /{appId}/1/stages/{stageId}

    .getNumActiveTasks

    spark_completed_tasks

    stage

    /{appId}/1/stages/{stageId}

    .getNumCompleteTasks

    spark_failed_tasks

    stage

    /{appId}/1/stages/{stageId}

    .getNumFailedTasks

    spark_killed_tasks

    stage

    /{appId}/1/stages/{stageId}

    .getNumKilledTasks

    spark_task_stage_input_bytes_read

    stage

    /{appId}/1/stages/{stageId}

    .getInputBytes

    spark_task_stage_input_records_read

    stage

    /{appId}/1/stages/{stageId}

    .getInputRecords

    spark_task_stage_output_bytes_written

    stage

    /{appId}/1/stages/{stageId}

    .getOutputBytes

    spark_task_stage_output_records_written

    stage

    /{appId}/1/stages/{stageId}

    .getOutputRecords

    spark_stage_shuffle_read_bytes

    stage

    /{appId}/1/stages/{stageId}

    .getShuffleReadBytes

    spark_task_stage_shuffle_read_records

    stage

    /{appId}/1/stages/{stageId}

    .getShuffleReadRecords

    spark_task_stage_shuffle_write_bytes

    stage

    /{appId}/1/stages/{stageId}

    .getShuffleWriteBytes

    spark_task_stage_shuffle_write_records

    stage

    /{appId}/1/stages/{stageId}

    .getShuffleWriteRecords

    spark_task_stage_memory_bytes_spilled

    stage

    /{appId}/1/stages/{stageId}

    .getMemoryBytesSpilled

    spark_task_stage_disk_bytes_spilled

    stage

    /{appId}/1/stages/{stageId}

    .getDiskBytesSpilled

    spark_duration

    task

    /{appId}/1/stages/{stageId}

    .tasks[].duration

    spark_task_executor_deserialize_time

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.executorDeserializeTime

    spark_task_executor_deserialize_cpu_time

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.executorDeserializeCpuTime

    spark_task_stage_executor_run_time

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.executorRunTime

    spark_task_stage_executor_cpu_time

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.executorCpuTime

    spark_task_result_size

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.resultSize

    spark_task_jvm_gc_duration

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.jvmGcTime

    spark_task_result_serialization_time

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.resultSerializationTime

    spark_task_stage_memory_bytes_spilled

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.memoryBytesSpilled

    spark_task_stage_disk_bytes_spilled

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.diskBytesSpilled

    spark_task_peak_execution_memory

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.peakExecutionMemory

    spark_task_stage_input_bytes_read

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.inputMetrics.bytesRead

    spark_task_stage_input_records_read

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.inputMetrics.recordsRead

    spark_task_stage_output_bytes_written

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.outputMetrics.bytesWritten

    spark_task_stage_output_records_written

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.outputMetrics.recordsWritten

    spark_task_shuffle_read_remote_blocks_fetched

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.shuffleReadMetrics.remoteBlocksFetched

    spark_task_shuffle_read_local_blocks_fetched

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.shuffleReadMetrics.localBlocksFetched

    spark_task_shuffle_read_fetch_wait_time

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.shuffleReadMetrics.fetchWaitTime

    spark_task_shuffle_read_remote_bytes

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.shuffleReadMetrics.remoteBytesRead

    spark_task_shuffle_read_remote_bytes_to_disk

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.shuffleReadMetrics.remoteBytesReadToDisk

    spark_task_shuffle_read_local_bytes

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.shuffleReadMetrics.localBytesRead

    spark_task_stage_shuffle_read_records

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.shuffleReadMetrics.recordsRead

    spark_task_stage_shuffle_write_bytes

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.shuffleWriteMetrics.bytesWritten

    spark_task_shuffle_write_time

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.shuffleWriteMetrics.writeTime

    spark_task_stage_shuffle_write_records

    task

    /{appId}/1/stages/{stageId}

    .tasks[].taskMetrics.shuffleWriteMetrics.recordsWritten

    spark_executor_rdd_blocks

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .rddBlocks

    spark_executor_mem_used

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .memoryUsed

    spark_executor_disk_used

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .diskUsed

    spark_executor_cores

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .totalCores

    spark_active_tasks

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .activeTasks

    spark_failed_tasks

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .failedTasks

    spark_completed_tasks

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .completedTasks

    spark_executor_total_tasks

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .totalTasks

    spark_executor_total_duration

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .totalDuration

    spark_executor_total_jvm_gc_duration

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .totalGCTime

    spark_executor_total_input_bytes

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .totalInputBytes

    spark_executor_total_shuffle_read

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .totalShuffleRead

    spark_executor_total_shuffle_write

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .totalShuffleWrite

    spark_executor_max_mem_used

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .maxMemory

    spark_executor_used_on_heap_storage_memory

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .memoryMetrics.usedOnHeapStorageMemory

    spark_executor_used_off_heap_storage_memory

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .memoryMetrics.usedOffHeapStorageMemory

    spark_executor_total_on_heap_storage_memory

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .memoryMetrics.totalOnHeapStorageMemory

    spark_executor_total_off_heap_storage_memory

    executor

    /{appId}/1/allexecutors

    select(.id!='driver) | .memoryMetrics.totalOffHeapStorageMemory

    spark_driver_rdd_blocks

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .rddBlocks

    spark_driver_mem_used

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .memoryUsed

    spark_driver_disk_used

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .diskUsed

    spark_driver_cores

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .totalCores

    spark_driver_total_duration

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .totalDuration

    spark_driver_total_jvm_gc_duration

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .totalGCTime

    spark_driver_total_input_bytes

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .totalInputBytes

    spark_driver_total_shuffle_read

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .totalShuffleRead

    spark_driver_total_shuffle_write

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .totalShuffleWrite

    spark_driver_max_mem_used

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .maxMemory

    spark_driver_used_on_heap_storage_memory

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .memoryMetrics.usedOnHeapStorageMemory

    spark_driver_used_off_heap_storage_memory

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .memoryMetrics.usedOffHeapStorageMemory

    spark_driver_total_on_heap_storage_memory

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .memoryMetrics.totalOnHeapStorageMemory

    spark_driver_total_off_heap_storage_memory

    driver

    /{appId}/1/allexecutors

    select(.id=='driver') | .memoryMetrics.totalOffHeapStorageMemory

    spark_duration

    job

    /{appId}/1/jobs/{jobId}

    .duration

    spark_completed_tasks

    Spark Application

    job

    cpu_util_details

    percent

    The average CPU utilization % broken down by usage type and cpu number (e.g., cp1 user, cp2 system, cp3 soft-irq)

    cpu_load_avg

    tasks

    The system load average (i.e., the number of active tasks in the system)

    mem_util_details

    percent

    The memory utilization % (i.e., the % of memory used) broken down by usage type (e.g., active memory)

    mem_used

    bytes

    The total amount of memory used

    mem_used_nocache

    bytes

    The total amount of memory used without considering memory reserved for caching purposes

    mem_total

    bytes

    The total amount of installed memory

    mem_fault_minor

    faults/s

    The number of minor memory faults (i.e., faults that do not cause disk access) per second

    mem_fault_major

    faults/s

    The number of major memory faults (i.e., faults that cause disk access) per second

    mem_fault

    faults/s

    The number of memory faults (major + minor)

    mem_swapins

    pages/s

    The number of memory pages swapped in per second

    mem_swapouts

    pages/s

    The number of memory pages swapped out per second

    network_out_bytes_details

    bytes/s

    The number of outbound network packets in bytes per second broken down by network device (e.g., eth01)

    disk_util_details

    percent

    The utilization % of disk, i.e how much time a disk is busy doing work broken down by disk (e.g., disk D://)

    disk_iops_writes

    ops/s

    The average number of IO disk-write operations per second across all disks

    disk_iops_reads

    ops/s

    The average number of IO disk-read operations per second across all disks

    disk_iops

    ops/s

    The average number of IO disk operations per second across all disks

    disk_response_time_read

    seconds

    The average response time of IO read-disk operations

    disk_response_time_worst

    seconds

    The average response time of IO disk operations of the slowest disk

    disk_response_time_write

    seconds

    The average response time of IO write-disk operations

    disk_response_time_details

    ops/s

    The average response time of IO disk operations broken down by disk (e.g., disk /dev/nvme01 )

    disk_iops_details

    ops/s

    The number of IO disk-write operations of per second broken down by disk (e.g., disk /dev/nvme01)

    disk_io_inflight_details

    ops

    The number of IO disk operations in progress (outstanding) broken down by disk (e.g., disk /dev/nvme01)

    disk_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_read_bytes

    bytes/s

    The number of bytes per second read across all disks

    disk_read_write_bytes

    bytes/s

    The number of bytes per second read and written across all disks

    disk_write_bytes_details

    bytes/s

    The number of bytes per second written from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation WRITE)

    disk_read_bytes_details

    bytes/s

    The number of bytes per second read from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation READ)

    filesystem_size

    bytes

    The size of filesystems broken down by type and device (e.g., filesystem of type ext4 for device /dev/nvme01)

    400000→40000000 ns

    Scheduler Wakeup Granularity (in nanoseconds)

    os_CPUSchedMigrationCost

    500000 ns

    100000→5000000 ns

    Amount of time (in nanoseconds) after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations

    os_CPUSchedChildRunsFirst

    0

    0→1

    A freshly forked child runs before the parent continues execution

    os_CPUSchedLatency

    18000000 ns

    2400000→240000000 ns

    Targeted preemption latency (in nanoseconds) for CPU bound tasks

    os_CPUSchedAutogroupEnabled

    1

    0→1

    Enables the Linux task auto-grouping feature, where the kernel assigns related tasks to groups and schedules them together on CPUs to achieve higher performance for some workloads

    os_CPUSchedNrMigrate

    32

    3→320

    Scheduler NR Migrate

    10→100 %

    VFS Cache Pressure

    os_MemoryVmMinFree

    67584 KB

    10240→1024000 KB

    Minimum Free Memory

    os_MemoryVmDirtyRatio

    20 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyBackgroundRatio

    10 %

    1→99 %

    When the dirty memory pages exceed this percentage of the total memory, the kernel begins to write them asynchronously in the background

    os_MemoryTransparentHugepageEnabled

    madvise

    always never madvise

    Transparent Hugepage Enablement

    os_MemoryTransparentHugepageDefrag

    madvise

    always never madvise defer defer+madvise

    Transparent Hugepage Enablement Defrag

    os_MemorySwap

    swapon

    swapon swapoff

    Memory Swap

    os_MemoryVmDirtyExpire

    3000 centisecs

    300→30000 centisecs

    Memory Dirty Expiration Time

    os_MemoryVmDirtyWriteback

    500 centisecs

    50→5000 centisecs

    Memory Dirty Writeback

    100→10000 packets

    Network Max Backlog

    os_NetworkNetIpv4TcpMaxSynBacklog

    1024 packets

    52→15120 packets

    Network IPV4 Max Sync Backlog

    os_NetworkNetCoreNetdevBudget

    300 packets

    30→3000 packets

    Network Budget

    os_NetworkNetCoreRmemMax

    212992 bytes

    21299→2129920 bytes

    Maximum network receive buffer size that applications can request

    os_NetworkNetCoreWmemMax

    21299→2129920 bytes

    21299→2129920 bytes

    Maximum network transmit buffer size that applications can request

    os_NetworkNetIpv4TcpSlowStartAfterIdle

    1

    0→1

    Network Slow Start After Idle Flag

    os_NetworkNetIpv4TcpFinTimeout

    60

    6 →600 seconds

    Network TCP timeout

    os_NetworkRfs

    0

    0→131072

    If enabled increases datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running

    100→10000 packets

    Network Max Backlog

    os_StorageRqAffinity

    1

    1→2

    Storage Requests Affinity

    os_StorageQueueScheduler

    none

    none mq-deadline

    Storage Queue Scheduler Type

    os_StorageNomerges

    0

    0→2

    Enables the user to disable the lookup logic involved with IO merging requests in the block layer. By default (0) all merges are enabled. With 1 only simple one-hit merges will be tried. With 2 no merge algorithms will be tried

    os_StorageMaxSectorsKb

    128 KB

    32→128 KB

    The largest IO size that the OS c

    cpu_num

    CPUs

    The number of CPUs available in the system (physical and logical)

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    mem_util

    percent

    The memory utilization % (i.e, the % of memory used)

    mem_util_nocache

    percent

    The memory utilization % (i.e., the % of memory used) without considering memory reserved for caching purposes

    network_tcp_retrans

    retrans/s

    The number of network TCP retransmissions per second

    network_in_bytes_details

    bytes/s

    The number of inbound network packets in bytes per second broken down by network device (e.g., wlp4s0)

    disk_swap_util

    percent

    The average space utilization % of swap disks

    disk_swap_used

    bytes

    The total amount of space used by swap disks

    filesystem_util

    percent

    The space utilization % of filesystems broken down by type and device (e.g., filesystem of type overlayfs on device /dev/loop1)

    filesystem_used

    bytes

    The amount of space used on the filesystems broken down by type and device (e.g., filesystem of type zfs on device /dev/nvme01)

    proc_blocked

    processes

    The number of processes blocked (e.g, for IO or swapping reasons)

    os_context_switch

    switches/s

    The number of context switches per second

    os_cpuSchedMinGranularity

    2250000 ns

    300000→30000000 ns

    Minimal preemption granularity (in nanoseconds) for CPU bound tasks

    os_cpuSchedWakeupGranularity

    os_MemorySwappiness

    1

    0→100

    Memory Swappiness

    os_MemoryVmVfsCachePressure

    os_NetworkNetCoreSomaxconn

    128 connections

    12→1200 connections

    Network Max Connections

    os_NetworkNetCoreNetdevMaxBacklog

    os_StorageReadAhead

    128 KB

    0→1024 KB

    Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk

    os_StorageNrRequests

    Prometheus provider
    Prometheus provider metrics mapping

    3000000 ns

    100 %

    1000 packets

    1000 packets

    IBM J9 VM 6

    This page describes the Optimization Pack for Eclipse OpenJ9 (formerly known as IBM J9) Virtual Machine version 6.

    hashtag
    Metrics

    hashtag

    All metrics
    Name
    Unit
    Description

    jvm_heap_size

    bytes

    The size of the JVM heap memory

    jvm_heap_used

    bytes

    The amount of heap memory used

    hashtag
    Parameters

    hashtag
    Heap

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    j9vm_minHeapSize

    integer

    megabytes

    hashtag
    Garbage Collection

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    j9vm_gcPolicy

    categorical

    hashtag
    JIT

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    j9vm_jitOptlevel

    ordinal

    hashtag
    Other parameters

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    j9vm_compressedReferences

    categorical

    hashtag
    Domains

    The following parameters require their ranges or default values to be updated according to the described rules:

    hashtag
    Memory

    Parameter
    Default value
    Domain

    j9vm_minNewSpace

    25% of j9vm_minHeapSize

    must not exceed j9vm_minHeapSize

    j9vm_maxNewSpace

    25% of j9vm_maxHeapSize

    must not exceed j9vm_maxHeapSize

    Notice that the value nocompressedreferences for j9vm_compressedReferences can only be specified for JVMs compiled with the proper --with-noncompressedrefs flag. If this is not the case you cannot actively disable compressed references, meaning:

    • for Xmx <= 57GB is useless to tune this parameter since compressed references are active by default and it is not possible to explicitly disable it

    • for Xmx > 57GB, since the by default (blank value) compressed references are disabled, Akamas can try to enable it. This requires removing the nocompressedreferences from the domain

    hashtag
    Constraints

    The following tables show a list of constraints that may be required in the definition of the study, depending on the tuned parameters:

    hashtag
    Memory

    Formula
    Notes

    jvm.j9vm_minHeapSize < jvm.j9vm_maxHeapSize

    jvm.j9vm_minNewSpace < jvm.j9vm_maxNewSpace && jvm.j9vm_minNewSpace < jvm.j9vm_minHeapSize && jvm.j9vm_maxNewSpace < jvm.j9vm_maxHeapSize

    jvm.j9vm_minOldSpace < jvm.j9vm_maxOldSpace && jvm.j9vm_minOldSpace < jvm.j9vm_minHeapSize && jvm.j9vm_maxOldSpace < jvm.j9vm_maxHeapSize

    Notice that

    • j9vm_newSpaceFixed is mutually exclusive with j9vm_minNewSpace and j9vm_maxNewSpace

    • j9vm_oldSpaceFixed is mutually exclusive with j9vm_minOldSpace and j9vm_maxOldSpace

    • the sum of j9vm_minNewSpace and j9vm_minOldSpace must be equal to j9vm_minHeapSize, so it's useless to tune all of them together. Max values seem to be more complex.

    IBM J9 VM 8

    This page describes the Optimization Pack for Eclipse OpenJ9 (formerly known as IBM J9) Virtual Machine version 8.

    hashtag
    Metrics

    hashtag
    All metrics

    Name
    Unit
    Description

    hashtag
    Parameters

    hashtag
    Heap

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    Garbage Collection

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    JIT

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    Other parameters

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    Domains

    The following parameters require their ranges or default values to be updated according to the described rules:

    hashtag
    Memory

    Parameter
    Default value
    Domain

    Notice that the value nocompressedreferences for j9vm_compressedReferences can only be specified for JVMs compiled with the proper --with-noncompressedrefs flag. If this is not the case you cannot actively disable compressed references, meaning:

    • for Xmx <= 57GB is useless to tune this parameter since compressed references are active by default and it is not possible to explicitly disable it

    • for Xmx > 57GB, since the by default (blank value) compressed references are disabled, Akamas can try to enable it. This requires removing the nocompressedreferences from the domain

    hashtag
    Constraints

    The following tables show a list of constraints that may be required in the definition of the study, depending on the tuned parameters:

    hashtag
    Memory

    Formula
    Notes

    Notice that

    • j9vm_newSpaceFixed is mutually exclusive with j9vm_minNewSpace and j9vm_maxNewSpace

    • j9vm_oldSpaceFixed is mutually exclusive with j9vm_minOldSpace and j9vm_maxOldSpace

    Amazon Linux

    This page describes the Optimization Pack for the component type Amazon Linux.

    hashtag
    Metrics

    hashtag
    CPU

    Metric
    Description

    hashtag
    Memory

    Metric
    Description

    hashtag
    Disk & Filesystem

    Metric
    Description

    hashtag
    Network

    Metric
    Description

    hashtag
    Others

    Metric
    Description

    hashtag
    Parameters

    hashtag
    CPU

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    hashtag
    Memory

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    hashtag
    Network

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    hashtag
    Storage

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    Eclipse Open J9 11

    This page describes the Optimization Pack for Eclipse OpenJ9 (formerly known as IBM J9) version 11.

    hashtag
    Metrics

    hashtag

    jvm_heap_util

    percent

    The utilization % of heap memory

    jvm_memory_used

    bytes

    The total amount of memory used across all the JVM memory pools

    jvm_memory_used_details

    bytes

    The total amount of memory used broken down by pool (e.g., code-cache, compressed-class-space)

    jvm_memory_buffer_pool_used

    bytes

    The total amount of bytes used by buffers within the JVM buffer memory pool

    jvm_gc_time

    percent

    The % of wall clock time the JVM spent doing stop the world garbage collection activities

    jvm_gc_time_details

    percent

    The % of wall clock time the JVM spent doing stop the world garbage collection activities broken down by type of garbage collection algorithm (e.g., ParNew)

    jvm_gc_count

    collections/s

    The total number of stop the world JVM garbage collections that have occurred per second

    jvm_gc_count_details

    collections/s

    The total number of stop the world JVM garbage collections that have occurred per second, broken down by type of garbage collection algorithm (e.g., G1, CMS)

    jvm_gc_duration

    seconds

    The average duration of a stop the world JVM garbage collection

    jvm_gc_duration_details

    seconds

    The average duration of a stop the world JVM garbage collection broken down by type of garbage collection algorithm (e.g., G1, CMS)

    jvm_threads_current

    threads

    The total number of active threads within the JVM

    jvm_threads_deadlocked

    threads

    The total number of deadlocked threads within the JVM

    jvm_compilation_time

    milliseconds

    The total time spent by the JVM JIT compiler compiling bytecode

    You should select your own default value.

    You should select your own domain.

    yes

    Minimum heap size (in megabytes)

    j9vm_maxHeapSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Maximum heap size (in megabytes)

    j9vm_minFreeHeap

    real

    percent

    0.3

    0.1 → 0.5

    yes

    Specify the minimum % free heap required after global GC

    j9vm_maxFreeHeap

    real

    percent

    0.6

    0.4 → 0.9

    yes

    Specify the maximum % free heap required after global GC

    gencon

    gencon, subpool, optavgpause, optthruput, nogc

    yes

    GC policy to use

    j9vm_gcThreads

    integer

    threads

    You should select your own default value.

    1 → 64

    yes

    Number of threads the garbage collector uses for parallel operations

    j9vm_scvTenureAge

    integer

    10

    1 → 14

    yes

    Set the initial tenuring threshold for generational concurrent GC policy

    j9vm_scvAdaptiveTenureAge

    categorical

    blank

    blank, -Xgc:scvNoAdaptiveTenure

    yes

    Enable the adaptive tenure age for generational concurrent GC policy

    j9vm_newSpaceFixed

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The fixed size of the new area when using the gencon GC policy. Must not be set alongside min or max

    j9vm_minNewSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The initial size of the new area when using the gencon GC policy

    j9vm_maxNewSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum size of the new area when using the gencon GC policy

    j9vm_oldSpaceFixed

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The fixed size of the old area when using the gencon GC policy. Must not be set alongside min or max

    j9vm_minOldSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The initial size of the old area when using the gencon GC policy

    j9vm_maxOldSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum size of the old area when using the gencon GC policy

    j9vm_concurrentScavenge

    categorical

    concurrentScavenge

    concurrentScavenge, noConcurrentScavenge

    yes

    Support pause-less garbage collection mode with gencon

    j9vm_gcPartialCompact

    categorical

    nopartialcompactgc

    nopartialcompactgc, partialcompactgc

    yes

    Enable partial compaction

    j9vm_concurrentMeter

    categorical

    soa

    soa, loa, dynamic

    yes

    Determine which area is monitored by the concurrent mark

    j9vm_concurrentBackground

    integer

    0

    0 → 128

    yes

    The number of background threads assisting the mutator threads in concurrent mark

    j9vm_concurrentSlack

    integer

    megabytes

    0

    You should select your own domain.

    yes

    The target size of free heap space for concurrent collectors

    j9vm_concurrentLevel

    integer

    percent

    8

    0 → 100

    yes

    The ratio between the amount of heap allocated and the amount of heap marked

    j9vm_gcCompact

    categorical

    blank

    blank, -Xcompactgc, -Xnocompactgc

    yes

    Enables full compaction on all garbage collections (system and global)

    j9vm_minGcTime

    real

    percent

    0.05

    0.0 → 1.0

    yes

    The minimum percentage of time to be spent in garbage collection, triggering the resize of the heap to meet the specified values

    j9vm_maxGcTime

    real

    percent

    0.13

    0.0 → 1.0

    yes

    The maximum percentage of time to be spent in garbage collection, triggering the resize of the heap to meet the specified values

    j9vm_loa

    categorical

    loa

    loa, noloa

    yes

    Enable the allocation of the large area object during garbage collection

    j9vm_loa_initial

    real

    0.05

    0.0 → 0.95

    yes

    The initial portion of the tenure area allocated to the large area object

    j9vm_loa_minimum

    real

    0.01

    0.0 → 0.95

    yes

    The minimum portion of the tenure area allocated to the large area object

    j9vm_loa_maximum

    real

    0.5

    0.0 → 0.95

    yes

    The maximum portion of the tenure area allocated to the large area object

    noOpt

    noOpt, cold, warm, hot, veryHot, scorching

    yes

    Force the JIT compiler to compile all methods at a specific optimization level

    j9vm_codeCacheTotal

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Maximum size limit in MB for the JIT code cache

    j9vm_jit_count

    integer

    10000

    0 → 1000000

    yes

    The number of times a method is called before it is compiled

    blank

    blank, -Xcompressedrefs, -Xnocompressedrefs

    yes

    Enable/disable the use of compressed references

    j9vm_aggressiveOpts

    categorical

    blank

    blank, -Xaggressive

    yes

    Enable the use of aggressive performance optimization features, which are expected to become default in upcoming releases

    j9vm_virtualized

    categorical

    blank

    blank, -Xtune:virtualized

    yes

    Optimize the VM for virtualized environment, reducing CPU usage when idle

    j9vm_shareclasses

    categorical

    blank

    blank, -Xshareclasses

    yes

    Enable class sharing

    j9vm_quickstart

    categorical

    blank

    blank, -Xquickstart

    yes

    Run JIT with only a subset of optimizations, improving the performance of short-running applications

    j9vm_minimizeUserCpu

    categorical

    blank

    blank, -Xthr:minimizeUserCPU

    yes

    Minimizes user-mode CPU usage in thread synchronization where possible

    j9vm_minOldSpace

    75% of j9vm_minHeapSize

    must not exceed j9vm_minHeapSize

    j9vm_maxOldSpace

    same as j9vm_maxHeapSize

    must not exceed j9vm_maxHeapSize

    j9vm_gcthreads

    number of CPUs - 1, up to a maximum of 64

    capped to default, no benefit in exceeding that value

    j9vm_compressedReferences

    enabled for j9vm_maxHeapSize<= 57 GB

    jvm.j9vm_loa_minimum <= jvm.j9vm_loa_initial && jvm.j9vm_loa_initial <= jvm.j9vm_loa_maximum

    jvm.j9vm_minFreeHeap + 0.05 < jvm.j9vm_maxFreeHeap

    jvm.j9vm_minGcTimeMin < jvm.j9vm_maxGcTime

    jvm_heap_util

    percent

    The utilization % of heap memory

    jvm_memory_used

    bytes

    The total amount of memory used across all the JVM memory pools

    jvm_memory_used_details

    bytes

    The total amount of memory used broken down by pool (e.g., code-cache, compressed-class-space)

    jvm_memory_buffer_pool_used

    bytes

    The total amount of bytes used by buffers within the JVM buffer memory pool

    jvm_gc_time

    percent

    The % of wall clock time the JVM spent doing stop the world garbage collection activities

    jvm_gc_time_details

    percent

    The % of wall clock time the JVM spent doing stop the world garbage collection activities broken down by type of garbage collection algorithm (e.g., ParNew)

    jvm_gc_count

    collections/s

    The total number of stop the world JVM garbage collections that have occurred per second

    jvm_gc_count_details

    collections/s

    The total number of stop the world JVM garbage collections that have occurred per second, broken down by type of garbage collection algorithm (e.g., G1, CMS)

    jvm_gc_duration

    seconds

    The average duration of a stop the world JVM garbage collection

    jvm_gc_duration_details

    seconds

    The average duration of a stop the world JVM garbage collection broken down by type of garbage collection algorithm (e.g., G1, CMS)

    jvm_threads_current

    threads

    The total number of active threads within the JVM

    jvm_threads_deadlocked

    threads

    The total number of deadlocked threads within the JVM

    jvm_compilation_time

    milliseconds

    The total time spent by the JVM JIT compiler compiling bytecode

    You should select your own domain.

    yes

    Minimum heap size (in megabytes)

    j9vm_maxHeapSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Maximum heap size (in megabytes)

    j9vm_minFreeHeap

    real

    percent

    0.3

    0.1 → 0.5

    yes

    Specify the minimum % free heap required after global GC

    j9vm_maxFreeHeap

    real

    percent

    0.6

    0.4 → 0.9

    yes

    Specify the maximum % free heap required after global GC

    gencon, subpool, optavgpause, optthruput, nogc

    yes

    GC policy to use

    j9vm_gcThreads

    integer

    threads

    You should select your own default value.

    1 → 64

    yes

    Number of threads the garbage collector uses for parallel operations

    j9vm_scvTenureAge

    integer

    10

    1 → 14

    yes

    Set the initial tenuring threshold for generational concurrent GC policy

    j9vm_scvAdaptiveTenureAge

    categorical

    blank

    blank, -Xgc:scvNoAdaptiveTenure

    yes

    Enable the adaptive tenure age for generational concurrent GC policy

    j9vm_newSpaceFixed

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The fixed size of the new area when using the gencon GC policy. Must not be set alongside min or max

    j9vm_minNewSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The initial size of the new area when using the gencon GC policy

    j9vm_maxNewSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum size of the new area when using the gencon GC policy

    j9vm_oldSpaceFixed

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The fixed size of the old area when using the gencon GC policy. Must not be set alongside min or max

    j9vm_minOldSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The initial size of the old area when using the gencon GC policy

    j9vm_maxOldSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum size of the old area when using the gencon GC policy

    j9vm_concurrentScavenge

    categorical

    concurrentScavenge

    concurrentScavenge, noConcurrentScavenge

    yes

    Support pause-less garbage collection mode with gencon

    j9vm_gcPartialCompact

    categorical

    nopartialcompactgc

    nopartialcompactgc, partialcompactgc

    yes

    Enable partial compaction

    j9vm_concurrentMeter

    categorical

    soa

    soa, loa, dynamic

    yes

    Determine which area is monitored by the concurrent mark

    j9vm_concurrentBackground

    integer

    0

    0 → 128

    yes

    The number of background threads assisting the mutator threads in concurrent mark

    j9vm_concurrentSlack

    integer

    megabytes

    0

    You should select your own domain.

    yes

    The target size of free heap space for concurrent collectors

    j9vm_concurrentLevel

    integer

    percent

    8

    0 → 100

    yes

    The ratio between the amount of heap allocated and the amount of heap marked

    j9vm_gcCompact

    categorical

    blank

    blank, -Xcompactgc, -Xnocompactgc

    yes

    Enables full compaction on all garbage collections (system and global)

    j9vm_minGcTime

    real

    percent

    0.05

    0.0 → 1.0

    yes

    The minimum percentage of time to be spent in garbage collection, triggering the resize of the heap to meet the specified values

    j9vm_maxGcTime

    real

    percent

    0.13

    0.0 → 1.0

    yes

    The maximum percentage of time to be spent in garbage collection, triggering the resize of the heap to meet the specified values

    j9vm_loa

    categorical

    loa

    loa, noloa

    yes

    Enable the allocation of the large area object during garbage collection

    j9vm_loa_initial

    real

    0.05

    0.0 → 0.95

    yes

    The initial portion of the tenure area allocated to the large area object

    j9vm_loa_minimum

    real

    0.01

    0.0 → 0.95

    yes

    The minimum portion of the tenure area allocated to the large area object

    j9vm_loa_maximum

    real

    0.5

    0.0 → 0.95

    yes

    The maximum portion of the tenure area allocated to the large area object

    noOpt, cold, warm, hot, veryHot, scorching

    yes

    Force the JIT compiler to compile all methods at a specific optimization level

    j9vm_compilationThreads

    integer

    integer

    You should select your own default value.

    1 → 7

    yes

    Number of JIT threads

    j9vm_codeCacheTotal

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Maximum size limit in MB for the JIT code cache

    j9vm_jit_count

    integer

    10000

    0 → 1000000

    yes

    The number of times a method is called before it is compiled

    blank, -XlockReservation

    no

    Enables an optimization that presumes a monitor is owned by the thread that last acquired it

    j9vm_compressedReferences

    categorical

    blank

    blank, -Xcompressedrefs, -Xnocompressedrefs

    yes

    Enable/disable the use of compressed references

    j9vm_aggressiveOpts

    categorical

    blank

    blank, -Xaggressive

    yes

    Enable the use of aggressive performance optimization features, which are expected to become default in upcoming releases

    j9vm_virtualized

    categorical

    blank

    blank, -Xtune:virtualized

    yes

    Optimize the VM for virtualized environment, reducing CPU usage when idle

    j9vm_shareclasses

    categorical

    blank

    blank, -Xshareclasses

    yes

    Enable class sharing

    j9vm_quickstart

    categorical

    blank

    blank, -Xquickstart

    yes

    Run JIT with only a subset of optimizations, improving the performance of short-running applications

    j9vm_minimizeUserCpu

    categorical

    blank

    blank, -Xthr:minimizeUserCPU

    yes

    Minimizes user-mode CPU usage in thread synchronization where possible

    j9vm_minOldSpace

    75% of j9vm_minHeapSize

    must not exceed j9vm_minHeapSize

    j9vm_maxOldSpace

    same as j9vm_maxHeapSize

    must not exceed j9vm_maxHeapSize

    j9vm_gcthreads

    number of CPUs - 1, up to a maximum of 64

    capped to default, no benefit in exceeding that value

    j9vm_compressedReferences

    enabled for j9vm_maxHeapSize<= 57 GB

    jvm.j9vm_minFreeHeap + 0.05 < jvm.j9vm_maxFreeHeap

    jvm.j9vm_minGcTimeMin < jvm.j9vm_maxGcTime

  • the sum of j9vm_minNewSpace and j9vm_minOldSpace must be equal to j9vm_minHeapSize, so it's useless to tune all of them together. Max values seem to be more complex.

  • jvm_heap_size

    bytes

    The size of the JVM heap memory

    jvm_heap_used

    bytes

    The amount of heap memory used

    j9vm_minHeapSize

    integer

    megabytes

    j9vm_gcPolicy

    categorical

    j9vm_jitOptlevel

    ordinal

    j9vm_lockReservation

    categorical

    j9vm_minNewSpace

    25% of j9vm_minHeapSize

    must not exceed j9vm_minHeapSize

    j9vm_maxNewSpace

    25% of j9vm_maxHeapSize

    must not exceed j9vm_maxHeapSize

    jvm.j9vm_minHeapSize < jvm.j9vm_maxHeapSize

    jvm.j9vm_minNewSpace < jvm.j9vm_maxNewSpace && jvm.j9vm_minNewSpace < jvm.j9vm_minHeapSize && jvm.j9vm_maxNewSpace < jvm.j9vm_maxHeapSize

    jvm.j9vm_minOldSpace < jvm.j9vm_maxOldSpace && jvm.j9vm_minOldSpace < jvm.j9vm_minHeapSize && jvm.j9vm_maxOldSpace < jvm.j9vm_maxHeapSize

    You should select your own default value.

    gencon

    noOpt

    categorical

    jvm.j9vm_loa_minimum <= jvm.j9vm_loa_initial && jvm.j9vm_loa_initial <= jvm.j9vm_loa_maximum

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    cpu_used

    CPUs

    The average number of CPUs used in the system (physical and logical)

    cpu_util_details

    percent

    The average CPU utilization % broken down by usage type and CPU number (e.g., cp1 user, cp2 system, cp3 soft-irq)

    mem_fault_minor

    faults/s

    The number of minor memory faults (i.e., faults that do not cause disk access) per second

    mem_swapins

    pages/s

    The number of memory pages swapped in per second

    mem_swapouts

    pages/s

    The number of memory pages swapped out per second

    mem_total

    bytes

    The total amount of installed memory

    mem_used

    bytes

    The total amount of memory used

    mem_used_nocache

    bytes

    The total amount of memory used without considering memory reserved for caching purposes

    mem_util

    percent

    The memory utilization % (i.e, the % of memory used)

    mem_util_details

    percent

    The memory utilization % (i.e., the % of memory used) broken down by usage type (e.g., active memory)

    mem_util_nocache

    percent

    The memory utilization % (i.e., the % of memory used) without considering memory reserved for caching purposes

    disk_iops_details

    ops/s

    The number of IO disk-write operations per second broken down by disk (e.g., disk /dev/nvme01)

    disk_iops_reads

    ops/s

    The average number of IO disk-read operations per second across all disks

    disk_iops_writes

    ops/s

    The average number of IO disk-write operations per second across all disks

    disk_read_bytes

    bytes/s

    The number of bytes per second read across all disks

    disk_read_bytes_details

    bytes/s

    The average response time of IO disk operations broken down by disk (e.g., disk C://)

    disk_read_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_response_time_details

    seconds

    The average response time of IO disk operations broken down by disk (e.g., disk C://)

    disk_response_time_read

    seconds

    The average response time of read disk operations

    disk_response_time_worst

    seconds

    The average response time of IO disk operations of the slowest disk

    disk_response_time_write

    seconds

    The average response time of write on disk operations

    disk_swap_used

    bytes

    The total amount of space used by swap disks

    disk_swap_util

    percent

    The average space utilization % of swap disks

    disk_util_details

    percent

    The utilization % of disk, i.e how much time a disk is busy doing work broken down by disk (e.g., disk D://)

    disk_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_write_bytes_details

    bytes/s

    The number of bytes per second written from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation WRITE)

    filesystem_size

    bytes

    The size of filesystems broken down by type and device (e.g., filesystem of type ext4 for device /dev/nvme01)

    filesystem_used

    bytes

    The amount of space used on the filesystems broken down by type and device (e.g., filesystem of type zfs on device /dev/nvme01)

    filesystem_util

    percent

    The space utilization % of filesystems broken down by type and device (e.g., filesystem of type overlayfs on device /dev/loop1)

    network_tcp_retrans

    retrans/s

    The number of network TCP retransmissions per second

    300000 → 30000000

    no

    Minimal preemption granularity (in nanoseconds) for CPU bound tasks

    os_cpuSchedWakeupGranularity

    integer

    nanoseconds

    2000000

    400000 → 40000000

    no

    Scheduler Wakeup Granularity (in nanoseconds)

    os_CPUSchedMigrationCost

    integer

    nanoseconds

    500000

    100000 → 5000000

    no

    Amount of time (in nanoseconds) after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations

    os_CPUSchedChildRunsFirst

    integer

    0

    0, 1

    no

    A freshly forked child runs before the parent continues execution

    os_CPUSchedLatency

    integer

    nanoseconds

    12000000

    2400000 → 240000000

    no

    Targeted preemption latency (in nanoseconds) for CPU bound tasks

    os_CPUSchedAutogroupEnabled

    integer

    0

    0, 1

    no

    Enables the Linux task auto-grouping feature, where the kernel assigns related tasks to groups and schedules them together on CPUs to achieve higher performance for some workloads

    os_CPUSchedNrMigrate

    integer

    32

    3 → 320

    no

    Scheduler NR Migrate

    0 → 100

    no

    The percentage of RAM free space for which the kernel will start swapping pages to disk

    os_MemoryVmVfsCachePressure

    integer

    100

    10 → 100

    no

    VFS Cache Pressure

    os_MemoryVmCompactionProactiveness

    integer

    Determines how aggressively compaction is done in the background

    os_MemoryVmMinFree

    integer

    67584

    10240 → 1024000

    no

    Minimum Free Memory (in kbytes)

    os_MemoryTransparentHugepageEnabled

    categorical

    madvise

    always, never, madvise

    no

    Transparent Hugepage Enablement Flag

    os_MemoryTransparentHugepageDefrag

    categorical

    madvise

    always, never, defer+madvise, madvise, defer

    no

    Transparent Hugepage Enablement Defrag

    os_MemorySwap

    categorical

    swapon

    swapon, swapoff

    no

    Memory Swap

    os_MemoryVmDirtyRatio

    integer

    20

    1 → 99

    no

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyBackgroundRatio

    integer

    10

    1 → 99

    no

    When the dirty memory pages exceed this percentage of the total memory, the kernel begins to write them asynchronously in the background

    os_MemoryVmDirtyExpire

    integer

    centiseconds

    3000

    300 → 30000

    no

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyWriteback

    integer

    centiseconds

    500

    50 → 5000

    no

    Memory Dirty Writeback (in centisecs)

    12 → 8192

    no

    Network Max Connections

    os_NetworkNetCoreNetdevMaxBacklog

    integer

    megabytes/s

    1000

    100 → 10000

    no

    Network Max Backlog

    os_NetworkNetIpv4TcpMaxSynBacklog

    integer

    milliseconds

    256

    52 → 5120

    no

    Network IPV4 Max Sync Backlog

    os_NetworkNetCoreNetdevBudget

    integer

    300

    30 → 30000

    no

    Network Budget

    os_NetworkNetCoreRmemMax

    integer

    212992

    21299 → 2129920

    no

    Maximum network receive buffer size that applications can request

    os_NetworkNetCoreWmemMax

    integer

    212992

    21299 → 2129920

    no

    Maximum network transmit buffer size that applications can request

    os_NetworkNetIpv4TcpSlowStartAfterIdle

    integer

    1

    0, 1

    no

    Network Slow Start After Idle Flag

    os_NetworkNetIpv4TcpFinTimeout

    integer

    60

    6 → 600

    no

    Network TCP timeout

    os_NetworkRfs

    integer

    0

    0 → 131072

    no

    If enabled increases datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running

    0 → 4096

    no

    Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk

    os_StorageNrRequests

    integer

    32

    12 → 1280

    no

    Storage Number of Requests

    os_StorageRqAffinity

    integer

    1

    1, 2

    no

    Storage Requests Affinity

    os_StorageNomerges

    integer

    0

    0 → 2

    no

    Enables the user to disable the lookup logic involved with IO merging requests in the block layer. By default (0) all merges are enabled. With 1 only simple one-hit merges will be tried. With 2 no merge algorithms will be tried

    os_StorageMaxSectorsKb

    integer

    kilobytes

    256

    32 → 256

    no

    The largest IO size that the OS can issue to a block device

    cpu_load_avg

    tasks

    The system load average (i.e., the number of active tasks in the system)

    cpu_num

    CPUs

    The number of CPUs available in the system (physical and logical)

    mem_fault

    faults/s

    The number of memory faults (minor+major)

    mem_fault_major

    faults/s

    The number of major memory faults (i.e., faults that cause disk access) per second

    disk_io_inflight_details

    ops

    The number of IO disk operations in progress (outstanding) broken down by disk (e.g., disk /dev/nvme01)

    disk_iops

    ops/s

    The average number of IO disk operations per second across all disks

    network_in_bytes_details

    bytes/s

    The number of inbound network packets in bytes per second broken down by network device (e.g., wlp4s0)

    network_out_bytes_details

    bytes/s

    The number of outbound network packets in bytes per second broken down by network device (e.g., eth01)

    os_context_switch

    switches/s

    The number of context switches per second

    proc_blocked

    processes

    The number of processes blocked (e.g, for IO or swapping reasons)

    os_cpuSchedMinGranularity

    integer

    nanoseconds

    os_MemorySwappiness

    integer

    percent

    os_NetworkNetCoreSomaxconn

    integer

    megabytes

    os_StorageReadAhead

    integer

    kilobytes

    1500000

    60

    128

    128

    All metrics
    Name
    Unit
    Description

    jvm_heap_size

    bytes

    The size of the JVM heap memory

    jvm_heap_used

    bytes

    The amount of heap memory used

    hashtag
    Parameters

    hashtag
    Heap

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    j9vm_minHeapSize

    integer

    megabytes

    hashtag
    Garbage Collection

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    j9vm_gcPolicy

    categorical

    hashtag
    JIT

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    j9vm_jitOptlevel

    ordinal

    hashtag
    Other parameters

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    j9vm_lockReservation

    categorical

    hashtag
    Domains

    The following parameters require their ranges or default values to be updated according to the described rules:

    hashtag
    Memory

    Parameter
    Default value
    Domain

    j9vm_minNewSpace

    25% of j9vm_minHeapSize

    must not exceed j9vm_minHeapSize

    j9vm_maxNewSpace

    25% of j9vm_maxHeapSize

    must not exceed j9vm_maxHeapSize

    Notice that the value nocompressedreferences for j9vm_compressedReferences can only be specified for JVMs compiled with the proper --with-noncompressedrefs flag. If this is not the case you cannot actively disable compressed references, meaning:

    • for Xmx <= 57GB is useless to tune this parameter since compressed references are active by default and it is not possible to explicitly disable it

    • for Xmx > 57GB, since the by default (blank value) compressed references are disabled, Akamas can try to enable it. This requires removing the nocompressedreferences from the domain

    hashtag
    Constraints

    The following tables show a list of constraints that may be required in the definition of the study, depending on the tuned parameters:

    hashtag
    Memory

    Formula
    Notes

    jvm.j9vm_minHeapSize < jvm.j9vm_maxHeapSize

    jvm.j9vm_minNewSpace < jvm.j9vm_maxNewSpace && jvm.j9vm_minNewSpace < jvm.j9vm_minHeapSize && jvm.j9vm_maxNewSpace < jvm.j9vm_maxHeapSize

    jvm.j9vm_minOldSpace < jvm.j9vm_maxOldSpace && jvm.j9vm_minOldSpace < jvm.j9vm_minHeapSize && jvm.j9vm_maxOldSpace < jvm.j9vm_maxHeapSize

    Notice that

    • j9vm_newSpaceFixed is mutually exclusive with j9vm_minNewSpace and j9vm_maxNewSpace

    • j9vm_oldSpaceFixed is mutually exclusive with j9vm_minOldSpace and j9vm_maxOldSpace

    • the sum of j9vm_minNewSpace and j9vm_minOldSpace must be equal to j9vm_minHeapSize, so it's useless to tune all of them together. Max values seem to be more complex.

    Amazon Linux 2

    This page describes the Optimization Pack for the component type Amazon Linux 2.

    hashtag
    Metrics

    hashtag

    Amazon Linux 2022

    This page describes the Optimization Pack for the component type Amazon Linux 2022.

    hashtag
    Metrics

    hashtag

    jvm_heap_util

    percent

    The utilization % of heap memory

    jvm_memory_used

    bytes

    The total amount of memory used across all the JVM memory pools

    jvm_memory_used_details

    bytes

    The total amount of memory used broken down by pool (e.g., code-cache, compressed-class-space)

    jvm_memory_buffer_pool_used

    bytes

    The total amount of bytes used by buffers within the JVM buffer memory pool

    jvm_gc_time

    percent

    The % of wall clock time the JVM spent doing stop the world garbage collection activities

    jvm_gc_time_details

    percent

    The % of wall clock time the JVM spent doing stop the world garbage collection activities broken down by type of garbage collection algorithm (e.g., ParNew)

    jvm_gc_count

    collections/s

    The total number of stop the world JVM garbage collections that have occurred per second

    jvm_gc_count_details

    collections/s

    The total number of stop the world JVM garbage collections that have occurred per second, broken down by type of garbage collection algorithm (e.g., G1, CMS)

    jvm_gc_duration

    seconds

    The average duration of a stop the world JVM garbage collection

    jvm_gc_duration_details

    seconds

    The average duration of a stop the world JVM garbage collection broken down by type of garbage collection algorithm (e.g., G1, CMS)

    jvm_threads_current

    threads

    The total number of active threads within the JVM

    jvm_threads_deadlocked

    threads

    The total number of deadlocked threads within the JVM

    jvm_compilation_time

    milliseconds

    The total time spent by the JVM JIT compiler compiling bytecode

    You should select your own default value.

    You should select your own domain.

    yes

    Minimum heap size (in megabytes)

    j9vm_maxHeapSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Maximum heap size (in megabytes)

    j9vm_minFreeHeap

    real

    percent

    0.3

    0.1 → 0.5

    yes

    Specify the minimum % free heap required after global GC

    j9vm_maxFreeHeap

    real

    percent

    0.6

    0.4 → 0.9

    yes

    Specify the maximum % free heap required after global GC

    gencon

    gencon, subpool, optavgpause, optthruput, nogc

    yes

    GC policy to use

    j9vm_gcThreads

    integer

    threads

    You should select your own default value.

    1 → 64

    yes

    Number of threads the garbage collector uses for parallel operations

    j9vm_scvTenureAge

    integer

    10

    1 → 14

    yes

    Set the initial tenuring threshold for generational concurrent GC policy

    j9vm_scvAdaptiveTenureAge

    categorical

    blank

    blank, -Xgc:scvNoAdaptiveTenure

    yes

    Enable the adaptive tenure age for generational concurrent GC policy

    j9vm_newSpaceFixed

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The fixed size of the new area when using the gencon GC policy. Must not be set alongside min or max

    j9vm_minNewSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The initial size of the new area when using the gencon GC policy

    j9vm_maxNewSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum size of the new area when using the gencon GC policy

    j9vm_oldSpaceFixed

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The fixed size of the old area when using the gencon GC policy. Must not be set alongside min or max

    j9vm_minOldSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The initial size of the old area when using the gencon GC policy

    j9vm_maxOldSpace

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum size of the old area when using the gencon GC policy

    j9vm_concurrentScavenge

    categorical

    concurrentScavenge

    concurrentScavenge, noConcurrentScavenge

    yes

    Support pause-less garbage collection mode with gencon

    j9vm_gcPartialCompact

    categorical

    nopartialcompactgc

    nopartialcompactgc, partialcompactgc

    yes

    Enable partial compaction

    j9vm_concurrentMeter

    categorical

    soa

    soa, loa, dynamic

    yes

    Determine which area is monitored by the concurrent mark

    j9vm_concurrentBackground

    integer

    0

    0 → 128

    yes

    The number of background threads assisting the mutator threads in concurrent mark

    j9vm_concurrentSlack

    integer

    megabytes

    0

    You should select your own domain.

    yes

    The target size of free heap space for concurrent collectors

    j9vm_concurrentLevel

    integer

    percent

    8

    0 → 100

    yes

    The ratio between the amount of heap allocated and the amount of heap marked

    j9vm_gcCompact

    categorical

    blank

    blank, -Xcompactgc, -Xnocompactgc

    yes

    Enables full compaction on all garbage collections (system and global)

    j9vm_minGcTime

    real

    percent

    0.05

    0.0 → 1.0

    yes

    The minimum percentage of time to be spent in garbage collection, triggering the resize of the heap to meet the specified values

    j9vm_maxGcTime

    real

    percent

    0.13

    0.0 → 1.0

    yes

    The maximum percentage of time to be spent in garbage collection, triggering the resize of the heap to meet the specified values

    j9vm_loa

    categorical

    loa

    loa, noloa

    yes

    Enable the allocation of the large area object during garbage collection

    j9vm_loa_initial

    real

    0.05

    0.0 → 0.95

    yes

    The initial portion of the tenure area allocated to the large area object

    j9vm_loa_minimum

    real

    0.01

    0.0 → 0.95

    yes

    The minimum portion of the tenure area allocated to the large area object

    j9vm_loa_maximum

    real

    0.5

    0.0 → 0.95

    yes

    The maximum portion of the tenure area allocated to the large area object

    noOpt

    noOpt, cold, warm, hot, veryHot, scorching

    yes

    Force the JIT compiler to compile all methods at a specific optimization level

    j9vm_compilationThreads

    integer

    threads

    You should select your own default value.

    1 → 7

    yes

    Number of JIT threads

    j9vm_codeCacheTotal

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Maximum size limit in MB for the JIT code cache

    j9vm_jit_count

    integer

    10000

    0 → 1000000

    yes

    The number of times a method is called before it is compiled

    blank

    blank, -XlockReservation

    no

    Enables an optimization that presumes a monitor is owned by the thread that last acquired it

    j9vm_compressedReferences

    categorical

    blank

    blank, -Xcompressedrefs, -Xnocompressedrefs

    yes

    Enable/disable the use of compressed references

    j9vm_aggressiveOpts

    categorical

    blank

    blank, -Xaggressive

    yes

    Enable the use of aggressive performance optimization features, which are expected to become default in upcoming releases

    j9vm_virtualized

    categorical

    blank

    blank, -Xtune:virtualized

    yes

    Optimize the VM for virtualized environment, reducing CPU usage when idle

    j9vm_shareclasses

    categorical

    blank

    blank, -Xshareclasses

    yes

    Enable class sharing

    j9vm_quickstart

    categorical

    blank

    blank, -Xquickstart

    yes

    Run JIT with only a subset of optimizations, improving the performance of short-running applications

    j9vm_minimizeUserCpu

    categorical

    blank

    blank, -Xthr:minimizeUserCPU

    yes

    Minimizes user-mode CPU usage in thread synchronization where possible

    j9vm_minOldSpace

    75% of j9vm_minHeapSize

    must not exceed j9vm_minHeapSize

    j9vm_maxOldSpace

    same as j9vm_maxHeapSize

    must not exceed j9vm_maxHeapSize

    j9vm_gcthreads

    number of CPUs - 1, up to a maximum of 64

    capped to default, no benefit in exceeding that value

    j9vm_compressedReferences

    enabled for j9vm_maxHeapSize<= 57 GB

    jvm.j9vm_loa_minimum <= jvm.j9vm_loa_initial && jvm.j9vm_loa_initial <= jvm.j9vm_loa_maximum

    jvm.j9vm_minFreeHeap + 0.05 < jvm.j9vm_maxFreeHeap

    jvm.j9vm_minGcTimeMin < jvm.j9vm_maxGcTime

    CPU
    Metric
    Description

    cpu_load_avg

    tasks

    The system load average (i.e., the number of active tasks in the system)

    cpu_num

    CPUs

    The number of CPUs available in the system (physical and logical)

    hashtag
    Memory

    Metric
    Description

    mem_fault

    faults/s

    The number of memory faults (minor+major)

    mem_fault_major

    faults/s

    The number of major memory faults (i.e., faults that cause disk access) per second

    hashtag
    Disk & Filesystem

    Metric
    Description

    disk_io_inflight_details

    ops

    The number of IO disk operations in progress (outstanding) broken down by disk (e.g., disk /dev/nvme01)

    disk_iops

    ops/s

    The average number of IO disk operations per second across all disks

    hashtag
    Network

    Metric
    Description

    network_in_bytes_details

    bytes/s

    The number of inbound network packets in bytes per second broken down by network device (e.g., wlp4s0)

    network_out_bytes_details

    bytes/s

    The number of outbound network packets in bytes per second broken down by network device (e.g., eth01)

    hashtag
    Others

    Metric
    Description

    os_context_switch

    switches/s

    The number of context switches per second

    proc_blocked

    processes

    The number of processes blocked (e.g, for IO or swapping reasons)

    hashtag
    Parameters

    hashtag
    CPU

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    os_cpuSchedMinGranularity

    integer

    nanoseconds

    hashtag
    Memory

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    os_MemorySwappiness

    integer

    percent

    hashtag
    Network

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    os_NetworkNetCoreSomaxconn

    integer

    megabytes

    hashtag
    Storage

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    os_StorageReadAhead

    integer

    kilobytes

    CPU
    Metric
    Description

    cpu_load_avg

    tasks

    The system load average (i.e., the number of active tasks in the system)

    cpu_num

    CPUs

    The number of CPUs available in the system (physical and logical)

    hashtag
    Memory

    Metric
    Description

    mem_fault

    faults/s

    The number of memory faults (minor+major)

    mem_fault_major

    faults/s

    The number of major memory faults (i.e., faults that cause disk access) per second

    hashtag
    Disk & Filesystem

    Metric
    Description

    disk_io_inflight_details

    ops

    The number of IO disk operations in progress (outstanding) broken down by disk (e.g., disk /dev/nvme01)

    disk_iops

    ops/s

    The average number of IO disk operations per second across all disks

    hashtag
    Network

    Metric
    Description

    network_in_bytes_details

    bytes/s

    The number of inbound network packets in bytes per second broken down by network device (e.g., wlp4s0)

    network_out_bytes_details

    bytes/s

    The number of outbound network packets in bytes per second broken down by network device (e.g., eth01)

    hashtag
    Others

    Metric
    Description

    os_context_switch

    switches/s

    The number of context switches per second

    proc_blocked

    processes

    The number of processes blocked (e.g, for IO or swapping reasons)

    hashtag
    Parameters

    hashtag
    CPU

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    os_cpuSchedMinGranularity

    integer

    nanoseconds

    hashtag
    Memory

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    os_MemorySwappiness

    integer

    percent

    hashtag
    Network

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    os_NetworkNetCoreSomaxconn

    integer

    megabytes

    hashtag
    Storage

    Parameter
    Type
    Unit
    Default Value
    Domain
    Restart
    Description

    os_StorageReadAhead

    integer

    kilobytes

    Dynatrace metrics mapping

    This page describes the mapping between metrics provided by Dynatrace to Akamas metrics for each supported component type.

    Component Type
    Notes

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    cpu_used

    CPUs

    The average number of CPUs used in the system (physical and logical)

    cpu_util_details

    percent

    The average CPU utilization % broken down by usage type and CPU number (e.g., cp1 user, cp2 system, cp3 soft-irq)

    mem_fault_minor

    faults/s

    The number of minor memory faults (i.e., faults that do not cause disk access) per second

    mem_swapins

    pages/s

    The number of memory pages swapped in per second

    mem_swapouts

    pages/s

    The number of memory pages swapped out per second

    mem_total

    bytes

    The total amount of installed memory

    mem_used

    bytes

    The total amount of memory used

    mem_used_nocache

    bytes

    The total amount of memory used without considering memory reserved for caching purposes

    mem_util

    percent

    The memory utilization % (i.e, the % of memory used)

    mem_util_details

    percent

    The memory utilization % (i.e., the % of memory used) broken down by usage type (e.g., active memory)

    mem_util_nocache

    percent

    The memory utilization % (i.e., the % of memory used) without considering memory reserved for caching purposes

    disk_iops_details

    ops/s

    The number of IO disk-write operations per second broken down by disk (e.g., disk /dev/nvme01)

    disk_iops_reads

    ops/s

    The average number of IO disk-read operations per second across all disks

    disk_iops_writes

    ops/s

    The average number of IO disk-write operations per second across all disks

    disk_read_bytes

    bytes/s

    The number of bytes per second read across all disks

    disk_read_bytes_details

    bytes/s

    The average response time of IO disk operations broken down by disk (e.g., disk C://)

    disk_read_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_response_time_details

    seconds

    The average response time of IO disk operations broken down by disk (e.g., disk C://)

    disk_response_time_read

    seconds

    The average response time of read disk operations

    disk_response_time_worst

    seconds

    The average response time of IO disk operations of the slowest disk

    disk_response_time_write

    seconds

    The average response time of write on disk operations

    disk_swap_used

    bytes

    The total amount of space used by swap disks

    disk_swap_util

    percent

    The average space utilization % of swap disks

    disk_util_details

    percent

    The utilization % of disk, i.e how much time a disk is busy doing work broken down by disk (e.g., disk D://)

    disk_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_write_bytes_details

    bytes/s

    The number of bytes per second written from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation WRITE)

    filesystem_size

    bytes

    The size of filesystems broken down by type and device (e.g., filesystem of type ext4 for device /dev/nvme01)

    filesystem_used

    bytes

    The amount of space used on the filesystems broken down by type and device (e.g., filesystem of type zfs on device /dev/nvme01)

    filesystem_util

    percent

    The space utilization % of filesystems broken down by type and device (e.g., filesystem of type overlayfs on device /dev/loop1)

    network_tcp_retrans

    retrans/s

    The number of network TCP retransmissions per second

    1500000

    300000 → 30000000

    no

    Minimal preemption granularity (in nanoseconds) for CPU bound tasks

    os_cpuSchedWakeupGranularity

    integer

    nanoseconds

    2000000

    400000 → 40000000

    no

    Scheduler Wakeup Granularity (in nanoseconds)

    os_CPUSchedMigrationCost

    integer

    nanoseconds

    500000

    100000 → 5000000

    no

    Amount of time (in nanoseconds) after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations

    os_CPUSchedChildRunsFirst

    integer

    0

    0, 1

    no

    A freshly forked child runs before the parent continues execution

    os_CPUSchedLatency

    integer

    nanoseconds

    12000000

    2400000 → 240000000

    no

    Targeted preemption latency (in nanoseconds) for CPU bound tasks

    os_CPUSchedAutogroupEnabled

    integer

    0

    0, 1

    no

    Enables the Linux task auto-grouping feature, where the kernel assigns related tasks to groups and schedules them together on CPUs to achieve higher performance for some workloads

    os_CPUSchedNrMigrate

    integer

    32

    3 → 320

    no

    Scheduler NR Migrate

    60

    0 → 100

    no

    The percentage of RAM free space for which the kernel will start swapping pages to disk

    os_MemoryVmVfsCachePressure

    integer

    100

    10 → 100

    no

    VFS Cache Pressure

    os_MemoryVmCompactionProactiveness

    integer

    20

    0 → 100

    Determines how aggressively compaction is done in the background

    os_MemoryVmPageLockUnfairness

    integer

    5

    0 → 1000

    no

    Set the level of unfairness in the page lock queue.

    os_MemoryVmWatermarkScaleFactor

    integer

    10

    0 → 1000

    no

    The amount of memory, expressed as fractions of 10'000, left in a node/system before kswapd is woken up and how much memory needs to be free before kswapd goes back to sleep

    os_MemoryVmWatermarkBoostFactor

    integer

    15000

    0 → 30000

    no

    The level of reclaim when the memory is being fragmented, expressed as fractions of 10'000 of a zone's high watermark

    os_MemoryVmMinFree

    integer

    67584

    10240 → 1024000

    no

    Minimum Free Memory (in kbytes)

    os_MemoryTransparentHugepageEnabled

    categorical

    madvise

    always, never, madvise

    no

    Transparent Hugepage Enablement Flag

    os_MemoryTransparentHugepageDefrag

    categorical

    madvise

    always, never, defer+madvise, madvise, defer

    no

    Transparent Hugepage Enablement Defrag

    os_MemorySwap

    categorical

    swapon

    swapon, swapoff

    no

    Memory Swap

    os_MemoryVmDirtyRatio

    integer

    20

    1 → 99

    no

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyBackgroundRatio

    integer

    10

    1 → 99

    no

    When the dirty memory pages exceed this percentage of the total memory, the kernel begins to write them asynchronously in the background

    os_MemoryVmDirtyExpire

    integer

    centiseconds

    3000

    300 → 30000

    no

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyWriteback

    integer

    centiseconds

    500

    50 → 5000

    no

    Memory Dirty Writeback (in centisecs)

    128

    12 → 8192

    no

    Network Max Connections

    os_NetworkNetCoreNetdevMaxBacklog

    integer

    megabytes/s

    1000

    100 → 10000

    no

    Network Max Backlog

    os_NetworkNetIpv4TcpMaxSynBacklog

    integer

    milliseconds

    256

    52 → 5120

    no

    Network IPV4 Max Sync Backlog

    os_NetworkNetCoreNetdevBudget

    integer

    300

    30 → 30000

    no

    Network Budget

    os_NetworkNetCoreRmemMax

    integer

    212992

    21299 → 2129920

    no

    Maximum network receive buffer size that applications can request

    os_NetworkNetCoreWmemMax

    integer

    212992

    21299 → 2129920

    no

    Maximum network transmit buffer size that applications can request

    os_NetworkNetIpv4TcpSlowStartAfterIdle

    integer

    1

    0, 1

    no

    Network Slow Start After Idle Flag

    os_NetworkNetIpv4TcpFinTimeout

    integer

    60

    6 → 600

    no

    Network TCP timeout

    os_NetworkRfs

    integer

    0

    0 → 131072

    no

    If enabled increases datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running

    128

    0 → 4096

    no

    Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk

    os_StorageNrRequests

    integer

    32

    12 → 1280

    no

    Storage Number of Requests

    os_StorageRqAffinity

    integer

    1

    1, 2

    no

    Storage Requests Affinity

    os_StorageQueueScheduler

    integer

    none

    none, kyber, mq-deadline, bfq

    no

    Storage Queue Scheduler Type

    os_StorageNomerges

    integer

    0

    0 → 2

    no

    Enables the user to disable the lookup logic involved with IO merging requests in the block layer. By default (0) all merges are enabled. With 1 only simple one-hit merges will be tried. With 2 no merge algorithms will be tried

    os_StorageMaxSectorsKb

    integer

    kilobytes

    256

    32 → 256

    no

    The largest IO size that the OS can issue to a block device

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    cpu_used

    CPUs

    The average number of CPUs used in the system (physical and logical)

    cpu_util_details

    percent

    The average CPU utilization % broken down by usage type and CPU number (e.g., cp1 user, cp2 system, cp3 soft-irq)

    mem_fault_minor

    faults/s

    The number of minor memory faults (i.e., faults that do not cause disk access) per second

    mem_swapins

    pages/s

    The number of memory pages swapped in per second

    mem_swapouts

    pages/s

    The number of memory pages swapped out per second

    mem_total

    bytes

    The total amount of installed memory

    mem_used

    bytes

    The total amount of memory used

    mem_used_nocache

    bytes

    The total amount of memory used without considering memory reserved for caching purposes

    mem_util

    percent

    The memory utilization % (i.e, the % of memory used)

    mem_util_details

    percent

    The memory utilization % (i.e., the % of memory used) broken down by usage type (e.g., active memory)

    mem_util_nocache

    percent

    The memory utilization % (i.e., the % of memory used) without considering memory reserved for caching purposes

    disk_iops_details

    ops/s

    The number of IO disk-write operations per second broken down by disk (e.g., disk /dev/nvme01)

    disk_iops_reads

    ops/s

    The average number of IO disk-read operations per second across all disks

    disk_iops_writes

    ops/s

    The average number of IO disk-write operations per second across all disks

    disk_read_bytes

    bytes/s

    The number of bytes per second read across all disks

    disk_read_bytes_details

    bytes/s

    The average response time of IO disk operations broken down by disk (e.g., disk C://)

    disk_read_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_response_time_details

    seconds

    The average response time of IO disk operations broken down by disk (e.g., disk C://)

    disk_response_time_read

    seconds

    The average response time of read disk operations

    disk_response_time_worst

    seconds

    The average response time of IO disk operations of the slowest disk

    disk_response_time_write

    seconds

    The average response time of write on disk operations

    disk_swap_used

    bytes

    The total amount of space used by swap disks

    disk_swap_util

    percent

    The average space utilization % of swap disks

    disk_util_details

    percent

    The utilization % of disk, i.e how much time a disk is busy doing work broken down by disk (e.g., disk D://)

    disk_write_bytes

    bytes/s

    The number of bytes per second written across all disks

    disk_write_bytes_details

    bytes/s

    The number of bytes per second written from the disks broken down by disk and type of operation (e.g., disk /dev/nvme01 and operation WRITE)

    filesystem_size

    bytes

    The size of filesystems broken down by type and device (e.g., filesystem of type ext4 for device /dev/nvme01)

    filesystem_used

    bytes

    The amount of space used on the filesystems broken down by type and device (e.g., filesystem of type zfs on device /dev/nvme01)

    filesystem_util

    percent

    The space utilization % of filesystems broken down by type and device (e.g., filesystem of type overlayfs on device /dev/loop1)

    network_tcp_retrans

    retrans/s

    The number of network TCP retransmissions per second

    1500000

    300000 → 30000000

    no

    Minimal preemption granularity (in nanoseconds) for CPU bound tasks

    os_cpuSchedWakeupGranularity

    integer

    nanoseconds

    2000000

    400000 → 40000000

    no

    Scheduler Wakeup Granularity (in nanoseconds)

    os_CPUSchedMigrationCost

    integer

    nanoseconds

    500000

    100000 → 5000000

    no

    Amount of time (in nanoseconds) after the last execution that a task is considered to be "cache hot" in migration decisions. A "hot" task is less likely to be migrated to another CPU, so increasing this variable reduces task migrations

    os_CPUSchedChildRunsFirst

    integer

    0

    0, 1

    no

    A freshly forked child runs before the parent continues execution

    os_CPUSchedLatency

    integer

    nanoseconds

    12000000

    2400000 → 240000000

    no

    Targeted preemption latency (in nanoseconds) for CPU bound tasks

    os_CPUSchedAutogroupEnabled

    integer

    0

    0, 1

    no

    Enables the Linux task auto-grouping feature, where the kernel assigns related tasks to groups and schedules them together on CPUs to achieve higher performance for some workloads

    os_CPUSchedNrMigrate

    integer

    32

    3 → 320

    no

    Scheduler NR Migrate

    60

    0 → 100

    no

    The percentage of RAM free space for which the kernel will start swapping pages to disk

    os_MemoryVmVfsCachePressure

    integer

    100

    10 → 100

    no

    VFS Cache Pressure

    os_MemoryVmCompactionProactiveness

    integer

    20

    10 → 100

    no

    Determines how aggressively compaction is done in the background

    os_MemoryVmPageLockUnfairness

    integer

    5

    0 → 1000

    no

    Set the level of unfairness in the page lock queue.

    os_MemoryVmWatermarkScaleFactor

    integer

    10

    0 → 1000

    no

    The amount of memory, expressed as fractions of 10'000, left in a node/system before kswapd is woken up and how much memory needs to be free before kswapd goes back to sleep

    os_MemoryVmWatermarkBoostFactor

    integer

    15000

    0 → 30000

    no

    The level of reclaim when the memory is being fragmented, expressed as fractions of 10'000 of a zone's high watermark

    os_MemoryVmMinFree

    integer

    67584

    10240 → 1024000

    no

    Minimum Free Memory (in kbytes)

    os_MemoryTransparentHugepageEnabled

    categorical

    madvise

    always, never, madvise

    no

    Transparent Hugepage Enablement Flag

    os_MemoryTransparentHugepageDefrag

    categorical

    madvise

    always, never, defer+madvise, madvise, defer

    no

    Transparent Hugepage Enablement Defrag

    os_MemorySwap

    categorical

    swapon

    swapon, swapoff

    no

    Memory Swap

    os_MemoryVmDirtyRatio

    integer

    20

    1 → 99

    no

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyBackgroundRatio

    integer

    10

    1 → 99

    no

    When the dirty memory pages exceed this percentage of the total memory, the kernel begins to write them asynchronously in the background

    os_MemoryVmDirtyExpire

    integer

    centiseconds

    3000

    300 → 30000

    no

    When the dirty memory pages exceed this percentage of the total memory, processes are forced to write dirty buffers during their time slice instead of continuing to write

    os_MemoryVmDirtyWriteback

    integer

    centiseconds

    500

    50 → 5000

    no

    Memory Dirty Writeback (in centisecs)

    128

    12 → 8192

    no

    Network Max Connections

    os_NetworkNetCoreNetdevMaxBacklog

    integer

    megabytes/s

    1000

    100 → 10000

    no

    Network Max Backlog

    os_NetworkNetIpv4TcpMaxSynBacklog

    integer

    milliseconds

    256

    52 → 5120

    no

    Network IPV4 Max Sync Backlog

    os_NetworkNetCoreNetdevBudget

    integer

    300

    30 → 30000

    no

    Network Budget

    os_NetworkNetCoreRmemMax

    integer

    212992

    21299 → 2129920

    no

    Maximum network receive buffer size that applications can request

    os_NetworkNetCoreWmemMax

    integer

    212992

    21299 → 2129920

    no

    Maximum network transmit buffer size that applications can request

    os_NetworkNetIpv4TcpSlowStartAfterIdle

    integer

    1

    0, 1

    no

    Network Slow Start After Idle Flag

    os_NetworkNetIpv4TcpFinTimeout

    integer

    60

    6 → 600

    no

    Network TCP timeout

    os_NetworkRfs

    integer

    0

    0 → 131072

    no

    If enabled increases datacache hitrate by steering kernel processing of packets to the CPU where the application thread consuming the packet is running

    128

    0 → 4096

    no

    Read-ahead speeds up file access by pre-fetching data and loading it into the page cache so that it can be available earlier in memory instead of from disk

    os_StorageNrRequests

    integer

    32

    12 → 1280

    no

    Storage Number of Requests

    os_StorageRqAffinity

    integer

    1

    1, 2

    no

    Storage Requests Affinity

    os_StorageQueueScheduler

    integer

    none

    none, kyber, mq-deadline, bfq

    no

    Storage Queue Scheduler Type

    os_StorageNomerges

    integer

    0

    0 → 2

    no

    Enables the user to disable the lookup logic involved with IO merging requests in the block layer. By default (0) all merges are enabled. With 1 only simple one-hit merges will be tried. With 2 no merge algorithms will be tried

    os_StorageMaxSectorsKb

    integer

    kilobytes

    256

    32 → 256

    no

    The largest IO size that the OS can issue to a block device

    hashtag
    Linux

    Component metric
    Labels
    Static labels
    Dynatrace metric
    Scale

    cpu_load_avg

    builtin:host.cpu.load

    hashtag
    JVM

    Component metric
    Dynatrace metric
    Scale
    Aggregate multiple Dynatrace entities
    Multiple entitites aggregation

    jvm_gc_count

    builtin:tech.jvm.memory.pool.collectionCount:merge(poolname,gcname):sum

    1/60

    Yes

    hashtag
    Web Application

    Component metric
    Dynatrace metric
    Default Value
    Scale

    requests_response_time

    builtin:service.response.time

    0

    0.000001

    requests_response_time_min

    hashtag
    Kubernetes Container and Docker Container

    Component Metric
    Dynatrace Metric
    Scale
    Aggregate multiple Dynatrace entities
    Multiple entitites aggregation

    container_cpu_limit

    builtin:containers.cpu.limit

    Yes

    hashtag
    Kubernetes Pod

    Component Metric
    Dynatrace Metric
    Default Value
    Aggregate multiple Dynatrace entities
    Multiple entitites aggregation

    k8s_pod_cpu_limit

    builtin:cloud.kubernetes.pod.cpuLimits

    Yes

    hashtag
    Kubernetes workload

    Component Metric
    Dynatrace Metric
    Scale
    Aggregate multiple Dynatrace entities
    Multiple entitites aggregation

    k8s_workload_desired_pods

    builtin:kubernetes.workload.pods_desired

    No

    Linux

    JVM

    Prometheus metrics mapping

    This page describes the mapping between metrics provided by Prometheus to Akamas metrics for each supported component type

    Component Type
    Notes

    cpu_num

    N/A

    cpu_util

    builtin:host.cpu.usage

    0.01

    cpu_util_details

    mode:

    • idle

    • user

    • system

    • iowait

    • builtin:host.cpu.idle (mode=idle)

    • builtin:host.cpu.system (mode=system)

    • builtin:host.cpu.user (mode=user)

    • builtin:host.cpu.iowait (mode=iowait)

    0.01

    mem_util

    N/A

    mem_util_nocache

    builtin:host.mem.usage

    0.01

    mem_util_details

    N/A

    mem_used

    N/A

    mem_used_nocache

    builtin:host.mem.used

    mem_total

    N/A

    mem_fault

    builtin:host.mem.avail.pfps

    mem_fault_minor

    N/A

    mem_fault_major

    N/A

    mem_swapins

    N/A

    mem_swapouts

    N/A

    disk_swap_util

    N/A

    disk_swap_used

    N/A

    filesystem_util

    • Disk

    builtin:host.disk.usedPct

    filesystem_used

    N/A

    filesystem_size

    N/A

    disk_util_details

    • Disk

    builtin:host.disk.free

    0.01

    disk_iops_writes

    N/A

    disk_iops_reads

    N/A

    disk_iops

    N/A

    disk_iops_details

    N/A

    disk_response_time_worst

    N/A

    disk_response_time

    N/A

    disk_io_inflight_details

    N/A

    0.01

    disk_write_bytes

    N/A

    disk_read_bytes

    N/A

    disk_read_write_bytes

    N/A

    disk_write_bytes_details

    • Disk

    builtin:host.disk.bytesWritten

    disk_read_bytes_details

    • Disk

    builtin:host.disk.bytesRead

    disk_response_time_details

    • Disk

    builtin:host.disk.readTime

    0.001

    proc_blocked

    N/A

    os_context_switch

    N/A

    network_tcp_retrans

    N/A

    network_in_bytes_details

    • Network interface

    builtin:host.net.nic.bytesRx

    network_out_bytes_details

    • Network interface

    builtin:host.net.nic.bytesTx

    avg

    jvm_gc_time

    builtin:tech.jvm.memory.gc.suspensionTime

    0.01

    Yes

    avg

    jvm_heap_size

    builtin:tech.jvm.memory.runtime.max

    Yes

    avg

    jvm_heap_committed

    Yes

    avg

    jvm_heap_used

    Yes

    avg

    jvm_off_heap_used

    Yes

    avg

    jvm_heap_old_gen_size

    Yes

    avg

    jvm_heap_old_gen_used

    Yes

    avg

    jvm_heap_young_gen_size

    Yes

    avg

    jvm_heap_young_gen_used

    Yes

    avg

    jvm_threads_current

    builtin:tech.jvm.threads.count

    Yes

    avg

    builtin:service.response.time:min

    0

    0.000001

    requests_response_time_max

    builtin:service.response.time:max

    0

    0.000001

    requests_throughput

    builtin:service.errors.total.successCount

    0

    1/60

    requests_error_rate

    builtin:service.errors.total.rate

    0

    0.01

    requests_response_time_p50

    builtin:service.response.time:percentile(50)

    0

    0.001

    requests_response_time_p85

    builtin:service.response.time:percentile(85)

    0

    0.001

    requests_response_time_p90

    builtin:service.response.time:percentile(90)

    0

    0.001

    requests_response_time_p95

    builtin:service.response.time:percentile(95)

    0

    0.001

    requests_response_time_p99

    builtin:service.response.time:percentile(99)

    0

    0.001

    avg

    container_cpu_util

    builtin:containers.cpu.usagePercent

    0.01

    Yes

    avg

    container_cpu_util_max

    builtin:containers.cpu.usagePercent

    0.01

    Yes

    max

    container_cpu_throttled_millicores

    builtin:containers.cpu.throttledMilliCores

    Yes

    avg

    container_cpu_throttle_time

    builtin:containers.cpu.throttledTime

    1 / 10^9 / 60

    Yes

    avg

    container_cpu_used

    builtin:containers.cpu.usageMilliCores

    Yes

    avg

    container_cpu_used_max

    builtin:containers.cpu.usageMilliCores

    Yes

    max

    container_memory_limit

    builtin:containers.memory.limitBytes

    Yes

    avg

    container_memory_used

    builtin:containers.memory.residentSetBytes

    Yes

    avg

    container_memory_used_max

    builtin:containers.memory.residentSetBytes

    Yes

    max

    container_memory_util

    builtin:containers.memory.usagePercent

    0.01

    Yes

    avg

    container_memory_util_max

    builtin:containers.memory.usagePercent

    0.01

    Yes

    max

    container_oom_kills_count

    builtin:containers.memory.outOfMemoryKills

    1/60

    Yes

    avg

    avg

    k8s_pod_cpu_request

    builtin:cloud.kubernetes.pod.cpuRequests

    Yes

    avg

    k8s_pod_memory_limit

    builtin:cloud.kubernetes.pod.memoryLimits

    Yes

    avg

    k8s_pod_memory_request

    builtin:cloud.kubernetes.pod.memoryRequests

    Yes

    avg

    k8s_pod_restarts

    builtin:kubernetes.container.restarts:merge(k8s.container.name):sum

    0

    Yes

    avg

    k8s_workload_running_pods

    builtin:kubernetes.pods:filter(eq(pod_phase,Running))

    No

    k8s_workload_cpu_limit

    builtin:kubernetes.workload.limits_cpu

    No

    k8s_workload_cpu_request

    builtin:kubernetes.workload.requests_cpu

    No

    k8s_workload_memory_limit

    builtin:kubernetes.workload.limits_memory

    No

    k8s_workload_memory_request

    builtin:kubernetes.workload.requests_memory

    No

    k8s_workload_cpu_used

    builtin:containers.cpu.usageMilliCores

    Yes

    sum

    k8s_workload_memory_used

    builtin:containers.memory.residentSetBytes

    Yes

    sum

    Web Application
    Kubernetes Container
    Kubernetes Pod
    Docker Container
    builtin:tech.jvm.memory.pool.committed:filter(ne(poolname,Metaspace),ne(poolname,Code Cache),ne(poolname,CodeHeap 'non-nmethods'),ne(poolname,CodeHeap 'non-profiled nmethods'),ne(poolname,CodeHeap 'profiled nmethods'),ne(poolname,Compressed Class Space),ne(poolname,class storage),ne(poolname,miscellaneous non-heap storage),ne(poolname,JIT code cache),ne(poolname,JIT data cache)):merge(poolname):sum
    builtin:tech.jvm.memory.pool.used:filter(ne(poolname,Metaspace),ne(poolname,Code Cache),ne(poolname,CodeHeap 'non-nmethods'),ne(poolname,CodeHeap 'non-profiled nmethods'),ne(poolname,CodeHeap 'profiled nmethods'),ne(poolname,Compressed Class Space),ne(poolname,class storage),ne(poolname,miscellaneous non-heap storage),ne(poolname,JIT code cache),ne(poolname,JIT data cache)):merge(poolname):sum
    builtin:tech.jvm.memory.pool.used:filter(or(eq(poolname,Metaspace),eq(poolname,Code Cache),eq(poolname,CodeHeap 'non-nmethods'),eq(poolname,CodeHeap 'non-profiled nmethods'),eq(poolname,CodeHeap 'profiled nmethods'),eq(poolname,Compressed Class Space),eq(poolname,class storage),eq(poolname,miscellaneous non-heap storage),eq(poolname,JIT code cache),eq(poolname,JIT data cache))):merge(poolname):sum
    builtin:tech.jvm.memory.pool.max:filter(or(eq(poolname,CMS Old Gen),eq(poolname,G1 Old Gen),eq(poolname,PS Old Gen),eq(poolname,Tenured Gen),eq(poolname,tenured-LOA),eq(poolname,tenured-SOA))):merge(poolname):sum
    builtin:tech.jvm.memory.pool.used:filter(or(eq(poolname,CMS Old Gen),eq(poolname,G1 Old Gen),eq(poolname,PS Old Gen),eq(poolname,Tenured Gen),eq(poolname,tenured-LOA),eq(poolname,tenured-SOA))):merge(poolname):sum
    builtin:tech.jvm.memory.pool.max:filter(or(eq(poolname,Eden Space),eq(poolname,G1 Survivor Space),eq(poolname,Par Eden Space),eq(poolname,Par Survivor Space),eq(poolname,PS Eden Space),eq(poolname,PS Survivor Space),eq(poolname,nursery-survivor),eq(poolname,nursery-allocate))):merge(poolname):sum
     builtin:tech.jvm.memory.pool.used:filter(or(eq(poolname,Eden Space),eq(poolname,G1 Survivor Space),eq(poolname,Par Eden Space),eq(poolname,Par Survivor Space),eq(poolname,PS Eden Space),eq(poolname,PS Survivor Space),eq(poolname,nursery-survivor),eq(poolname,nursery-allocate))):merge(poolname):sum

    The default metrics in this table are based on the and

    The default metrics in this table are based on the and

    The default metrics in this table are based on the , configured with the attached

    The default metrics in this table are based on the , extending the default queries with the attached

    The default metrics in this table are based on the

    hashtag
    Linux

    Component metric
    Prometheus query

    cpu_load_avg

    node_load1{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

    cpu_num

    count(node_cpu_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$", mode="system" %FILTERS%})

    cpu_used

    sum by (job) (sum by (cpu, job) (rate(node_cpu_seconds_total{instance=~"$INSTANCE$", mode=~"user|system|softirq|irq|nice", job=~"$JOB$" %FILTERS%}[$DURATION$])))

    hashtag
    JVM

    Component metric
    Prometheus query

    jvm_heap_size

    avg(jvm_memory_bytes_max{area="heap" %FILTERS%})

    jvm_heap_committed

    avg(jvm_memory_bytes_committed{area="heap" %FILTERS%})

    jvm_heap_used

    avg(jvm_memory_bytes_used{area="heap" %FILTERS%})

    hashtag
    Kubernetes workload

    Component metric
    Prometheus query

    k8s_workload_desired_pods

    kube_deployment_spec_replicas{namespace=~"$NAMESPACE$", deployment=~"$DEPLOYMENT$" %FILTERS%}

    k8s_workload_running_pods

    kube_deployment_status_replicas_available{namespace=~"$NAMESPACE$", deployment=~"$DEPLOYMENT$" %FILTERS%}

    k8s_workload_ready_pods

    kube_deployment_status_replicas_ready{namespace=~"$NAMESPACE$", deployment=~"$DEPLOYMENT$" %FILTERS%}

    hashtag
    Kubernetes Pod

    Component metric
    Prometheus metric

    k8s_pod_cpu_used

    1e3 * avg(rate(container_cpu_usage_seconds_total{container="", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}[$DURATION$]))

    k8s_pod_cpu_request

    1e3 * avg(sum by (pod) (kube_pod_container_resource_requests{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}))

    k8s_pod_cpu_limit

    1e3 * avg(sum by (pod) (kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}))

    hashtag
    Kubernetes Container and Docker Container

    The following metrics are configured to work for Kubernetes. When using the Docker optimization pack, override the required metrics in the telemetry instance configuration.

    Component metric
    Prometheus query

    container_cpu_used

    1e3 * avg(rate(container_cpu_usage_seconds_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

    container_cpu_used_max

    1e3 * max(rate(container_cpu_usage_seconds_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

    container_cpu_util

    avg(rate(container_cpu_usage_seconds_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]) / on (pod) group_left kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

    hashtag
    EC2

    Component metric
    Prometheus query

    cpu_util

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() aws_ec2_cpuutilization_average{job='$JOB$'}/100

    network_in_bytes_details

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_network_in_sum{job='$JOB$'} * count_over_time(aws_ec2_network_in_sum{job='$JOB$'}[300s]) / 300)

    network_out_bytes_details

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_network_out_sum{job='$JOB$'} * count_over_time(aws_ec2_network_out_sum{job='$JOB$'}[300s]) / 300)

    hashtag
    Oracle Database

    Component metric
    Prometheus query

    oracle_sga_total_size

    oracledb_memory_size{component='SGA Target', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_sga_free_size

    oracledb_memory_size{component='Free SGA Memory Available', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_sga_max_size

    oracledb_memory_size{component='Maximum SGA Size', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    hashtag
    Web Application

    Component metric
    Prometheus query

    transactions_response_time

    avg(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

    transactions_response_time_max

    max(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

    transactions_response_time_min

    min(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

    Linux

    JVM

    cpu_util

    avg by (job) (sum by (cpu, job) (rate(node_cpu_seconds_total{instance=~"$INSTANCE$", mode=~"user|system|softirq|irq|nice", job=~"$JOB$" %FILTERS%}[$DURATION$])))

    cpu_util_details

    avg by (instance, cpu, mode, job) (sum by (instance, cpu, mode, job) (rate(node_cpu_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])))

    disk_io_inflight_details

    node_disk_io_now{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

    disk_iops

    sum by (instance, job) (rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) + sum by (instance, job) (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    disk_iops_details

    sum by (instance, device, job) (rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    disk_iops_details

    sum by (instance, device, job) (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    disk_iops_details

    sum by (instance, device, job) (rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) + sum by (instance, device, job) (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    disk_iops_reads

    sum by (instance, job) (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    disk_iops_writes

    sum by (instance, job) (rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    disk_read_bytes

    sum by (instance, device, job) (rate(node_disk_read_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    disk_read_bytes_details

    sum by (instance, device, job) (rate(node_disk_read_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    disk_read_write_bytes

    sum by (instance, device, job) (rate(node_disk_written_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_read_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    disk_response_time

    avg by (instance, job) ((rate(node_disk_read_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_write_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) / (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) > 0 ))

    disk_response_time_details

    avg by (instance, device, job) ((rate(node_disk_read_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_write_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) / ((rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) > 0))

    disk_response_time_read

    rate(node_disk_read_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])/ rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    disk_response_time_worst

    max by (instance, job) ((rate(node_disk_read_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_write_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])) / (rate(node_disk_reads_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) + rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]) > 0 ))

    disk_response_time_write

    rate(node_disk_write_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])/ rate(node_disk_writes_completed_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    disk_swap_used

    node_memory_SwapTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_SwapFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

    disk_swap_util

    ((node_memory_SwapTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_SwapFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}) / (node_memory_SwapTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} > 0)) or ((node_memory_SwapTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_SwapFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}))

    disk_util_details

    rate(node_disk_io_time_seconds_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    disk_write_bytes

    sum by (instance, device, job) (rate(node_disk_written_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    disk_write_bytes_details

    sum by (instance, device, job) (rate(node_disk_written_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$]))

    filesystem_size

    node_filesystem_size_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

    filesystem_used

    node_filesystem_size_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_filesystem_free_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

    filesystem_util

    ((node_filesystem_size_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_filesystem_free_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}) / node_filesystem_size_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

    mem_fault_major

    rate(node_vmstat_pgmajfault{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    mem_fault_minor

    rate(node_vmstat_pgfault{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    mem_swapins

    rate(node_vmstat_pswpin{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    mem_swapouts

    rate(node_vmstat_pswpout{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    mem_total

    node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

    mem_used

    (node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_MemFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

    mem_util

    (node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_MemFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}) / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

    mem_util_details

    (node_memory_Active_file_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

    mem_util_details

    (node_memory_Active_anon_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

    mem_util_details

    (node_memory_Inactive_file_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

    mem_util_details

    (node_memory_Inactive_anon_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%})

    mem_util_nocache

    (node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_Buffers_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_Cached_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%} - node_memory_MemFree_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}) / node_memory_MemTotal_bytes{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

    network_in_bytes_details

    rate(node_network_receive_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    network_out_bytes_details

    rate(node_network_transmit_bytes_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    network_tcp_retrans

    rate(node_netstat_Tcp_RetransSegs{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    os_context_switch

    rate(node_context_switches_total{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}[$DURATION$])

    proc_blocked

    node_procs_blocked{instance=~"$INSTANCE$", job=~"$JOB$" %FILTERS%}

    jvm_off_heap_used

    avg(jvm_memory_bytes_used{area="nonheap" %FILTERS%})

    jvm_heap_util

    avg(jvm_memory_bytes_used{area="heap" %FILTERS%} / jvm_memory_bytes_max{area="heap" %FILTERS%})

    jvm_memory_used

    avg(sum by (instance) (jvm_memory_bytes_used))

    jvm_heap_young_gen_size

    avg(sum by (instance) (jvm_memory_pool_bytes_max{pool=~".*Eden Space|.*Survivor Space" %FILTERS%}))

    jvm_heap_young_gen_used

    avg(sum by (instance) (jvm_memory_pool_bytes_used{pool=~".*Eden Space|.*Survivor Space" %FILTERS%}))

    jvm_heap_old_gen_size

    avg(sum by (instance) (jvm_memory_pool_bytes_max{pool=~".*Tenured Gen|.*Old Gen" %FILTERS%}))

    jvm_heap_old_gen_used

    avg(sum by (instance) (jvm_memory_pool_bytes_used{pool=~".*Tenured Gen|.*Old Gen" %FILTERS%}))

    jvm_memory_buffer_pool_used

    avg(sum by (instance) (jvm_buffer_pool_used_bytes))

    jvm_gc_time

    avg(sum by (instance) (rate(jvm_gc_collection_seconds_sum[$DURATION$])))

    jvm_gc_count

    avg(sum by (instance) (rate(jvm_gc_collection_seconds_count[$DURATION$])))

    jvm_gc_duration

    (sum(rate(jvm_gc_collection_seconds_sum[$DURATION$])) / sum(rate(jvm_gc_collection_seconds_count[$DURATION$])) > 0 ) or sum(rate(jvm_gc_collection_seconds_count[$DURATION$]))

    jvm_threads_current

    avg(jvm_threads_current)

    jvm_threads_deadlocked

    avg(jvm_threads_deadlocked)

    transactions_response_time

    avg(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

    transactions_response_time_max

    max(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

    transactions_response_time_min

    min(rate(ResponseTime_sum{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])/rate(ResponseTime_count{code="200", job=~"$JOB$" %FILTERS%}[$DURATION$])>0)

    transactions_response_time_p50

    ResponseTime{quantile="0.5", code="200", job=~"$JOB$" %FILTERS%}

    transactions_response_time_p85

    ResponseTime{quantile="0.85", code="200", job=~"$JOB$" %FILTERS%}

    transactions_response_time_p90

    ResponseTime{quantile="0.9", code="200", job=~"$JOB$" %FILTERS%}

    transactions_response_time_p99

    ResponseTime{quantile="0.99", code="200", job=~"$JOB$" %FILTERS%}

    transactions_throughput

    sum(rate(Ratio_success{job=~"$JOB$" %FILTERS%}[$DURATION$]))

    transactions_error_throughput

    sum(rate(Ratio_failure{job=~"$JOB$" %FILTERS%}[$DURATION$]))

    transactions_error_rate

    (avg(rate(Ratio_failure{job=~"$JOB$" %FILTERS%}[$DURATION$]))/avg(rate(Ratio_total{job=~"$JOB$" %FILTERS%}[$DURATION$])))*100

    users

    sum(jmeter_threads{state="active", job=~"$JOB$" %FILTERS%})

    k8s_workload_cpu_used

    1e3 * sum(rate(container_cpu_usage_seconds_total{container="", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%}[$DURATION$]))

    k8s_workload_memory_used

    sum(last_over_time(container_memory_usage_bytes{container="", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%}[$DURATION$]))

    k8s_workload_cpu_request

    1e3 * sum(kube_pod_container_resource_requests{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%})

    k8s_workload_cpu_limit

    1e3 * sum(kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%})

    k8s_workload_memory_request

    sum(kube_pod_container_resource_requests{resource="memory", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%})

    k8s_workload_memory_limit

    sum(kube_pod_container_resource_limits{resource="memory", namespace=~"$NAMESPACE$", pod=~"$DEPLOYMENT$.*" %FILTERS%})

    k8s_pod_memory_used

    avg(last_over_time(container_memory_usage_bytes{container="", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}[$DURATION$]))

    k8s_pod_memory_working_set

    avg(container_memory_working_set_bytes{container="", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%})

    k8s_pod_memory_request

    avg(sum by (pod) (kube_pod_container_resource_requests{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}))

    k8s_pod_memory_limit

    avg(sum by (pod) (kube_pod_container_resource_limits{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}))

    k8s_pod_restarts

    avg(sum by (pod) (increase(kube_pod_container_status_restarts_total{namespace=~"$NAMESPACE$", pod=~"$POD$" %FILTERS%}[$DURATION$])))

    container_cpu_util_max

    max(rate(container_cpu_usage_seconds_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]) / on (pod) group_left kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

    container_cpu_throttled_millicores

    1e3 * avg(rate(container_cpu_cfs_throttled_seconds_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

    container_cpu_throttle_time

    avg(last_over_time(container_cpu_cfs_throttled_periods_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]) / container_cpu_cfs_periods_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

    container_memory_used

    avg(last_over_time(container_memory_working_set_bytes{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

    container_memory_used_max

    max(last_over_time(container_memory_working_set_bytes{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

    container_memory_util

    avg(last_over_time(container_memory_working_set_bytes{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]) / on (pod) group_left kube_pod_container_resource_limits{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

    container_memory_util_max

    max(last_over_time(container_memory_working_set_bytes{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]) / on (pod) group_left kube_pod_container_resource_limits{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

    container_memory_resident_set_used

    avg(last_over_time(container_memory_rss{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

    container_memory_cache

    avg(last_over_time(container_memory_cache{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

    container_cpu_request

    1e3 * avg(kube_pod_container_resource_requests{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

    container_cpu_limit

    1e3 * avg(kube_pod_container_resource_limits{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

    container_memory_request

    avg(kube_pod_container_resource_requests{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

    container_memory_limit

    avg(kube_pod_container_resource_limits{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})

    container_restarts

    avg(increase(kube_pod_container_status_restarts_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

    container_oom_kills_count

    avg(increase(container_oom_events_total{namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%}[$DURATION$]))

    cost

    sum(kube_pod_container_resource_requests{resource="cpu", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})*29 + sum(kube_pod_container_resource_requests{resource="memory", namespace=~"$NAMESPACE$", pod=~"$POD$", container=~"$CONTAINER$" %FILTERS%})/1024/1024/1024*8

    aws_ec2_credits_cpu_available

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() aws_ec2_cpucredit_balance_average{job='$JOB$'}

    aws_ec2_credits_cpu_used

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() aws_ec2_cpucredit_usage_sum{job='$JOB$'}

    disk_read_bytes

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_ebsread_bytes_sum{job='$JOB$'} * count_over_time(aws_ec2_ebsread_bytes_sum{job='$JOB$'}[300s]) / 300)

    disk_write_bytes

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_ebswrite_bytes_sum{job='$JOB$'} * count_over_time(aws_ec2_ebswrite_bytes_sum{job='$JOB$'}[300s]) / 300)

    aws_ec2_disk_iops

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() ((aws_ec2_ebsread_ops_sum{job='$JOB$'} + aws_ec2_ebswrite_ops_sum{job='$JOB$'}) * count_over_time(aws_ec2_ebsread_ops_sum{job='$JOB$'}[300s])/300)

    aws_ec2_disk_iops_reads

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_ebsread_ops_sum{job='$JOB$'} * count_over_time(aws_ec2_ebsread_ops_sum{job='$JOB$'}[300s]) / 300)

    aws_ec2_disk_iops_writes

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() (aws_ec2_ebswrite_ops_sum{job='$JOB$'} * count_over_time(aws_ec2_ebswrite_ops_sum{job='$JOB$'}[300s]) / 300)

    aws_ec2_ebs_credits_io_util

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() aws_ec2_ebsiobalance__average{job='$JOB$'} / 100

    aws_ec2_ebs_credits_bytes_util

    aws_resource_info{instance='$INSTANCE$', job='$JOB$' %FILTERS%} * on(instance_id) group_left() aws_ec2_ebsbyte_balance__average{job='$JOB$'} / 100

    oracle_pga_target_size

    oracledb_memory_size{component='PGA Target', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_redo_buffers_size

    oracledb_memory_size{component='Redo Buffers', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_default_buffer_cache_size

    oracledb_memory_size{component='DEFAULT buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_default_2k_buffer_cache_size

    oracledb_memory_size{component='DEFAULT 2K buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_default_4k_buffer_cache_size

    oracledb_memory_size{component='DEFAULT 4K buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_default_8k_buffer_cache_size

    oracledb_memory_size{component='DEFULT 8K buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_default_16k_buffer_cache_size

    oracledb_memory_size{component='DEFAULT 16K buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_default_32k_buffer_cache_size

    oracledb_memory_size{component='DEFAULT 32K buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_keep_buffer_cache_size

    oracledb_memory_size{component='KEEP buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_recycle_buffer_cache_size

    oracledb_memory_size{component='RECYCLE buffer cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_asm_buffer_cache_size

    oracledb_memory_size{component='ASM Buffer Cache', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_shared_io_pool_size

    oracledb_memory_size{component='Shared IO Pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_java_pool_size

    oracledb_memory_size{component='java pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_large_pool_size

    oracledb_memory_size{component='large pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_shared_pool_size

    oracledb_memory_size{component='shared pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_streams_pool_size

    oracledb_memory_size{component='streams pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_sessions_active_user

    oracledb_sessions_value{type='USER', status='ACTIVE', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_sessions_inactive_user

    oracledb_sessions_value{type='USER', status='INACTIVE', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_sessions_active_background

    oracledb_sessions_value{type='BACKGROUND', status='ACTIVE', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_sessions_inactive_background

    oracledb_sessions_value{type='BACKGROUND', status='INACTIVE', instance='$INSTANCE$', job='$JOB$' %FILTERS%}

    oracle_buffer_cache_hit_ratio

    ttps://docs.oracle.com/database/121/TGDBA/tune_buffer_cache.htm#TGDBA533

    oracle_redo_log_space_requests

    rate(oracledb_activity_redo_log_space_requests{instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])

    oracle_wait_event_log_file_sync

    rate(oracledb_system_event_time_waited{event='log file sync', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_log_file_parallel_write

    rate(oracledb_system_event_time_waited{event='log file sequential read', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_log_file_sequential_read

    rate(oracledb_system_event_time_waited{event='log file parallel write', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_enq_tx_contention

    rate(oracledb_system_event_time_waited{event='enq: TX - contention', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_enq_tx_row_lock_contention

    rate(oracledb_system_event_time_waited{event='enq: TX - row lock contention', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_latch_row_cache_objects

    rate(oracledb_system_event_time_waited{event='latch: row cache objects', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_latch_shared_pool

    rate(oracledb_system_event_time_waited{event='latch: shared pool', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_resmgr_cpu_quantum

    rate(oracledb_system_event_time_waited{event='resmgr:cpu quantum', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_sql_net_message_from_client

    rate(oracledb_system_event_time_waited{event='SQL*Net message from client', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_rdbms_ipc_message

    rate(oracledb_system_event_time_waited{event='rdbms ipc message', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_db_file_sequential_read

    rate(oracledb_system_event_time_waited{event='db file sequential read', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_log_file_switch_checkpoint_incomplete

    rate(oracledb_system_event_time_waited{event='log file switch (checkpoint incomplete)', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_row_cache_lock

    rate(oracledb_system_event_time_waited{event='row cache lock', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_buffer_busy_waits

    rate(oracledb_system_event_time_waited{event='buffer busy waits', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_event_db_file_async_io_submit

    rate(oracledb_system_event_time_waited{event='db file async I/O submit', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$])/100

    oracle_wait_class_commit

    sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Commit', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

    oracle_wait_class_concurrency

    sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Concurrency', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

    oracle_wait_class_system_io

    sum without(event) (rate(oracledb_system_event_time_waited{wait_class='System I/O', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

    oracle_wait_class_user_io

    sum without(event) (rate(oracledb_system_event_time_waited{wait_class='User I/O', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

    oracle_wait_class_other

    sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Other', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

    oracle_wait_class_scheduler

    sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Scheduler', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

    oracle_wait_class_idle

    sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Idle', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

    oracle_wait_class_application

    sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Application', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

    oracle_wait_class_network

    sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Network', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

    oracle_wait_class_configuration

    sum without(event) (rate(oracledb_system_event_time_waited{wait_class='Configuration', instance='$INSTANCE$', job='$JOB$' %FILTERS%}[$DURATION$]))/100

    transactions_response_time_p50

    ResponseTime{quantile="0.5", code="200", job=~"$JOB$" %FILTERS%}

    transactions_response_time_p85

    ResponseTime{quantile="0.85", code="200", job=~"$JOB$" %FILTERS%}

    transactions_response_time_p90

    ResponseTime{quantile="0.9", code="200", job=~"$JOB$" %FILTERS%}

    transactions_response_time_p99

    ResponseTime{quantile="0.99", code="200", job=~"$JOB$" %FILTERS%}

    transactions_throughput

    sum(rate(Ratio_success{job=~"$JOB$" %FILTERS%}[$DURATION$]))

    transactions_error_throughput

    sum(rate(Ratio_failure{job=~"$JOB$" %FILTERS%}[$DURATION$]))

    transactions_error_rate

    (avg(rate(Ratio_failure{job=~"$JOB$" %FILTERS%}[$DURATION$]))/avg(rate(Ratio_total{job=~"$JOB$" %FILTERS%}[$DURATION$])))*100

    users

    sum(jmeter_threads{state="active", job=~"$JOB$" %FILTERS%})

    Kubernetes Container
    cadvisorarrow-up-right
    kube-state-metricsarrow-up-right
    Kubernetes Pod
    cadvisorarrow-up-right
    kube-state-metricsarrow-up-right
    EC2
    CloudWatch Exporter
    custom configuration file
    Oracle Database
    OracleDB Exporter
    custom configuration file
    Web Application
    Prometheus Listener for Jmeterarrow-up-right

    Java OpenJDK 8

    This page describes the Optimization Pack for Java OpenJDK 8 JVM.

    hashtag
    Metrics

    hashtag
    Memory

    Metric
    Unit
    Description

    hashtag
    CPU

    Metric
    Unit
    Description

    hashtag
    Garbage Collection

    Metric
    Unit
    Description

    hashtag
    Other metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    hashtag
    Memory

    Name
    Type
    Unit
    Dafault
    Domain
    Restart
    Description

    hashtag
    Garbage Collection

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    Compilation

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    Other parameters

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    Domains

    The following parameters require their ranges or default values to be updated according to the described rules:

    hashtag
    Constraints

    The following tables show a list of constraints that may be required in the definition of the study, depending on the tuned parameters:

    Formula
    Notes

    Java OpenJDK 11

    This page describes the Optimization Pack for Java OpenJDK 11 JVM.

    hashtag
    Metrics

    hashtag
    Memory

    jvm_heap_used

    bytes

    The amount of heap memory used

    jvm_heap_util

    percent

    The utilization % of heap memory

    jvm_off_heap_used

    bytes

    The amount of non-heap memory used

    jvm_heap_old_gen_used

    bytes

    The amount of heap memory used (old generation)

    jvm_heap_young_gen_used

    bytes

    The amount of heap memory used (young generation)

    jvm_heap_old_gen_size

    bytes

    The size of the JVM heap memory (old generation)

    jvm_heap_young_gen_size

    bytes

    The size of the JVM heap memory (young generation)

    jvm_memory_used

    bytes

    The total amount of memory used across all the JVM memory pools

    jvm_heap_committed

    bytes

    The size of the JVM committed memory

    jvm_memory_buffer_pool_used

    bytes

    The total amount bytes used by buffers within the JVM buffer memory pool

    jvm_gc_duration

    seconds

    The average duration of a stop the world JVM garbage collection

    jvm_compilation_time

    milliseconds

    The total time spent by the JVM JIT compiler compiling bytecode

    You should select your own domain.

    yes

    The minimum heap size.

    jvm_maxHeapSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum heap size.

    jvm_maxRAM

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum amount of memory used by the JVM.

    jvm_initialRAMPercentage

    real

    percent

    1.563

    0.1 → 100

    yes

    The initial percentage of memory used by the JVM.

    jvm_maxRAMPercentage

    real

    percent

    25.0

    0.1 → 100.0

    yes

    The percentage of memory used for maximum heap size, on systems with large physical memory size (more than 512MB). Requires Java 10, Java 8 Update 191 or later.

    jvm_alwaysPreTouch

    categorical

    -AlwaysPreTouch

    +AlwaysPreTouch, -AlwaysPreTouch

    yes

    Pretouch pages during initialization.

    jvm_metaspaceSize

    integer

    megabytes

    20

    You should select your own domain within 1 and 1024

    yes

    The initial size of the allocated class metadata space.

    jvm_maxMetaspaceSize

    integer

    megabytes

    20

    You should select your own domain within 1 and 1024

    yes

    The maximum size of the allocated class metadata space.

    jvm_useTransparentHugePages

    categorical

    -UseTransparentHugePages

    +UseTransparentHugePages, -UseTransparentHugePages

    yes

    Enables the use of large pages that can dynamically grow or shrink.

    jvm_allocatePrefetchInstr

    integer

    0

    0 → 3

    yes

    Prefetch ahead of the allocation pointer.

    jvm_allocatePrefetchDistance

    integer

    bytes

    0

    0 → 512

    yes

    Distance to prefetch ahead of allocation pointer. -1 use system-specific value (automatically determined).

    jvm_allocatePrefetchLines

    integer

    lines

    3

    1 → 64

    yes

    The number of lines to prefetch ahead of array allocation pointer.

    jvm_allocatePrefetchStyle

    integer

    1

    0 → 3

    yes

    Selects the prefetch instruction to generate.

    jvm_useLargePages

    categorical

    +UseLargePages

    +UseLargePages, -UseLargePages

    yes

    Enable the use of large page memory.

    0 → 2147483647

    yes

    The ratio of old/new generation sizes.

    jvm_newSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Sets the initial and maximum size of the heap for the young generation (nursery).

    jvm_maxNewSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Specifies the upper bound for the young generation size.

    jvm_survivorRatio

    integer

    8

    1 → 100

    yes

    The ratio between the Eden and each Survivor-space within the JVM. For example, a jvm_survivorRatio would mean that the Eden-space is 6 times one Survivor-space.

    jvm_useAdaptiveSizePolicy

    categorical

    +UseAdaptiveSizePolicy

    +UseAdaptiveSizePolicy, -UseAdaptiveSizePolicy

    yes

    Enable adaptive generation sizing. Disable coupled with jvm_targetSurvivorRatio.

    jvm_adaptiveSizePolicyWeight

    integer

    10

    0 → 100

    yes

    The weighting given to the current Garbage Collection time versus previous GC times when checking the timing goal.

    jvm_targetSurvivorRatio

    integer

    50

    1 → 100

    yes

    The desired percentage of Survivor-space used after young garbage collection.

    jvm_minHeapFreeRatio

    integer

    40

    1 → 99

    yes

    The minimum percentage of heap free after garbage collection to avoid shrinking.

    jvm_maxHeapFreeRatio

    integer

    70

    0 → 100

    yes

    The maximum percentage of heap free after garbage collection to avoid shrinking.

    jvm_maxTenuringThreshold

    integer

    15

    0 → 15

    yes

    The maximum value for the tenuring threshold.

    jvm_gcType

    categorical

    Parallel

    Serial, Parallel, ConcMarkSweep, G1, ParNew

    yes

    Type of the garbage collection algorithm.

    jvm_concurrentGCThreads

    integer

    threads

    You should select your own default value.

    You should select your own domain.

    yes

    The number of threads concurrent garbage collection will use.

    jvm_parallelGCThreads

    integer

    threads

    You should select your own default value.

    You should select your own domain.

    yes

    The number of threads garbage collection will use for parallel phases.

    jvm_maxGCPauseMillis

    integer

    milliseconds

    200

    1 → 1000

    yes

    Adaptive size policy maximum GC pause time goal in millisecond.

    jvm_resizePLAB

    categorical

    +ResizePLAB

    +ResizePLAB, -ResizePLAB

    yes

    Enables the dynamic resizing of promotion LABs.

    jvm_GCTimeRatio

    integer

    99

    0 → 100

    yes

    The target fraction of time that can be spent in garbage collection before increasing the heap, computet as 1 / (1 + GCTimeRatio).

    jvm_initiatingHeapOccupancyPercent

    integer

    45

    0 → 100

    yes

    Sets the percentage of the heap occupancy at which to start a concurrent GC cycle.

    jvm_youngGenerationSizeIncrement

    integer

    20

    0 → 100

    yes

    The increment size for Young Generation adaptive resizing.

    jvm_tenuredGenerationSizeIncrement

    integer

    20

    0 → 100

    yes

    The increment size for Old/Tenured Generation adaptive resizing.

    jvm_adaptiveSizeDecrementScaleFactor

    integer

    4

    1 → 1024

    yes

    Specifies the scale factor for goal-driven generation resizing.

    jvm_CMSTriggerRatio

    integer

    80

    0 → 100

    yes

    The percentage of MinHeapFreeRatio allocated before CMS GC starts

    jvm_CMSInitiatingOccupancyFraction

    integer

    -1

    -1 → 99

    yes

    Configure oldgen occupancy fraction threshold for CMS GC. Negative values default to CMSTriggerRatio.

    jvm_CMSClassUnloadingEnabled

    categorical

    +CMSClassUnloadingEnabled

    +CMSClassUnloadingEnabled, -CMSClassUnloadingEnabled

    yes

    Enables class unloading when using CMS.

    jvm_useCMSInitiatingOccupancyOnly

    categorical

    -UseCMSInitiatingOccupancyOnly

    +UseCMSInitiatingOccupancyOnly, -UseCMSInitiatingOccupancyOnly

    yes

    Use of the occupancy value as the only criterion for initiating the CMS collector.

    jvm_G1HeapRegionSize

    integer

    megabytes

    8

    1→32

    yes

    Sets the size of the regions for G1.

    jvm_G1ReservePercent

    integer

    10

    0 → 50

    yes

    Sets the percentage of the heap that is reserved as a false ceiling to reduce the possibility of promotion failure for the G1 collector.

    jvm_G1NewSizePercent

    integer

    5

    0 → 100

    yes

    Sets the percentage of the heap to use as the minimum for the young generation size.

    jvm_G1MaxNewSizePercent

    integer

    60

    0 → 100

    yes

    Sets the percentage of the heap size to use as the maximum for young generation size.

    jvm_G1MixedGCLiveThresholdPercent

    integer

    85

    0 → 100

    yes

    Sets the occupancy threshold for an old region to be included in a mixed garbage collection cycle.

    jvm_G1HeapWastePercent

    integer

    5

    0 → 100

    yes

    The maximum percentage of the reclaimable heap before starting mixed GC.

    jvm_G1MixedGCCountTarget

    integer

    collections

    8

    0 → 100

    yes

    Sets the target number of mixed garbage collections after a marking cycle to collect old regions with at most G1MixedGCLIveThresholdPercent live data. The default is 8 mixed garbage collections.

    jvm_G1OldCSetRegionThresholdPercent

    integer

    10

    0 → 100

    yes

    The upper limit on the number of old regions to be collected during mixed GC.

    3 → 2048

    yes

    The maximum size of the compiled code cache pool.

    jvm_tieredCompilation

    categorical

    +TieredCompilation

    +TieredCompilation, -TieredCompilation

    yes

    The type of the garbage collection algorithm.

    jvm_tieredCompilationStopAtLevel

    integer

    4

    0 → 4

    yes

    Overrides the number of detected CPUs that the VM will use to calculate the size of thread pools.

    jvm_compilationThreads

    integer

    threads

    You should select your own default value.

    You should select your own domain.

    yes

    The number of compilation threads.

    jvm_backgroundCompilation

    categorical

    +BackgroundCompilation

    +BackgroundCompilation, -BackgroundCompilation

    yes

    Allow async interpreted execution of a method while it is being compiled.

    jvm_inline

    categorical

    +Inline

    +Inline, -Inline

    yes

    Enable inlining.

    jvm_maxInlineSize

    integer

    bytes

    35

    1 → 2097152

    yes

    The bytecode size limit (in bytes) of the inlined methods.

    jvm_inlineSmallCode

    integer

    bytes

    2000

    1 → 16384

    yes

    The maximum compiled code size limit (in bytes) of the inlined methods.

    +AggressiveOpts, -AggressiveOpts

    yes

    Turn on point performance compiler optimizations.

    jvm_usePerfData

    categorical

    +UsePerfData

    +UsePerfData, -UsePerfData

    yes

    Enable monitoring of performance data.

    jvm_useNUMA

    categorical

    -UseNUMA

    +UseNUMA, -UseNUMA

    yes

    Enable NUMA.

    jvm_useBiasedLocking

    categorical

    +UseBiasedLocking

    +UseBiasedLocking, -UseBiasedLocking

    yes

    Manage the use of biased locking.

    jvm_activeProcessorCount

    integer

    CPUs

    1

    1 → 512

    yes

    Overrides the number of detected CPUs that the VM will use to calculate the size of thread pools.

    jvm_newSize

    Depends on the configured heap

    jvm_maxNewSize

    Depends on the configured heap

    jvm_concurrentGCThreads

    Depends on the available CPU cores

    Depends on the available CPU cores

    jvm_parallelGCThreads

    Depends on the available CPU cores

    Depends on the available CPU cores

    jvm_compilation_threads

    Depends on the available CPU cores

    Depends on the available CPU cores

    mem_used

    bytes

    The total amount of memory used

    jvm_heap_size

    bytes

    The size of the JVM heap memory

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    cpu_used

    CPUs

    The total amount of CPUs used

    jvm_gc_time

    percent

    The % of wall clock time the JVM spent doing stop the world garbage collection activities

    jvm_gc_count

    collections/s

    The total number of stop the world JVM garbage collections that have occurred per second

    jvm_threads_current

    threads

    The total number of active threads within the JVM

    jvm_threads_deadlocked

    threads

    The total number of deadlocked threads within the JVM

    jvm_minHeapSize

    integer

    megabytes

    jvm_newRatio

    integer

    jvm_reservedCodeCacheSize

    integer

    megabytes

    jvm_aggressiveOpts

    categorical

    Parameter

    Default value

    Domain

    jvm_minHeapSize

    Depends on the instance available memory

    jvm_maxHeapSize

    jvm.jvm_minHeapSize <= jvm.jvm_maxHeapSize

    jvm.jvm_minHeapFreeRatio <= jvm.jvm_maxHeapFreeRatio

    jvm.jvm_maxNewSize < jvm.jvm_maxHeapSize

    You should select your own default value.

    2

    240

    -AggressiveOpts

    Depends on the instance available memory

    jvm.jvm_concurrentGCThreads <= jvm.jvm_parallelGCThreads

    Metric
    Unit
    Description

    mem_used

    bytes

    The total amount of memory used

    jvm_heap_size

    bytes

    The size of the JVM heap memory

    hashtag
    CPU

    Metric
    Unit
    Description

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    cpu_used

    CPUs

    The total amount of CPUs used

    hashtag
    Garbage Collection

    Metric
    Unit
    Description

    jvm_gc_time

    percent

    The % of wall clock time the JVM spent doing stop the world garbage collection activities

    jvm_gc_count

    collections/s

    The total number of stop the world JVM garbage collections that have occurred per second

    hashtag
    Other metrics

    Metric
    Unit
    Description

    jvm_threads_current

    threads

    The total number of active threads within the JVM

    jvm_threads_deadlocked

    threads

    The total number of deadlocked threads within the JVM

    hashtag
    Parameters

    hashtag
    Memory

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    jvm_minHeapSize

    integer

    megabytes

    hashtag
    Garbage Collection

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    jvm_newRatio

    integer

    hashtag
    Compilation

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    jvm_reservedCodeCacheSize

    integer

    megabytes

    hashtag
    Other parameters

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    jvm_usePerfData

    categorical

    hashtag
    Domains

    The following parameters require their ranges or default values to be updated according to the described rules:

    Parameter

    Default value

    Domain

    jvm_minHeapSize

    Depends on the instance available memory

    jvm_maxHeapSize

    hashtag
    Constraints

    The following tables show a list of constraints that may be required in the definition of the study, depending on the tuned parameters:

    Formula
    Notes

    jvm.jvm_minHeapSize <= jvm.jvm_maxHeapSize

    jvm.jvm_minHeapFreeRatio <= jvm.jvm_maxHeapFreeRatio

    jvm.jvm_maxNewSize < jvm.jvm_maxHeapSize * 0.8

    jvm_heap_used

    bytes

    The amount of heap memory used

    jvm_heap_util

    percent

    The utilization % of heap memory

    jvm_off_heap_used

    bytes

    The amount of non-heap memory used

    jvm_heap_old_gen_used

    bytes

    The amount of heap memory used (old generation)

    jvm_heap_young_gen_used

    bytes

    The amount of heap memory used (young generation)

    jvm_heap_old_gen_size

    bytes

    The size of the JVM heap memory (old generation)

    jvm_heap_young_gen_size

    bytes

    The size of the JVM heap memory (young generation)

    jvm_memory_used

    bytes

    The total amount of memory used across all the JVM memory pools

    jvm_heap_committed

    bytes

    The size of the JVM committed memory

    jvm_memory_buffer_pool_used

    bytes

    The total amount bytes used by buffers within the JVM buffer memory pool

    jvm_gc_duration

    seconds

    The average duration of a stop the world JVM garbage collection

    jvm_compilation_time

    milliseconds

    The total time spent by the JVM JIT compiler compiling bytecode

    You should select your own default value.

    You should select your own domain.

    yes

    The minimum heap size.

    jvm_maxHeapSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum heap size.

    jvm_maxRAM

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum amount of memory used by the JVM.

    jvm_initialRAMPercentage

    real

    percent

    1.563

    0.1 → 100

    yes

    The percentage of memory used for initial heap size.

    jvm_maxRAMPercentage

    real

    percent

    25.0

    0.1 → 100.0

    yes

    The percentage of memory used for maximum heap size, on systems with large physical memory size (more than 512MB).

    jvm_alwaysPreTouch

    categorical

    -AlwaysPreTouch

    +AlwaysPreTouch, -AlwaysPreTouch

    yes

    Pretouch pages during initialization.

    jvm_metaspaceSize

    integer

    megabytes

    20

    You should select your own domain within 1 and 1024

    yes

    The initial size of the allocated class metadata space.

    jvm_maxMetaspaceSize

    integer

    megabytes

    20

    You should select your own domain within 1 and 1024

    yes

    The maximum size of the allocated class metadata space.

    jvm_useTransparentHugePages

    categorical

    -UseTransparentHugePages

    +UseTransparentHugePages, -UseTransparentHugePages

    yes

    Enables the use of large pages that can dynamically grow or shrink.

    jvm_allocatePrefetchInstr

    integer

    0

    0 → 3

    yes

    Prefetch ahead of the allocation pointer.

    jvm_allocatePrefetchDistance

    integer

    bytes

    0

    0 → 512

    yes

    Distance to prefetch ahead of allocation pointer. -1 use system-specific value (automatically determined).

    jvm_allocatePrefetchLines

    integer

    lines

    3

    1 → 64

    yes

    The number of lines to prefetch ahead of array allocation pointer.

    jvm_allocatePrefetchStyle

    integer

    1

    0 → 3

    yes

    Selects the prefetch instruction to generate.

    jvm_useLargePages

    categorical

    +UseLargePages

    +UseLargePages, -UseLargePages

    yes

    Enable the use of large page memory.

    jvm_aggressiveHeap

    categorical

    -AggressiveHeap

    -AggressiveHeap, +AggressiveHeap

    yes

    Optimize heap options for long-running memory intensive apps.

    2

    0 → 2147483647

    yes

    The ratio of old/new generation sizes.

    jvm_newSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Sets the initial and maximum size of the heap for the young generation (nursery).

    jvm_maxNewSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Specifies the upper bound for the young generation size.

    jvm_survivorRatio

    integer

    8

    1 → 100

    yes

    The ratio between the Eden and each Survivor-space within the JVM. For example, a jvm_survivorRatio would mean that the Eden-space is 6 times one Survivor-space.

    jvm_useAdaptiveSizePolicy

    categorical

    +UseAdaptiveSizePolicy

    +UseAdaptiveSizePolicy, -UseAdaptiveSizePolicy

    yes

    Enable adaptive generation sizing. Disable coupled with jvm_targetSurvivorRatio.

    jvm_adaptiveSizePolicyWeight

    integer

    10

    0 → 100

    yes

    The weighting given to the current Garbage Collection time versus previous GC times when checking the timing goal.

    jvm_targetSurvivorRatio

    integer

    50

    1 → 100

    yes

    The desired percentage of Survivor-space used after young garbage collection.

    jvm_minHeapFreeRatio

    integer

    40

    1 → 99

    yes

    The minimum percentage of heap free after garbage collection to avoid shrinking.

    jvm_maxHeapFreeRatio

    integer

    70

    0 → 100

    yes

    The maximum percentage of heap free after garbage collection to avoid shrinking.

    jvm_maxTenuringThreshold

    integer

    15

    0 → 15

    yes

    The maximum value for the tenuring threshold.

    jvm_gcType

    categorical

    G1

    Serial, Parallel, ConcMarkSweep, G1

    yes

    Type of the garbage collection algorithm.

    jvm_concurrentGCThreads

    integer

    threads

    You should select your own default value.

    You should select your own domain.

    yes

    The number of threads concurrent garbage collection will use.

    jvm_parallelGCThreads

    integer

    threads

    You should select your own default value.

    You should select your own domain.

    yes

    The number of threads garbage collection will use for parallel phases.

    jvm_maxGCPauseMillis

    integer

    milliseconds

    200

    1 → 1000

    yes

    Adaptive size policy maximum GC pause time goal in millisecond.

    jvm_resizePLAB

    categorical

    +ResizePLAB

    +ResizePLAB, -ResizePLAB

    yes

    Enables the dynamic resizing of promotion LABs.

    jvm_GCTimeRatio

    integer

    99

    2 → 100

    yes

    The target fraction of time that can be spent in garbage collection before increasing the heap, computet as 1 / (1 + GCTimeRatio).

    jvm_initiatingHeapOccupancyPercent

    integer

    45

    5 → 90

    yes

    Sets the percentage of the heap occupancy at which to start a concurrent GC cycle.

    jvm_youngGenerationSizeIncrement

    integer

    20

    0 → 100

    yes

    The increment size for Young Generation adaptive resizing.

    jvm_tenuredGenerationSizeIncrement

    integer

    20

    0 → 100

    yes

    The increment size for Old/Tenured Generation adaptive resizing.

    jvm_adaptiveSizeDecrementScaleFactor

    integer

    4

    1 → 1024

    yes

    Specifies the scale factor for goal-driven generation resizing.

    jvm_CMSTriggerRatio

    integer

    80

    0 → 100

    yes

    The percentage of MinHeapFreeRatio allocated before CMS GC starts

    jvm_CMSInitiatingOccupancyFraction

    integer

    -1

    -1 → 99

    yes

    Configure oldgen occupancy fraction threshold for CMS GC. Negative values default to CMSTriggerRatio.

    jvm_CMSClassUnloadingEnabled

    categorical

    +CMSClassUnloadingEnabled

    +CMSClassUnloadingEnabled, -CMSClassUnloadingEnabled

    yes

    Enables class unloading when using CMS.

    jvm_useCMSInitiatingOccupancyOnly

    categorical

    -UseCMSInitiatingOccupancyOnly

    +UseCMSInitiatingOccupancyOnly, -UseCMSInitiatingOccupancyOnly

    yes

    Use of the occupancy value as the only criterion for initiating the CMS collector.

    jvm_G1HeapRegionSize

    integer

    megabytes

    8

    1→32

    yes

    Sets the size of the regions for G1.

    jvm_G1ReservePercent

    integer

    10

    0 → 50

    yes

    Sets the percentage of the heap that is reserved as a false ceiling to reduce the possibility of promotion failure for the G1 collector.

    jvm_G1NewSizePercent

    integer

    5

    0 → 100

    yes

    Sets the percentage of the heap to use as the minimum for the young generation size.

    jvm_G1MaxNewSizePercent

    integer

    60

    0 → 100

    yes

    Sets the percentage of the heap size to use as the maximum for young generation size.

    jvm_G1MixedGCLiveThresholdPercent

    integer

    85

    0 → 100

    yes

    Sets the occupancy threshold for an old region to be included in a mixed garbage collection cycle.

    jvm_G1HeapWastePercent

    integer

    5

    0 → 100

    yes

    The maximum percentage of the reclaimable heap before starting mixed GC.

    jvm_G1MixedGCCountTarget

    integer

    collections

    8

    0 → 100

    yes

    Sets the target number of mixed garbage collections after a marking cycle to collect old regions with at most G1MixedGCLIveThresholdPercent live data. The default is 8 mixed garbage collections.

    jvm_G1OldCSetRegionThresholdPercent

    integer

    10

    0 → 100

    yes

    The upper limit on the number of old regions to be collected during mixed GC.

    jvm_G1AdaptiveIHOPNumInitialSamples

    integer

    3

    1→2097152

    yes

    The number of completed time periods from initial mark to first mixed GC required to use the input values for prediction of the optimal occupancy to start marking.

    jvm_G1UseAdaptiveIHOP

    categorical

    +G1UseAdaptiveIHOP

    +G1UseAdaptiveIHOP, -G1UseAdaptiveIHOP

    yes

    Adaptively adjust the initiating heap occupancy from the initial value of InitiatingHeapOccupancyPercent.

    240

    3 → 2048

    yes

    The maximum size of the compiled code cache pool.

    jvm_tieredCompilation

    categorical

    +TieredCompilation

    +TieredCompilation, -TieredCompilation

    yes

    The type of the garbage collection algorithm.

    jvm_tieredCompilationStopAtLevel

    integer

    4

    0 → 4

    yes

    Overrides the number of detected CPUs that the VM will use to calculate the size of thread pools.

    jvm_compilationThreads

    integer

    threads

    You should select your own default value.

    You should select your own domain.

    yes

    The number of compilation threads.

    jvm_backgroundCompilation

    categorical

    +BackgroundCompilation

    +BackgroundCompilation, -BackgroundCompilation

    yes

    Allow async interpreted execution of a method while it is being compiled.

    jvm_inline

    categorical

    +Inline

    +Inline, -Inline

    yes

    Enable inlining.

    jvm_maxInlineSize

    integer

    bytes

    35

    1 → 2097152

    yes

    The bytecode size limit (in bytes) of the inlined methods.

    jvm_inlineSmallCode

    integer

    bytes

    2000

    1 → 16384

    yes

    The maximum compiled code size limit (in bytes) of the inlined methods.

    +UsePerfData

    +UsePerfData, -UsePerfData

    yes

    Enable monitoring of performance data.

    jvm_useNUMA

    categorical

    -UseNUMA

    +UseNUMA, -UseNUMA

    yes

    Enable NUMA.

    jvm_useBiasedLocking

    categorical

    +UseBiasedLocking

    +UseBiasedLocking, -UseBiasedLocking

    yes

    Manage the use of biased locking.

    jvm_activeProcessorCount

    integer

    CPUs

    1

    1 → 512

    yes

    Overrides the number of detected CPUs that the VM will use to calculate the size of thread pools.

    Depends on the instance available memory

    jvm_newSize

    Depends on the configured heap

    jvm_maxNewSize

    Depends on the configured heap

    jvm_concurrentGCThreads

    Depends on the available CPU cores

    Depends on the available CPU cores

    jvm_parallelGCThreads

    Depends on the available CPU cores

    Depends on the available CPU cores

    jvm_compilation_threads

    Depends on the available CPU cores

    Depends on the available CPU cores

    jvm.jvm_concurrentGCThreads <= jvm.jvm_parallelGCThreads

    jvm_activeProcessorCount < container.cpu_limits/1000 + 1

    Java OpenJDK 17

    This page describes the Optimization Pack for Java OpenJDK 17 JVM.

    hashtag
    Metrics

    hashtag
    Memory

    Metric
    Unit
    Description

    hashtag
    CPU

    Metric
    Unit
    Description

    hashtag
    Garbage Collection

    Metric
    Unit
    Description

    hashtag
    Other metrics

    Metric
    Unit
    Description

    hashtag
    Parameters

    hashtag
    Memory

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    Garbage Collection

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    Compilation

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    Other parameters

    Name
    Type
    Unit
    Default
    Domain
    Restart
    Description

    hashtag
    Domains

    The following parameters require their ranges or default values to be updated according to the described rules:

    hashtag
    Constraints

    The following tables show a list of constraints that may be required in the definition of the study, depending on the tuned parameters:

    Formula
    Notes

    jvm_heap_used

    bytes

    The amount of heap memory used

    jvm_heap_util

    percent

    The utilization % of heap memory

    jvm_off_heap_used

    bytes

    The amount of non-heap memory used

    jvm_heap_old_gen_used

    bytes

    The amount of heap memory used (old generation)

    jvm_heap_young_gen_used

    bytes

    The amount of heap memory used (young generation)

    jvm_heap_old_gen_size

    bytes

    The size of the JVM heap memory (old generation)

    jvm_heap_young_gen_size

    bytes

    The size of the JVM heap memory (young generation)

    jvm_memory_used

    bytes

    The total amount of memory used across all the JVM memory pools

    jvm_heap_committed

    bytes

    The size of the JVM committed memory

    jvm_memory_buffer_pool_used

    bytes

    The total amount bytes used by buffers within the JVM buffer memory pool

    jvm_gc_duration

    seconds

    The average duration of a stop the world JVM garbage collection

    jvm_compilation_time

    milliseconds

    The total time spent by the JVM JIT compiler compiling bytecode

    You should select your own domain.

    yes

    The minimum heap size.

    jvm_maxHeapSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum heap size.

    jvm_maxRAM

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    The maximum amount of memory used by the JVM.

    jvm_initialRAMPercentage

    real

    percent

    2

    1 → 100

    yes

    The percentage of memory used for initial heap size.

    jvm_maxRAMPercentage

    integer

    percent

    25

    1 → 100

    yes

    The percentage of memory used for maximum heap size, on systems with large physical memory size (more than 512MB).

    jvm_minRAMPercentage

    integer

    percent

    25

    1 → 100

    yes

    The percentage of memory used for maximum heap size, on systems with small physical memory size (up to 256MB)

    jvm_alwaysPreTouch

    categorical

    -AlwaysPreTouch

    +AlwaysPreTouch, -AlwaysPreTouch

    yes

    Pretouch pages during initialization.

    jvm_metaspaceSize

    integer

    megabytes

    20

    You should select your own domain within 1 and 1024

    yes

    The initial size of the allocated class metadata space.

    jvm_maxMetaspaceSize

    integer

    megabytes

    20

    You should select your own domain within 1 and 1024

    yes

    The maximum size of the allocated class metadata space.

    jvm_useTransparentHugePages

    categorical

    -UseTransparentHugePages

    +UseTransparentHugePages, -UseTransparentHugePages

    yes

    Enables the use of large pages that can dynamically grow or shrink.

    jvm_allocatePrefetchInstr

    integer

    0

    0 → 3

    yes

    Prefetch ahead of the allocation pointer.

    jvm_allocatePrefetchDistance

    integer

    bytes

    0

    0 → 512

    yes

    Distance to prefetch ahead of allocation pointer. -1 use system-specific value (automatically determined).

    jvm_allocatePrefetchLines

    integer

    lines

    3

    0 → 64

    yes

    The number of lines to prefetch ahead of array allocation pointer.

    jvm_allocatePrefetchStyle

    integer

    1

    0 → 3

    yes

    Selects the prefetch instruction to generate.

    jvm_useLargePages

    categorical

    +UseLargePages

    +UseLargePages, -UseLargePages

    yes

    Enable the use of large page memory.

    jvm_aggressiveHeap

    categorical

    -AggressiveHeap

    -AggressiveHeap, +AggressiveHeap

    yes

    Optimize heap options for long-running memory intensive apps.

    0 → 2147483647

    yes

    The ratio of old/new generation sizes.

    jvm_newSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Sets the initial and maximum size of the heap for the young generation (nursery).

    jvm_maxNewSize

    integer

    megabytes

    You should select your own default value.

    You should select your own domain.

    yes

    Specifies the upper bound for the young generation size.

    jvm_survivorRatio

    integer

    8

    1 → 100

    yes

    The ratio between the Eden and each Survivor-space within the JVM. For example, a jvm_survivorRatio would mean that the Eden-space is 6 times one Survivor-space.

    jvm_useAdaptiveSizePolicy

    categorical

    +UseAdaptiveSizePolicy

    +UseAdaptiveSizePolicy, -UseAdaptiveSizePolicy

    yes

    Enable adaptive generation sizing. Disable coupled with jvm_targetSurvivorRatio.

    jvm_adaptiveSizePolicyWeight

    integer

    10

    0 → 100

    yes

    The weighting given to the current Garbage Collection time versus previous GC times when checking the timing goal.

    jvm_targetSurvivorRatio

    integer

    50

    1 → 100

    yes

    The desired percentage of Survivor-space used after young garbage collection.

    jvm_minHeapFreeRatio

    integer

    40

    1 → 99

    yes

    The minimum percentage of heap free after garbage collection to avoid shrinking.

    jvm_maxHeapFreeRatio

    integer

    70

    0 → 100

    yes

    The maximum percentage of heap free after garbage collection to avoid shrinking.

    jvm_maxTenuringThreshold

    integer

    15

    0 → 15

    yes

    The maximum value for the tenuring threshold.

    jvm_gcType

    categorical

    G1

    Serial, Parallel, G1, Z , Shenandoah

    yes

    Type of the garbage collection algorithm.

    jvm_concurrentGCThreads

    integer

    threads

    You should select your own default value.

    You should select your own domain.

    yes

    The number of threads concurrent garbage collection will use.

    jvm_parallelGCThreads

    integer

    threads

    You should select your own default value.

    You should select your own domain.

    yes

    The number of threads garbage collection will use for parallel phases.

    jvm_maxGCPauseMillis

    integer

    milliseconds

    200

    1 → 1000

    yes

    Adaptive size policy maximum GC pause time goal in millisecond.

    jvm_resizePLAB

    categorical

    +ResizePLAB

    +ResizePLAB, -ResizePLAB

    yes

    Enables the dynamic resizing of promotion LABs.

    jvm_GCTimeRatio

    integer

    99

    0 → 100

    yes

    The target fraction of time that can be spent in garbage collection before increasing the heap, computet as 1 / (1 + GCTimeRatio).

    jvm_initiatingHeapOccupancyPercent

    integer

    45

    0 → 100

    yes

    Sets the percentage of the heap occupancy at which to start a concurrent GC cycle.

    jvm_youngGenerationSizeIncrement

    integer

    20

    0 → 100

    yes

    The increment size for Young Generation adaptive resizing.

    jvm_tenuredGenerationSizeIncrement

    integer

    20

    0 → 100

    yes

    The increment size for Old/Tenured Generation adaptive resizing.

    jvm_adaptiveSizeDecrementScaleFactor

    integer

    4

    1 → 1024

    yes

    Specifies the scale factor for goal-driven generation resizing.

    jvm_G1HeapRegionSize

    integer

    megabytes

    8

    1→32

    yes

    Sets the size of the regions for G1.

    jvm_G1ReservePercent

    integer

    10

    0 → 50

    yes

    Sets the percentage of the heap that is reserved as a false ceiling to reduce the possibility of promotion failure for the G1 collector.

    jvm_G1NewSizePercent

    integer

    5

    0 → 100

    yes

    Sets the percentage of the heap to use as the minimum for the young generation size.

    jvm_G1MaxNewSizePercent

    integer

    60

    0 → 100

    yes

    Sets the percentage of the heap size to use as the maximum for young generation size.

    jvm_G1MixedGCLiveThresholdPercent

    integer

    85

    0 → 100

    yes

    Sets the occupancy threshold for an old region to be included in a mixed garbage collection cycle.

    jvm_G1HeapWastePercent

    integer

    5

    0 → 100

    yes

    The maximum percentage of the reclaimable heap before starting mixed GC.

    jvm_G1MixedGCCountTarget

    integer

    collections

    8

    0 → 100

    yes

    Sets the target number of mixed garbage collections after a marking cycle to collect old regions with at most G1MixedGCLIveThresholdPercent live data. The default is 8 mixed garbage collections.

    jvm_G1OldCSetRegionThresholdPercent

    integer

    10

    0 → 100

    yes

    The upper limit on the number of old regions to be collected during mixed GC.

    jvm_G1AdaptiveIHOPNumInitialSamples

    integer

    3

    1→2097152

    yes

    The number of completed time periods from initial mark to first mixed GC required to use the input values for prediction of the optimal occupancy to start marking.

    jvm_G1UseAdaptiveIHOP

    categorical

    +G1UseAdaptiveIHOP

    +G1UseAdaptiveIHOP, -G1UseAdaptiveIHOP

    yes

    Adaptively adjust the initiating heap occupancy from the initial value of InitiatingHeapOccupancyPercent.

    jvm_G1PeriodicGCInterval

    integer

    milliseconds

    0

    0 → 3600000

    yes

    The number of milliseconds after a previous GC to wait before triggering a periodic gc. A value of zero disables periodically enforced gc cycles.

    jvm_ZProactive

    categorical

    +ZProactive

    +ZProactive, -ZProactive

    yes

    Enable proactive GC cycles.

    jvm_ZUncommit

    categorical

    +ZUncommit

    +ZUncommit, -ZUncommit

    yes

    Enable uncommit (free) of unused heap memory back to the OS.

    jvm_ZAllocationSpikeTolerance

    integer

    2

    1 → 10

    yes

    The allocation spike tolerance factor for ZGC.

    jvm_ZFragmentationLimit

    integer

    25

    10 → 90

    yes

    The maximum allowed heap fragmentation for ZGC.

    jvm_ZCollectionInterval

    integer

    seconds

    0

    0 → 3600

    yes

    Force GC at a fixed time interval (in seconds) for ZGC.

    jvm_ZMarkStackSpaceLimit

    integer

    bytes

    8589934592

    33554432 → 1099511627776

    yes

    The maximum number of bytes allocated for mark stacks for ZGC.

    32 → 2048

    yes

    The maximum size of the compiled code cache pool.

    jvm_tieredCompilation

    categorical

    +TieredCompilation

    +TieredCompilation, -TieredCompilation

    yes

    The type of the garbage collection algorithm.

    jvm_tieredCompilationStopAtLevel

    integer

    4

    0 → 4

    yes

    Overrides the number of detected CPUs that the VM will use to calculate the size of thread pools.

    jvm_compilationThreads

    integer

    threads

    You should select your own default value.

    You should select your own domain.

    yes

    The number of compilation threads.

    jvm_backgroundCompilation

    categorical

    +BackgroundCompilation

    +BackgroundCompilation, -BackgroundCompilation

    yes

    Allow async interpreted execution of a method while it is being compiled.

    jvm_inline

    categorical

    +Inline

    +Inline, -Inline

    yes

    Enable inlining.

    jvm_maxInlineSize

    integer

    bytes

    35

    1 → 2097152

    yes

    The bytecode size limit (in bytes) of the inlined methods.

    jvm_inlineSmallCode

    integer

    bytes

    2000

    500 → 5000

    yes

    The maximum compiled code size limit (in bytes) of the inlined methods.

    jvm_maxInlineLevel

    integer

    15

    1 → 64

    yes

    The maximum number of nested calls that are inlined by high tier compiler.

    jvm_freqInlineSize

    integer

    bytes

    325

    1 → 3250

    yes

    The maximum number of bytecode instructions to inline for a method.

    jvm_compilationMode

    categorical

    default

    default, quick-only, high-only, high-only-quick-internal

    yes

    The JVM compilation mode.

    jvm_typeProfileWidth

    integer

    2

    1 → 8

    yes

    The number of receiver types to record in call/cast profile.

    +UsePerfData, -UsePerfData

    yes

    Enable monitoring of performance data.

    jvm_useNUMA

    categorical

    -UseNUMA

    +UseNUMA, -UseNUMA

    yes

    Enable NUMA.

    jvm_useBiasedLocking

    categorical

    +UseBiasedLocking

    +UseBiasedLocking, -UseBiasedLocking

    yes

    Manage the use of biased locking.

    jvm_activeProcessorCount

    integer

    CPUs

    1

    1 → 512

    yes

    Overrides the number of detected CPUs that the VM will use to calculate the size of thread pools.

    jvm_threadStackSize

    integer

    kilobytes

    1024

    128 → 16384

    yes

    The thread Stack Size (in Kbytes).

    jvm_newSize

    Depends on the configured heap

    jvm_maxNewSize

    Depends on the configured heap

    jvm_concurrentGCThreads

    Depends on the available CPU cores

    Depends on the available CPU cores

    jvm_parallelGCThreads

    Depends on the available CPU cores

    Depends on the available CPU cores

    jvm_compilation_threads

    Depends on the available CPU cores

    Depends on the available CPU cores

    jvm_activeProcessorCount < container.cpu_limits/1000 + 1

    mem_used

    bytes

    The total amount of memory used

    jvm_heap_size

    bytes

    The size of the JVM heap memory

    cpu_util

    percent

    The average CPU utilization % across all the CPUs (i.e., how much time on average the CPUs are busy doing work)

    cpu_used

    CPUs

    The total amount of CPUs used

    jvm_gc_time

    percent

    The % of wall clock time the JVM spent doing stop the world garbage collection activities

    jvm_gc_count

    collections/s

    The total number of stop the world JVM garbage collections that have occurred per second

    jvm_threads_current

    threads

    The total number of active threads within the JVM

    jvm_threads_deadlocked

    threads

    The total number of deadlocked threads within the JVM

    jvm_minHeapSize

    integer

    megabytes

    jvm_newRatio

    integer

    jvm_reservedCodeCacheSize

    integer

    megabytes

    jvm_usePerfData

    categorical

    Parameter

    Default value

    Domain

    jvm_minHeapSize

    Depends on the instance available memory

    jvm_maxHeapSize

    jvm.jvm_minHeapSize <= jvm.jvm_maxHeapSize

    jvm.jvm_minHeapFreeRatio <= jvm.jvm_maxHeapFreeRatio

    jvm.jvm_maxNewSize < jvm.jvm_maxHeapSize * 0.8

    You should select your own default value.

    2

    240

    +UsePerfData

    Depends on the instance available memory

    jvm.jvm_concurrentGCThreads <= jvm.jvm_parallelGCThreads