Study

Now that Akamas knows about your application, how to configure it, and how to monitor it, the final step is to define your optimization study.

The study defines the objective of the optimization activity. It contains information about what we want to achieve (e.g. reduce costs, improve latency..), the parameters that can be optimized, and any SLO that should not be breached by the optimized configuration.

Studies are divided into two main categories:

  • Offline Studies are, generally, executed in test environments where the workload of the application is generated using a load-testing tool. You can read more here.

  • Live Studies are, usually, executed in production environments. You can read more here.

The setup of both studies is similar as both are constituted by the following core elements:

  • Name: A unique identifier that can be used to identify different studies.

  • System: The name of the system that we want to optimize.

  • Workflow: The name of the workflow that will be used to configure the application.

  • Goal: The objective of the optimization (e.g. minimize cost, maximize throughput, reduce latency).

  • Parameter Selection: A list of parameters that will be tuned in the optimization (e.g. container memory and CPU limits, EC2 instance family..).

  • Steps: The flow of the optimization study (e.g. assessing the baseline performance, optimizing the system, restoring the configuration).

The system and the workflow, already introduced in the previous sections, are referenced in the study definition to provide Akamas with information on how to apply the parameters (through the workflow) and retrieve the metrics (through the telemetry instances in the system) that are used to calculate the goal.

Goal

The goal defines the objective of our optimization. Specifying a goal is as simple as defining the metric we want to optimize and the direction of the optimization such as maximizing throughput or minimizing cost. If you want to optimize more complex scenarios or lack a single metric that represents your objective you can also specify a formula and define a goal such as minimizing memory and CPU utilization.

Metrics are identified within a study with the following notation component.metric_name where component is the name of a component of the system linked to the study and metric name is the name of a metric. As an example, the CPU utilization of a container might be identified by MyContainer.cpu_util.

Another important, although optional, element of the goal is the definition of constraints on other metrics of the system: in many cases optimizing a system involves finding a tradeoff between multiple aspects, and goal constraints can be used to map SLO and inform Akamas about other aspects of our system that we want to safeguard during the optimization (e.g. reducing the amount of CPU assigned to a container might reduce the cost of running the system but increase its response time). Constraints can be used to specify, as an example, an upper limit to the response time or the memory utilization of the system. You can find more information on how to specify constraints in the reference documentation section.

Parameter Selection

The parameter selection contains the list of parameters that are subject to the optimization process. These might include several components and layers, as in the following example.

Similarly to metrics, components are defined with the notation component.parameter_name.

Optionally, you can also specify a range of values that can be assigned to the parameter. This is very useful when you want to evaluate a specific optimization area or want to add some context to the optimization (e.g. avoid setting a memory greater than 8GB because it's not available on the system).

The parameter selection can include any component and parameter of the system. During the optimization process, Akamas will provide values for those parameters and apply them to the system using the workflow provided in the study definition.

Steps

If the goal describes where we are heading, steps describe the road to get there. Usually, when optimizing an application we want to assess its performance before the tuning activity to evaluate the benefits; this initial assessment is called the Baseline. Then, we want to run the optimization process for a definite number of iterations, this is called an Optimization step. Many other use cases can be achieved by providing additional steps to the study. Some of these include:

  • Re-using knowledge gathered by other optimization studies

  • Applying the baseline configuration to the test environment after the optimization has ended

  • Evaluating a specific configuration suggested by the user

You can find more information on the steps in the reference documentation section.

Besides the goal, parameter selection, and steps, the study can be enriched with other, optional, elements that can be used to better tailor it to your specific needs. These include, as an example automated windowing and parameter constraints. You can find more information on these optional elements in the specific subsections or read the entire study definition in the reference documentation section.

Optimizing the Online Boutique

Recalling our application example introduced in this section, our optimization objective is to reduce the costs of running the Ad service while reaching our SLO on the response time.

As shown in the image below, you can use the study creation wizard in the UI to specify all the required information.

If you prefer to define it via YAML you can use the following file.

name: Reduce Costs
system: Online Boutique
workflow: Configure and Test Boutique

goal:
  objective: minimize
  function:
    formula: Adservice.cost
  constraints:
    absolute:
      - name: response_time
        formula: Apis.requests_response_time <= 20

parametersSelection:
  - name: Adservice.cpu_limit
    domain: [150, 1000]
  - name: Adservice.memory_limit
    domain: [64, 2048]
  - name: AdserviceJVM.jvm_maxRAMPercentage
  - name: AdserviceJVM.jvm_gcType

steps:
  - name: baseline
    type: baseline
    values:
      Adservice.cpu_limit: 500
      Adservice.memory_limit: 1024
      AdserviceJVM.jvm_maxRAMPercentage: 25

  - name: optimize
    type: optimize
    numberOfExperiments: 30

Save it to a file named, as an example, study.yaml and then issue the command

akamas create study study.yaml

This study's definition contains three main parts.

The goal

In this section, we instruct akamas that we want to minimize the cost of the Adservice and we have added a constraint to the optimization. In particular, we added a constraint on the value of the metric requests_response_time of the Api component to be lower than 20ms. This is an absolute constraint as it's defined on the actual value of the metric and can easily map an SLO. You can also express constraints like "do not make the response time increase more than 10%" by using relative constraints. You can find more info on the supported constraint types in the reference documentation section.

The parameters selection

In this section, we defined which parameters Akamas can change to achieve its goal. We decided to include parameters both from the JVM and the container layers to let Akamas tune all of them accordingly. We also specified a custom domain, for a couple of parameters, to allow Akamas to explore only values within those ranges. Note that this is an optional step as Akamas already knows about the range of possible values of many parameters. You can find more info on available parameters and guidelines to choose them in different use cases in the optimization guides section.

The steps

This final section instructs Akamas to first assess the performance and costs of the current configuration, which we will refer to as the baseline, then run 30 experiments by changing the parameters to optimize the goal.

You can now start your optimization study and wait for Akamas to find the best configuration!

Last updated