The following provides some best practices that can be adopted before launching optimization studies, in particular for offline optimization studies.
It is recommended to execute a dry-run of the study to verify that the workflow works as expected and in particular that the telemetry and configuration management steps are correctly executed.
Verify that workflow actually works
It is important to verify that all the steps of the workflow complete successfully and produce the expected results.
Verify that parameters are applied and effective
When approaching the optimization of new applications or technologies, it is important to make sure all the parameters that are being set are actually applied and used by the system.
Depending on the specific technology at hand, the following issues can be found:
parameters were set but they are not applied - for example parameters were set in the wrong configuration file or the path is not correct;
some automatic (corrective) mechanisms are in place that overrides the values applied for the parameters.
Therefore, it is important to always verify the actual values of the parameters once the system is up & running with a new configuration, and make sure they match the values applied by Akamas. This is typically done by leveraging:
monitoring tools, when the parameters are available as metrics or properties of the system;
native administration tools, which are typically available for introspection or troubleshooting activities (e.g. jcmd for the JVM).
Verify that load testing works
It is important to verify that the integration with load testing tools actually executes the intended load test scenarios.
Verify that telemetry collects all the relevant metrics
It is important to make sure that the integration with telemetry providers works correctly and that all the relevant metrics of the system are correctly collected.
Data-gathering from the telemetry data sources is launched at the end of the workflow tasks. The status of the telemetry process can be inspected in the Progress tab, where it is also possible to inspect the telemetry logs in case of failures.
Please notice that the telemetry process fails if the key metrics of the study cannot be gathered. This includes metrics defined in the goal function or constraints.
Before running the optimization study, it is important to make sure the system and the environment where the optimization is running provide stable and reproducible performance.
Make sure the system performance is stable
In order to ensure a successful optimization, it is important to make sure that the target system displays stable and predictable performance and does not suffer from random variations.
To make sure this is the case, it is recommended to create a study that only runs a single baseline experiment. In order to assess the performance of the system, Akamas trials can be used to execute the same experiments (hence, the same configuration) multiple times (e.g. three times). Once the experiment is completed, the resulting performance metrics can be analyzed to assess the stability. The analysis can either be done by leveraging aggregate metrics in the Analysis tab, or to a deeper level on the actual time series by accessing the Metrics tab from the Akamas UI.
Ideally, no significant performance variation should be observed in the different trials, for the key system performance metrics. Otherwise, it is strongly recommended to identify the root cause before proceeding with the actual optimization activity.
Before launching the optimization it might be a good idea to take note of (or backup) the original configuration. This is very important in the case of Linux OS parameters optimization.