General optimization process and methodology

Akamas has been designed and implemented to effectively support organizations in implementing their own approach to optimization, in particular, thanks to its Infrastructure as Code (IaC) design, modular and reusable constructs, and delegation-of-duty features to support multiple teams.

While an optimization process can also be a one-shot exercise aiming at optimizing a specific critical application to remediate performance issues or to address a cost reduction initiative, in general, optimization is conceived as a continuous and iterative process. This process can be seen as composed of multiple optimization campaigns running in parallel (each typically involving a single application) that are being executed at the same time (see the following figure).

In Akamas, an optimization campaign is structured into one or more optimization studies, which represent an optimization initiative aimed at optimizing a target system with respect to defined goals and constraints.

These studies can be either offline optimization studies, which are typically executed in test or pre-production environments, also to validate planned changes or what-if scenarios, or live optimization studies which run directly in production environments.

At any given timeframe, for a specific application, there could be multiple studies being executed either in parallel or in sequence (see the following figure):

multiple live optimizations running for each critical application microservices; typically, a live optimization focuses on an application microservice supporting a specific business function with respect to specific optimization goals and constraints, as the optimization could be aimed for some microservices at improving performance while trading lower costs, while for others at keeping performances within the SLOs and reducing infrastructure or cloud cost;
multiple offline optimization studies may correspond to the different layers of the target system that are being optimized in several stages (typically starting with the backend layer, then the middleware, and finally the front-end layer), or to several application releases with different resources footprint (e.g. higher memory usage), or that involve technology changes in the application stack (e.g. moving from Oracle to MongoDB) or migration to a different cloud provider (or cloud managed service), or that are required to sustain higher workload (e.g. due to a marketing campaign) or to ensure application resilience under failure scenarios (identified by chaos engineering).

The following figure intends to illustrate the variety of scenarios in a real optimization process:

For example (with reference to the previous figure):

the optimization campaign for the microservices-based application App-1 runs an offline optimization study for the App-1-1 microservice in Q1 and the App-1-2 microservice in Q2, before running live optimizations for both these microservices in parallel starting from Q3; notice that in Q4, possibly to anticipate a workload growth and assess the required infrastructure, an offline optimization for App-1-2 (possibly the most resource-demanding microservice) is also executed;
the optimization campaign for the standalone application App-2 runs several offline optimizations in sequence: in Q1 and Q2, first separately on the frontend and backend layers of App-2 (respectively App-2-FE and App-2-BE) and then in Q3 for the entire application; in Q4, in addition to the quarterly optimization for App-2 with respect to the goal Goal-2-1 that was used in the previous optimizations, also another offline optimization is executed with respect to a different goal Goal-2-2, which could either be a refinement of the previous goal (e.g. with tighter SLOs) or reflecting a completely different goal (e.g. a cost-reduction goal with respect to a performance improvement goal);
the optimization campaign for the microservices-based application App-3 runs first a live optimization starting at some point in Q2 (for example as the application is first released) for most-critical microservice App-3-1 and then in Q3 also for other microservice App-3-2, possibly as a refinement of the modeling of App-3 based on the observed optimization results.

More complex scenarios may result in the case of multiple teams (working jointly or separately) on the same or different applications, which in Akamas can be organized in different workspaces.

Last updated 1 year ago