In this example, you will go through the optimization of a Spark application running on AWS instances. We’ll be using a PageRank implementation included in Renaissance, an industry-standard Java benchmarking suite, tuning both Java and AWS parameters to improve the performance of our application.
For this example, you’re expected to use two dedicated machines:
an Akamas instance
a Linux-based AWS EC2 instance
The Akamas instance requires provisioning and manipulating instances, therefore it requires to be enabled to do so by setting , integrating with orchestration tools (such as ), and an inventory linked to your AWS EC2 environment.
The Linux-based instance will run the application benchmark, so it requires the latest open-jdk11 release
For this study you’re going to require the following telemetry providers:
to parse the results of the benchmark
to monitor the instance
to extract instance price
The suite provides the benchmark we’re going to optimize.
Since the application consists of a jar file only, the setup is rather straightforward; just download the binary in the ~/renaissance/ folder:
In the same folder upload the template file launch.benchmark.sh.temp, containing the script that executes the benchmark using the provided parameters and parses the results:
You may find further info about the suite and its benchmarks in the .
In this section, we will guide you through the steps required to set up the optimization on Akamas.
This example requires the installation of the following optimization packs:
Our system could be named renaissance after its application, so you’ll have a system.yaml file like this:
Then create the new system resource:
The renaissance system will then have three components:
A benchmark component
A Java component
An EC2 component, i.e. the underlying instance
Java component
Create a component-jvm.yaml file like the following:
Then type:
Benchmark component
Since there is no optimization pack associated with this component, you have to create some extra resources.
A metrics.yaml file for a new metric tracking execution time:
A component-type benchmark.yaml:
The component pagerank.yaml:
Create your new resources, by typing in your terminal the following commands:
EC2 component
Create a component-ec2.yaml file like the following:
Then create its resource by typing in your terminal:
The workflow in this example is composed of three main steps:
Update the instance type
Run the application benchmark
Stop the instance
To manage the instance we are going to integrate a very simple in our workflow: the will replace the parameters in the template file in order to generate the code run by the as explained in the page.
In detail:
Update the instance size
Generate the playbook file from the template
Update the instance using the playbook
Wait for the instance to be available
The following is the template of the Ansible playbook:
The following is the workflow configuration file:
If you have not installed the Prometheus telemetry provider or the CSV telemetry provider yet, take a look at the telemetry provider pages and to proceed with the installation.
Prometheus
Prometheus allows us to gather jvm execution metrics through the jmx exporter: download the java agent required to gather metrics from , then update the two following files:
The prometheus.yml file, located in your Prometheus folder:
The config.yml file you have to create in the ~/renaissance folder:
Now you can create a prometheus-instance.yaml file:
Then you can install the telemetry instance:
You may find further info on exporting Java metrics to Prometheus .
CSV - Telemetry instance
Create a telemetry-csv.yaml file to read the benchmark output:
Then create the resource by typing in your terminal:
Here we provide a reference study for AWS. As we’ve anticipated, the goal of this study is to optimize a sample Java application, the PageRank benchmark you may find in the renaissance benchmark suite by Oracle.
Our goal is rather simple: minimizing the product between the benchmark execution time and the instance price, that is, finding the most cost-effective instance for our application.
Create a study.yaml file with the following content:
Then create the corresponding Akamas resource and start the study:
Run the application benchmark
Configure the benchmark Java launch script
Execute the launch script
Parse PageRank output to make it consumable by the CSV telemetry instance
Stop the instance
Configure the playbook to stop an instance with a specific instance id
Run the playbook to stop the instance
sudo apt install openjdk-11-jremkdir ~/renaissance
cd ~/renaissance
wget -O renaissance.jar https://github.com/renaissance-benchmarks/renaissance/releases/download/v0.10.0/renaissance-gpl-0.10.0.jar#!/bin/bash
java -XX:MaxRAMPercentage=60 ${jvm.*} -jar renaissance.jar -r 50 --csv renaissance.csv page-rank
total_time=$(awk -F"," '{total_time+=$2}END{print total_time}' ./renaissance.csv)
first_line=$(head -n 1 renaissance.csv)
end_time=$(tail -n 1 renaissance.csv | cut -d',' -f3)
start_time=$(sed '2q;d' renaissance.csv | cut -d',' -f4)
echo $first_line,"TS,COMPONENT" > renaissance-parsed.csv
ts=$(date -d @$(($start_time/1000)) "+%Y-%m-%d %H:%M:%S")
echo -e "page-rank,$total_time,$end_time,$start_time,$ts,pagerank" >> renaissance-parsed.csvname: jvm
description: The JVM running the benchmark
componentType: java-openjdk-11
properties:
prometheus:
job: jmx
instance: jmx_instanceakamas create component component-jvm.yaml renaissancename: jvm
description: The JVM running the benchmark
componentType: java-openjdk-11
properties:
prometheus:
job: jmx
instance: jmx_instanceakamas create component component-jvm.yaml renaissancemetrics:
- name: elapsed
unit: nanoseconds
description: The duration of the benchmark executionname: benchmark
description: A component type for the Renaissance Java benchmarking suite
metrics:
- name: elapsed
parameters: []name: pagerank
description: The pagerank application included in Renaissance benchmarks
componentType: benchmarkakamas create metrics metrics.yaml
akamas create component-type benchmark.yaml
akamas create component pagerank.yaml renaissancename: instance
description: The ec2 instance the benchmark runs on
componentType: ec2
properties:
hostname: renaissance.akamas.io
sshPort: 22
instance: ec2_instance
username: ubuntu
key: # SSH KEY
ec2:
region: us-east-2 # This is just a referenceakamas create component component-ec2.yaml renaissance# Change instance type, requires AWS CLI
- name: Resize the instance
hosts: localhost
gather_facts: no
connection: local
tasks:
- name: save instance info
ec2_instance_info:
filters:
"tag:Name": <your-instance-name>
register: ec2
- name: Stop the instance
ec2:
region: <your-aws-region>
state: stopped
instance_ids:
- "{{ ec2.instances[0].instance_id }}"
instance_type: "{{ ec2.instances[0].instance_type }}"
wait: True
- name: Change the instances ec2 type
shell: >
aws ec2 modify-instance-attribute --instance-id "{{ ec2.instances[0].instance_id }}"
--instance-type "${ec2.aws_ec2_instance_type}.${ec2.aws_ec2_instance_size}"
delegate_to: localhost
- name: restart the instance
ec2:
region: <your-aws-region>
state: running
instance_ids:
- "{{ ec2.instances[0].instance_id }}"
wait: True
register: ec2
- name: wait for SSH to come up
wait_for:
host: "{{ item.public_dns_name }}"
port: 22
delay: 60
timeout: 320
state: started
with_items: "{{ ec2.instances }}"name: Pagerank AWS optimization
tasks:
# Creating the EC2 instance
- name: Configure provisioning
operator: FileConfigurator
arguments:
sourcePath: /home/ubuntu/ansible/resize.yaml.templ
targetPath: /home/ubuntu/ansible/resize.yaml
host:
hostname: bastion.akamas.io
username: ubuntu
key: # SSH KEY
- name: Execute Provisioning
operator: Executor
arguments:
command: ansible-playbook /home/akamas/ansible/resize.yaml
host:
hostname: bastion.akamas.io
username: akamas
key: # SSH KEY
# Waiting for the instance to come up and set up its DNS
- name: Pause
operator: Sleep
arguments:
seconds: 120
# Running the benchmark
- name: Configure Benchmark
operator: FileConfigurator
arguments:
source:
hostname: renaissance.akamas.io
username: ubuntu
path: /home/ubuntu/renaissance/launch_benchmark.sh.templ
key: # SSH KEY
target:
hostname: renaissance.akamas.io
username: ubuntu
path: /home/ubuntu/renaissance/launch_benchmark.sh
key: # SSH KEY
- name: Launch Benchmark
operator: Executor
arguments:
command: bash /home/ubuntu/renaissance/launch_benchmark.sh
host:
hostname: renaissance.akamas.io
username: ubuntu
key: # SSH KEYCreate the workflow resource by typing in your terminal:# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: prometheus
static_configs:
- targets: ['localhost:9090']
- job_name: jmx
static_configs:
- targets: ["localhost:9110"]
relabel_configs:
- source_labels: ["__address__"]
regex: "(.*):.*"
target_label: instance
replacement: jmx_instancstartDelaySeconds: 0
username:
password:
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
# using the property above we are telling the export to export only relevant java metrics
whitelistObjectNames:
- "java.lang:*"
- "jvm:*"provider: Prometheus
config:
address: renaissance.akamas.io
port: 9090akamas create telemetry-instance prometheus-instance.yaml renaissanceprovider: CSV
config:
protocol: scp
address: renaissance.akamas.io
username: ubuntu
authType: key
auth: # SSH KEY
remoteFilePattern: /home/ubuntu/renaissance/renaissance-parsed.csv
csvFormat: horizontal
componentColumn: COMPONENT
timestampColumn: TS
timestampFormat: yyyy-MM-dd HH:mm:ss
metrics:
- metric: elapsed
datasourceMetric: nanosakamas create telemetry-instance renaissancename: aws
description: Tweaking aws and the JVM to optimize the page-rank application.
system: renaissance
goal:
objective: minimize
function:
formula: benchmark.elapsed * aws.aws_ec2_price
workflow: workflow-aws
parametersSelection:
- name: aws.aws_ec2_instance_type
categories: [c5,c5d,c5a,m5,m5d,m5a,r5,r5d,r5a]
- name: aws.aws_ec2_instance_size
categories: [large,xlarge,2xlarge,4xlarge]
- name: jvm.jvm_gcType
- name: jvm.jvm_newSize
- name: jvm.jvm_maxHeapSize
- name: jvm.jvm_minHeapSize
- name: jvm.jvm_survivorRatio
- name: jvm.jvm_maxTenuringThreshold
steps:
- name: baseline
type: baseline
numberOfTrials: 2
values:
aws.aws_ec2_instance_type: c5
aws.aws_ec2_instance_size: 2xlarge
jvm.jvm_gcType: G1
- name: optimize
type: optimize
numberOfExperiments: 60akamas create study study.yaml
akamas start study aws