Optimizing a sample application running on AWS
In this example, you will go through the optimization of a Spark based PageRank algorithm on AWS instances. We’ll be using a PageRank implementation included in Renaissance, an industry-standard Java benchmarking suite developed by Oracle Labs, tweaking both Java and AWS parameters to improve the performance of our application.
For this example, you’re expected to use two dedicated machines:
- an Akamas instance
- a Linux-based AWS EC2 instance
The Akamas instance requires provisioning and manipulating instances, therefore it requires to be enabled to do so by setting AWS Policies, integrating with orchestration tools (such as Ansible) and an inventory linked to your AWS EC2 environment.
The Linux-based instance will run the application benchmark, so it requires the latest open-jdk11 release
sudo apt install openjdk-11-jre
For this study you’re going to require the following telemetry providers:
Since the application consists of a jar file only, the setup is rather straightforward; just download the binary in the
~/renaissance/
folder:mkdir ~/renaissance
cd ~/renaissance
wget -O renaissance.jar https://github.com/renaissance-benchmarks/renaissance/releases/download/v0.10.0/renaissance-gpl-0.10.0.jar
In the same folder upload the template file
launch.benchmark.sh.temp
, containing the script that executes the benchmark using the provided parameters and parses the results:1
#!/bin/bash
2
java -XX:MaxRAMPercentage=60 ${jvm.*} -jar renaissance.jar -r 50 --csv renaissance.csv page-rank
3
4
total_time=$(awk -F"," '{total_time+=$2}END{print total_time}' ./renaissance.csv)
5
first_line=$(head -n 1 renaissance.csv)
6
end_time=$(tail -n 1 renaissance.csv | cut -d',' -f3)
7
start_time=$(sed '2q;d' renaissance.csv | cut -d',' -f4)
8
echo $first_line,"TS,COMPONENT" > renaissance-parsed.csv
9
ts=$(date -d @$(($start_time/1000)) "+%Y-%m-%d %H:%M:%S")
10
11
echo -e "page-rank,$total_time,$end_time,$start_time,$ts,pagerank" >> renaissance-parsed.csv
In this section, we will guide you through the steps required to set up the optimization on Akamas.
This example requires the installation of the following optimization packs:
Our system could be named renaissance after its application, so you’ll have a
system.yaml
file like this:1
name: jvm
2
description: The JVM running the benchmark
3
componentType: java-openjdk-11
4
properties:
5
prometheus:
6
job: jmx
7
instance: jmx_instance
Then create the new system resource:
akamas create component component-jvm.yaml renaissance
The renaissance system will then have three components:
- A benchmark component
- A Java component
- An EC2 component, i.e. the underlying instance
Java component
Create a
component-jvm.yaml
file like the following:1
name: jvm
2
description: The JVM running the benchmark
3
componentType: java-openjdk-11
4
properties:
5
prometheus:
6
job: jmx
7
instance: jmx_instance
Then type:
akamas create component component-jvm.yaml renaissance
Benchmark component
Since there is no optimization pack associated with this component, you have to create some extra resources.
- A
metrics.yaml
file for a new metric tracking execution time:
1
metrics:
2
- name: elapsed
3
unit: nanoseconds
4
description: The duration of the benchmark execution
- A component-type
benchmark.yaml
:
1
name: benchmark
2
description: A component type for the Renaissance Java benchmarking suite
3
metrics:
4
- name: elapsed
5
parameters: []
- The component
pagerank.yaml
:
1
name: pagerank
2
description: The pagerank application included in Renaissance benchmarks
3
componentType: benchmark
Create your new resources, by typing in your terminal the following commands:
akamas create metrics metrics.yaml
akamas create component-type benchmark.yaml
akamas create component pagerank.yaml renaissance
EC2 component
Create a
component-ec2.yaml
file like the following:1
name: instance
2
description: The ec2 instance the benchmark runs on
3
componentType: ec2
4
properties:
5
hostname: renaissance.akamas.io
6
sshPort: 22
7
instance: ec2_instance
8
username: ubuntu
9
key: # SSH KEY
10
ec2:
11
region: us-east-2 # This is just a reference
Then create its resource by typing in your terminal:
akamas create component component-ec2.yaml renaissance
The workflow in this example is composed by three main steps:
- 1.Update the instance type
- 2.Run the application benchmark
- 3.Stop the instance
To manage the instance we are going to integrate a very simple Ansible in our workflow: the FileConfigurator operator will replace the parameters in the template file in order to generate the code run by the Executor operator, as explained in the Ansible page.
In detail:
- 1.Update the instance size
- 1.Generate the the playbook file from the template
- 2.Update the instance using the playbook
- 3.Wait for the instance to be available
- 2.Run the application benchmark
- 1.Configure the benchmark Java launch script
- 2.Execute the launch script
- 3.Parse PageRank output to make it consumable by the CSV telemetry instance
- 3.Stop the instance
- 1.Configure the playbook to stop an instance with a specific instance id
- 2.Run the playbook to stop the instance
The following is the template of the Ansible playbook:
1
# Change instance type, requires AWS CLI
2
3
- name: Resize the instance
4
hosts: localhost
5
gather_facts: no
6
connection: local
7
tasks:
8
- name: save instance info
9
ec2_instance_info:
10
filters:
11
"tag:Name": <your-instance-name>
12
register: ec2
13
- name: Stop the instance
14
ec2:
15
region: <your-aws-region>
16
state: stopped
17
instance_ids:
18
- "{{ ec2.instances[0].instance_id }}"
19
instance_type: "{{ ec2.instances[0].instance_type }}"
20
wait: True
21
- name: Change the instances ec2 type
22
shell: >
23
aws ec2 modify-instance-attribute --instance-id "{{ ec2.instances[0].instance_id }}"
24
--instance-type "${ec2.aws_ec2_instance_type}.${ec2.aws_ec2_instance_size}"
25
delegate_to: localhost
26
- name: restart the instance
27
ec2:
28
region: <your-aws-region>
29
state: running
30
instance_ids:
31
- "{{ ec2.instances[0].instance_id }}"
32
wait: True
33
register: ec2
34
- name: wait for SSH to come up
35
wait_for:
36
host: "{{ item.public_dns_name }}"
37
port: 22
38
delay: 60
39
timeout: 320
40
state: started
41
with_items: "{{ ec2.instances }}"
The following is the workflow configuration file:
1
name: Pagerank AWS optimization
2
tasks:
3
4
# Creating the EC2 instance
5
- name: Configure provisioning
6
operator: FileConfigurator
7
arguments:
8
sourcePath: /home/ubuntu/ansible/resize.yaml.templ
9
targetPath: /home/ubuntu/ansible/resize.yaml
10
host:
11
hostname: bastion.akamas.io
12
username: ubuntu
13
key: # SSH KEY
14
15
- name: Execute Provisioning
16
operator: Executor
17
arguments:
18
command: ansible-playbook /home/akamas/ansible/resize.yaml
19
host:
20
hostname: bastion.akamas.io
21
username: akamas
22
key: # SSH KEY
23
24
# Waiting for the instance to come up and set up its DNS
25
- name: Pause
26
operator: Sleep
27
arguments:
28
seconds: 120
29
30
# Running the benchmark
31
- name: Configure Benchmark
32
operator: FileConfigurator
33
arguments:
34
source:
35
hostname: renaissance.akamas.io
36
username: ubuntu
37
path: /home/ubuntu/renaissance/launch_benchmark.sh.templ
38
key: # SSH KEY
39
target:
40
hostname: renaissance.akamas.io
41
username: ubuntu
42
path: /home/ubuntu/renaissance/launch_benchmark.sh
43
key: # SSH KEY
44
45
- name: Launch Benchmark
46
operator: Executor
47
arguments:
48
command: bash /home/ubuntu/renaissance/launch_benchmark.sh
49
host:
50
hostname: renaissance.akamas.io
51
username: ubuntu
52
key: # SSH KEYCreate the workflow resource by typing in your terminal:
If you have not installed the Prometheus telemetry provider or the CSV telemetry provider yet, take a look at the telemetry provider pages Prometheus provider and CSV Provider to proceed with the installation.
Prometheus
Prometheus allows us to gather jvm execution metrics through the jmx exporter: download the java agent required to gather metrics from here, then update the two following files:
- The
prometheus.yml
file, located in your Prometheus folder:
1
# my global config
2
global:
3
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
4
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
5
6
# A scrape configuration containing exactly one endpoint to scrape:
7
# Here it's Prometheus itself.
8
scrape_configs:
9
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
10
- job_name: prometheus
11
static_configs:
12
- targets: ['localhost:9090']
13
14
- job_name: jmx
15
static_configs:
16
- targets: ["localhost:9110"]
17
relabel_configs:
18
- source_labels: ["__address__"]
19
regex: "(.*):.*"
20
target_label: instance
21
replacement: jmx_instanc
The
config.yml
file you have to create in the ~/renaissance folder:1
startDelaySeconds: 0
2
username:
3
password:
4
ssl: false
5
lowercaseOutputName: false
6
lowercaseOutputLabelNames: false
7
# using the property above we are telling the export to export only relevant java metrics
8
whitelistObjectNames:
9
- "java.lang:*"
10
- "jvm:*"
Now you can create a
prometheus-instance.yaml
file:1
provider: Prometheus
2
config:
3
address: renaissance.akamas.io
4
port: 9090
Then you can install the telemetry instance:
akamas create telemetry-instance prometheus-instance.yaml renaissance
CSV - Telemetry instance
Create a
telemetry-csv.yaml
file to read the benchmark output:1
provider: CSV
2
config:
3
protocol: scp
4
address: renaissance.akamas.io
5
username: ubuntu
6
authType: key
7
auth: # SSH KEY
8
remoteFilePattern: /home/ubuntu/renaissance/renaissance-parsed.csv
9
csvFormat: horizontal
10
componentColumn: COMPONENT
11
timestampColumn: TS
12
timestampFormat: yyyy-MM-dd HH:mm:ss
13
14
metrics:
15
- metric: elapsed
16
datasourceMetric: nanos
Then create the resource by typing in your terminal:
akamas create telemetry-instance renaissance
Here we provide a reference study for AWS.
As we’ve anticipated, the goal of this study is to optimize a sample java application, the PageRank benchmark you may find in the renaissance benchmark suite by Oracle.
Our goal is rather simple: minimizing the product between the benchmark execution time and the instance price, that is, finding the most cost-effective instance for our application.
Create a
study.yaml
file with the following content:1
name: aws
2
description: Tweaking aws and the JVM to optimize the page-rank application.
3
system: renaissance
4
5
goal:
6
objective: minimize
7
function:
8
formula: benchmark.elapsed * aws.aws_ec2_price
9
10
workflow: workflow-aws
11
12
parametersSelection:
13
- name: aws.aws_ec2_instance_type
14
categories: [c5,c5d,c5a,m5,m5d,m5a,r5,r5d,r5a]
15
- name: aws.aws_ec2_instance_size
16
categories: [large,xlarge,2xlarge,4xlarge]
17
- name: jvm.jvm_gcType
18
- name: jvm.jvm_newSize
19
- name: jvm.jvm_maxHeapSize
20
- name: jvm.jvm_minHeapSize
21
- name: jvm.jvm_survivorRatio
22
- name: jvm.jvm_maxTenuringThreshold
23
24
steps:
25
- name: baseline
26
type: baseline
27
numberOfTrials: 2
28
values:
29
aws.aws_ec2_instance_type: c5
30
aws.aws_ec2_instance_size: 2xlarge
31
jvm.jvm_gcType: G1
32
- name: optimize
33
type: optimize
34
numberOfExperiments: 60
Then create the corresponding Akamas resource and start the study:
akamas create study study.yaml
akamas start study aws