Akamas Docs
3.5
3.5
  • Home
  • Getting started
    • Introduction
    • Free Trial
    • Licensing
    • Deployment
      • Cloud Hosting
    • Security
    • Maintenance & Support (M&S) Services
      • Customer Support Services
      • Support levels for Customer Support Services
      • Support levels for software versions
      • Support levels with Akamas
  • Installing
    • Architecture
    • Docker compose installation
      • Prerequisites
        • Hardware Requirements
        • Software Requirements
        • Network requirements
      • Install Akamas dependencies
      • Install the Akamas Server
        • Online installation mode
          • Online installation behind a Proxy server
        • Offline installation mode
        • Changing UI Ports
        • Setup HTTPS configuration
      • Troubleshoot Docker installation issues
    • Kubernetes installation
      • Prerequisites
        • Cluster Requirements
        • Software Requirements
      • Install Akamas
        • Online Installation
        • Offline Installation - Private registry
      • Installing on OpenShift
      • Accessing Akamas
      • Useful commands
    • Install the CLI
      • Setup the CLI
      • Initialize the CLI
      • Change CLI configuration
      • Use a proxy server
    • Verify the installation
    • Installing the toolbox
    • Install the license
    • Manage anonymous data collection
  • Managing Akamas
    • Akamas logs
    • Audit logs
    • Upgrade Akamas
      • Docker compose
      • Kubernetes
    • Monitor Akamas status
    • Backup & Recover of the Akamas Server
    • Users management
      • Accessing Keycloak admin console
      • Configure an external identity provider
        • Azure Active Directory
        • Google
      • Limit users sessions
        • Local users
        • Identity provider users
    • Collecting support information
  • Using
    • System
    • Telemetry
    • Workflow
    • Study
      • Offline Study
      • Live Study
        • Analyzing results of live optimization studies
      • Windowing
      • Parameters and constraints
  • Optimization Guides
    • Optimize application costs and resource efficiency
      • Kubernetes microservices
        • Optimize cost of a Kubernetes deployment subject to Horizontal Pod Autoscaler
        • Optimize cost of a Kubernetes microservice while preserving SLOs in production
        • Optimize cost of a Java microservice on Kubernetes while preserving SLOs in production
      • Application runtime
        • Optimizing a sample Java OpenJDK application
        • Optimizing cost of a Node.js application with performance tests
        • Optimizing cost of a Golang application with performance tests
        • Optimizing cost of a .NET application with performance tests
      • Applications running on cloud instances
        • Optimizing a sample application running on AWS
      • Spark applications
        • Optimizing a Spark application
    • Optimize application performance and reliability
      • Kubernetes microservices
        • Optimizing cost of a Kubernetes microservice while preserving SLOs in production
        • Optimizing cost of a Java microservice on Kubernetes while preserving SLOs in production
      • Applications running on cloud instances
      • Spark applications
  • Integrating
    • Integrating Telemetry Providers
      • CSV provider
        • Install CSV provider
        • Create CSV telemetry instances
      • Dynatrace provider
        • Install Dynatrace provider
        • Create Dynatrace telemetry instances
          • Import Key Requests
      • Prometheus provider
        • Install Prometheus provider
        • Create Prometheus telemetry instances
        • CloudWatch Exporter
        • OracleDB Exporter
      • Spark History Server provider
        • Install Spark History Server provider
        • Create Spark History Server telemetry instances
      • NeoLoadWeb provider
        • Install NeoLoadWeb telemetry provider
        • Create NeoLoadWeb telemetry instances
      • LoadRunner Professional provider
        • Install LoadRunner Professional provider
        • Create LoadRunner Professional telemetry instances
      • LoadRunner Enterprise provider
        • Install LoadRunner Enterprise provider
        • Create LoadRunner Enterprise telemetry instances
      • AWS provider
        • Install AWS provider
        • Create AWS telemetry instances
    • Integrating Configuration Management
    • Integrating with pipelines
    • Integrating Load Testing
      • Integrating NeoLoad
      • Integrating LoadRunner Professional
      • Integrating LoadRunner Enterprise
  • Reference
    • Glossary
      • System
      • Component
      • Metric
      • Parameter
      • Component Type
      • Workflow
      • Telemetry Provider
      • Telemetry Instance
      • Optimization Pack
      • Goals & Constraints
      • KPI
      • Optimization Study
      • Workspace
      • Safety Policies
    • Construct templates
      • System template
      • Component template
      • Parameter template
      • Metric template
      • Component Types template
      • Telemetry Provider template
      • Telemetry Instance template
      • Workflows template
      • Study template
        • Goal & Constraints
        • Windowing policy
          • Trim windowing
          • Stability windowing
        • Parameter selection
        • Metric selection
        • Workload selection
        • KPIs
        • Steps
          • Baseline step
          • Bootstrap step
          • Preset step
          • Optimize step
        • Parameter rendering
        • Optimizer Options
    • Workflow Operators
      • General operator arguments
      • Executor Operator
      • FileConfigurator Operator
      • LinuxConfigurator Operator
      • WindowsExecutor Operator
      • WindowsFileConfigurator Operator
      • Sleep Operator
      • OracleExecutor Operator
      • OracleConfigurator Operator
      • SparkSSHSubmit Operator
      • SparkSubmit Operator
      • SparkLivy Operator
      • NeoLoadWeb Operator
      • LoadRunner Operator
      • LoadRunnerEnteprise Operator
    • Telemetry metric mapping
      • Dynatrace metrics mapping
      • Prometheus metrics mapping
      • NeoLoadWeb metrics mapping
      • Spark History Server metrics mapping
      • LoadRunner metrics mapping
    • Optimization Packs
      • Linux optimization pack
        • Amazon Linux
        • Amazon Linux 2
        • Amazon Linux 2022
        • CentOS 7
        • CentOS 8
        • RHEL 7
        • RHEL 8
        • Ubuntu 16.04
        • Ubuntu 18.04
        • Ubuntu 20.04
      • DotNet optimization pack
        • DotNet Core 3.1
      • Java OpenJDK optimization pack
        • Java OpenJDK 8
        • Java OpenJDK 11
        • Java OpenJDK 17
      • OpenJ9 optimization pack
        • IBM J9 VM 6
        • IBM J9 VM 8
        • Eclipse Open J9 11
      • Node JS optimization pack
        • Node JS 18
      • GO optimization pack
        • GO 1
      • Web Application optimization pack
        • Web Application
      • Docker optimization pack
        • Container
      • Kubernetes optimization pack
        • Kubernetes Pod
        • Kubernetes Container
        • Kubernetes Workload
        • Kubernetes Namespace
        • Kubernetes Cluster
      • WebSphere optimization pack
        • WebSphere 8.5
        • WebSphere Liberty ND
      • AWS optimization pack
        • EC2
        • Lambda
      • PostgreSQL optimization pack
        • PostgreSQL 11
        • PostgreSQL 12
      • Cassandra optimization pack
        • Cassandra
      • MySQL Database optimization pack
        • MySQL 8.0
      • Oracle Database optimization pack
        • Oracle Database 12c
        • Oracle Database 18c
        • Oracle Database 19c
        • RDS Oracle Database 11g
        • RDS Oracle Database 12c
      • MongoDB optimization pack
        • MongoDB 4
        • MongoDB 5
      • Elasticsearch optimization pack
        • Elasticsearch 6
      • Spark optimization pack
        • Spark Application 2.2.0
        • Spark Application 2.3.0
        • Spark Application 2.4.0
    • Command Line commands
      • Administration commands
      • User and Workspace management commands
      • Authentication commands
      • Resource management commands
      • Optimizer options commands
    • Release Notes
  • Knowledge Base
    • Creating custom optimization packs
    • Setting up a Konakart environment for testing Akamas
    • Modeling a sample Java-based e-commerce application (Konakart)
    • Optimizing a web application
    • Optimizing a sample Java OpenJ9 application
    • Optimizing a sample Linux system
    • Optimizing a MongoDB server instance
    • Optimizing a Kubernetes application
    • Leveraging Ansible to automate AWS instance management
    • Guidelines for optimizing AWS EC2 instances
    • Optimizing an Oracle Database server instance
    • Optimizing an Oracle Database for an e-commerce service
    • Guidelines for optimizing Oracle RDS
    • Optimizing a MySQL server database running Sysbench
    • Optimizing a MySQL server database running OLTPBench
    • Optimizing a live full-stack deployment (K8s + JVM)
    • Setup Instana integration
Powered by GitBook
On this page
  • Environment setup
  • Telemetry Infrastructure setup
  • Application and Test tools
  • Optimization setup
  • System
  • Workflow
  • Telemetry
  • Study

Was this helpful?

Export as PDF
  1. Optimization Guides
  2. Optimize application costs and resource efficiency
  3. Spark applications

Optimizing a Spark application

In this example study we’ll tune the parameters of SparkPi, one of the example applications provided by most of the Apache Spark distributions, to minimize its execution time. Application monitoring is provided by the Spark History Server APIs.

Environment setup

The test environment includes the following instances:

  • Akamas: instance running Akamas

  • Spark cluster: composed of instances with 16 vCPUs and 64 GB of memory, where the Spark binaries are installed under /usr/lib/spark. In particular, the roles are:

    • 1x master instance: the Spark node running the resource manager and Spark History Server (host: sparkmaster.akamas.io)

    • 2x worker instances: the other instances in the cluster

Telemetry Infrastructure setup

To gather metrics about the application we will leverage the Spark History Server. If it is not already running, start it on the master instance with the following command:

/usr/lib/spark/sbin/start-history-server.sh

Application and Test tools

To make sure the tested application is available on your cluster and runs correctly, execute the following commands:

file /usr/lib/spark/examples/jars/spark-examples.jar
spark-submit \
  --master yarn --deploy-mode client \
  --class 'org.apache.spark.examples.SparkPi' \
  /usr/lib/spark/examples/jars/spark-examples.jar 100

Optimization setup

In this section, we will guide you through the steps required to set up on Akamas the optimization of the Spark application execution.

System

System spark

Here’s the definition of the system we will use to group our components and telemetry instances for this example:

name: spark
description: A system to tune the Spark Pi example application

To create the system run the following command:

akamas create system system.yaml

Component sparkPi

In the snippet shown below, we specify:

  • the field properties required by Akamas to connect via SSH to the cluster master instance

  • the parameters required by spark-submit to execute the application

  • the sparkApplication flag required by the telemetry instance to associate the metrics from the History Server to this component

name: sparkPi
description: The Spark Application used to calculate KPIs for ContentWise Analytics
componentType: Spark Application 2.3.0

properties:
  hostname: sparkmaster.akamas.io
  username: hadoop
  key: ssh_key

  master: yarn
  deployMode: client
  className: org.apache.spark.examples.SparkPi
  file: /usr/lib/spark/examples/jars/spark-examples.jar
  args: [ 1000 ]

  sparkApplication: 'true'

To create the component in the system run the following command:

akamas create component sparkPi.yaml spark

Workflow

The workflow used for this study contains only a single stage, where the operator submits the application along with the Spark parameters under test.

Here’s the definition of the workflow:

name: Run SparkPi
tasks:
- name: run application
  operator: SSHSparkSubmit
  arguments:
    component: sparkPi
    retries: 0

To create the workflow run the following command:

akamas create workflow workflow.yaml

Telemetry

Here’s the definition of the component, specifying the History Server endpoint:

provider: SparkHistoryServer
config:
  address: sparkmaster.akamas.io
  port: 18080

  importLevel: job

To create the telemetry instance in the system run the following command:

akamas create telemetry-instance telemetry.yaml spark

This telemetry instance will be able to bind the fetched metrics to the related sparkPi component thanks to the sparkApplication attribute we previously added in its definition.

Study

The goal of this study is to find a Spark configuration that minimizes the execution time for the example application.

To achieve this goal we’ll operate on the number of executor processes available to run the application job, and the memory and CPUs allocated for both driver and executors. The domains are configured so that the single driver/executor process does not exceed the size of the underlying instance, and the constraints make it so that the application overall does not require more resources than the ones available in the cluster, also taking into account that some resources must be reserved for other services such as the cluster manager.

Note that this study uses two constraints on the total number of resources to be used by the spark application. This example refers to a cluster of three nodes with 16 cores and 64 GB of memory each, and at least one core per instance should be reserved for the system.

Here’s the definition of the study:

name: Speedup SparkPi execution
system: spark
workflow: Run SparkPi

goal:
  objective: minimize
  function:
    formula: sparkPi.spark_application_duration

parametersSelection:
- name: sparkPi.driverCores
  domain: [1, 10]
- name: sparkPi.driverMemory
  domain: [32, 2048]
- name: sparkPi.executorCores
  domain: [1, 15]
- name: sparkPi.executorMemory
  domain: [32, 2048]
- name: sparkPi.numExecutors
  domain: [1, 45]

parameterConstraints:
- name: cap_total_allocated_cpus
  formula: (spark.driverCores + spark.executorCores*spark.numExecutors) <= 15*3

- name: cap_total_allocated_memory
  formula: (spark.driverMemory + spark.executorMemory*spark.numExecutors) <= 60*3

steps:
- name: baseline
  type: baseline

- name: tune
  type: optimize
  numberOfExperiments: 200
  maxFailedExperiments: 200

To create and run the study execute the following commands:

akamas create study study.yaml
akamas start study 'Speedup SparkPi execution'

Last updated 10 months ago

Was this helpful?

We’ll use a component of type to represent the application running on the Apache Spark framework 2.3.

If you have not installed the Spark History Server telemetry provider yet, take a look at the telemetry provider page to proceed with the installation.

Spark Application 2.3.0
Spark History Server Provider