Akamas Docs
3.6
3.6
  • Home
  • Getting started
    • Introduction
    • Insights for Kubernetes
    • Free Trial
    • Licensing
    • Deployment
      • Cloud Hosting
    • Security
    • Maintenance & Support (M&S) Services
      • Customer Support Services
      • Support levels for Customer Support Services
      • Support levels for software versions
      • Support levels with Akamas
  • Installing
    • Architecture
    • Docker compose installation
      • Prerequisites
        • Hardware Requirements
        • Software Requirements
        • Network requirements
      • Install Akamas dependencies
      • Install the Akamas Server
        • Online installation mode
          • Online installation behind a Proxy server
        • Offline installation mode
        • Changing UI Ports
        • Setup HTTPS configuration
      • Troubleshoot Docker installation issues
    • Kubernetes installation
      • Prerequisites
        • Cluster Requirements
        • Software Requirements
      • Install Akamas
        • Online Installation
        • Offline Installation - Private registry
      • Installing on OpenShift
      • Accessing Akamas
      • Useful commands
      • Selecting Cluster Nodes
    • Install the CLI
      • Setup the CLI
      • Initialize the CLI
      • Change CLI configuration
      • Use a proxy server
    • Verify the installation
    • Installing the toolbox
    • Install the license
    • Manage anonymous data collection
  • Managing Akamas
    • Akamas logs
    • Audit logs
    • Upgrade Akamas
      • Docker compose
      • Kubernetes
    • Monitor Akamas status
    • Backup & Recover of the Akamas Server
    • Users management
      • Accessing Keycloak admin console
      • Configure an external identity provider
        • Azure Active Directory
        • Google
      • Limit users sessions
        • Local users
        • Identity provider users
    • Collecting support information
  • Using
    • System
    • Telemetry
    • Workflow
    • Study
      • Offline Study
      • Live Study
        • Analyzing results of live optimization studies
      • Windowing
      • Parameters and constraints
  • Optimization Guides
    • Optimize application costs and resource efficiency
      • Kubernetes microservices
        • Optimize cost of a Kubernetes deployment subject to Horizontal Pod Autoscaler
        • Optimize cost of a Kubernetes microservice while preserving SLOs in production
        • Optimize cost of a Java microservice on Kubernetes while preserving SLOs in production
      • Application runtime
        • Optimizing a sample Java OpenJDK application
        • Optimizing cost of a Node.js application with performance tests
        • Optimizing cost of a Golang application with performance tests
        • Optimizing cost of a .NET application with performance tests
      • Applications running on cloud instances
        • Optimizing a sample application running on AWS
      • Spark applications
        • Optimizing a Spark application
    • Optimize application performance and reliability
      • Kubernetes microservices
        • Optimizing cost of a Kubernetes microservice while preserving SLOs in production
        • Optimizing cost of a Java microservice on Kubernetes while preserving SLOs in production
  • Integrating
    • Integrating Telemetry Providers
      • CSV provider
        • Install CSV provider
        • Create CSV telemetry instances
      • Dynatrace provider
        • Install Dynatrace provider
        • Create Dynatrace telemetry instances
          • Import Key Requests
      • Prometheus provider
        • Install Prometheus provider
        • Create Prometheus telemetry instances
        • CloudWatch Exporter
        • OracleDB Exporter
      • Spark History Server provider
        • Install Spark History Server provider
        • Create Spark History Server telemetry instances
      • NeoLoadWeb provider
        • Install NeoLoadWeb telemetry provider
        • Create NeoLoadWeb telemetry instances
      • LoadRunner Professional provider
        • Install LoadRunner Professional provider
        • Create LoadRunner Professional telemetry instances
      • LoadRunner Enterprise provider
        • Install LoadRunner Enterprise provider
        • Create LoadRunner Enterprise telemetry instances
      • AWS provider
        • Install AWS provider
        • Create AWS telemetry instances
    • Integrating Configuration Management
    • Integrating with pipelines
    • Integrating Load Testing
      • Integrating NeoLoad
      • Integrating LoadRunner Professional
      • Integrating LoadRunner Enterprise
  • Reference
    • Glossary
      • System
      • Component
      • Metric
      • Parameter
      • Component Type
      • Workflow
      • Telemetry Provider
      • Telemetry Instance
      • Optimization Pack
      • Goals & Constraints
      • KPI
      • Optimization Study
      • Workspace
      • Safety Policies
    • Construct templates
      • System template
      • Component template
      • Parameter template
      • Metric template
      • Component Types template
      • Telemetry Provider template
      • Telemetry Instance template
      • Workflows template
      • Study template
        • Goal & Constraints
        • Windowing policy
          • Trim windowing
          • Stability windowing
        • Parameter selection
        • Metric selection
        • Workload selection
        • KPIs
        • Steps
          • Baseline step
          • Bootstrap step
          • Preset step
          • Optimize step
        • Parameter rendering
        • Optimizer Options
    • Workflow Operators
      • General operator arguments
      • Executor Operator
      • FileConfigurator Operator
      • LinuxConfigurator Operator
      • WindowsExecutor Operator
      • WindowsFileConfigurator Operator
      • Sleep Operator
      • OracleExecutor Operator
      • OracleConfigurator Operator
      • SparkSSHSubmit Operator
      • SparkSubmit Operator
      • SparkLivy Operator
      • NeoLoadWeb Operator
      • LoadRunner Operator
      • LoadRunnerEnteprise Operator
    • Telemetry metric mapping
      • Dynatrace metrics mapping
      • Prometheus metrics mapping
      • NeoLoadWeb metrics mapping
      • Spark History Server metrics mapping
      • LoadRunner metrics mapping
    • Optimization Packs
      • Linux optimization pack
        • Amazon Linux
        • Amazon Linux 2
        • Amazon Linux 2022
        • CentOS 7
        • CentOS 8
        • RHEL 7
        • RHEL 8
        • Ubuntu 16.04
        • Ubuntu 18.04
        • Ubuntu 20.04
      • DotNet optimization pack
        • DotNet Core 3.1
      • Java OpenJDK optimization pack
        • Java OpenJDK 8
        • Java OpenJDK 11
        • Java OpenJDK 17
      • OpenJ9 optimization pack
        • IBM J9 VM 6
        • IBM J9 VM 8
        • Eclipse Open J9 11
      • Node JS optimization pack
        • Node JS 18
      • GO optimization pack
        • GO 1
      • Web Application optimization pack
        • Web Application
      • Docker optimization pack
        • Container
      • Kubernetes optimization pack
        • Horizontal Pod Autoscaler v1
        • Horizontal Pod Autoscaler v2
        • Kubernetes Pod
        • Kubernetes Container
        • Kubernetes Workload
        • Kubernetes Namespace
        • Kubernetes Cluster
      • WebSphere optimization pack
        • WebSphere 8.5
        • WebSphere Liberty ND
      • AWS optimization pack
        • EC2
        • Lambda
      • PostgreSQL optimization pack
        • PostgreSQL 11
        • PostgreSQL 12
      • Cassandra optimization pack
        • Cassandra
      • MySQL Database optimization pack
        • MySQL 8.0
      • Oracle Database optimization pack
        • Oracle Database 12c
        • Oracle Database 18c
        • Oracle Database 19c
        • RDS Oracle Database 11g
        • RDS Oracle Database 12c
      • MongoDB optimization pack
        • MongoDB 4
        • MongoDB 5
      • Elasticsearch optimization pack
        • Elasticsearch 6
      • Spark optimization pack
        • Spark Application 2.2.0
        • Spark Application 2.3.0
        • Spark Application 2.4.0
    • Command Line commands
      • Administration commands
      • User and Workspace management commands
      • Authentication commands
      • Resource management commands
      • Optimizer options commands
    • Release Notes
  • Knowledge Base
    • Performing load testing to support optimization activities
    • Creating custom optimization packs
    • Setting up a Konakart environment for testing Akamas
    • Modeling a sample Java-based e-commerce application (Konakart)
    • Optimizing a web application
    • Optimizing a sample Java OpenJ9 application
    • Optimizing a sample Linux system
    • Optimizing a MongoDB server instance
    • Optimizing a Kubernetes application
    • Leveraging Ansible to automate AWS instance management
    • Guidelines for optimizing AWS EC2 instances
    • Optimizing an Oracle Database server instance
    • Optimizing an Oracle Database for an e-commerce service
    • Guidelines for optimizing Oracle RDS
    • Optimizing a MySQL server database running Sysbench
    • Optimizing a MySQL server database running OLTPBench
    • Optimizing a live full-stack deployment (K8s + JVM)
    • Setup Instana integration
    • Setup Locust telemetry via CSV
    • Setup AppDynamics integration
Powered by GitBook
On this page
  • Create a telemetry instance
  • Configuration options
  • Telemetry instance reference
  • Use cases
  • Collect stage metrics of a Spark Application
  • Best practices

Was this helpful?

Export as PDF
  1. Integrating
  2. Integrating Telemetry Providers
  3. Spark History Server provider

Create Spark History Server telemetry instances

Create a telemetry instance

To create an instance of the Spark History Server provider, build a YAML file (instance.yml in this example) with the definition of the instance:

provider: SparkHistoryServer
config:
  address: spark_master_node
  port: 18080
  importLevel: stage

Then you can create the instance for the system spark-system using the Akamas CLI:

akamas create telemetry-instance instance.yml spark-system

Configuration options

When you create an instance of the Spark History Server provider, you should specify some configuration information to allow the provider to correctly extract and process metrics from the Spark History server.

You can specify configuration information within the config part of the YAML of the instance definition.

Required properties

  • address - hostname of the Spark History Server instance

Telemetry instance reference

The following YAML file describes the definition of a telemetry instance.

provider: SparkHistoryServer  # This is an instance of the Spark History Server provider
config:
  address: spark_master_node # The adress of Spark History Server
  port: 18080   # The port of Spark History Server
  importLevel: job  # The granularity of the imported metrics

The following table reports the reference for the config section within the definition of the Spark History Server provider instance:

Field
Type
Description
Default value
Restriction
Required

address

URL

Spark History Server address

Yes

importLevel

String

Granularity of the imported metrics

job

Allowed values: job, stage, task

No

port

Integer

Spark History Server listening port

18080

No

Use cases

This section reports common use cases addressed by this provider.

Collect stage metrics of a Spark Application

As a first step, you need to create a YAML file (spark_instance.yml) containing the configuration the provider needs to connect to the Spark History Server, plus the filter on the desired level of granularity for the imported metrics:

provider: SparkHistoryServer
config:
  address: spark_master_node
  port: 18080
  importLevel: stage

and then create the telemetry instance using the Akamas CLI:

akamas create telemetry-instance spark_instance.yml
name: spark_workflow
tasks:
  - name: Run Spark application
    operator: SSHSparkSubmit
    arguments:
      component: spark

Best practices

This section reports common best practices you can adopt to ease the use of this telemetry provider.

  • configure metrics granularity: in order to reduce the collection time, configure the importLevel to import metrics with a granularity no finer than the study requires.

  • wait for metrics publication: make sure in the workflow there is a few-minute interval between the end of the Spark application and the execution of the Spark telemetry instance, since the Spark History Server may take some time to complete the publication of the metrics.

Last updated 1 year ago

Was this helpful?

Check for a list of all Spark application metrics available in Akamas

This example shows how to configure a Spark History Server provider in order to collect performance metrics about a Spark application submitted to the cluster using the operator.

Finally, you will need to define for your study a workflow that includes the submission of the Spark application to the cluster, in this case using the :

Spark Application page
Spark SSH Submit
Spark SSH Submit operator