Optimizing Spark
Last updated
Was this helpful?
Last updated
Was this helpful?
When optimizing applications running on the Apache Spark framework, the goal is to find the configurations that best optimize the allocated resources or the execution time.
Please refer to the for the list of component types, parameters, metrics, and constraints.
Akamas offers several operators that you can use to apply the parameters for the tuned Spark application. In particular, we suggest using the , which connects to a target instance to submit the application using the configuration parameters to test.
Other solutions include:
the , which allows submitting the application along with the configuration parameters using the
the standard , which allows running a custom command or script once the updated the default Spark configuration file or a custom one using a template.
You can organize a typical workflow to optimize a Spark application in three parts:
Setup the test environment
prepare any required input data
apply the Spark configuration parameters, if you are going for a file-based solution
Execute the Spark application
Perform cleanup
Here’s an example of a typical workflow where Akamas executes the Spark application using the :
Here’s a configuration example for a telemetry provider instance:
Akamas can access statistics using the . This provider maps the metrics in this optimization pack to the statistics provided by the History Server endpoint.
See this for an example of a study leveraging the Spark pack.