# SparkSubmit Operator

The **SparkSubmit** operator connects to a Spark instance and invokes a local *spark-submit* to schedule a job.

## Operator arguments <a href="#operator-arguments" id="operator-arguments"></a>

| Name              | Type                                 | Value Restrictions                                                                                                                                                                                                                 | Required | Default                                | Description                                                                                  |
| ----------------- | ------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------- | -------------------------------------- | -------------------------------------------------------------------------------------------- |
| `file`            | String                               | It should be a path to a valid java or python spark application file                                                                                                                                                               | Yes      |                                        | Spark application to submit (jar or python file)                                             |
| `args`            | List of Strings, Numbers or Booleans |                                                                                                                                                                                                                                    | Yes      |                                        | Additional application arguments                                                             |
| `master`          | String                               | <p>It should be a valid supported Master URL:</p><ul><li>local</li><li>local\[K]</li><li>local\[K,F]</li><li>local\[]</li><li>local\[,F]</li><li>spark://HOST:PORT</li><li>spark://HOST1:PORT1, HOST2:PORT2</li><li>yarn</li></ul> | Yes      |                                        | The master URL for the Spark cluster                                                         |
| `deployMode`      | `client` `cluster`                   |                                                                                                                                                                                                                                    | No       | `cluster`                              | Whether to launch the driver locally (`client`) or in the cluster (`cluster`)                |
| `className`       | String                               |                                                                                                                                                                                                                                    | No       |                                        | The entry point of the java application. Required for java applications.                     |
| `name`            | String                               |                                                                                                                                                                                                                                    | No       |                                        | Name of the task. When submitted the id of the study, experiment and trial will be appended. |
| `jars`            | List of Strings                      | Each item of the list should be a path that matches an existing jar file                                                                                                                                                           | No       |                                        | A list of jars to be added in the classpath.                                                 |
| `pyFiles`         | List of Strings                      | Each item of the list should be a path that matches an existing python file                                                                                                                                                        | No       |                                        | A list of python scripts to be added to the PYTHONPATH                                       |
| `files`           | List of Strings                      | Each item of the list should be a path that matches an existing file                                                                                                                                                               | No       |                                        | A list of files to be added to the context of the spark-submit                               |
| `conf`            | Object (key-value pairs)             |                                                                                                                                                                                                                                    | No       |                                        | Mapping containing additional Spark configurations. See Spark documentation.                 |
| `envVars`         | Object (key-value pairs)             |                                                                                                                                                                                                                                    | No       |                                        | Env variables when running the *spark-submit* command                                        |
| `sparkSubmitExec` | String                               | It should be a path that matches an existing executable                                                                                                                                                                            | No       | The default for the Spark installation | The path of the *spark-submit* executable command                                            |
| `sparkHome`       | String                               | It should be a path that matches an existing directory                                                                                                                                                                             | No       | The default for the Spark installation | The path of the SPARK\_HOME                                                                  |
| `proxyUser`       | String                               |                                                                                                                                                                                                                                    | No       |                                        | The user to be used to execute Spark applications                                            |
| `verbose`         | Boolean                              |                                                                                                                                                                                                                                    | No       | true                                   | If additional debugging output should be displayed                                           |
| `component`       | String                               | It should match the name of an existing Component of the System under test                                                                                                                                                         | Yes      |                                        | The name of the component whose properties can be used as arguments of the operator          |
