The SparkProcessor does feature ETL using Spark as the compute engine. In the following sections we describe the deployment modes supported by SparkProcessor and the configuration keys accepted by each mode.
- Spark 3.3
The Spark processor runs the Spark job in client mode.
In the following we describe the configuration keys accepted by the configuration dict passed to the SparkProcessor.
key | Required | default | type | Description |
---|---|---|---|---|
master | Required | (None) | String | The Spark master URL to connect to. Check this link for valid url formats. |
native.* | optional | (none) | String | Any key with the "native" prefix will be forwarded to the Spark Session config after the "native" prefix is removed. For example, if the processor config has an entry "native.spark.default.parallelism": 2, then the Spark Session config will have an entry "spark.default.parallelism": 2. |