Parent document: Connectors
The Elasticsearch connector can be used in stream and batch scenarios, providing the ability to write elasticsearch in 'At Least Once'
mode, and providing flexible write request construction.
- Support Elasticsearch 7.X
<dependency>
<groupId>com.bytedance.bitsail</groupId>
<artifactId>connector-elasticsearch</artifactId>
<version>${revision}</version>
</dependency>
Basic data types supported by Elasticsearch connectors:
- String type:
- string
- text
- keyword
- Integer type:
- long
- integer
- short
- byte
- Float type:
- double
- float
- half_float
- scaled_float
- Bool type:
- boolean
- Binary type:
- binary
- Date type:
- date
Users can add parameters to job.writer
block in task configuration files.
Param name | Default value | Optional value | Description |
---|---|---|---|
class | - | Class name of Elasticsearch connector,com.bytedance.bitsail.connector.elasticsearch.sink.ElasticsearchSink |
|
es_hosts | - | Address list for Elasticsearch handling REST requests | |
es_index | - | Elasticsearch index | |
columns | - | Describing fields' names and types |
Param name | Default value | Optional value | Description |
---|---|---|---|
writer_parallelism_num | writer parallelism |
Param name | Default value | Optional value | Description |
---|---|---|---|
request_path_prefix | - | The path prefix used by the http client when making a request | |
connection_request_timeout_ms | 10000 | Timeout (ms) used by http connection manager when requesting a connection | |
connection_timeout_ms | 10000 | Http connection establishment timeout (ms) | |
socket_timeout_ms | 60000 | Socket timeout for http connection (ms) |
Param name | Default value | Optional value | Description |
---|---|---|---|
bulk_flush_max_actions | 300 | When the number of requests reaches, execute a bulk operation | |
bulk_flush_max_size_mb | 10 | When the request data size (in MB) reaches, execute a bulk operation | |
bulk_flush_interval_ms | 10000 | How often to execute bulk operation (unit: ms) | |
bulk_backoff_policy | EXPONENTIAL | CONSTANT EXPONENTIAL NONE |
Backoff policy when bulk operation fails: 1. CONSTANT : fixed delay backoff2. EXPONENTAIL : exponential backoff3. NONE : no backoff |
bulk_backoff_delay_ms | 100 | Failure retry delay (ms) of bulk operation | |
bulk_backoff_max_retry_count | 5 | The maximum number of failed retries for bulk operations |
Param name | Default value | Optional value | Description |
---|---|---|---|
es_operation_type | "index" | "index" "create" "update" "upsert" "delete" |
Type of ActionRequest |
es_dynamic_index_field | - | Get the index name of this data to insert from this field | |
es_operation_type_field | - | Get the ActionRequest type of this data from this field | |
es_version_field | - | Get the version information of this data from this field | |
es_id_fields | "" | Get the document ID from this field. The format is ',' separated string, e.g. "1,2" |
|
doc_exclude_fields | "" | When creating a document, ignore these fields. The format is ',' separated string, for example: "1,2" |
|
ignore_blank_value | false | Whether to ignore fields with null values when creating documents | |
flatten_map | false | Whether to expand the Map type data into the document when creating the document |
|
id_delimiter | # |
The separator used when merging multiple fields into one document id | |
json_serializer_features | - | Json features used when building json strings. The format is ',' separated string, for example: "QuoteFieldNames,UseSingleQuotes" |
Configuration examples: Elasticsearch connector example