Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][doc] Add missing files linked in io-connectors #17732

Merged
merged 7 commits into from
Sep 26, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
id: io-aerospike
title: Aerospike Sink Connector
sidebar_label: "Aerospike Sink Connector"
original_id: io-aerospike
---

The Aerospike Sink connector is used to write messages to an Aerospike Cluster.

## Sink Configuration Options

The following configuration options are specific to the Aerospike Connector:

| Name | Required | Default | Description |
|------|----------|---------|-------------|
| `seedHosts` | `true` | `null` | Comma separated list of one or more Aerospike cluster hosts; each host can be specified as a valid IP address or hostname followed by an optional port number (default is 3000). |
| `keyspace` | `true` | `null` | Aerospike namespace to use. |
| `keySet` | `false` | `null` | Aerospike set name to use. |
| `columnName` | `true` | `null` | Aerospike bin name to use. |
| `maxConcurrentRequests` | `false` | `100` | Maximum number of concurrent Aerospike transactions that a Sink can open. |
| `timeoutMs` | `false` | `100` | A single timeout value controls `socketTimeout` and `totalTimeout` for Aerospike transactions. |
| `retries` | `false` | `1` | Maximum number of retries before aborting a write transaction to Aerospike. |
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
id: io-cassandra
title: Cassandra Sink Connector
sidebar_label: "Cassandra Sink Connector"
original_id: io-cassandra
---

The Cassandra Sink connector is used to write messages to a Cassandra Cluster.

The tutorial [Connecting Pulsar with Apache Cassandra](io-quickstart) shows an example how to use Cassandra Sink
connector to write messages to a Cassandra table.

## Sink Configuration Options

All the Cassandra sink settings are listed as below. All the settings are required to run a Cassandra sink.

| Name | Default | Required | Description |
|------|---------|----------|-------------|
| `roots` | `null` | `true` | Cassandra Contact Points. A list of one or many node address. It is a comma separated `String`. |
| `keyspace` | `null` | `true` | Cassandra Keyspace name. The keyspace should be created prior to creating the sink. |
| `columnFamily` | `null` | `true` | Cassandra ColumnFamily name. The column family should be created prior to creating the sink. |
| `keyname` | `null` | `true` | Key column name. The key column is used for storing Pulsar message keys. If a Pulsar message doesn't have any key associated, the message value will be used as the key. |
| `columnName` | `null` | `true` | Value column name. The value column is used for storing Pulsar message values. |
267 changes: 15 additions & 252 deletions site2/website/versioned_docs/version-2.1.1-incubating/io-connectors.md
Original file line number Diff line number Diff line change
@@ -1,256 +1,19 @@
---
id: io-connectors
title: Built-in connector
sidebar_label: "Built-in connector"
title: Builtin Connectors
sidebar_label: "Builtin Connectors"
original_id: io-connectors
---

Pulsar distribution includes a set of common connectors that have been packaged and tested with the rest of Apache Pulsar. These connectors import and export data from some of the most commonly used data systems.

Using any of these connectors is as easy as writing a simple connector and running the connector locally or submitting the connector to a Pulsar Functions cluster.

## Source connector

Pulsar has various source connectors, which are sorted alphabetically as below.

### Canal

* [Configuration](io-canal-source.md#configuration)

* [Example](io-canal-source.md#usage)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/canal/src/main/java/org/apache/pulsar/io/canal/CanalStringSource.java)


### Debezium MySQL

* [Configuration](io-debezium-source.md#configuration)

* [Example](io-debezium-source.md#example-of-mysql)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/debezium/mysql/src/main/java/org/apache/pulsar/io/debezium/mysql/DebeziumMysqlSource.java)

### Debezium PostgreSQL

* [Configuration](io-debezium-source.md#configuration)

* [Example](io-debezium-source.md#example-of-postgresql)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/debezium/postgres/src/main/java/org/apache/pulsar/io/debezium/postgres/DebeziumPostgresSource.java)

### Debezium MongoDB

* [Configuration](io-debezium-source.md#configuration)

* [Example](io-debezium-source.md#example-of-mongodb)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/debezium/mongodb/src/main/java/org/apache/pulsar/io/debezium/mongodb/DebeziumMongoDbSource.java)

### Debezium Oracle

* [Configuration](io-debezium-source.md#configuration)

* [Example](io-debezium-source.md#example-of-oracle)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/debezium/oracle/src/main/java/org/apache/pulsar/io/debezium/oracle/DebeziumOracleSource.java)

### Debezium Microsoft SQL Server

* [Configuration](io-debezium-source.md#configuration)

* [Example](io-debezium-source.md#example-of-microsoft-sql)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/debezium/mssql/src/main/java/org/apache/pulsar/io/debezium/mssql/DebeziumMsSqlSource.java)


### DynamoDB

* [Configuration](io-dynamodb-source.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/dynamodb/src/main/java/org/apache/pulsar/io/dynamodb/DynamoDBSource.java)

### File

* [Configuration](io-file-source.md#configuration)

* [Example](io-file-source.md#usage)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/file/src/main/java/org/apache/pulsar/io/file/FileSource.java)

### Flume

* [Configuration](io-flume-source.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/flume/src/main/java/org/apache/pulsar/io/flume/FlumeConnector.java)

### Twitter firehose

* [Configuration](io-twitter-source.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/twitter/src/main/java/org/apache/pulsar/io/twitter/TwitterFireHose.java)

### Kafka

* [Configuration](io-kafka-source.md#configuration)

* [Example](io-kafka-source.md#usage)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/kafka/src/main/java/org/apache/pulsar/io/kafka/KafkaAbstractSource.java)

### Kinesis

* [Configuration](io-kinesis-source.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/kinesis/src/main/java/org/apache/pulsar/io/kinesis/KinesisSource.java)

### Netty

* [Configuration](io-netty-source.md#configuration)

* [Example of TCP](io-netty-source.md#tcp)

* [Example of HTTP](io-netty-source.md#http)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/netty/src/main/java/org/apache/pulsar/io/netty/NettySource.java)

### NSQ

* [Configuration](io-nsq-source.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/nsq/src/main/java/org/apache/pulsar/io/nsq/NSQSource.java)

### RabbitMQ

* [Configuration](io-rabbitmq-source.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/rabbitmq/src/main/java/org/apache/pulsar/io/rabbitmq/RabbitMQSource.java)

## Sink connector

Pulsar has various sink connectors, which are sorted alphabetically as below.

### Aerospike

* [Configuration](io-aerospike-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/aerospike/src/main/java/org/apache/pulsar/io/aerospike/AerospikeStringSink.java)

### Cassandra

* [Configuration](io-cassandra-sink.md#configuration)

* [Example](io-cassandra-sink.md#usage)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/cassandra/src/main/java/org/apache/pulsar/io/cassandra/CassandraStringSink.java)

### ElasticSearch

* [Configuration](io-elasticsearch-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/elastic-search/src/main/java/org/apache/pulsar/io/elasticsearch/ElasticSearchSink.java)

### Flume

* [Configuration](io-flume-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/flume/src/main/java/org/apache/pulsar/io/flume/sink/StringSink.java)

### HBase

* [Configuration](io-hbase-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/hbase/src/main/java/org/apache/pulsar/io/hbase/HbaseAbstractConfig.java)

### HDFS2

* [Configuration](io-hdfs2-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/hdfs2/src/main/java/org/apache/pulsar/io/hdfs2/AbstractHdfsConnector.java)

### HDFS3

* [Configuration](io-hdfs3-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/hdfs3/src/main/java/org/apache/pulsar/io/hdfs3/AbstractHdfsConnector.java)

### InfluxDB

* [Configuration](io-influxdb-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/influxdb/src/main/java/org/apache/pulsar/io/influxdb/InfluxDBGenericRecordSink.java)

### JDBC ClickHouse

* [Configuration](io-jdbc-sink.md#configuration)

* [Example](io-jdbc-sink.md#example-for-clickhouse)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/jdbc/clickhouse/src/main/java/org/apache/pulsar/io/jdbc/ClickHouseJdbcAutoSchemaSink.java)

### JDBC MariaDB

* [Configuration](io-jdbc-sink.md#configuration)

* [Example](io-jdbc-sink.md#example-for-mariadb)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/jdbc/mariadb/src/main/java/org/apache/pulsar/io/jdbc/MariadbJdbcAutoSchemaSink.java)

### JDBC OpenMLDB

* [Configuration](io-jdbc-sink.md#configuration)

* [Example](io-jdbc-sink.md#example-for-openmldb)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/jdbc/openmldb/src/main/java/org/apache/pulsar/io/jdbc/OpenMLDBJdbcAutoSchemaSink.java)

### JDBC PostgreSQL

* [Configuration](io-jdbc-sink.md#configuration)

* [Example](io-jdbc-sink.md#example-for-postgresql)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/jdbc/postgres/src/main/java/org/apache/pulsar/io/jdbc/PostgresJdbcAutoSchemaSink.java)

### JDBC SQLite

* [Configuration](io-jdbc-sink.md#configuration)

* [Example](io-jdbc-sink.md#example-for-sqlite)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/jdbc/sqlite/src/main/java/org/apache/pulsar/io/jdbc/SqliteJdbcAutoSchemaSink.java)

### Kafka

* [Configuration](io-kafka-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/kafka/src/main/java/org/apache/pulsar/io/kafka/KafkaAbstractSink.java)

### Kinesis

* [Configuration](io-kinesis-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/kinesis/src/main/java/org/apache/pulsar/io/kinesis/KinesisSink.java)

### MongoDB

* [Configuration](io-mongo-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/mongo/src/main/java/org/apache/pulsar/io/mongodb/MongoSink.java)

### RabbitMQ

* [Configuration](io-rabbitmq-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/rabbitmq/src/main/java/org/apache/pulsar/io/rabbitmq/RabbitMQSink.java)

### Redis

* [Configuration](io-redis-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/redis/src/main/java/org/apache/pulsar/io/redis/RedisAbstractConfig.java)

### Solr

* [Configuration](io-solr-sink.md#configuration)

* [Java class](https://github.com/apache/pulsar/blob/master/pulsar-io/solr/src/main/java/org/apache/pulsar/io/solr/SolrSinkConfig.java)

Pulsar distribution includes a set of common connectors that have been packaged and tested with the rest of Apache Pulsar.
These connectors import and export data from some of the most commonly used data systems. Using any of these connectors is
as easy as writing a simple connector configuration and running the connector locally or submitting the connector to a
Pulsar Functions cluster.

- [Aerospike Sink Connector](io-aerospike.md)
- [Cassandra Sink Connector](io-cassandra.md)
- [Kafka Sink Connector](io-kafka.md#sink)
- [Kafka Source Connector](io-kafka.md#source)
- [Kinesis Sink Connector](io-kinesis.md#sink)
- [RabbitMQ Source Connector](io-rabbitmq.md#source)
- [Twitter Firehose Source Connector](io-twitter.md)
41 changes: 41 additions & 0 deletions site2/website/versioned_docs/version-2.1.1-incubating/io-kafka.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
id: io-kafka
title: Kafka Connector
sidebar_label: "Kafka Connector"
original_id: io-kafka
---

## Source

The Kafka Source Connector is used to pull messages from Kafka topics and persist the messages
to a Pulsar topic.

### Source Configuration Options

| Name | Required | Default | Description |
|------|----------|---------|-------------|
| bootstrapServers | `true` | `null` | A list of host/port pairs to use for establishing the initial connection to the Kafka cluster. |
| groupId | `true` | `null` | A unique string that identifies the consumer group this consumer belongs to. |
| fetchMinBytes | `false` | `null` | Minimum bytes expected for each fetch response. |
| autoCommitEnabled | `false` | `false` | If true, periodically commit to ZooKeeper the offset of messages already fetched by the consumer. This committed offset will be used when the process fails as the position from which the new consumer will begin. |
| autoCommitIntervalMs | `false` | `null` | The frequency in ms that the consumer offsets are committed to zookeeper. |
| sessionTimeoutMs | `false` | `null` | The timeout used to detect consumer failures when using Kafka's group management facility. |
| topic | `true` | `null` | Topic name to receive records from Kafka |
| keySerializerClass | false | org.apache.kafka.common.serialization.StringSerializer | Serializer class for key that implements the org.apache.kafka.common.serialization.Serializer interface. |
| valueSerializerClass | false | org.apache.kafka.common.serialization.StringSerializer | Serializer class for value that implements the org.apache.kafka.common.serialization.Serializer interface. |

## Sink

The Kafka Sink Connector is used to pull messages from Pulsar topics and persist the messages
to a Kafka topic.

### Sink Configuration Options

| Name | Required | Default | Description |
|------|----------|---------|-------------|
| acks | `true` | `null` | The kafka producer acks mode |
| batchSize | `true` | `null` | The kafka producer batch size. |
| maxRequestSize | `true` | `null` | The maximum size of a request in bytes. |
| topic | `true` | `null` | Topic name to receive records from Kafka |
| keySerializerClass | false | org.apache.kafka.common.serialization.StringSerializer | Serializer class for value that implements the org.apache.kafka.common.serialization.Serializer interface. |
| valueSerializerClass | false | org.apache.kafka.common.serialization.StringSerializer | Serializer class for value that implements the org.apache.kafka.common.serialization.Serializer interface. |
Loading