Skip to content

Latest commit

 

History

History
101 lines (80 loc) · 3.84 KB

larksheet-v1.md

File metadata and controls

101 lines (80 loc) · 3.84 KB

LarkSheet connector-v1

Parent document: Connectors

The BitSail LarkSheet connector supports reading lark sheets. The main function points are as follows:

  • Support batch read from single or multiple lark sheets at once.
  • Support authentication by static token and application.
  • Support read a portion of columns from sheets.

Maven dependency

<dependency>
   <groupId>com.bytedance.bitsail</groupId>
   <artifactId>connector-larksheet</artifactId>
   <version>${revision}</version>
</dependency>

LarkSheet reader

Supported data types

BitSail LarkSheet reader processes all data as string.

Parameters

The following mentioned parameters should be added to job.reader block when using, for example:

{
  "job": {
    "reader": {
      "class": "com.bytedance.bitsail.connector.legacy.larksheet.source.LarkSheetInputFormat",
      "sheet_urls": "https://e4163pj5kq.feishu.cn/sheets/shtcnQmZNlZ9PjZUJKT5oU3Sjjg?sheet=ZbzDHq",
      "columns": [
        {
          "name": "id",
          "type": "string"
        },
        {
          "name": "datetime",
          "type": "string"
        }
      ]
    }
  }
}

Necessary parameters

Param name Required Optional value Description
class Yes LarkSheet reader class name, com.bytedance.bitsail.connector.legacy.larksheet.source.LarkSheetInputFormat
sheet_urls Yes A list of sheet to read. Multi sheets urls are separated by comma.
columns Yes Describing fields' names and types.

The following parameters are for authentication, you have to set (sheet_token) or (app_id and app_secret) in your configuration.

Param name Required Optional value Description
sheet_token At least set one:
1. sheet_token
2. app_id and app_secret
Token for get permission to visit feishu open api.
app_id Use app_id and app_secret to generate token for visiting feishu open api.
app_secret

Note that if you use sheet_token, it may expire when the job runs. If you use app_id and app_secret, the token will be refreshed if it expires.

Optional parameters

Param name Required Optional value Description
reader_parallelism_num No Read parallelism num
batch_size No Number of lines extracted once.
skip_nums no A list of numbers indicating how many lines should be skipped in each sheet.

Related documents

Configuration examples: LarkSheet connector example