🛈 Please note that this is an open source project which is officially supported by Exasol. For any question, you can contact our support team.
This repository contains helper code to create Exasol user defined functions (UDFs) in order to read from and write to public cloud storage systems.
Additionally, it provides UDF scripts to import data from Apache Kafka clusters.
- Import formatted data from public cloud storage systems.
- Following data formats are supported as source file format when importing: Apache Avro, Apache Orc and Apache Parquet.
- Export Exasol table data to public cloud storage systems.
- Following data formats are supported as sink file format when exporting: Apache Parquet.
- Following cloud storage systems are supported: Amazon S3, Google Cloud Storage, Azure Blob Storage and Azure Data Lake (Gen1) Storage.
- Import Apache Avro formatted data from Apache Kafka clusters.
For more information please check out the following guides.
See CONTRIBUTING.md for contribution guidelines.
For requesting a feature, providing a feedback or reporting an issue, please open a Github issue.
The following sections list all the dependencies that are required for compiling, testing and running the project.
We compile and build the cloud-storage-etl-udfs
releases using Java 8;
however, it should be safe to run it on the newer JVM versions. This is also
recommendend way to build the Scala code.
Dependency | Purpose | License |
---|---|---|
Exasol Script API | Accessing Exasol IMPORT / EXPORT API | MIT License |
Hadoop AWS | Access for Amazon S3 object store and compatible implementations | Apache License 2.0 |
Hadoop Azure | Access support for Azure Blob Storage | Apache License 2.0 |
Hadoop Azure Datalake | Access support for Azure Data Lake Store | Apache License 2.0 |
Hadoop Client | Apache Hadoop common dependencies as configuration or filesystem | Apache License 2.0 |
Google Cloud Storage | Access support for Google Cloud Storage | Apache License 2.0 |
Apache Avro | Integration support for Avro format | Apache License 2.0 |
Apache Orc | Integration support for Orc format | Apache License 2.0 |
Apache Parquet | Integration support for Parquet format | Apache License 2.0 |
Apache Kafka Clients | An Apache Kafka client support for Java / Scala | Apache License 2.0 |
Kafka Avro Serializer | Support for serializing / deserializing Avro formats with Kafka | Apache License 2.0 |
SLF4J API | A simple logging facade for Java (SLF4J) | MIT License |
Scala Logging Library | Scala logging library wrapping SLF4J | Apache License 2.0 |
Dependency | Purpose | License |
---|---|---|
Scalatest | A testing tool for Scala and Java developers | Apache License 2.0 |
Scalatest Plus | An integration support between Scalatest and Mockito | Apache License 2.0 |
Mockito Core | A mocking framework for unit tests | MIT License |
Embedded Kafka Schema Registry | An in-memory instances of Kafka and Schema registry for tests | MIT License |
These plugins help with project development.
Plugin Name | Purpose | License |
---|---|---|
SBT Coursier | Pure Scala artifact fetching | Apache License 2.0 |
SBT Wartremover | Flexible Scala code linting tool | Apache License 2.0 |
SBT Wartremover Contrib | Community managed additional warts for wartremover | Apache License 2.0 |
SBT Assembly | Create fat jars with all project dependencies | MIT License |
SBT API Mappings | A plugin that fetches API mappings for common Scala libraries | Apache License 2.0 |
SBT Scoverage | Integrates the scoverage code coverage library | Apache License 2.0 |
SBT Coveralls | Uploads scala code coverage results to https://coveralls.io | Apache License 2.0 |
SBT Updates | Checks Maven and Ivy repositories for dependency updates | BSD 3-Clause License |
SBT Scalafmt | A plugin for https://scalameta.org/scalafmt/ formatting | Apache License 2.0 |
SBT Scalastyle | A plugin for http://www.scalastyle.org/ Scala style checker | Apache License 2.0 |
SBT Dependency Graph | A plugin for visualizing dependency graph of your project | Apache License 2.0 |
SBT Explicit Dependencies | Checks which direct libraries required to compile your code | Apache License 2.0 |
SBT Git | A plugin for Git integration, used to version the release jars | BSD 2-Clause License |