The BucketFS Client (BFSC) is a client program for Exasol's BucketFS distributed file system.
BFSC allows you to read and write the contents of buckets.
A bucket is a storage space on an Exasol cluster that automatically takes care of distributing files loaded onto it. Those files can then be used in User Defined Functions (UDFs), e.g. for Virtual Schemas which are a special type of UDFs. See official documentation for more details on BucketFS.
The most common use case for BucketFS is storing files on it that UDFs need:
- Libraries for your UDFs
- Drivers for external data sources or sinks
- Configuration files
- Immutable lookup tables
The client is implemented in Java and provides native executables for x86 CPU with operating system Ubuntu Linux or Windows.
Platforms
Platform | Available? | Notes |
---|---|---|
Linux | ✅ yes | Built using GitHub hosted runner latest_ubuntu. For release 1.1.2 this is Ubuntu 22.04 with glibc 2.32. |
Windows | ✅ yes | When downloading the native executable some browsers may display a warning about security risks imposed by downloading unsigned binaries. |
macOS | ❌ no | Latest macOS Ventura (e.g. version 13.2.1) refuses to execute unsigned binaries. |
Signed binaries for Windows and macOS will be provided as soon as a signing process is established.
For all platforms not supported currently or in case you do not want to execute an unsigned binary on your platform please refer to the jar file provided with each release.
To make BFSC really handy you can either rename the binary or set an alias
alias bfsc='bfsc-linux_x86'
Besides executing the binaries you can also download BFSC's JAR file and let a Java VM execute it.
For that you need Java 17 or later, the installation procedure depends on your operating system.
- Any operating system: Adoptium OpenJDK build
- Linux
- Ubuntu, Debian:
sudo apt install openjdk-17-jre-headless
- RedHat
sudo yum install java-17-openjdk
- SuSE: OpenJDK build of the Leap project](https://software.opensuse.org/download/package?package=java-17-openjdk&project=openSUSE%3AFactory)
- Ubuntu, Debian:
As soon as Java 17 is available you can run BFSC's JAR file the following command:
java -jar "<path-to-bfsc-jar>" <command> <option> ...
An alias in your shell is even more useful here:
alias bfsc='java -jar <path-to-bfsc-jar>'
Sub-commands control the action that BFSC is taking. For each type of action there is a sub-command. The way to use this is:
bfsc <sub-command> <option> ...
BFSC supports the following sub commands to inspect and manipulate files in the BucketFS. You can use a configuration file or rely on default values to abbreviate URLs in BucketFS.
Full | Abbreviated | Description |
---|---|---|
bfsc ls bfs://localhost:2580/default |
bfsc ls |
List the contents of BucketFS |
bfsc ls bfs://localhost:2580/default/folder |
bfsc ls folder |
List the contents of a directory in BucketFS |
bfsc cp a.jar bfs://localhost:2580/default/ |
bfsc cp a.jar bfs:/ |
Upload a file from local fie system to BucketFS |
bfsc cp bfs://localhost:2580/default/a.jar . |
bfsc cp bfs:/a.jar . |
Download a file from BucketFS to local file system |
bfsc rm bfs://localhost:2580/a.jar |
bfsc rm /a.jar |
Remove a file from BucketFS |
The following sections will first explain some general concepts and then describe each of the commands in detail.
BFSC uses specific URLs to refer to files or paths in a bucket.
bfs[s]://<host>:<port>/<path-in-bucket>
The protocol is either bfs
or bfss
for a connection that is TLS secured.
You specify the host or IP address of the machine where the BucketFS service runs. Each service has its own port that you set in the BucketFS service configuration so a port number identifies the service.
The first path element is the name of the bucket. The next elements are the path relative to that bucket's root.
There are separate passwords for read- and write-operations:
Operation | Examples | Password protection |
---|---|---|
Write operations | Upload files to BucketFS, delete files from the BucketFS | Always password protected |
Read operations | List bucket contents, download files | Unprotected for public buckets |
BFSC supports to read required passwords either from the profile in the configuration file or from an interactive prompt hiding the characters typed by the user.
BFSC does not support to provide passwords on the command line to avoid the password showing up in the command history. As a general rule you should never put any credentials directly into a command line.
BFSC will normally not ask for a read-password interactively.
- This is fine for public buckets as these are not password-protected.
- This is also fine if the password is contained in your profile.
- In case of accessing a private bucket without the read-password being contained in your profile you must add option
-pw
or--require-read-password
to force BFSC to ask for the password via interactive prompt.
The passwords are usually stored in file /exa/etc/EXAConf
:
WritePasswd = <value>
ReadPasswd = <value>
Please note that each of the passwords is base64-encoded. So before providing the passwords to BFSC please apply echo <password> | base64 -d
.
Besides specifying the complete URL on the command line you can use defaults for some parts of the URL.
BFSC uses the following precedence for URI parts
- Supplied on the command line, or entered interactively in case of a password
- Data in your profile
- Hard coded default value for host and port
You can define the profile in BFSC's configuration file in your home directory: ~/.bucketfs-client/config.ini
. On Windows the home directory is %USERPROFILE%
.
The configuration file uses the INI-file syntax. An INI file is divided into sections. Each section contains a number of lines defining a profile. The first line of each section specifies the name of the profile in brackets. The default profile's name is default
. Each of the following lines of the section may assign a value to a variable.
BFCS uses the following elements of your profile with the specified hard coded default values:
Parameter | Variable | Default value |
---|---|---|
Host address of BucketFS server | host |
localhost |
Port | port |
2580 |
Name of root bucket | bucket |
(none) |
Password for read operations | password.read |
(none) |
Password for write operations | password.write |
(none) |
Here is an example for the content of a configuration file for BFSC:
[default]
host=1.2.3.4
port=8888
bucket=simba
password.read=abc
password.write=def
Using this configuration file the command line
bfsc cp foo.jar bfs://1.2.3.4:8888/simba/drivers/foo.jar
... can then be abbreviated to:
bfsc cp foo.jar bfs:/drivers
BFSC cannot detect whether the URL on the command line contains a bucket name or directly starts with the path inside the bucket. If the profile in your configuration file specifies a bucket name, then BFSC will always prepend that bucket name to the path of the URL.
If you want to provide the name of the root bucket on the command line then please do not set bucket
in your profile.
In case your profile specifes the name of the root bucket all of the following commands will list the contents of directory folder
in the root bucket:
bfsc ls folder
bfsc ls /folder
bfsc ls bfs:///folder
bfsc ls bfs:/folder
In case your profile does not specify the name of the root bucket all of the following commands will list the contents of root bucket default
:
bfsc ls /default/
bfsc ls bfs:/default/
bfsc ls bfs:///default/
With bfsc ls
you can list the contents of a directory in the bucket. If you do not specify a directory then the ls
command will list all files and directories in the bucket on top level.
Command (abbreviated) | Output |
---|---|
bfsc ls |
Contents of the bucket's root directory |
bfsc ls folder |
Contents of directory folder |
Example output
a.txt
b.txt
folder/
BFSC appends a trailing slash /
to separate directories from files potentially having the same name.
If you want to list all buckets rather than the contents of a specific bucket, then please ensure to not not set bucket
in your profile.
Example output for listing root buckets
default
simba
With option -r
or --recursive
you can obtain a recursive listing of all contents inside a directory of a bucket. However when listing buckets then BFSC will ignore option -r
.
Example (abbreviated):
$ bfsc ls -r folder
a.txt
b.txt
sub/a1.txt
With bfsc cp
you can copy files between BucketFS and the local file system.
bfsc cp <from> <to>
In the majority of all cases you will copy files to a bucket. For example if you want to install a library that you plan to use in a Python or Java UDF (User Defined Function):
bfsc cp foo-driver-1.2.3.jar bfs://192.168.0.1:2580/default/drivers/foo-driver-1.2.3.jar
While the cp
command can rename the copied file in the target location you can also omit the filename to tell BFSC to leave the filename unchanged just as the GNU cp
command does:
bfsc cp foo.jar bfs:/drivers/
bfsc cp bfs:/drivers/foo.jar .
Command cp
supports recursive copying:
bfsc cp -r Dir bfs:/
However there are some special cases explained in the following sections.
BucketFS differs from the local file system as it may contain ambigue entries. An ambigue entry represents a regular file and a directory at the same time. Obviously you cannot download both flavors of such an entry to the local file system and BFSC will abort with an error message.
You can however tell BFSC to download one of the flavors of such an ambigue entry.
- For recursively downloading the directory please specify option
-r
and append a slash/
to the name of the ambigue entry. - For downloading the regular file omit both of them.
Command (abbreviated) | Semantics |
---|---|
bfsc cp bfs:/ambigue/ |
error message |
bfsc cp -r bfs:/ambigue |
error message |
bfsc cp bfs:/ambigue |
download regular file ambigue |
bfsc cp -r bfs:/ambigue/ |
recursively download directory ambigue |
For non-ambigue entries the trailing slash is not relevant.
For ambigue entries on lower levels during recursive download BFSC will report a warning and only download the directory flavor of the ambigue entry.
When downloading from BucketFS the target in the local file system might already exist and could be either a directory or a regular file.
The following table shows all cases assuming the local file system contains a regular file b.txt
and a directory B
.
Command (abbreviated) | Semantics / error message |
---|---|
cp bfs://a.txt b.txt |
download regular file a.txt and rename it to b.txt |
cp bfs://a.txt B |
download regular file a.txt into existing local directory B and keep the original filename resulting in B/a.txt |
cp -r bfs://A B |
recursively download directory A to B/A |
cp -r bfs://A b.txt |
error message |
When downloading a directory from BucketFS you can rename the directory by specifying a local name that does not exist, yet.
With bfsc rm
you can delete files in BucketFS.
Examples (abbreviated):
bfsc rm a.txt
bfsc rm folder/b.txt
bfsc rm -r folder
As for the other commands command rm
supports recursive removal, too. However there are some special remarks
- BFSC will not display an error message if you specify
-r
for deleting a regular file. - With option
-r
BFSC will remove files, directories, and ambigue entries. - Applying
rm
without option-r
on an ambigue entry will remove only the regular file but not the directory with the same name.