This document describes the current design of datamon sidecars.
The current design favors the sidecar container approach, with bespoke signaling between containers, over the CSI driver approach. Therefore, datamon is not available as a kubernetes persistent volume plugin.
This design choice stems from the current inability on GKE to handle Kubernetes ephemeral volumes (v1.16 feature). Managing many short lived kubernetes volumes is at the moment not practical. When GKE eventually makes Kubernetes v1.16 available, we may revive our attempt to make a CSI driver for datamon.
The sidecar approach requires a coordination between containers. Signaling is implemented with files on a shared volume.
We need some coordination between the different containers running in the pod.
In particular, we need to keep the pod running and the sidecars to finish uploading results when the main ARGO workflow is done processing.
Ensuring that data is ready for access (sidecar to main-container messaging) as well as notification that the data-science program has produced output data to upload (main-container to sidecar messaging), is the responsibility of a few shell scripts shipped as part and parcel of the Docker images that practicably constitute sidecars.
The coordination signaling defines the following protocol:
main(wrap_application.sh ) |
sidecar (wrap_datamon.sh ) |
what happens |
---|---|---|
<= mountdone | application waits for input bundles to be mounted | |
(do some work...) | ||
initupload => | datamon starts running the upload commands | |
<= uploaddone | application waits until its output is archived |
A similar process is iterated through all configured datamon-postgres sidecars.
main(wrap_application.sh ) |
sidecar (wrap_datamon_pg.sh ) |
what happens |
---|---|---|
<= dbstarted | application waits for the DB instance to be ready | |
(do some work...) | ||
initdbupload => | datamon archives the database as a bundle | |
<= dbuploaddone | application waits until its output is archived |
Users need only place the wrap_application.sh
script located in the root directory of the main container.
This
can be accomplished
via an initContainer
without duplicating version of the Datamon sidecar
image in both the main application Dockerfile as well as the YAML.
When using a block-storage GCS product, we might've specified a data-science application's
Argo DAG node with something like
command: ["app"]
args: ["param1", "param2"]
whereas with wrap_application.sh
in place, this would be something to the effect of
command: ["/path/to/wrap_application.sh"]
args: ["-c", "/path/to/coordination_directory", "-b", "fuse", "--", "app", "param1", "param2"]
That is, wrap_application.sh
has the following usage
wrap_application.sh -c <coordination_directory> -b <sidecar_kind> -- <application_command>
where
<coordination_directory>
is an empty directory in a shared volume (anemptyDir
using memory-backed storage suffices). each coordination directory (not necessarily the volume) corresponds to a particular DAG node (i.e. Kubernetes pod) and vice-versa.<sidecar_kind>
is in correspondence with the containers specified in the YAML and may be amongfuse
postgres
<application_command>
is the data-science application command exactly as it would appear without the wrapper script. That is, the wrapper script, relies the conventional UNIX syntax for stating that options to a command are done being declared.
Meanwhile, each sidecar's datamon-specific batteries have their corresponding usages.
Provides filesystem representations (i.e. a folder) of datamon bundles.
Since bundles' filelists are serialized filesystem representations,
the wrap_datamon.sh
interface is tightly coupled to that of the self-documenting
datamon
binary itself.
./wrap_datamon.sh -c <coord_dir> -d <bin_cmd_I> -d <bin_cmd_J> ...
-c
the same coordination directory passed towrap_application.sh
-d
all parameters, exactly as passed to the datamon binary, except as a single scalar (quoted) parameter, for one of the following commandsconfig
sets user information associated with any bundles created by the nodebundle mount
provides sources for data-science applicationsbundle upload
provides sinks for data-science applications
Multiple (or none) bundle mount
and bundle upload
commands may be specified,
and at most one config
command is allowed so that an example wrap_datamon.sh
YAML might be
command: ["./wrap_datamon.sh"]
args: ["-c", "/tmp/coord", "-d", "config create", "-d", "bundle upload --path /tmp/upload --message \"result of container coordination demo\" --repo ransom-datamon-test-repo --label coordemo", "-d", "bundle mount --repo ransom-datamon-test-repo --label testlabel --mount /tmp/mount --stream"]
or from the shell
./wrap_datamon.sh -c /tmp/coord -d 'config create' -d 'bundle upload --path /tmp/upload --message "result of container coordination demo" --repo ransom-datamon-test-repo --label coordemo' -d 'bundle mount --repo ransom-datamon-test-repo --label testlabel --mount /tmp/mount --stream'
Aside on serialization format
Each of these environment variables each contain a serialized dictionary according the the following format
<entry_sperator><key_value_seperator><entry_1><entry_seperator><entry_2>...
where <entry_sperator>
and <key_value_seperator>
are each a single
character, anything other than a .
, and each <entry>
is of one of
two forms, either <option>
or <option><key_value_seperator><arg>
.
So for example
;:a;b:c
expresses something like a Python map
{'a': True, 'b' : 'c'}
or shell option args
<argv0> -a -b c