diff --git a/pr-preview/pr-54/404.html b/pr-preview/pr-54/404.html new file mode 100644 index 0000000..51e4c8f --- /dev/null +++ b/pr-preview/pr-54/404.html @@ -0,0 +1,748 @@ + + + +
+ + + + + + + + + + + + + + +You can configure a DTS instance by creating a YAML text
+file similar to dts.yaml.example
+in the repository. Typically this file is named dts.yaml
, and is passed as an
+argument to the dts
executable. Here we describe the different sections in
+this file and how they affect your DTS instance.
Click on any of the links below to see the relevant details for a section.
+Each of these sections is described below, with a motivating example.
+service
service:
+ port: 8080
+ max_connections: 100
+ poll_interval: 60000
+ endpoint: globus-local
+ data_dir: /path/to/dir
+ manifest_dir: /path/to/dir
+ delete_after: 604800
+ debug: true
+
The service
section contains parameters that control nuts-and-bolts behavior
+of the web service portion of the Data Transfer Service. The fields in this
+section are:
port
: the port on which the service listensmax_connections
: the maximum number of connections that are simultaneously
+ available for DTS clients. If a client sends a request to the DTS when all
+ connections are occupied, the request is denied.poll_interval
: the interval (in milliseconds) at which the DTS checks for
+ progress in any ongoing transfers. Because the file transfers orchestrated by
+ the DTS typically take a long time, it's reasonable to set this parameter to
+ a minute (60000 ms) or even longer. However, sometimes it's useful to have a
+ smaller polling interval, like when you're testing a feature. This parameter
+ is optional and defaults to 60000 ms.endpoint
: the name of an endpoint (defined in the endpoints
+ section) used by the DTS to generate and transfer manifests to destination
+ endpoints. This endpoint must have access to the file system to which the DTS
+ writes its manifests.data_dir
: a path to a directory on the local file system that the DTS uses
+ for its own storage. The DTS should have read/write access to this directory.manifest_dir
: a path to a directory on the local file system in which the
+ DTS writes transfer manifests. The endpoint named in the endpoint
parameter
+ must have read access to this directory in order to send the manifest to its
+ destination.delete_after
: the interval (in seconds) after which the DTS deletes the
+ record for a completed transfer, whether the transfer completed successfully
+ or unsuccessfully. This makes it possible for users to query the status of
+ completed transfers for the given interval. This parameter is optional and
+ defaults to 7 days (604800 seconds).debug
: an optional parameter that, if set to true
, enables more detailed
+ logging and other features that are helpful for troubleshooting and
+ development work. The default value is false
.endpoints
endpoints:
+ globus-local:
+ name: name-of-local-endpoint
+ id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
+ provider: globus
+ auth:
+ client_id: <ID of client with authentication secret>
+ client_secret: <secret>
+ globus-jdp:
+ name: name-of-jdp-endpoint
+ id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
+ provider: globus
+ auth:
+ client_id: <ID of client with authentication secret>
+ client_secret: <secret>
+ globus-kbase:
+ name: name-of-kbase-endpoint
+ id: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
+ provider: globus
+ auth:
+ client_id: <ID of client with authentication secret>
+ client_secret: <secret>
+
This section is a mapping (set of key-value pairs) that associates the names +of endpoints (keys) with sets of parameters that define their behaviors +(values). The endpoints defined here can be referred to in the other sections. +The fields that define the behavior of each endpoint are:
+name
: a human-readable name for the endpoint, which can be helpful in
+ diagnostic and error-related messagesid
: a UUID that uniquely identifies the endpoint in a (provider-specific)
+ way that allows the DTS to access itprovider
: the name of the service providing the endpoint capability.
+ Valid values for this parameter are:globus
: identifies the endpoint as a Globus Collection (in which case
+ the id
parameter is the corresponding UUID)local
: identifies the endpoint as a local endpoint with access only to
+ the DTS's local file system. This type of endpoint is only useful for
+ testing.auth
: this optional parameter provides authentication information to the
+ endpoint's provider if necessary. Its fields are:client_id
: an ID that identifies the DTS to the endpoint's provider as
+ a clientclient_secret
: a string containing a secret corresponding to the ID
+ provided by the client_id
parameterroot
: this optional parameter specifies the root directory used by DTS to
+ refer to files on the underlying filesystem of the endpoint. If left blank,
+ the root directory is set to /
.databases
databases:
+ jdp:
+ name: JGI Data Portal
+ organization: Joint Genome Institute
+ endpoint: globus-jdp
+ kbase:
+ name: KBase Workspace Service (KSS)
+ organization: KBase
+ endpoint: globus-kbase
+
This section is a mapping (set of key-value pairs) that associates the names +of databases (keys) with sets of parameters that define the databases themselves +(at least, as far as the DTS is concerned). These databases are the sources and +destinations for all file transfers performed by the DTS. The keys in this +section identify the databases that are configured for the DTS, and are referred +to in transfer requests specified by DTS clients. Supported databases are:
+jdp
: the Joint Genome Institute Data Portalkbase
: the Department of Energy Systems Biology Knowledgebase (KBase)Valid fields for each database are:
+name
: a human-readable name for the database, useful in diagnostic and
+ error-related messagesorganization
: a human-readable name for the organization that provides the
+ database (again, purely informational)endpoint
: the name of the endpoint defined in the endpoints
+ section that provides the DTS with access to the file staging area for the
+ databaseYou can use the Dockerfile
and dts.yaml
files in the deployment
folder to
+build a Docker image for DTS. The Docker image contains two files:
/bin/dts
: the statically-linked dts
executable/etc/dts.yaml
: a DTS configuration file with embedded
+ environment variables that control parameters of interestThis image can be deployed in any Docker-friendly environment. The use of +environment variables in the configuration file allows you to configure +DTS without regenerating the image.
+DTS is hosted in NERSC's Spin
+environment under Rancher 2.
+It runs in the Production
environment under the kbase
organization.
+You can read about Spin in NERSC's documentation, and Rancher 2
+here. The documentation
+isn't great, but fortunately there's not a lot to know--most of the
+materials you'll need are right here in the deployment
folder.
Deploying DTS to Spin involves
+dts
Spin deployment via NERSC's
+ Rancher 2 consoleEach of these steps are fairly simple.
+Before you perform an update, take some time to familiarize yourself
+with the Rancher 2 console and the dts
production deployment.
+The most important details are:
/global/cfs/cdirs/kbase/dts/
. This volume is visible to the
+ service as /data
, so the DATA_DIRECTORY
environment variable should be
+ set to /data
./global/cfs/cdirs/kbase/gsharing/dts/manifests
+ so that it is accessible via a Globus endpoint. This volume is visible to
+ the service as /manifests
, so the MANIFEST_DIRECTORY
environment variable
+ should be set to /manifests
. NOTE: the directory must be the same when
+ viewed by the service and the Globus Collection! If there is a mismatch,
+ the service will not be able to write the manifest OR Globus will not be
+ able to transfer it.Let's walk through the process of updating and redeploying the DTS in Spin.
+From within a clone of the DTS GitHub repo, make
+sure the repo is up to date by typing git pull
in the main
branch.
Then, sitting in the top-level dts
source folder of your dts
, execute
+the deploy-to-spin.sh
script, passing as arguments
For example,
+./deployment/deploy-to-spin.sh v1.1 johnson 52710 kbase 54643
+
builds a new DTS Docker image for to be run as the user johnson
,
+with the tag v1.1
. The script pushes the Docker image to Harbor, the
+NERSC Docker registry. Make sure the tag
+indicates the current version of dts
(e.g. v1.1
) for clarity.
After building the Docker image and tagging it, the script prompts you for the +NERSC password for the user you specified. This allows it to push the image to +Harbor so it can be accessed via the Rancher 2 console.
+Now log in to Rancher 2 and
+navigate to the dts
deployment.
dts
pod to view its status and informationEdit
to update its configuration.Volumes
section and edit the CFS directory for
+ the volume mounted at /data
. Usually, this is set to /global/cfs/cdirs/kbase/dts/
,
+ so you usually don't need to edit this. Similarly, check the volume mounted
+ at /manifests
(usually set to /global/cfs/cdirs/kbase/gsharing/manifests/
).deploy-to-spin.sh
above.Recreate: KILL ALL pods, then start new pods.
This ensures that the
+ service in the existing pod can save a record of its ongoing tasks before a
+ service in a new pod tries to restore them.Save
to restart the deployment with this new information.That's it! You've now updated the service with new features and bugfixes.
+ + + + + + + + + + + + + +The Data Transfer Service relies heavily on Globus +for performing file transfers between different databases. Globus is an elaborate +and continuously evolving platform, so configuring access from an application +can be confusing. Here we describe all the things you need to know to grant +DTS access to a Globus endpoint.
+Globus has its own set of terminology that is slightly different from that we've + used to describe DTS, so let's clarify some definitions first.
+This guide +gives a complete set of instructions using the terminology above. Below, we briefly +summarize the steps in the guide. Of course, you need a Globus user account to play +this game.
+Obtain an Application/Service Credential for DTS. The credential consists of + a unique client ID and an associated client secret. The client ID can be used to + identify the DTS as an entity that can be granted access permissions. Of course, + the primary instance of the DTS already has one of these.
+Create a Guest Collection on the Globus Endpoint or on an existing Collection. + Without a Guest Collection, you can't grant the DTS access to anything. You might + have to poke around a bit to find an endpoint or existing collection that (a) you + have access to and (b) that exposes the resources that you want to grant to the + DTS.
+Grant DTS read or read/write access to the Guest Collection. Since the DTS + has its own client ID, you can grant it access to a Guest Collection just as you + would any other Globus user.
+The DTS stores its Globus credentials (client ID, client secret) in environment +variables to prevent them from being read from a file or mined from the executable. +The deployment section describes how these environment variables +are managed in practice.
+ + + + + + + + + + + + + +More soon!
+ + + + + + + + + + + + + +Here we describe how to build, test, and install the Data Transfer Service +in a local environment.
+DTS is written in Go, so you'll need a working Go compiler +to build, test, and run it locally. If you have a Go compiler, you can clone +this repository and build it from the top-level directory:
+go build
+
DTS comes with several unit tests that demonstrate its capabilities, and you can +run these tests as you would any other Go project:
+go test ./...
+
You can add a -v
flag to see output from the tests.
Because DTS is primarily an orchestrator of network resources, its unit tests +must be able to connect to and utilize these resources. Accordingly, you must +set the following environment variables to make sure DTS can do what it needs +to do:
+DTS_KBASE_DEV_TOKEN
: a developer token for the KBase production
+ environment (available to KBase developers
+ used to connect to the KBase Auth Server, which provides a context for
+ authenticating and authorizing DTS for its basic operations. You can create
+ a token from your KBase developer account.DTS_KBASE_TEST_ORCID
: an ORCID identifier that can be
+ used to run DTS's unit test. This identifier must match a registered ORCID ID
+ associated with a KBase user account.DTS_KBASE_TEST_USER
: the KBase user associated with the ORCID specified
+ by DTS_KBASE_TEST_ORCID
. NOTE: at the time of writing, KBase does not have
+ a mechanism for mapping ORCID IDs to local users, so the DTS uses a file in
+ its data directory called kbase_users.json
consisting of a single JSON
+ object whose keys are ORCID IDs and whose values are local usernames.DTS_GLOBUS_CLIENT_ID
: a client ID registered using the
+ Globus Developers
+ web interface. This ID must be registered specifically for an instance of DTS.DTS_GLOBUS_CLIENT_SECRET
: a client secret associated with the client ID
+ specified by DTS_GLOBUS_CLIENT_ID
DTS_GLOBUS_TEST_ENDPOINT
: a Globus endpoint used to test DTS's transfer
+ capabilitiesDTS_JDP_SECRET
: a string containing a shared secret that allows the DTS to
+ authenticate with the JGI Data PortalThe only remaining step is to copy the dts
executable from your source
+directory to wherever you want it to reside. This executable is statically
+linked against all libraries, so it's completely portable.
{"use strict";/*!
+ * escape-html
+ * Copyright(c) 2012-2013 TJ Holowaychuk
+ * Copyright(c) 2015 Andreas Lubbe
+ * Copyright(c) 2015 Tiancheng "Timothy" Gu
+ * MIT Licensed
+ */var Va=/["'&<>]/;qn.exports=za;function za(e){var t=""+e,r=Va.exec(t);if(!r)return t;var o,n="",i=0,s=0;for(i=r.index;i