diff --git a/src/Documentation/sidebar.json b/src/Documentation/sidebar.json
index 21e1634083..fc2d96f866 100644
--- a/src/Documentation/sidebar.json
+++ b/src/Documentation/sidebar.json
@@ -92,8 +92,11 @@
"destroy.md",
"diff.md",
"fetch.md",
+ "get-url.md",
+ "get.md",
"gc.md",
"import-url.md",
+ "import.md",
"init.md",
"install.md",
"lock.md",
@@ -135,8 +138,11 @@
"destroy.md": "destroy",
"diff.md": "diff",
"fetch.md": "fetch",
+ "get-url.md": "get-url",
+ "get.md": "get",
"gc.md": "gc",
"import-url.md": "import-url",
+ "import.md": "import",
"init.md": "init",
"install.md": "install",
"lock.md": "lock",
diff --git a/static/docs/commands-reference/add.md b/static/docs/commands-reference/add.md
index b5e78342bd..7c1c5b03aa 100644
--- a/static/docs/commands-reference/add.md
+++ b/static/docs/commands-reference/add.md
@@ -69,12 +69,12 @@ to work with directory hierarchies with `dvc add`.
the single DVC-file points to a file in the DVC cache that contains
references to the files in the added hierarchy.
-In a DVC project `dvc add` can be used to version control any data artifacts -
-input, intermediate, output files and directories, as well as model files. It is
-useful by itself to go back and forth between different versions of datasets or
-models. Usually though, it is recommended to use `dvc run` and `dvc repro`
-mechanism to version control intermediate and output artifacts (like models).
-This way you bring data provenance and make your project reproducible.
+In a DVC project `dvc add` can be used to version control any data
+artifact (input, intermediate, or output files and directories, and model
+files). It is useful by itself to go back and forth between different versions
+of datasets or models. Usually though, it is recommended to use `dvc run` and
+`dvc repro` mechanism to version control intermediate and final results (like
+models). This way you bring data provenance and make your project reproducible.
## Options
diff --git a/static/docs/commands-reference/cache.md b/static/docs/commands-reference/cache.md
index 429f417439..9a8a26be76 100644
--- a/static/docs/commands-reference/cache.md
+++ b/static/docs/commands-reference/cache.md
@@ -21,9 +21,8 @@ default `cache` directory.
The DVC cache is where your data files, models, etc (anything you want to
version with DVC) are actually stored. The corresponding files you see in the
-working directory or "workspace" simply link to the ones in cache. (See
-`dvc config cache` `type` setting for more information on file links on
-different platforms.)
+workspace simply link to the ones in cache. (See `dvc config cache`, `type`
+config option, for more information on file links on different platforms.)
> For more cache-related configuration options refer to `dvc config cache`.
diff --git a/static/docs/commands-reference/cache_dir.md b/static/docs/commands-reference/cache_dir.md
index 0890694f08..95275daee7 100644
--- a/static/docs/commands-reference/cache_dir.md
+++ b/static/docs/commands-reference/cache_dir.md
@@ -1,4 +1,4 @@
-# dir
+# cache dir
Set/unset the cache directory location intuitively (compared to using
`dvc config cache`).
@@ -18,7 +18,7 @@ positional arguments:
Helper to set the `cache.dir` configuration option. Unlike doing so with
`dvc config cache`, this command transform paths (`value`) that are provided
-relative to the present working directory into paths **relative to the config
+relative to the current working directory into paths **relative to the config
file location**. They are required in the latter form for the config file.
## Options
@@ -29,12 +29,11 @@ file location**. They are required in the latter form for the config file.
- `--system` - modify a system config file (e.g. `/etc/dvc.config`) instead of
`.dvc/config`.
-- `--local` - modify a local
- [config file](/doc/user-guide/dvc-files-and-directories) instead of
- `.dvc/config`. It is located in `.dvc/config.local` and is Git-ignored. This
- is useful when you need to specify private config options in your config that
- you don't want to track and share through Git (credentials, private locations,
- etc).
+- `--local` - modify a local [config file](/doc/commands-reference/config)
+ instead of `.dvc/config`. It is located in `.dvc/config.local` and is
+ Git-ignored. This is useful when you need to specify private config options in
+ your config that you don't want to track and share through Git (credentials,
+ private locations, etc).
- `-u`, `--unset` - remove the `cache.dir` config option from the config file.
Don't provide a `value` when using this flag.
diff --git a/static/docs/commands-reference/checkout.md b/static/docs/commands-reference/checkout.md
index ef73a0b025..71abd16840 100644
--- a/static/docs/commands-reference/checkout.md
+++ b/static/docs/commands-reference/checkout.md
@@ -179,7 +179,7 @@ MD5 (model.pkl) = 3863d0e317dee0a55c4e59d2ec0eef33
```
What if we want to rewind history, so to speak? The `git checkout` command lets
-us checkout at any point in the commit history, or even check out other tags. It
+us checkout at any point in the commit history, or even checkout other tags. It
automatically adjusts the files, by replacing file content and adding or
deleting files as necessary.
diff --git a/static/docs/commands-reference/commit.md b/static/docs/commands-reference/commit.md
index 925719f064..5576b2b778 100644
--- a/static/docs/commands-reference/commit.md
+++ b/static/docs/commands-reference/commit.md
@@ -55,12 +55,12 @@ to the DVC cache as the last step. What _commit_ means is that DVC:
- Adds the file/directory or to the DVC cache.
There are many cases where the last step is not desirable (usually, rapid
-iteration on some experiment). For the DVC commands where it is appropriate the
-`--no-commit` option prevents the last step from occurring - thus, we are saving
-some time and space, by not storing all the data artifacts for all the attempts
-we do. The checksum is still computed and added to the DVC-file, but the file is
-not added to the cache. That's where the `dvc commit` command comes into play.
-It handles that last step of adding the file to the DVC cache.
+iteration on some experiment). For the DVC commands where available, the
+`--no-commit` option prevents the last step from occurring, thus we are saving
+time and space by not storing all the data artifacts for every
+command attempt. The checksum is still computed and added to the DVC-file, but
+the file is not added to the cache. That's where the `dvc commit` command comes
+into play. It handles that last step of adding the file to the DVC cache.
## Options
diff --git a/static/docs/commands-reference/config.md b/static/docs/commands-reference/config.md
index 0dcea52aea..c25b227fd0 100644
--- a/static/docs/commands-reference/config.md
+++ b/static/docs/commands-reference/config.md
@@ -19,7 +19,7 @@ You can query/set/replace/unset DVC configuration options with this command. It
takes a config option `name` (a section and a key, separated by a dot) and its
`value` (any valid alpha-numeric string generally).
-This command reads and overwrites the DVC config file `.dvc/config`. If
+This command reads and overwrites the DVC configuration file `.dvc/config`. If
`--local` option is specified, `.dvc/config.local` is modified instead.
If the config option `value` is not provided and `--unset` option is not used,
@@ -95,7 +95,7 @@ details.)
config location results in `.dvc/cache`.
> See also helper command `dvc cache dir` to intuitively set this config
- > option, properly transforming paths relative to the present working
+ > option, properly transforming paths relative to the current working
> directory into paths relative to the config file location.
- `cache.protected` - makes files in the workspace read-only. Possible values
@@ -103,8 +103,8 @@ details.)
effect. (It affects only files that are under DVC control.)
Due to the way DVC handles linking between the data files in the cache and
- their counterparts in the working directory, it's easy to accidentally corrupt
- the cached version of a file by editing or overwriting it. Turning this config
+ their counterparts in the workspace, it's easy to accidentally corrupt the
+ cached version of a file by editing or overwriting it. Turning this config
option on forces you to run `dvc unprotect` before updating a file, providing
an additional layer of security to your data.
@@ -158,7 +158,7 @@ details.)
### state
-State config options. Check the
+State config options. See
[DVC Files and Directories](/doc/user-guide/dvc-files-and-directories) to learn
more about the state file that is used for optimization.
diff --git a/static/docs/commands-reference/diff.md b/static/docs/commands-reference/diff.md
index acf5010b22..4df02d7789 100644
--- a/static/docs/commands-reference/diff.md
+++ b/static/docs/commands-reference/diff.md
@@ -37,7 +37,7 @@ by the Git SCM, for example when `dvc init` was used with the `--no-scm` option.
- `-t TARGET`, `--target TARGET` - Source path to a data file or directory. If
not specified, compares all files and directories that are under DVC control
- in the current workspace.
+ in the workspace.
- `-h`, `--help` - prints the usage/help message, and exit.
diff --git a/static/docs/commands-reference/gc.md b/static/docs/commands-reference/gc.md
index ff857111b0..a17e86ae5a 100644
--- a/static/docs/commands-reference/gc.md
+++ b/static/docs/commands-reference/gc.md
@@ -69,7 +69,7 @@ $ du -sh .dvc/cache/
```
When you run `dvc gc` it removes all objects from cache that are not referenced
-in the current workspace (by collecting hash sums from the DVC-files):
+in the workspace (by collecting hash sums from the DVC-files):
```dvc
$ dvc gc
diff --git a/static/docs/commands-reference/get-url.md b/static/docs/commands-reference/get-url.md
new file mode 100644
index 0000000000..a3816f0e34
--- /dev/null
+++ b/static/docs/commands-reference/get-url.md
@@ -0,0 +1,158 @@
+# get-url
+
+Download or copy file or directory from any supported URL (for example `s3://`,
+`ssh://`, and other protocols) or local directory to the local file system.
+
+> Unlike `dvc import-url`, this command does not track the downloaded data
+> file(s) (does not create a DVC-file).
+
+## Synopsis
+
+```usage
+usage: dvc get-url [-h] [-q | -v] url [out]
+
+positional arguments:
+ url (See supported URLs in the description.)
+ out Destination path to put data to.
+```
+
+## Description
+
+In some cases it's convenient to get a data file or directory from a remote
+location into the current working directory, regardless of whether it's a DVC
+project. The `dvc get-url` command helps the user do just that.
+
+The `url` argument should provide the location of the data to be downloaded,
+while `out` can be used to specify the (path and) file name desired for the
+downloaded data file or directory.
+
+Note that this command doesn't require an existing DVC project to run in. It's a
+single-purpose command that can be used out of the box after installing DVC.
+
+> See `dvc get` to download data or model files or directories from other DVC
+> repositories (e.g. Github URLs).
+
+DVC supports several types of (local or) remote locations (protocols):
+
+| Type | Discussion | URL format |
+| ------- | ------------------------------------------------------- | ------------------------------------------ |
+| `local` | Local path | `/path/to/local/file` |
+| `s3` | Amazon S3 | `s3://mybucket/data.csv` |
+| `gs` | Google Storage | `gs://mybucket/data.csv` |
+| `ssh` | SSH server | `ssh://user@example.com:/path/to/data.csv` |
+| `hdfs` | HDFS | `hdfs://user@example.com/path/to/data.csv` |
+| `http` | HTTP to file with _strong ETag_ (see explanation below) | `https://example.com/path/to/data.csv` |
+
+> Depending on the remote locations type you plan to download data from you
+> might need to specify one of the optional dependencies: `[s3]`, `[ssh]`,
+> `[gs]`, `[azure]`, and `[oss]` (or `[all]` to include them all) when
+> [installing DVC](/doc/get-started/install) with `pip`.
+
+Another way to understand the `dvc get-url` command is as a tool for downloading
+data files.
+
+On GNU/Linux systems for example, instead of `dvc get-url` with HTTP(S) it's
+possible to instead use:
+
+```dvc
+$ wget https://example.com/path/to/data.csv
+```
+
+## Options
+
+- `-h`, `--help` - prints the usage/help message, and exit.
+
+- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no
+ problems arise, otherwise 1.
+
+- `-v`, `--verbose` - displays detailed tracing information.
+
+## Examples
+
+
+
+### Click and expand for a local example
+
+```dvc
+$ dvc get-url /local/path/to/data
+```
+
+The above command will copy the `/local/path/to/data` file or directory into
+`./dir`.
+
+
+
+
+
+### Click for AWS S3 example
+
+This command will copy an S3 object into the current working directory with the
+same file name:
+
+```dvc
+$ dvc get-url s3://bucket/path
+```
+
+By default DVC expects your AWS CLI is already
+[configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html).
+DVC will be using default AWS credentials file to access S3. To override some of
+these settings, you could the options described in `dvc remote modify`.
+
+> We use the `boto3` library to and communicate with AWS S3. The following API
+> methods may be performed:
+>
+> - `head_object`
+> - `download_file`
+>
+> So make sure you have the `s3:GetObject` permission enabled.
+
+
+
+
+
+### Click for Google Cloud Storage example
+
+```dvc
+$ dvc get-url gs://bucket/path file
+```
+
+The above command downloads the `/path` file (or directory) into `./file`.
+
+
+
+
+
+### Click for SSH example
+
+```dvc
+$ dvc get-url ssh://user@example.com/path/to/data
+```
+
+Using default SSH credentials, the above command gets the `data` file (or
+directory).
+
+
+
+
+
+### Click for HDFS example
+
+```dvc
+$ dvc get-url hdfs://user@example.com/path/to/data
+```
+
+
+
+
+
+### Click for HTTP example
+
+> Both HTTP and HTTPS protocols are supported.
+
+```dvc
+$ dvc get-url https://example.com/path/to/data
+```
+
+
+
+
diff --git a/static/docs/commands-reference/get.md b/static/docs/commands-reference/get.md
new file mode 100644
index 0000000000..eb124fbfdf
--- /dev/null
+++ b/static/docs/commands-reference/get.md
@@ -0,0 +1,47 @@
+# get
+
+Download or copy file or directory from another DVC repository (on a git server
+such as Github) into the local file system.
+
+> Unlike `dvc import`, this command does not track the downloaded data file(s)
+> (does not create a DVC-file).
+
+## Synopsis
+
+```usage
+usage: dvc get [-h] [-q | -v] [-o [OUT]] [--rev [REV]] url path
+
+positional arguments:
+ url URL of Git repository with DVC project to download from.
+ path Path to data within DVC repository.
+```
+
+## Description
+
+DVC provides an easy way to reuse datasets, intermediate results, ML models, or
+other files and directories tracked in another DVC repository into the current
+working directory, regardless of whether it's a DVC project. The `dvc get`
+command downloads such a data artifact.
+
+The `url` argument specifies the external DVC project's Git repository URL (both
+HTTP and SSH protocols supported, e.g. `[user@]server:project.git`), while
+`path` is used to specify the path to the data to be downloaded within the repo.
+
+Note that this command doesn't require an existing DVC project to run in. It's a
+single-purpose command that can be used out of the box after installing DVC.
+
+> See `dvc get-url` to download data from other supported URLs.
+
+After running this command successfully, the data found in the `url` `path` is
+created in the current working directory with its original file name.
+
+## Options
+
+- `-h`, `--help` - prints the usage/help message, and exit.
+
+- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no
+ problems arise, otherwise 1.
+
+- `-v`, `--verbose` - displays detailed tracing information.
+
+
diff --git a/static/docs/commands-reference/import-url.md b/static/docs/commands-reference/import-url.md
index 0e99208b6d..ad56d97afe 100644
--- a/static/docs/commands-reference/import-url.md
+++ b/static/docs/commands-reference/import-url.md
@@ -1,8 +1,11 @@
# import-url
-Import file from any supported URL (it could be `http://`, as well as `s3://`,
-`ssh://`, and other supported external storage URLs) or local directory to local
-workspace and track changes in remote file or directory.
+Download or copy file or directory from any supported URL (for example `s3://`,
+`ssh://`, and other protocols) or local directory to the workspace,
+and track changes in the remote source with DVC. Creates a DVC-file.
+
+> See also `dvc get-url` which corresponds to the first step this command
+> performs (just download the data).
## Synopsis
@@ -16,30 +19,32 @@ positional arguments:
## Description
-In some cases it is convenient to add a data file or a directory to a workspace
-such that it will be automatically updated when the data source is updated.
-Examples:
+In some cases it's convenient to add a data file or directory from a remote
+location into the workspace, such that it will be automatically updated when the
+external data source changes. Examples:
- A remote system may produce occasional data files that are used in other
projects.
- A batch process running regularly updates a data file to import.
- A shared dataset on a remote storage that is managed and updated outside DVC.
-DVC supports [DVC-files](/doc/user-guide/dvc-file-format) which refer to an
-external data location, see
-[External Dependencies](/doc/user-guide/external-dependencies). In such a DVC
-file, the `deps` section specifies a remote URL, and the `outs` section lists
-the corresponding local path in the workspace. It records enough data from the
-remote file or directory to enable DVC to efficiently check it to determine if
-the local copy is out of date. DVC uses this remote URL to download the data to
-the workspace initially, and to re-download it upon changes.
-
The `dvc import-url` command helps the user create such an external data
-dependency. The `url` argument should provide the location of the data to be
-imported, while `out` is used to specify the (path and) name of the imported
-data file or directory in the workspace.
+dependency. The `url` argument specifies the external location of the data to be
+imported, while `out` can be used to specify the (path and) file name desired
+for the imported data file or directory in the workspace.
+
+> See `dvc import` to download and tack data or model files or directories from
+> other DVC repositories (e.g. Github URLs).
+
+DVC supports [DVC-files](/doc/user-guide/dvc-file-format) which refer to data in
+an external location, see
+[External Dependencies](/doc/user-guide/external-dependencies). In such a
+DVC-file, the `deps` section stores the remote URL, and the `outs` section
+contains the corresponding local path in the workspace. It records enough data
+from the external file or directory to enable DVC to efficiently check it to
+determine whether the local copy is out of date.
-DVC supports several types of (local or) remote locations:
+DVC supports several types of (local or) remote locations (protocols):
| Type | Discussion | URL format |
| -------- | ------------------------------------------------------- | ------------------------------------------ |
@@ -51,15 +56,20 @@ DVC supports several types of (local or) remote locations:
| `http` | HTTP to file with _strong ETag_ (see explanation below) | `https://example.com/path/to/data.csv` |
| `remote` | Remote path (see explanation below) | `remote://myremote/path/to/file` |
+> Depending on the remote locations type you plan to download data from you
+> might need to specify one of the optional dependencies: `[s3]`, `[ssh]`,
+> `[gs]`, `[azure]`, and `[oss]` (or `[all]` to include them all) when
+> [installing DVC](/doc/get-started/install) with `pip`.
+
> In case of HTTP,
> [strong ETag](https://en.wikipedia.org/wiki/HTTP_ETag#Strong_and_weak_validation)
> is necessary to track if the specified remote file (URL) changed to download
> it again.
-> `remote://myremote/path/to/file` notation just means that there is a DVC
+> `remote://myremote/path/to/file` notation just means that a DVC
> [remote](/doc/commands-reference/remote) `myremote` is defined and when DVC is
-> running it internally expands this URL into a regular S3, SSH, GS, etc URL by
-> appending `/path/to/file` to the `myremote`'s configured base path.
+> running. DVC automatically expands this URL into a regular S3, SSH, GS, etc
+> URL by appending `/path/to/file` to the `myremote`'s configured base path.
Another way to understand the `dvc import-url` command is as a short-cut for a
more verbose `dvc run` command. This is discussed in the
@@ -72,7 +82,7 @@ Instead of `dvc import-url`:
$ dvc import-url https://example.com/path/to/data.csv data.csv
```
-It is possible to instead use `dvc run`:
+It is possible to instead use `dvc run`, for example (HTTP URL):
```dvc
$ dvc run -d https://example.com/path/to/data.csv \
@@ -80,10 +90,10 @@ $ dvc run -d https://example.com/path/to/data.csv \
wget https://example.com/path/to/data.csv -O data.csv
```
-Both methods generate a stage file (DVC-file) with an external dependency, and
-they produce equivalent results. The `dvc import-url` command saves the user
-from having to manually copy files from each of the remote storage schemes, and
-from having to install CLI tools for each service.
+Both methods generate an equivalent stage file (DVC-file) with an external
+dependency. The `dvc import-url` command saves the user from having to manually
+copy files from each of the remote storage schemes, and from having to install
+CLI tools for each service.
When DVC inspects a DVC-file, its dependencies will be checked to see if any
have changed. A changed dependency will appear in the `dvc status` report,
@@ -142,6 +152,29 @@ Now, we can install requirements for the project:
$ pip install -r requirements.txt
```
+
+
+### Click for AWS S3 example
+
+This command will copy an S3 object into the current working directory with the
+same file name:
+
+```dvc
+$ dvc get-url s3://bucket/path
+```
+
+Note that the examples use
+
+> We use the `boto3` library to and communicate with AWS S3. The following API
+> methods may be performed:
+>
+> - `head_object`
+> - `download_file`
+>
+> So make sure you have the `s3:GetObject` permission enabled.
+
+
+
## Example: Tracking a remote file
@@ -215,8 +248,9 @@ file has changed.
## Example: Detecting remote file changes
What if that remote file is one which will be updated regularly? The project
-goal might include regenerating some artifact based on the updated data. A
-pipeline can be triggered to re-execute based on a changed external dependency.
+goal might include regenerating a data artifact based on the
+updated source. A pipeline can be triggered to re-execute based on a changed
+external dependency.
Let us again use the [Getting Started](/doc/get-started) example, in a way which
will mimic an updated external data source.
diff --git a/static/docs/commands-reference/import.md b/static/docs/commands-reference/import.md
new file mode 100644
index 0000000000..aef93621de
--- /dev/null
+++ b/static/docs/commands-reference/import.md
@@ -0,0 +1,69 @@
+# import
+
+> **Note!** This command has been repurposed after its original release. The
+> previous version is still available as the `dvc import-url` command.
+
+Download or copy file or directory from another DVC repository (on a git server
+such as Github) into the workspace, and track changes in the remote
+source with DVC. Creates a DVC-file.
+
+> See also `dvc get` which corresponds to the first step this command performs
+> (just download the data).
+
+## Synopsis
+
+```usage
+usage: dvc import [-h] [-q | -v] [-o [OUT]] [--rev [REV]] url path
+
+positional arguments:
+ url URL of Git repository with DVC project to download from.
+ path Path to data within DVC repository.
+```
+
+## Description
+
+DVC provides an easy way to reuse datasets, intermediate results, ML models, or
+other files and directories tracked in another DVC repository into the present
+workspace. The `dvc import` command downloads such a data
+artifact in a way that it can be tracked with DVC, resulting in automatic
+updates when the external data source changes.
+
+The `url` argument specifies the external DVC project's Git repository URL (both
+HTTP and SSH protocols supported, e.g. `[user@]server:project.git`), while
+`path` is used to specify the path to the data to be downloaded within the repo.
+
+> See `dvc import-url` to download and tack data from other supported URLs.
+
+After running this command successfully, the data found in the `url` `path` is
+created in the current working directory with its original file name e.g.
+`data.txt`. An import stage (DVC-file) is then created (similar to having used
+`dvc run` to generate the same output) extending the full file or directory name
+of the imported data e.g. `data.txt.dvc`.
+
+DVC supports [DVC-files](/doc/user-guide/dvc-file-format) which refer to data in
+an external DVC repository (hosted on a Git server). In such a DVC-file, the
+`deps` section specifies the `repo` URL and data `path`, and the `outs` section
+contains the corresponding local path in the workspace. It records enough data
+from the external file or directory to enable DVC to efficiently check it to
+determine whether the local copy is out of date.
+
+To actually [track the data](https://dvc.org/doc/get-started/add-files),
+`git add` (and `git commit`) the import stage (DVC-file).
+
+## Options
+
+- `-o`, `--out` - specify a location in the workspace to place the imported data
+ in, as a path to the desired directory. The default value (when this option
+ isn't used) is the current working directory (`.`).
+
+- `--rev` - specific Git revision of the DVC repository to import the data from.
+ `HEAD` by default.
+
+- `-h`, `--help` - prints the usage/help message, and exit.
+
+- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no
+ problems arise, otherwise 1.
+
+- `-v`, `--verbose` - displays detailed tracing information.
+
+
diff --git a/static/docs/commands-reference/index.md b/static/docs/commands-reference/index.md
index d95f92ca0f..5d64dc2853 100644
--- a/static/docs/commands-reference/index.md
+++ b/static/docs/commands-reference/index.md
@@ -10,7 +10,8 @@ DVC is a command-line tool. The typical use case for DVC goes as follows
- Use `--outs` option to specify `dvc run` command outputs which will be
converted to DVC data files after the code runs.
- Clone a git repo with the code of your ML application pipeline. However, this
- will not copy your DVC cache. Use cloud storage settings and `dvc push` to
- share the cache (data).
+ will not copy your DVC cache. Use
+ [data remotes](/doc/commands-reference/remote) and `dvc push` to share the
+ cache (data).
- Use `dvc repro` to quickly reproduce your pipeline on a new iteration, after
your data item files or source code of your ML application are modified.
diff --git a/static/docs/commands-reference/init.md b/static/docs/commands-reference/init.md
index 7432978587..e6571e5729 100644
--- a/static/docs/commands-reference/init.md
+++ b/static/docs/commands-reference/init.md
@@ -1,6 +1,9 @@
# init
-This command initializes a DVC environment in a current Git repository.
+This command initializes a DVC project on a directory.
+
+Note that by default the current working directory is expected to contain a Git
+repository, unless the `--no-scm` option is used.
## Synopsis
@@ -8,10 +11,22 @@ This command initializes a DVC environment in a current Git repository.
usage: dvc init [-h] [-q | -v] [--no-scm] [-f]
```
+## Description
+
+After DVC initialization, a new directory `.dvc/` will be created with `config`
+and `.gitignore` files and `cache` directory. These files and directories are
+hidden from the user generally and are not meant to be manipulated directly.
+
+`.dvc/cache` is one of the most important
+[DVC directories](/doc/user-guide/dvc-files-and-directories). It will hold all
+the contents of tracked data files. Note that `.dvc/.gitignore` lists this
+directory, which means that the cache directory is not under Git control. This
+is your local cache and you cannot push it to any Git remote.
+
## Options
-- `--no-scm` - skip Git specific initializations, `.dvc/.gitignore` will not be
- populated and added to Git.
+- `--no-scm` - skip Git specific initialization, `.dvc/.gitignore` will not be
+ written.
- `-f`, `--force` - remove `.dvc/` if it exists before initialization. Will
remove all local cache. Useful when first `dvc init` got corrupted for some
@@ -24,21 +39,9 @@ usage: dvc init [-h] [-q | -v] [--no-scm] [-f]
- `-v`, `--verbose` - displays detailed tracing information.
-## Details
-
-After DVC initialization, a new directory `.dvc/` will be created with `config`
-and `.gitignore` files and `cache` directory. These files and directories are
-hidden from the user generally and are not meant to be manipulated directly.
-
-`.dvc/cache directory` is one of the most important parts of any DVC
-repositories. The directory contains all content of data files. The most
-important part about this directory is that `.dvc/.gitignore` file is containing
-this directory which means that the cache directory is not under Git control —
-this is your local directory and you cannot push it to any Git remote.
-
## Examples
-- Creating a new DVC repository:
+- Creating a new DVC repository (requires a Git repository):
```dvc
$ mkdir tag_classifier
diff --git a/static/docs/commands-reference/install.md b/static/docs/commands-reference/install.md
index 9ebce721cf..0e99c80c7f 100644
--- a/static/docs/commands-reference/install.md
+++ b/static/docs/commands-reference/install.md
@@ -154,7 +154,7 @@ bigrams-experiment
These tags are used to mark points in the development of this workspace, and to
document specific experiments conducted in the workspace. To take a look at one
-we check-out the workspace using the SCM (in this case Git):
+we checkout the workspace using the SCM (in this case Git):
```dvc
$ git checkout 6-featurization
diff --git a/static/docs/commands-reference/metrics.md b/static/docs/commands-reference/metrics.md
index f84266b368..b52db7e2f8 100644
--- a/static/docs/commands-reference/metrics.md
+++ b/static/docs/commands-reference/metrics.md
@@ -56,7 +56,7 @@ $ dvc run -d code/evaluate.py -M data/eval.json \
> running `dvc metrics add data/eval.json` to explicitly mark `data/eval.json`
> as a metric file.
-Now let's print metric values that we are tracking in the current project:
+Now let's print metric values that we are tracking in this DVC project:
```dvc
$ dvc metrics show -a
diff --git a/static/docs/commands-reference/metrics_add.md b/static/docs/commands-reference/metrics_add.md
index 73365e6dfc..92ca0ebf31 100644
--- a/static/docs/commands-reference/metrics_add.md
+++ b/static/docs/commands-reference/metrics_add.md
@@ -1,4 +1,4 @@
-# add
+# metrics add
Tag the file located at `path` as a metric file.
@@ -41,8 +41,8 @@ contains multiple metrics.
`dvc metrics show`. Accepted value depends on the metric file type (`-t`
option):
- - `json` - check [JSONPath spec](https://goessner.net/articles/JsonPath/) to
- see available options. For example, `"AUC"` extracts the value from the
+ - `json` - see [JSONPath spec](https://goessner.net/articles/JsonPath/) for
+ available options. For example, `"AUC"` extracts the value from the
following json-formatted metric file: `{"AUC": "0.624652"}`.
- `tsv`/`csv` - `row,column`, e.g. `1,2`. Indices are 0-based.
- `htsv`/`hcsv` - `row,column name`. Row index is 0-based. First row is used
diff --git a/static/docs/commands-reference/metrics_modify.md b/static/docs/commands-reference/metrics_modify.md
index 6fef1ef0da..1f94394380 100644
--- a/static/docs/commands-reference/metrics_modify.md
+++ b/static/docs/commands-reference/metrics_modify.md
@@ -1,4 +1,4 @@
-# modify
+# metrics modify
Modify metric settings (like type, path expression that is used to parse it,
etc).
@@ -46,8 +46,8 @@ ERROR: failed to modify metric file settings -
`dvc metrics show`. Accepted value depends on the metric file type (`-t`
option):
- - `json` - check [JSONPath spec](https://goessner.net/articles/JsonPath/) to
- see available options. For example, `"AUC"` extracts the value from the
+ - `json` - see [JSONPath spec](https://goessner.net/articles/JsonPath/) for
+ available options. For example, `"AUC"` extracts the value from the
following json-formatted metric file: `{"AUC": "0.624652"}`.
- `tsv`/`csv` - `row,column`, e.g. `1,2`. Indices are 0-based.
- `htsv`/`hcsv` - `row,column name`. Row index is 0-based. First row is used
diff --git a/static/docs/commands-reference/metrics_remove.md b/static/docs/commands-reference/metrics_remove.md
index 19b6c55764..9df94a4a44 100644
--- a/static/docs/commands-reference/metrics_remove.md
+++ b/static/docs/commands-reference/metrics_remove.md
@@ -1,4 +1,4 @@
-# remove
+# metrics remove
Keep file as an output, remove metric flag and stop tracking as a metric file.
diff --git a/static/docs/commands-reference/metrics_show.md b/static/docs/commands-reference/metrics_show.md
index 5e77913891..b1353e6335 100644
--- a/static/docs/commands-reference/metrics_show.md
+++ b/static/docs/commands-reference/metrics_show.md
@@ -1,4 +1,4 @@
-# show
+# metrics show
Find and print project metrics.
@@ -51,14 +51,13 @@ supported.
corresponding format in this case. Accepted value depends on the metric file
type (`-t` option):
- - `json` - check [JSONPath spec](https://goessner.net/articles/JsonPath/) or
- [jsonpath-ng](https://github.com/h2non/jsonpath-ng) to see available
- options. For example, `"AUC"` extracts the value from the following
- json-formatted metric file: `{"AUC": "0.624652"}`. You can also filter on
- certain values. For example,
- `"$.metrics[?(@.deviation_mse<0.30) & (@.value_mse>0.4)]"` extracts only the
- values for model versions if they meet the given condition(s) from the
- metric file:
+ - `json` - see [JSONPath spec](https://goessner.net/articles/JsonPath/) or
+ [jsonpath-ng](https://github.com/h2non/jsonpath-ng) for available options.
+ For example, `"AUC"` extracts the value from the following json-formatted
+ metric file: `{"AUC": "0.624652"}`. You can also filter on certain values.
+ For example, `"$.metrics[?(@.deviation_mse<0.30) & (@.value_mse>0.4)]"`
+ extracts only the values for model versions if they meet the given
+ condition(s) from the metric file:
`{"metrics": [{"dataset": "train", "deviation_mse": 0.173461, "value_mse": 0.421601}]}`
- `tsv`/`csv` - `row,column`, e.g. `1,2`. Indices are 0-based.
- `htsv`/`hcsv` - `row,column name`. Row index is 0-based. First row is used
diff --git a/static/docs/commands-reference/pipeline_list.md b/static/docs/commands-reference/pipeline_list.md
index 2b213c4842..71a8bdedf9 100644
--- a/static/docs/commands-reference/pipeline_list.md
+++ b/static/docs/commands-reference/pipeline_list.md
@@ -1,4 +1,4 @@
-# list
+# pipeline list
Show connected groups (pipelines) of [stage](/doc/commands-reference/run) that
are independent of each other.
diff --git a/static/docs/commands-reference/pipeline_show.md b/static/docs/commands-reference/pipeline_show.md
index 0c1f6d632c..f3e171e5c3 100644
--- a/static/docs/commands-reference/pipeline_show.md
+++ b/static/docs/commands-reference/pipeline_show.md
@@ -1,4 +1,4 @@
-# show
+# pipeline show
Show [stages](/doc/commands-reference/run) in a pipeline that lead to the
specified stage. By default it lists
diff --git a/static/docs/commands-reference/pull.md b/static/docs/commands-reference/pull.md
index bd74a7aea5..d53b76c630 100644
--- a/static/docs/commands-reference/pull.md
+++ b/static/docs/commands-reference/pull.md
@@ -85,10 +85,10 @@ reflinks or hardlinks to put it in the workspace without copying. See
path for this option to have effect. Determines the files to pull by searching
each target directory and its subdirectories for DVC-files to inspect.
-- `-f`, `--force` - does not prompt when removing working directory files, which
- occurs during the process of updating the workspace. This option surfaces
- behavior from the `dvc checkout` command because `dvc pull` in effect performs
- a _checkout_ after downloading files.
+- `-f`, `--force` - does not prompt when removing workspace files, which occurs
+ during the process of updating the workspace. This option surfaces behavior
+ from the `dvc checkout` command because `dvc pull` in effect performs a
+ _checkout_ after downloading files.
- `-j JOBS`, `--jobs JOBS` - specifies number of jobs to run simultaneously
while downloading files from the remote cache. The effect is to control the
@@ -116,7 +116,7 @@ $ dvc remote list
r1 ssh://_username_@_host_/path/to/dvc/cache/directory
```
-> DVC supports several protocols for remote storage. For details, see the
+> DVC supports several remote types. For details, see the
> [`remote add`](/doc/commands-reference/remote-add) documentation.
With a remote cache containing some images and other files, we can pull all
diff --git a/static/docs/commands-reference/push.md b/static/docs/commands-reference/push.md
index e503d332f5..62c68beee7 100644
--- a/static/docs/commands-reference/push.md
+++ b/static/docs/commands-reference/push.md
@@ -122,7 +122,7 @@ the example, let's define an SSH remote with the `dvc remote add` command:
r1 ssh://_username_@_host_/path/to/dvc/cache/directory
```
-> DVC supports several protocols for remote storage. For details, see the
+> DVC supports several remote types. For details, see the
> [`remote add`](/doc/commands-reference/remote-add) documentation.
Push all data file caches from the current Git branch to the default remote:
diff --git a/static/docs/commands-reference/remote.md b/static/docs/commands-reference/remote.md
index 377f305447..213e550371 100644
--- a/static/docs/commands-reference/remote.md
+++ b/static/docs/commands-reference/remote.md
@@ -33,11 +33,11 @@ models and re-process data files. It also saves space on your local
environment - DVC can [fetch](/doc/commands-reference/fetch) into the local
cache only the data you need for a specific branch/commit.
-> If you installed DVC via `pip`, and depending on the remote type you plan to
-> use you might need to install optional dependencies: `s3`, `gs`, `azure`,
-> `ssh`. Or `all_remotes` to include them all. The command should look like
-> this: `pip install -U "dvc[s3]"` - it installs `boto3` library along with DVC
-> to support AWS S3 storage.
+> If you installed DVC via `pip`, depending on the remote type you plan to use
+> you might need to install optional dependencies: `[s3]`, `[ssh]`, `[gs]`,
+> `[azure]`, and `[oss]`; or `[all]` to include them all. The command should
+> look like this: `pip install "dvc[s3]"` - it installs `boto3` library along
+> with DVC to support AWS S3 storage.
Using DVC with a remote data storage is optional. By default, DVC is configured
to use a local data storage only (usually `.dvc/cache` directory inside your
@@ -48,8 +48,8 @@ repository), which enables basic DVC usage scenarios out of the box.
[list](/doc/commands-reference/remote-list),
[modify](/doc/commands-reference/remote-modify), and
[remove](/doc/commands-reference/remote-remove) commands read or modify DVC
-[config files](/doc/user-guide/dvc-files-and-directories). Alternatively,
-`dvc config` can be used or these files could be edited manually.
+[config files](/doc/commands-reference/config). Alternatively, `dvc config` can
+be used or these files could be edited manually.
For the typical process to share the project via remote, see
[Share Data And Model Files](/doc/use-cases/share-data-and-model-files).
diff --git a/static/docs/commands-reference/remote_add.md b/static/docs/commands-reference/remote_add.md
index feda3b9e55..491b489351 100644
--- a/static/docs/commands-reference/remote_add.md
+++ b/static/docs/commands-reference/remote_add.md
@@ -18,7 +18,7 @@ usage: dvc remote add [-h] [--global] [--system] [--local] [-q | -v]
positional arguments:
name Name of the remote.
- url URL. (See supported URLs below.)
+ url URL. (See supported URLs in the examples below.)
```
## Description
@@ -26,20 +26,20 @@ positional arguments:
`name` and `url` are required. `url` specifies a location to store your data. It
could be S3 path, SSH path, Azure, Google cloud, Aliyun OSS local directory,
etc. (See more examples below.) If `url` is a local relative path, it will be
-resolved relative to the current directory but saved **relative to the config
-file location** (see LOCAL example below). Whenever possible DVC will create a
-remote directory if it doesn't exists yet. It won't create an S3 bucket though
-and will rely on default access settings.
+resolved relative to the current working directory but saved **relative to the
+config file location** (see LOCAL example below). Whenever possible DVC will
+create a remote directory if it doesn't exists yet. It won't create an S3 bucket
+though and will rely on default access settings.
-> If you installed DVC via `pip`, and depending on the remote type you plan to
-> use you might need to install optional dependencies: `s3`, `gs`, `azure`,
-> `ssh`. Or `all_remotes` to include them all. The command should look like
-> this: `pip install -U "dvc[s3]"` - it installs `boto3` library along with DVC
-> to support AWS S3 storage.
+> If you installed DVC via `pip`, depending on the remote type you plan to use
+> you might need to install optional dependencies: `[s3]`, `[ssh]`, `[gs]`,
+> `[azure]`, and `[oss]`; or `[all]` to include them all. The command should
+> look like this: `pip install "dvc[s3]"` - it installs `boto3` library along
+> with DVC to support AWS S3 storage.
This command creates a section in the DVC
-[config file](/doc/user-guide/dvc-files-and-directories) and optionally assigns
-a default remote in the core section if the `--default` option is used:
+[config file](/doc/commands-reference/config) and optionally assigns a default
+remote in the core section if the `--default` option is used:
```ini
['remote "myremote"']
@@ -64,12 +64,11 @@ Use `dvc config` to unset/change the default remote as so:
- `--system` - save remote configuration to the system config (e.g.
`/etc/dvc.config`) instead of `.dvc/config`.
-- `--local` - modify a local
- [config file](/doc/user-guide/dvc-files-and-directories) instead of
- `.dvc/config`. It is located in `.dvc/config.local` and is Git-ignored. This
- is useful when you need to specify private config options in your config that
- you don't want to track and share through Git (credentials, private locations,
- etc).
+- `--local` - modify a local [config file](/doc/commands-reference/config)
+ instead of `.dvc/config`. It is located in `.dvc/config.local` and is
+ Git-ignored. This is useful when you need to specify private config options in
+ your config that you don't want to track and share through Git (credentials,
+ private locations, etc).
- `-d`, `-default` - commands like `dvc pull`, `dvc push`, `dvc fetch` will be
using this remote by default to save or retrieve data files unless `-r` option
@@ -79,6 +78,8 @@ Use `dvc config` to unset/change the default remote as so:
## Examples
+The following are the types and of remotes (protocols) supported:
+
### Click for a local remote example
@@ -135,8 +136,8 @@ By default DVC expects your AWS CLI is already
DVC will be using default AWS credentials file to access S3. To override some of
these settings, you could the options described in `dvc remote modify`.
-We use `boto3` library to set up a client and communicate with AWS S3. The
-following API methods are performed:
+We use the `boto3` library to communicate with AWS S3. The following API methods
+are performed:
- `list_objects_v2`, `list_objects`
- `head_object`
@@ -147,10 +148,10 @@ following API methods are performed:
So, make sure you have the following permissions enabled:
-- s3:ListBucket
-- s3:GetObject
-- s3:PutObject
-- s3:DeleteObject
+- `s3:ListBucket`
+- `s3:GetObject`
+- `s3:PutObject`
+- `s3:DeleteObject`
@@ -258,9 +259,8 @@ $ dvc remote add myremote hdfs://user@example.com/path/to/dir
> **Note!** Currently HTTP remotes only support downloads operations:
>
-> - `pull`
-> - `fetch`
-> - `import`
+> - `pull` and `fetch`
+> - `import-url` and `get-url`
> - As an [external dependency](/doc/user-guide/external-dependencies)
```dvc
diff --git a/static/docs/commands-reference/remote_default.md b/static/docs/commands-reference/remote_default.md
index 2dd2054741..0d9f0c4dc4 100644
--- a/static/docs/commands-reference/remote_default.md
+++ b/static/docs/commands-reference/remote_default.md
@@ -31,7 +31,7 @@ $ dvc remote default myremote
```
This command assigns the default remote in the core section of the DVC
-[config file](/doc/user-guide/dvc-files-and-directories).
+[config file](/doc/commands-reference/config).
```ini
[core]
@@ -42,10 +42,8 @@ For the commands which take a `--remote` option (`dvc pull`, `dvc push`,
`dvc status`, `dvc gc`, `dvc fetch`), default remote is used if that option is
not specified.
-You can also use [`dvc config`](/doc/user-guide/dvc-files-and-directories),
-[`dvc remote add`](/doc/commands-reference/remote-add) and
-[`dvc remote modify`](/doc/commands-reference/remote-modify) commands to
-set/unset/change the default remote configurations.
+You can also use `dvc config`, `dvc remote add` and `dvc remote modify` commands
+to set/unset/change the default remote configurations.
## Options
@@ -57,12 +55,11 @@ set/unset/change the default remote configurations.
- `--system` - save remote configuration to the system config (e.g.
`/etc/dvc.config`) instead of `.dvc/config`.
-- `--local` - modify a local
- [config file](/doc/user-guide/dvc-files-and-directories) instead of
- `.dvc/config`. It is located in `.dvc/config.local` and is Git-ignored. This
- is useful when you need to specify private config options in your config that
- you don't want to track and share through Git (credentials, private locations,
- etc).
+- `--local` - modify a local [config file](/doc/commands-reference/config)
+ instead of `.dvc/config`. It is located in `.dvc/config.local` and is
+ Git-ignored. This is useful when you need to specify private config options in
+ your config that you don't want to track and share through Git (credentials,
+ private locations, etc).
- `-h`, `--help` - prints the usage/help message and exit.
diff --git a/static/docs/commands-reference/remote_list.md b/static/docs/commands-reference/remote_list.md
index 20482407a0..87e890e823 100644
--- a/static/docs/commands-reference/remote_list.md
+++ b/static/docs/commands-reference/remote_list.md
@@ -26,10 +26,8 @@ Including names and URLs.
- `--system` - save remote configuration to the system config (e.g.
`/etc/dvc.config`) instead of `.dvc/config`.
-- `--local` - list remotes specified in the
- [local](/doc/user-guide/dvc-files-and-directories) configuration file
- (`.dvc/config.local`). Local configuration files stores private settings that
- should not be tracked by Git.
+- `--local` - read a local [config file](/doc/commands-reference/config) instead
+ of `.dvc/config`. It is located in `.dvc/config.local` and is Git-ignored.
## Examples
diff --git a/static/docs/commands-reference/remote_modify.md b/static/docs/commands-reference/remote_modify.md
index 8f6b1263ae..027c1c51c4 100644
--- a/static/docs/commands-reference/remote_modify.md
+++ b/static/docs/commands-reference/remote_modify.md
@@ -1,6 +1,6 @@
# remote modify
-Modify remote settings.
+Modify configuration of remotes.
> This command is commonly needed after `dvc remote add` or
> [default](/doc/commands-reference/remote-default) to setup credentials or
@@ -30,9 +30,9 @@ Remote `name` and `option` name are required. Option names are remote type
specific. See below examples and a list of per remote type - AWS S3, Google
cloud, Azure, SSH, ALiyun OSS, and others.
-This command modifies a `remote` section in the DVC
-[config file](/doc/user-guide/dvc-files-and-directories). Alternatively,
-`dvc config` or manual editing could be used to change settings.
+This command modifies a `remote` section in the DVC project's
+[config file](/doc/commands-reference/config). Alternatively, `dvc config` or
+manual editing could be used to change the configuration.
## Options
@@ -45,15 +45,16 @@ This command modifies a `remote` section in the DVC
- `--system` - save remote configuration to the system config (e.g.
`/etc/dvc.config`) instead of `.dvc/config`.
-- `--local` - modify a local
- [config file](/doc/user-guide/dvc-files-and-directories) instead of
- `.dvc/config`. It is located in `.dvc/config.local` and is Git-ignored. This
- is useful when you need to specify private config options in your config that
- you don't want to track and share through Git (credentials, private locations,
- etc).
+- `--local` - modify a local [config file](/doc/commands-reference/config)
+ instead of `.dvc/config`. It is located in `.dvc/config.local` and is
+ Git-ignored. This is useful when you need to specify private config options in
+ your config that you don't want to track and share through Git (credentials,
+ private locations, etc).
## Examples
+The following are the types and of remotes (protocols) supported:
+
### Click for AWS S3 available options
diff --git a/static/docs/commands-reference/remote_remove.md b/static/docs/commands-reference/remote_remove.md
index 361ad73c29..2f759bd48a 100644
--- a/static/docs/commands-reference/remote_remove.md
+++ b/static/docs/commands-reference/remote_remove.md
@@ -23,8 +23,8 @@ positional arguments:
Remote `name` is required.
This command removes a section in the DVC
-[config file](/doc/user-guide/dvc-files-and-directories). Alternatively, it is
-possible to edit config files manually.
+[config file](/doc/commands-reference/config). Alternatively, it is possible to
+edit config files manually.
## Options
@@ -34,10 +34,9 @@ possible to edit config files manually.
- `--system` - save remote configuration to the system config (e.g.
`/etc/dvc.config`) instead of `.dvc/config`.
-- `--local` - remove remote specified in the
- [local](/doc/user-guide/dvc-files-and-directories) configuration file
- (`.dvc/config.local`). Local configuration files stores private settings or
- local environment specific settings that should not be tracked by Git.
+- `--local` - modify a local [config file](/doc/commands-reference/config)
+ instead of `.dvc/config`. It is located in `.dvc/config.local` and is
+ Git-ignored.
## Examples
diff --git a/static/docs/commands-reference/remove.md b/static/docs/commands-reference/remove.md
index 6c3459df16..e1879b6c78 100644
--- a/static/docs/commands-reference/remove.md
+++ b/static/docs/commands-reference/remove.md
@@ -19,8 +19,8 @@ positional arguments:
DVC-files in the workspace by default.)
```
-Check also [Update Tracked Files](/doc/user-guide/update-tracked-file) to see
-how it can be used to replace or modify files that are under DVC control.
+Refer to [Update Tracked Files](/doc/user-guide/update-tracked-file) to see how
+it can be used to replace or modify files that are under DVC control.
## Options
diff --git a/static/docs/commands-reference/repro.md b/static/docs/commands-reference/repro.md
index 7d3cefb224..19a2524c88 100644
--- a/static/docs/commands-reference/repro.md
+++ b/static/docs/commands-reference/repro.md
@@ -20,8 +20,8 @@ positional arguments:
`dvc repro` provides an interface to run the commands in a computational graph
(a.k.a. pipeline) again, as defined in the stage files (DVC-files) found in the
-current workspace. (A pipeline is typically defined using the `dvc run` command,
-while data input nodes are defined by the `dvc add` command.)
+workspace. (A pipeline is typically defined using the `dvc run` command, while
+data input nodes are defined by the `dvc add` command.)
There's a few ways to restrict the stages that will be run again by this
command: by specifying stage file(s) as `targets`, or by using the
@@ -112,7 +112,7 @@ specified), and updates stage files with the new checksum information.
## Examples
For simplicity, let's build a pipeline defined below (if you want get your hands
-on something more real, check this
+on something more real, see this
[mini-tutorial](/doc/get-started/example-pipeline)). It takes this `text.txt`
file:
diff --git a/static/docs/commands-reference/run.md b/static/docs/commands-reference/run.md
index db6bd84e7a..99e895fa37 100644
--- a/static/docs/commands-reference/run.md
+++ b/static/docs/commands-reference/run.md
@@ -45,7 +45,7 @@ be no cycles, etc.
Note that `dvc repro` provides an interface to check state and reproduce this
graph later. This concept is similar to the one of the `Makefile` but DVC
-captures data and caches data artifacts along the way. Check this
+captures data and caches data artifacts along the way. See this
[example](/doc/get-started/example-pipeline) to learn more and try to build a
pipeline.
@@ -84,7 +84,7 @@ pipeline.
- `-m`, `--metrics` - another kind of output files. It is usually a small human
readable file (JSON, CSV, text, whatnot) with some numbers or other
- information that describes a model or other outputs. Check `dvc metrics` to
+ information that describes a model or other outputs. See `dvc metrics` to
learn more about tracking metrics and comparing them across different model or
experiment versions.
diff --git a/static/docs/commands-reference/status.md b/static/docs/commands-reference/status.md
index 83f7290c11..75adcadd1e 100644
--- a/static/docs/commands-reference/status.md
+++ b/static/docs/commands-reference/status.md
@@ -8,8 +8,7 @@ cache and remote cache.
```usage
usage: dvc status [-h] [-v] [-j JOBS] [--show-checksums] [-q] [-c]
- [-r REMOTE] [-a] [-T] [-d]
- [targets [targets ...]]
+ [-r REMOTE] [-a] [-T] [-d] [targets [targets ...]]
positional arguments:
targets Limit command scope to these DVC-files. Using -R,
@@ -32,12 +31,12 @@ synchronize them). The two modes, _local_ and _cloud_ are triggered by using the
| remote | `--cloud` | Comparisons are made between the local cache, and the default remote, defined with `dvc remote --default` command. |
DVC determines data and code files to compare by analyzing all
-[DVC-files](/doc/user-guide/dvc-file-format) in the current workspace
-(`--all-branches` and `--all-tags` in the `cloud` mode compare multiple
-workspaces - across all branches or tags). The comparison can be limited to
-specific DVC-files by listing them as `targets`. Changes are reported only
-against the given `targets`. When combined with the `--with-deps` option, a
-search is made for changes in other stages that affect the target.
+[DVC-files](/doc/user-guide/dvc-file-format) in the workspace (`--all-branches`
+and `--all-tags` in the `cloud` mode compare multiple workspaces - across all
+branches or tags). The comparison can be limited to specific DVC-files by
+listing them as `targets`. Changes are reported only against the given
+`targets`. When combined with the `--with-deps` option, a search is made for
+changes in other stages that affect the target.
In the `local` mode, changes are detected through the checksum of every file
listed in every DVC-file in question against the corresponding file in the file
@@ -91,15 +90,14 @@ cache. For the typical process to update workspaces, see
name defined using the `dvc remote` command. Implies `--cloud`.
- `-a`, `--all-branches` - compares cache content against all Git branches.
- Instead of checking just the currently checked out workspace, it checks
- against all other branches of this workspace. The corresponding branches are
- shown in the status output. Applies only if `--cloud` or a remote is
- specified.
+ Instead of checking just the workspace, it runs the same status command in all
+ the branches of this repo. The corresponding branches are shown in the status
+ output. Applies only if `--cloud` or a remote is specified.
- `-T`, `--all-tags` - compares cache content against all Git tags. Both the
`--all-branches` and `--all-tags` options cause DVC to check more than just
- the currently checked out workspace. The corresponding tags are shown in the
- status output. Applies only if `--cloud` or a remote is specified.
+ the workspace. The corresponding tags are shown in the status output. Applies
+ only if `--cloud` or a remote is specified.
- `--show-checksums` - shows the DVC checksum for the file, rather than the file
name. Applies only if `--cloud` is specified.
diff --git a/static/docs/get-started/add-files.md b/static/docs/get-started/add-files.md
index 92d4387a3c..ae00f56ba8 100644
--- a/static/docs/get-started/add-files.md
+++ b/static/docs/get-started/add-files.md
@@ -42,8 +42,8 @@ $ git commit -m "add source data to DVC"
### Expand to learn about DVC internals
You can see that actual data file has been moved to the `.dvc/cache` directory,
-while the entries in the working directory may be links to the actual files in
-the DVC cache. (See
+while the entries in the workspace may be links to the actual files in the DVC
+cache. (See
[File link types](/docs/user-guide/large-dataset-optimization#file-link-types-for-the-dvc-cache)
to learn about the supported file linking options, their tradeoffs, and how to
enable them).
diff --git a/static/docs/get-started/agenda.md b/static/docs/get-started/agenda.md
index 5a8e04919b..ea411e5157 100644
--- a/static/docs/get-started/agenda.md
+++ b/static/docs/get-started/agenda.md
@@ -26,10 +26,11 @@ contrary, DVC is designed to be pretty agnostic of frameworks, languages, etc.
If you have data files or data sets and/or you produce other data files, models,
data sets and you want to:
-- capture and save those data artifacts the same way we capture code,
-- track and switch between different versions of these artifacts easily,
-- being able to answer the question of how those artifacts or models were built
- in the first place,
+- capture and save those data artifacts the same way we capture
+ code,
+- track and switch between different versions of the data easily,
+- being able to answer the question of how data artifacts (e.g. ML models) were
+ built in the first place,
- being able to compare them,
- bring best practices to your team and get everyone on the same page.
diff --git a/static/docs/get-started/compare-experiments.md b/static/docs/get-started/compare-experiments.md
index dec676c8a4..e373e7a2b4 100644
--- a/static/docs/get-started/compare-experiments.md
+++ b/static/docs/get-started/compare-experiments.md
@@ -38,5 +38,5 @@ bigram-experiment:
```
DVC provides built-in support to track and navigate `JSON`, `TSV` or `CSV`
-metric files if you want to track additional information. Check `dvc metrics` to
+metric files if you want to track additional information. See `dvc metrics` to
learn more.
diff --git a/static/docs/get-started/configure.md b/static/docs/get-started/configure.md
index 43e9afad31..90acffc13c 100644
--- a/static/docs/get-started/configure.md
+++ b/static/docs/get-started/configure.md
@@ -32,8 +32,8 @@ $ git commit .dvc/config -m "initialize DVC local remote"
> [use cases](/doc/use-cases), other "more remote" types of remotes will be
> required.
-Adding a remote should be specified by both its type prefix and its path. DVC
-currently supports seven types of remotes:
+Adding a remote should be specified by both its type prefix (protocol) and its
+path. DVC currently supports seven types of remotes:
- `local` - Local directory
- `s3` - Amazon Simple Storage Service
@@ -41,15 +41,13 @@ currently supports seven types of remotes:
- `azure` - Azure Blob Storage
- `ssh` - Secure Shell
- `hdfs` - The Hadoop Distributed File System
-- `http` - Support for HTTP and HTTPS protocol
-
-> Depending on the [remote storage](/doc/commands-reference/remote) type you
-> plan to use to keep and share your data you might need to specify one of the
-> optional dependencies: `s3`, `gs`, `azure`, `ssh`. Or `all_remotes` to include
-> them all. The command should look like this: `pip install "dvc[s3]"` - it will
-> install `boto3` library along with DVC to support AWS S3 storage. This is
-> valid for `pip install` option only. Other ways to install DVC already include
-> support for all remotes.
+- `http` - HTTP and HTTPS protocols
+
+> If you installed DVC via `pip`, depending on the remote type you plan to use
+> you might need to install optional dependencies: `[s3]`, `[ssh]`, `[gs]`,
+> `[azure]`, and `[oss]`; or `[all]` to include them all. The command should
+> look like this: `pip install "dvc[s3]"` - it installs `boto3` library along
+> with DVC to support AWS S3 storage.
For example, to setup an S3 remote we would use something like (make sure that
`mybucket` exists):
diff --git a/static/docs/get-started/connect-code-and-data.md b/static/docs/get-started/connect-code-and-data.md
index fe59822e33..8867b0417c 100644
--- a/static/docs/get-started/connect-code-and-data.md
+++ b/static/docs/get-started/connect-code-and-data.md
@@ -61,9 +61,9 @@ $ git commit -m "add code"
Having installed the `src/prepare.py` script in your repo, the following command
-transforms it into a reproducible
-[stage](/doc/user-guide/dvc-files-and-directories) for the ML pipeline we're
-building (described in the [next chapter](/doc/get-started/example-pipeline)).
+transforms it into a reproducible [stage](/doc/commands-reference/run) for the
+ML pipeline we're building (described in the
+[next chapter](/doc/get-started/example-pipeline)).
```dvc
$ dvc run -f prepare.dvc \
diff --git a/static/docs/get-started/example-pipeline.md b/static/docs/get-started/example-pipeline.md
index ca1bdb1e78..cdf6d027e8 100644
--- a/static/docs/get-started/example-pipeline.md
+++ b/static/docs/get-started/example-pipeline.md
@@ -9,14 +9,14 @@ it `python`. This is a short version of the [Tutorial](/doc/tutorial).
In this example, we will focus on building a simple ML pipeline that takes an
archive with StackOverflow posts and trains the prediction model and saves it as
-an output. Check [get started](/doc/get-started) to see links to other examples,
+an output. See [get started](/doc/get-started) to see links to other examples,
tutorials, use cases if you want to cover other aspects of the DVC. The pipeline
itself is a sequence of transformation we apply to the data file:
![](/static/img/example-flow-2x.png)
DVC helps to describe these transformations and capture actual data involved -
-input data set we are processing, intermediate artifacts (useful if some
+input data set we are processing, intermediate results (useful if some
transformations take a lot of time to run), output models. This way we can
capture what data and code were used to produce a specific model in a sharable
and reproducible way.
@@ -94,7 +94,7 @@ When we run `dvc add` `Posts.xml.zip`, DVC creates a
`dvc init` created a new directory `example/.dvc/` with `config`, `.gitignore`
files and the `cache` directory. These files and directories are hidden from
-users in general. Users don't interact with these files directly. Check
+users in general. Users don't interact with these files directly. See
[DVC Files and Directories](/doc/user-guide/dvc-files-and-directories) to learn
more.
@@ -129,10 +129,10 @@ $ git commit -m "add dataset"
## Define stages
-Each [stage](/doc/user-guide/dvc-files-and-directories) – the parts of a
-pipeline – is described by providing a command to run, input data it takes and a
-list of output files. DVC is not Python or any other language specific and can
-wrap any command runnable via CLI.
+Each [stage](/doc/commands-reference/run) – the parts of a pipeline – is
+described by providing a command to run, input data it takes and a list of
+output files. DVC is not Python or any other language specific and can wrap any
+command runnable via CLI.
- The first stage is to extract XML from the archive. Note that we don't need to
run `dvc add` on `Posts.xml` below, `dvc run` saves the data automatically
diff --git a/static/docs/get-started/example-versioning.md b/static/docs/get-started/example-versioning.md
index 3c323cf455..8718a094d5 100644
--- a/static/docs/get-started/example-versioning.md
+++ b/static/docs/get-started/example-versioning.md
@@ -70,7 +70,7 @@ $ pip install -r requirements.txt
The repository you cloned is already DVC-initialized. There should be a `.dvc/`
directory with `config`, `.gitignore` files and the `cache` directory. These
files and directories are hidden from users in general. Users don't interact
-with these files directly. Check
+with these files directly. See
[DVC Files and Directories](/doc/user-guide/dvc-files-and-directories) to learn
more.
@@ -341,14 +341,14 @@ changed.
Here where DVC pipelines feature comes very handy and was designed for. We
touched it briefly when we described `dvc run` and `dvc repro` at the very end.
The next step here would be splitting the script into two steps and utilizing
-DVC pipelines. Check this [example](/doc/get-started/example-pipeline) to get a
+DVC pipelines. See this [example](/doc/get-started/example-pipeline) to get a
hands-on experience with them and try to apply it here. Don't hesitate to join
our [community](/chat) to ask any questions!
Another thing, you should have noticed, is the metrics file - `metrics.json` and
the way we captured it with `-M metrics.json` option. Metric file is a special
type of output DVC provides an interface on top to compare across tags or
-branches. Check `dvc metrics` command and
+branches. See `dvc metrics` command and
[Compare Experiments](/doc/get-started/compare-experiments) to learn more about
managing metrics. Next step you should try on your own is converting both
iterations we had into `dvc run` and then utilize `dvc metrics show` to compare
diff --git a/static/docs/get-started/index.md b/static/docs/get-started/index.md
index 9db6fbd937..5782ca6771 100644
--- a/static/docs/get-started/index.md
+++ b/static/docs/get-started/index.md
@@ -8,11 +8,11 @@ hands-on experience with real-life scenarios - first is about model and data set
[versioning](/doc/get-started/example-versioning), and the second one is focused
on [pipelines and reproducibility](/doc/get-started/example-pipeline).
-✅ Please, join our [community](/chat) or check these [support](/support)
-options if you have any questions or need any help. We are very responsive ⚡.
+✅ Please, join our [community](/chat) or see these [support](/support) options
+if you have any questions or need any help. We are very responsive ⚡.
-✅ Check out the [Github](https://github.com/iterative/dvc) page and give us a
-⭐ if you like the project!
+✅ Check out the [Github](https://github.com/iterative/dvc) repository and give
+us a ⭐ if you like the project!
✅ Contribute either on [Github](https://github.com/iterative/dvc) or
[Patreon](https://www.patreon.com/DVCorg/overview) to support the Project.
diff --git a/static/docs/get-started/install.md b/static/docs/get-started/install.md
index efe3d6c479..13cc0506a6 100644
--- a/static/docs/get-started/install.md
+++ b/static/docs/get-started/install.md
@@ -9,13 +9,12 @@ To install DVC from terminal, run:
$ pip install dvc
```
-> Depending on the [remote storage](/doc/commands-reference/remote) type you
-> plan to use to keep and share your data, you might need to specify one of the
-> optional dependencies: `s3`, `gs`, `azure`, `ssh`. Or `all_remotes` to include
-> them all. The command should look like this: `pip install "dvc[s3]"` - it
-> installs the `boto3` library along with DVC to support the AWS S3 storage.
-> This is valid for `pip install` option only. Other ways to install DVC already
-> include support for all remotes.
+> If you installed DVC via `pip`, depending on the
+> [remote](/doc/commands-reference/remote) type you plan to use you might need
+> to install optional dependencies: `[s3]`, `[ssh]`, `[gs]`, `[azure]`, and
+> `[oss]`; or `[all]` to include them all. The command should look like this:
+> `pip install "dvc[s3]"` - it installs `boto3` library along with DVC to
+> support AWS S3 storage.
The easiest option, self-contained binary packages (or Windows installer), are
available by using the big "Download" button in the [home page](/). You may also
diff --git a/static/docs/get-started/retrieve-data.md b/static/docs/get-started/retrieve-data.md
index 64f3faf8f0..7288f84d20 100644
--- a/static/docs/get-started/retrieve-data.md
+++ b/static/docs/get-started/retrieve-data.md
@@ -12,8 +12,8 @@ $ dvc pull
```
This command retrieves data files that are referenced in _all_
-[DVC-files](/doc/user-guide/dvc-file-format) in the current workspace. So, you
-usually run it after `git clone`, `git pull`, or `git checkout`.
+[DVC-files](/doc/user-guide/dvc-file-format) in the workspace. So, you usually
+run it after `git clone`, `git pull`, or `git checkout`.
As an easy way to test it:
diff --git a/static/docs/get-started/share-data.md b/static/docs/get-started/share-data.md
index 88f864ba74..1e9d5bcc48 100644
--- a/static/docs/get-started/share-data.md
+++ b/static/docs/get-started/share-data.md
@@ -16,11 +16,9 @@ Usually, you run it along with `git commit` and `git push` to save changed
[DVC-files](/doc/user-guide/dvc-file-format) to Git.
The `dvc push` command allows one to upload data to remote storage. It doesn't
-save any changes in the code or DVC-files. Those should be saved by using
+save any changes in the code or DVC-files. Those should be saved by using
`git commit` and `git push`.
-See `dvc push` for more details and options for this command.
-
> \*As noted in the DVC [configuration](/doc/get-started/configure) chapter, we
> are using a **local remote** in this guide for educational purposes.
diff --git a/static/docs/tutorial/define-ml-pipeline.md b/static/docs/tutorial/define-ml-pipeline.md
index 5f944e1f94..b4ac31df58 100644
--- a/static/docs/tutorial/define-ml-pipeline.md
+++ b/static/docs/tutorial/define-ml-pipeline.md
@@ -54,7 +54,7 @@ Refer to
files with DVC.
Note that to modify or replace a data file that is under DVC control you may
-need to run `dvc unprotect` or `dvc remove` first (check the
+need to run `dvc unprotect` or `dvc remove` first (see the
[Update Tracked File](/doc/user-guide/update-tracked-file) guide). Use
`dvc move` to rename or move a data file that is under DVC control.
diff --git a/static/docs/tutorial/reproducibility.md b/static/docs/tutorial/reproducibility.md
index afdd189b6f..7cb2d4caab 100644
--- a/static/docs/tutorial/reproducibility.md
+++ b/static/docs/tutorial/reproducibility.md
@@ -110,7 +110,7 @@ master:
data/eval.txt: AUC: 0.624652
```
-> It is convenient to keep track of information even for failed experiments.
+> It's convenient to keep track of information even for failed experiments.
> Sometimes a failed hypothesis gives more information than a successful one.
Let’s keep the result in the repository. Later we can find out why bigram does
diff --git a/static/docs/tutorial/sharing-data.md b/static/docs/tutorial/sharing-data.md
index 32d3343fc9..f828990b30 100644
--- a/static/docs/tutorial/sharing-data.md
+++ b/static/docs/tutorial/sharing-data.md
@@ -12,8 +12,8 @@ DVC is able to push the cache to a cloud.
> Using your shared cache a colleague can reuse ML models that were trained on
> your machine.
-First, you need to modify the cloud settings in the DVC config file. This can be
-done using the CLI as shown below.
+First, you need to set a data remote which will be stored in the project's
+config file. This can be done using the CLI as shown below.
> Note that we are using `dvc-share` s3 bucket as an example and you don't have
> write access to it, so in order to follow the tutorial you will need to either
@@ -54,7 +54,7 @@ $ dvc pull
```
After executing this command, all the data files will be in the right place. You
-can check that by trying to reproduce the default goal:
+can confirm this by trying to reproduce the default goal:
```dvc
# Nothing to reproduce:
diff --git a/static/docs/use-cases/data-and-model-files-versioning.md b/static/docs/use-cases/data-and-model-files-versioning.md
index cd0eb5e961..f820b78a95 100644
--- a/static/docs/use-cases/data-and-model-files-versioning.md
+++ b/static/docs/use-cases/data-and-model-files-versioning.md
@@ -5,23 +5,23 @@
> along the [versioning](/doc/get-started/example-versioning) get started
> example.
-DVC allows storing and versioning source data files, ML models, intermediate
-results with Git, without checking the file contents into Git. It is useful when
-dealing with files that are too large for Git to handle. DVC stores information
-about your data file in a special [DVC-file](/doc/user-guide/dvc-file-format),
-that has a description of a file that can be used for versioning. DVC supports
-various types of remote locations for your data files and allows you to easily
-store and share your data alongside your code.
+DVC allows storing and versioning source data files and directories, ML models,
+intermediate results with Git, without checking the file contents into Git. It
+is useful when dealing with files that are too large for Git to handle. DVC
+stores information about your data file in a special
+[DVC-file](/doc/user-guide/dvc-file-format), that has a description of a file
+that can be used for versioning. DVC supports various types of remote locations
+for your data files and allows you to easily store and share your data alongside
+your code.
![](/static/img/model-versioning-diagram.png)
-In this very basic scenario, DVC is a better replacement for `git-lfs` (check
-the [Related Technologies](/doc/understanding-dvc/related-technologies) to get a
-better sense why) and ad-hoc scripts on top of Amazon S3 (or name-it cloud) that
-are usually used to manage ML artifacts like model files, data files, etc.
-Unlike `git-lfs`, DVC doesn't require installing a server; it can be used
-on-premises (NAS, SSH, for example) or with any major cloud provider (S3, Google
-Cloud, Azure).
+In this very basic scenario, DVC is a better replacement for `git-lfs` (see
+[Related Technologies](/doc/understanding-dvc/related-technologies)) and ad-hoc
+scripts on top of Amazon S3 (or any other cloud) that are usually used to manage
+ML data artifacts like data files, models, etc. Unlike `git-lfs`,
+DVC doesn't require installing a server; it can be used on-premises (NAS, SSH,
+for example) or with any major cloud provider (S3, Google Cloud, Azure).
Let's say you already have a project that uses a bunch of images that are stored
in `images` directory and has a `model.pkl` file - your model file that is
@@ -55,13 +55,15 @@ $ git status
$ git commit -m "Initialize dvc"
```
-Start tracking images and models with DVC:
+Start tracking images and models with `dvc add`:
```dvc
$ dvc add images
$ dvc add model.pkl
```
+> Refer also to `dvc run` for more advanced ways to version data.
+
Commit your changes:
```dvc
@@ -109,9 +111,9 @@ points to the `v1.0` of the data set. While code and model files are from the
![](/static/img/versioning.png)
-To share your data with others you need to setup a remote repository. Check the
-[Share Data And Model Files] use case to get a high level overview on how to
-setup it and use `dvc pull` and `dvc push` commands to collaborate. Please,
-don't forget to check the [versioning](/doc/get-started/example-versioning) get
-started example to get a hands-on experience with datasets and models
-versioning.
+To share your data with others you need to setup a remote repository. See the
+[Share Data And Model Files](/doc/use-cases/share-data-and-model-files) use case
+to get a high level overview on how to setup it and use `dvc pull` and
+`dvc push` commands to collaborate. Please, don't forget to see the
+[versioning](/doc/get-started/example-versioning) example to get a hands-on
+experience with datasets and models versioning.
diff --git a/static/docs/user-guide/autocomplete.md b/static/docs/user-guide/autocomplete.md
index 593b9b0ff5..a374f1da62 100644
--- a/static/docs/user-guide/autocomplete.md
+++ b/static/docs/user-guide/autocomplete.md
@@ -31,10 +31,8 @@ Depending on what you typed on the command line so far, it completes:
Depending upon your preference and the availability of both Bash and Zsh on your
system, follow the steps given below to Configure Bash and/or Zsh.
-If you are new to working with shell or uncertain about your active shell, use
-`$0` to check your active shell.
-
-For example:
+If you are new to working with shell or uncertain about your active shell, print
+`$0` to check your active shell. For example:
```dvc
$ echo $0
diff --git a/static/docs/user-guide/contributing-documentation.md b/static/docs/user-guide/contributing-documentation.md
index f49529e10a..fdb5110453 100644
--- a/static/docs/user-guide/contributing-documentation.md
+++ b/static/docs/user-guide/contributing-documentation.md
@@ -28,8 +28,8 @@ to update the docs and redeploy the website.
## Submitting changes
In case of a minor change, you can use the **Edit on Github** button (found to
-the right of each page) to fork the project, edit it in place (check the right
-top corner for an Edit button on Github), and create a pull request (PR).
+the right of each page) to fork the project, edit it in place (with the source
+file **Edit** button in Github), and create a pull request (PR).
Otherwise, please refer to the following procedure:
diff --git a/static/docs/user-guide/contributing.md b/static/docs/user-guide/contributing.md
index 6f1f92e370..f52fc4ecca 100644
--- a/static/docs/user-guide/contributing.md
+++ b/static/docs/user-guide/contributing.md
@@ -1,7 +1,7 @@
# Contributing
We welcome contributions to [DVC](https://github.com/iterative/dvc) by the
-community. Check the
+community. See the
[Contributing to the Documentation](/doc/user-guide/contributing-documentation)
guide if you want to fix or update the documentation or this website.
diff --git a/static/docs/user-guide/dvc-file-format.md b/static/docs/user-guide/dvc-file-format.md
index dff021b3f6..89f3b81d84 100644
--- a/static/docs/user-guide/dvc-file-format.md
+++ b/static/docs/user-guide/dvc-file-format.md
@@ -7,8 +7,8 @@ the `.dvc` file extension (e.g. `process.dvc`), or with the default name
to track your data and reproduce pipeline stages. The file itself contains a
simple YAML format that could be easily written or altered manually.
-Check the [Syntax Highlighting](/doc/user-guide/plugins) to enable the
-highlighting for your editor.
+See the [Syntax Highlighting](/doc/user-guide/plugins) to learn how to enable
+the highlighting for your editor.
Here is an example of a DVC-file:
@@ -34,7 +34,7 @@ outs:
locked: True
# Comments like this line persist through multiple executions of
-# dvc repro/commit but not through dvc run/add/import commands.
+# dvc repro/commit but not through dvc run/add/import-url/get-url commands.
meta: # Special key to contain arbitary user data
name: John
@@ -80,4 +80,5 @@ meta values are preserved between multiple executions of `dvc repro` and
`dvc commit` commands.
> Note that comments and meta values are not preserved when a DVC-file is
-> overwritten with the `dvc run`,`dvc add`,`dvc import-url` commands.
+> overwritten with the `dvc run`,`dvc add`,`dvc import-url`, and `dvc get-url`
+> commands.
diff --git a/static/docs/user-guide/external-dependencies.md b/static/docs/user-guide/external-dependencies.md
index 045a9e27f0..4f0a0532a6 100644
--- a/static/docs/user-guide/external-dependencies.md
+++ b/static/docs/user-guide/external-dependencies.md
@@ -2,8 +2,8 @@
With DVC you can specify external files as dependencies for your pipeline
stages. DVC will track changes in those files and will reflect that in your
-pipeline state. Currently DVC supports the following types of external
-dependencies:
+pipeline state. Currently, the following types of external dependencies
+(protocols) are supported:
1. Local files and directories outside of your dvc repository;
2. Amazon S3;
@@ -90,7 +90,7 @@ $ dvc run -d remote://example/data.txt \
Please refer to `dvc remote add` for more details like setting up access
credentials for certain remotes.
-## Using import
+## Using import-url
In the previous command examples, downloading commands were used: `aws s3 cp`,
`scp`, `wget`, etc. `dvc import-url` simplifies the downloading part for all the
diff --git a/static/docs/user-guide/external-outputs.md b/static/docs/user-guide/external-outputs.md
index be60d51713..fb85251891 100644
--- a/static/docs/user-guide/external-outputs.md
+++ b/static/docs/user-guide/external-outputs.md
@@ -3,8 +3,8 @@
You can specify external files as outputs for
[DVC-files](/doc/user-guide/dvc-file-format) created by `dvc run` (stage files).
DVC will track changes in those files and will reflect so in your pipeline
-[status](/doc/commands-reference/status). Currently DVC supports these types of
-external outputs:
+[status](/doc/commands-reference/status). Currently, the following types of
+external outputs (protocols) are supported:
1. Local files and directories outside of your dvc repository;
2. Amazon S3;
diff --git a/static/docs/user-guide/update-tracked-file.md b/static/docs/user-guide/update-tracked-file.md
index 7b822a225e..f69e063124 100644
--- a/static/docs/user-guide/update-tracked-file.md
+++ b/static/docs/user-guide/update-tracked-file.md
@@ -1,7 +1,7 @@
# Update a Tracked File
Due to the way DVC handles linking between the data files in the cache and their
-counterparts in the working directory (refer to
+counterparts in the workspace (refer to
[Large Dataset Optimization](/docs/user-guide/large-dataset-optimization)),
updating tracked files has to be carried out with caution to avoid data
corruption when the DVC config option `cache.type` is set to `hardlink` or/and