From 4faf132ec9cabd750cab0db1d7f025a74d34714a Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Thu, 12 Jan 2023 23:36:37 +0100 Subject: [PATCH 01/67] rfc: CARGO_TARGET_DIRECTORIES: initial proposal version --- text/0000-cargo-target-directories.md | 283 ++++++++++++++++++++++++++ 1 file changed, 283 insertions(+) create mode 100644 text/0000-cargo-target-directories.md diff --git a/text/0000-cargo-target-directories.md b/text/0000-cargo-target-directories.md new file mode 100644 index 00000000000..bfb31d4d285 --- /dev/null +++ b/text/0000-cargo-target-directories.md @@ -0,0 +1,283 @@ +- Feature Name: `cargo_target_directories` +- Start Date: 2023-01-12 +- RFC PR: [rust-lang/rfcs#3371](https://github.com/rust-lang/rfcs/pull/3371) + + +# Summary +[summary]: #summary + + + +Introduce a new configuration option for cargo to tell it to move the crate/workspace's target directory into a crate/workspace-specific subdirectory of the configured absolute path, +named `CARGO_TARGET_DIRECTORIES`. + +# Motivation +[motivation]: #motivation + + + +The original motivating issue can be found here: [rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156). + +1. Not having to find and clean all `target/` dirs everywhere while not having all projects collide (which is the effect of setting `CARGO_TARGET_DIR` globally) +1. Being able to easily exclude a directory from saves (Apple's Time Machine, ZFS snapshots, BRTS, ...) +1. Allows easily having separate directories for Rust-Analyzer and Cargo itself, allowing concurrent builds + (technically already doable with arguments/env vars but `CARGO_TARGET_DIR` collides all projects into big target dir, leading to frequent recompilation because of conflicting features and locking builds) +1. Allows using a different disk, partition or mount point for cargo artifacts +1. Avoids having to set `CARGO_TARGET_DIR` for every project to get the same effect as proposed here + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + + + +For a single project, it is possible to use the `CARGO_TARGET_DIR` environment variable (or the `target-dir` TOML config option or the `--target-dir` command-line flag) to change the position of the `target/` directory used for build artifacts during compilation with Cargo. + +While this option is useful for single-project environments (CI builds, builds through other build systems like Meson or Bazel), in multi-projects environment, like personal machines, it conflates every under the configured path: `CARGO_TARGET_DIR` directly replaces the `/target/` directory. + +`CARGO_TARGET_DIRECTORIES` (or the `target-directories` TOML option or the `--target-directories` command-line flag) instead acts as a parent for those `target` directories. + +Below is an example of the behavior with `CARGO_TARGET_DIR` versus the one with `CARGO_TARGET_DIRECTORIES`: + +## Example + +Consider this directory tree: + +```text +/Users/ +├─ poliorcetics/ +│ ├─ work/ +│ │ ├─ work-project/ +│ │ │ ├─ Cargo.toml +│ │ │ ├─ crate-1/ +│ │ │ │ ├─ Cargo.toml +│ │ │ ├─ crate-2/ +│ │ │ │ ├─ Cargo.toml +│ ├─ perso/ +│ │ ├─ perso-1/ +│ │ │ ├─ Cargo.toml +│ │ ├─ perso-2/ +│ │ │ ├─ Cargo.toml + +/cargo-cache/ +``` + +#### With `CARGO_TARGET_DIR=/cargo-cache` + +`cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts directly in `/cargo-cache/debug/...` + +A subsequent `cargo build` in `project-1` will work with the same artifact, potentially having conflicting features for dependencies for example. + +A `cargo clean` will delete the entire `/cargo-cache` directory, for all projects at once. + +It's possible to produce invalid state in the target dir by having unrelated projects writing in the same place. + +It's not possible to have to projects building at once because Cargo locks its target directory during builds. + +#### With `CARGO_TARGET_DIRECTORIES=/cargo-cache` + +`cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts in `/cargo-cache/work-project/debug/...` + +A `cargo build` in `project-1` will produce new artifacts in `/cargo-cache/project-1/debug/...`. + +A `cargo clean` will only remove the `/cargo-cache//` subdirectory, not all the artifacts. + +It's not possible for Cargo to produce invalid state without a `build.rs` deliberately writing outside its target directory. + +Two projects can be built in parallel without troubles. + +#### With both set + +`CARGO_TARGET_DIR` was present long before `CARGO_TARGET_DIRECTORIES`: backward compatibility is important, so the first always trumps the second, +there is no mixing going on. + +#### Absolute and relative paths + +`CARGO_TARGET_DIR` can be either a relative or absolute path, which makes sense since it's mostly intended for a single project, which can then +work from its own position to configure the target directory. + +On the other hand `CARGO_TARGET_DIRECTORIES` is intended to be used with several projects, possibly completely unrelated to each other. As such, +it does not accept relative paths, only absolute ones. If a compelling use case is present for a relative path, it can added in the future as a +backward-compatible change. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + + + +## Setting `CARGO_TARGET_DIRECTORIES` + +The option is similar to `CARGO_TARGET_DIR` and can be set in the same places. From less to most specific: + +- Through the `config.toml`: + + ```toml + [build] + target-directories = "/absolute/path/to/target/directories" + ``` + +- Through the environment variable: `CARGO_TARGET_DIRECTORIES="/absolute/path/to/target/directories" cargo build` +- Through the command line flag: `cargo build --target-directories /absolute/path/to/target/directories` + +The given path must be absolute: setting `CARGO_TARGET_DIRECTORIES` to an empty or relative path is an error (when used and not instantly overriden by `CARGO_TARGET_DIR`). + +## Resolution order relative to `CARGO_TARGET_DIR` + +The resolution order favors `CARGO_TARGET_DIR` in all its forms, in the interest of both backward compatibility and allowing overriding for a singular workspace: + +`--target-dir` > `CARGO_TARGET_DIR` > `target-dir = ...` > `--target-directories` > `CARGO_TARGET_DIRECTORIES` > `target-directories = ...` + +## Naming + +In the example in the previous section, using `CARGO_TARGET_DIRECTORIES` with `cargo build` produces named subdirectories. The name of those is deterministic: +it is the name of the parent directory of the workspace's `Cargo.toml` manifest, so building `work-project/crate-1` will still use the `/cargo-caches/work-project/debug/...` directory for a `cargo build` call. + +This naming scheme is chosen to be simple for people to navigate but is **not considered stable**: since it can easily conflict, Cargo maintainers reserve the right to change it if too many conflicts happen. +In practice, it should be rare to have two different projects using the same name on a single machine so conflicts are not considered more important than readability and predictability. + +In case the parent directory is `/` or `C:\`, the subdirectory name is implementation defined. + +## Impact on `cargo ...` calls + +When calling `cargo` where `CARGO_TARGET_DIRECTORIES` is active, `CARGO_TARGET_DIR` is set by all `cargo` calls that happen in a Cargo workspace, including calls to third-party tools. + +In the same vein, `cargo metadata` will fill the target directory and make no mention of `CARGO_TARGET_DIRECTORIES` since it can only be used in a single workspace at once. + +### `cargo clean` + +Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists. The same behavior is used for +`CARGO_TARGET_DIRECTORIES`. + +# Drawbacks +[drawbacks]: #drawbacks + + + +## One more option to find the target directory + +This introduces one more option to look at to find the target directory, which may complicate the life of external tools. + +This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_DIRECTORIES`, so an external tool can set it and go on its way. +Also, having `cargo` set `CARGO_TARGET_DIR` when inside a workspace where `CARGO_TARGET_DIRECTORIES` is used will help current tools (those not +yet using `cargo metadata`) continue working without trouble. + +## Conflicting names are easy to produce + +This option easily conflicts. + +Entirely true, and for now ignored because of the rationale in the "Naming" subsection above. It's an option set by the people controlling the machine +for their convenience and does nothing when absent. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + + + +## Do nothing + +It is already possible today to use `CARGO_TARGET_DIR` to remap workspaces and projects but this has two problems: + +1) If done globally, the `CARGO_TARGET_DIR` becomes a hodge-podge of every project, which is not often the goal. +2) If done per-project, it is very cumbersome to maintain. + +For those reason, this option has not been retained. + +## Using `XDG_CACHE_HOME` instead of a cargo-specific env-var + +Not all OSes use the XDG convention, notably Windows and macOS (though the latter can be somewhat made to) and it is +very easy to define `CARGO_TARGET_DIRECTORIES=${XDG_CACHE_HOME:-~/.cache}/cargo_target_directories` if wanted by users. + +## Just put the directories inside of `.cargo/cache/...` + +There are already lots of discussion about `.cargo` and `.rustup` being home to both cache and config files and why this +is annoying for lots of users. What's more, it would not be as helpful to external build tools, they don't care about +bringing the registry cache in their build directory for example. + +## Stabilize the naming scheme + +I feel this require an hard-to-break naming scheme, which I don't have the skills nor motivation to design. Instead, I prefer explicitely telling the naming scheme is not to +be considered stable and allow more invested people to experiment with the feature and find something solid. + +# Prior art +[prior-art]: #prior-art + + + +- [`targo`](https://github.com/sunshowers/targo) by @sunshowers +- [rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) + +Both express the needs for either remapping or a global target directory that is not shared between different Cargo workspaces. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + + + +None I can think of. + +# Future possibilities +[future-possibilities]: #future-possibilities + + + +- Allowing relative paths: I feel this is counter-productive to the stated goal and have thought of no use for it, but it's entirely possible someone else will. +- Introduce remapping into the concept in some way. From 0a0df4c2affdc31544d2bc5626aacef9b2e4842d Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 5 Feb 2023 22:43:05 +0100 Subject: [PATCH 02/67] chore: use PR number in file name --- ...rgo-target-directories.md => 3371-cargo-target-directories.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{0000-cargo-target-directories.md => 3371-cargo-target-directories.md} (100%) diff --git a/text/0000-cargo-target-directories.md b/text/3371-cargo-target-directories.md similarity index 100% rename from text/0000-cargo-target-directories.md rename to text/3371-cargo-target-directories.md From 0b53b3979d354854f924b6745326224f9b45e641 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 5 Feb 2023 23:50:58 +0100 Subject: [PATCH 03/67] feat: Improve "alternatives" and "prior art" sections following review. --- text/3371-cargo-target-directories.md | 50 ++++++++++++++++++++++++--- 1 file changed, 45 insertions(+), 5 deletions(-) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index bfb31d4d285..6e7ce3c9abf 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -155,6 +155,8 @@ In practice, it should be rare to have two different projects using the same nam In case the parent directory is `/` or `C:\`, the subdirectory name is implementation defined. +See "Rationale and alternatives" about conflicts in the naming scheme. + ## Impact on `cargo ...` calls When calling `cargo` where `CARGO_TARGET_DIRECTORIES` is active, `CARGO_TARGET_DIR` is set by all `cargo` calls that happen in a Cargo workspace, including calls to third-party tools. @@ -202,8 +204,12 @@ It is already possible today to use `CARGO_TARGET_DIR` to remap workspaces and p 1) If done globally, the `CARGO_TARGET_DIR` becomes a hodge-podge of every project, which is not often the goal. 2) If done per-project, it is very cumbersome to maintain. +3) [`targo`](https://github.com/sunshowers/targo) by @sunshowers +4) [rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) + +`targo` and the cargo issue express a need for either remapping or a global target directory that is not shared between different Cargo workspaces. -For those reason, this option has not been retained. +For those reason, this option has not been retained and the `targo` tool is discussed more in details below. ## Using `XDG_CACHE_HOME` instead of a cargo-specific env-var @@ -221,6 +227,32 @@ bringing the registry cache in their build directory for example. I feel this require an hard-to-break naming scheme, which I don't have the skills nor motivation to design. Instead, I prefer explicitely telling the naming scheme is not to be considered stable and allow more invested people to experiment with the feature and find something solid. +## Make the naming conflict less easily + +The (possibly) conflicting naming scheme used here is something that can easily be fixed: instead of just `/cargo-caches/work-project`, use something like `/cargo-caches/work-project-`, like `targo` does. This approach has at least two defaults and one advantage: + +- **Advantage**: conflicts are pretty much impossible while the naming scheme is still predictable +- **Disadvantage**: external tools now have to know about the naming scheme internal more than just "set `$CARGO_TARGET_DIRECTORIES` and append crate directory name" and changing the hash method or length could heavily break them (even with us specifying it as unstable, we all know how that goes in real life). We would also have to specify when is the hash resolved (before symlink resolution or after). +- **Disadvantage**: moving the crate directory implies a full rebuild because the hash has changed. This is especially impactful for CIs, where different steps could be executed in different temporary directories, preventing them from enjoying the benefits of such a cache. For CIs, this is strongly mitigated by using `$CARGO_TARGET_DIR` directly and so may not be that much of a disadvantage. For local builds, I expect users are not moving their crates all over the place often, although I only have myself as data on this. + +Also, I don't expect people are using similarly-named directories for unrelated projects. I checked my machines, it's zero for all of them. I have two `rust-lang/rust` worktree, named `rust-lang-1` and `rust-lang-2`, and they would not interfere with each other if they used a system like `$CARGO_TARGET_DIRECTORIES` since their directory names are different. + +## Just use `targo` + +While a very nice tool, `targo` is not integrated with `cargo` and has a few shortcomings: + +- It uses symlinks, which are not always handled well by other tools. Specifically, since it's not integrated inside `cargo`, it uses a `target` symlink to avoid having to remap `cargo`'s execution using `CARGO_TARGET_DIR` and such,making it less useful for external build tools that would use this functionality. Using such a symlink also means `cargo clean` does not work, it just removes the symlink and not the data. +- It completely ignores `CARGO_TARGET_DIR`-related options, which again may break workflows. +- It needs more metadata to work well, which means an external tool using it would have to understand that metadata too. +- It uses `$CARGO_HOME/targo` to place its cache, making it less useful for external build tools and people wanting to separate caches and configuration. +- It needs to intercept `cargo`'s arguments, making it more brittle than an integrated solution. +- It uses a hash-based naming scheme, making it less predictable and compatible with external build tools and moving directories, as seen above. + +Some of those could be fixed of course, and I don't expect `cargo`'s `--target-dir` and `--manifest-path` to change or disappear anytime soon, but still, it could happen. An external tool like `targo` will never be able to +solve some of these or ensure forward compatibility as well as the solution proposed in this RFC. + +On the other hand, `targo` is already here and working for at least one person, making it the most viable alternative for now. + # Prior art [prior-art]: #prior-art @@ -240,10 +272,17 @@ Note that while precedent set by other languages is some motivation, it does not Please also take into consideration that rust sometimes intentionally diverges from common language features. --> -- [`targo`](https://github.com/sunshowers/targo) by @sunshowers -- [rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) +## `bazel` + +The [`bazel`] build system has a similar feature called the [`outputRoot`](https://docs.bazel.build/versions/5.4.0/output_directories.html), which is always active and has default directories on all major platforms (Linux, macOS, Windows). + +The naming scheme is as follow: `/_bazel_$USER/` is the `outputUserRoot`, used for all builds done by `$USER`. Below that, projects are identified by the MD5 hash of the path name of the workspace directory (computed after resolving symlinks). + +The `outputRoot` can be overridden using `--output_base=...` (this is `$CARGO_TARGET_DIRECTORIES`, the subject of this RFC) and the `outputUserRoot` with `--output_user_root=...` (this is close to using `$CARGO_TARGET_DIR`, already possible in today's `cargo`). + +**Conclusion**: `bazel` shows that a hash-based workflow seems to work well enough, making an argument for the use of it in `cargo` too. It also uses the current user, to prevent attacks by having compiled a program as root and making the directory accessible to other users later on by also compiling there for them. `cargo` could also do this, though I do not know what happens when `--output_user_root` is set to the same path for two different users. -Both express the needs for either remapping or a global target directory that is not shared between different Cargo workspaces. +*Note: I looked at Bazel 5.4.0, the latest stable version as of this writing, things may change in the future or be different for older versions.* # Unresolved questions [unresolved-questions]: #unresolved-questions @@ -254,7 +293,8 @@ Both express the needs for either remapping or a global target directory that is - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? --> -None I can think of. +- Should we use a hash-based solution or simply a directory-named based one ? `bazel` using hashes indicates this solution is viable for them, it probably would be for `cargo` too and if we use both the hash and the directory name, it would stay fairly human-readable but this solution also has disadvantages. +- Do we want to differentiate according to users ? `bazel` is a generic build tool, whereas `cargo` is not, so maybe differentiating on users is not necessary for us ? # Future possibilities [future-possibilities]: #future-possibilities From 60822431f929d1d1599f6ab2ef3cc193a637d685 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 27 Feb 2023 22:47:23 +0100 Subject: [PATCH 04/67] nit: clarify some lines --- text/3371-cargo-target-directories.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index 6e7ce3c9abf..7e61b124d29 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -43,7 +43,7 @@ For implementation-oriented RFCs (e.g. for compiler internals), this section sho For a single project, it is possible to use the `CARGO_TARGET_DIR` environment variable (or the `target-dir` TOML config option or the `--target-dir` command-line flag) to change the position of the `target/` directory used for build artifacts during compilation with Cargo. -While this option is useful for single-project environments (CI builds, builds through other build systems like Meson or Bazel), in multi-projects environment, like personal machines, it conflates every under the configured path: `CARGO_TARGET_DIR` directly replaces the `/target/` directory. +While this option is useful for single-project environments (simple CI builds, builds through other build systems like Meson or Bazel), in multi-projects environment, like personal machines or repos with multiple workspaces, it conflates every build directory under the configured path: `CARGO_TARGET_DIR` directly replaces the `/target/` directory. `CARGO_TARGET_DIRECTORIES` (or the `target-directories` TOML option or the `--target-directories` command-line flag) instead acts as a parent for those `target` directories. @@ -92,7 +92,7 @@ A `cargo build` in `project-1` will produce new artifacts in `/cargo-cache/proje A `cargo clean` will only remove the `/cargo-cache//` subdirectory, not all the artifacts. -It's not possible for Cargo to produce invalid state without a `build.rs` deliberately writing outside its target directory. +In this situation, it's not possible for Cargo to produce invalid state without a `build.rs` deliberately writing outside its target directory. Two projects can be built in parallel without troubles. From ed55d46e350929dc16428431306bf6d9849620ac Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 27 Feb 2023 23:37:11 +0100 Subject: [PATCH 05/67] fix: improve bazel section, fix missing link --- text/3371-cargo-target-directories.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index 7e61b124d29..9b009967b2f 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -274,12 +274,13 @@ Please also take into consideration that rust sometimes intentionally diverges f ## `bazel` -The [`bazel`] build system has a similar feature called the [`outputRoot`](https://docs.bazel.build/versions/5.4.0/output_directories.html), which is always active and has default directories on all major platforms (Linux, macOS, Windows). +The [`bazel`](https://bazel.build) build system has a similar feature called the [`outputRoot`](https://docs.bazel.build/versions/5.4.0/output_directories.html), which is always active and has default directories on all major build/development platforms (Linux, macOS, Windows). The naming scheme is as follow: `/_bazel_$USER/` is the `outputUserRoot`, used for all builds done by `$USER`. Below that, projects are identified by the MD5 hash of the path name of the workspace directory (computed after resolving symlinks). The `outputRoot` can be overridden using `--output_base=...` (this is `$CARGO_TARGET_DIRECTORIES`, the subject of this RFC) and the `outputUserRoot` with `--output_user_root=...` (this is close to using `$CARGO_TARGET_DIR`, already possible in today's `cargo`). +It should be noted that `bazel` is integrated with [remote caching](https://bazel.build/remote/caching) and has different needs from `cargo`, the latter only working locally. **Conclusion**: `bazel` shows that a hash-based workflow seems to work well enough, making an argument for the use of it in `cargo` too. It also uses the current user, to prevent attacks by having compiled a program as root and making the directory accessible to other users later on by also compiling there for them. `cargo` could also do this, though I do not know what happens when `--output_user_root` is set to the same path for two different users. *Note: I looked at Bazel 5.4.0, the latest stable version as of this writing, things may change in the future or be different for older versions.* From 5f869bc4206489f1160f395b24c66b79818f5a11 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 27 Feb 2023 23:38:14 +0100 Subject: [PATCH 06/67] feat: use hashing in the naming scheme --- text/3371-cargo-target-directories.md | 41 +++++++++------------------ 1 file changed, 14 insertions(+), 27 deletions(-) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index 9b009967b2f..d8f023bf75e 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -88,7 +88,7 @@ It's not possible to have to projects building at once because Cargo locks its t `cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts in `/cargo-cache/work-project/debug/...` -A `cargo build` in `project-1` will produce new artifacts in `/cargo-cache/project-1/debug/...`. +A `cargo build` in `project-1` will produce new artifacts in `/cargo-cache/project-1-/debug/...`. A `cargo clean` will only remove the `/cargo-cache//` subdirectory, not all the artifacts. @@ -147,26 +147,22 @@ The resolution order favors `CARGO_TARGET_DIR` in all its forms, in the interest ## Naming -In the example in the previous section, using `CARGO_TARGET_DIRECTORIES` with `cargo build` produces named subdirectories. The name of those is deterministic: -it is the name of the parent directory of the workspace's `Cargo.toml` manifest, so building `work-project/crate-1` will still use the `/cargo-caches/work-project/debug/...` directory for a `cargo build` call. +In the example in the previous section, using `CARGO_TARGET_DIRECTORIES` with `cargo build` produces named subdirectories. The name of those is partially deterministic: +it is the name of the parent directory of the workspace's `Cargo.toml` manifest and an unspecified hash of the absolute path to the workspace's root, so building `work-project/crate-1` will still use the `/cargo-caches/work-project-/debug/...` directory for a `cargo build` call. -This naming scheme is chosen to be simple for people to navigate but is **not considered stable**: since it can easily conflict, Cargo maintainers reserve the right to change it if too many conflicts happen. -In practice, it should be rare to have two different projects using the same name on a single machine so conflicts are not considered more important than readability and predictability. +This naming scheme is chosen to be simple for people to navigate but is **not considered stable**: the hashing method (and so the hash) will not change within a single minor version of cargo (1.68.0 -> 1.68.1) but it can change between any two minor versions (1.68 -> 1.69) and tools using that needs to interact with `cargo`'s target directory should not rely on its value for more than a single invocation of them: they should instead query `cargo metadata` for the actual value. In case the parent directory is `/` or `C:\`, the subdirectory name is implementation defined. -See "Rationale and alternatives" about conflicts in the naming scheme. - ## Impact on `cargo ...` calls When calling `cargo` where `CARGO_TARGET_DIRECTORIES` is active, `CARGO_TARGET_DIR` is set by all `cargo` calls that happen in a Cargo workspace, including calls to third-party tools. -In the same vein, `cargo metadata` will fill the target directory and make no mention of `CARGO_TARGET_DIRECTORIES` since it can only be used in a single workspace at once. +In the same vein, `cargo metadata` will fill the target directory information with the absolute path and make no mention of `CARGO_TARGET_DIRECTORIES` since it can only be used in a single workspace at once. ### `cargo clean` -Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists. The same behavior is used for -`CARGO_TARGET_DIRECTORIES`. +Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists. The same behavior is used for `CARGO_TARGET_DIRECTORIES`. # Drawbacks [drawbacks]: #drawbacks @@ -181,13 +177,6 @@ This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_D Also, having `cargo` set `CARGO_TARGET_DIR` when inside a workspace where `CARGO_TARGET_DIRECTORIES` is used will help current tools (those not yet using `cargo metadata`) continue working without trouble. -## Conflicting names are easy to produce - -This option easily conflicts. - -Entirely true, and for now ignored because of the rationale in the "Naming" subsection above. It's an option set by the people controlling the machine -for their convenience and does nothing when absent. - # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -224,18 +213,16 @@ bringing the registry cache in their build directory for example. ## Stabilize the naming scheme -I feel this require an hard-to-break naming scheme, which I don't have the skills nor motivation to design. Instead, I prefer explicitely telling the naming scheme is not to -be considered stable and allow more invested people to experiment with the feature and find something solid. +I feel this require an hard-to-break naming scheme (a recent hash algorithm should be good enough in 99% of the cases but collisions are always possible), which I don't have the skills nor motivation to design. Instead, I prefer explicitely telling the naming scheme is not to be considered stable and allow more invested people to experiment with the feature and find something solid. -## Make the naming conflict less easily +What's more, by explicitely not stabilizing it (and maybe voluntarily changing it between versions sometimes, since a version change recompiles everything anyway ?) we can instead reroute people and tools towards `CARGO_TARGET_DIR` / `cargo metadata` instead, which are much more likely to be suited to their use case if they need the path to the target directory. -The (possibly) conflicting naming scheme used here is something that can easily be fixed: instead of just `/cargo-caches/work-project`, use something like `/cargo-caches/work-project-`, like `targo` does. This approach has at least two defaults and one advantage: +## Use only the hash, not the name of the workspace directory -- **Advantage**: conflicts are pretty much impossible while the naming scheme is still predictable -- **Disadvantage**: external tools now have to know about the naming scheme internal more than just "set `$CARGO_TARGET_DIRECTORIES` and append crate directory name" and changing the hash method or length could heavily break them (even with us specifying it as unstable, we all know how that goes in real life). We would also have to specify when is the hash resolved (before symlink resolution or after). -- **Disadvantage**: moving the crate directory implies a full rebuild because the hash has changed. This is especially impactful for CIs, where different steps could be executed in different temporary directories, preventing them from enjoying the benefits of such a cache. For CIs, this is strongly mitigated by using `$CARGO_TARGET_DIR` directly and so may not be that much of a disadvantage. For local builds, I expect users are not moving their crates all over the place often, although I only have myself as data on this. +While this solution is just as easy to work with for tools, I think having a somewhat human-readable part in the name make it easier to work with through things like logging or just listing the subdirectories in `CARGO_TARGET_DIRECTORIES`. It also makes the path printed by `cargo run` or +the local URL used when calling `cargo doc --open` much clearer. -Also, I don't expect people are using similarly-named directories for unrelated projects. I checked my machines, it's zero for all of them. I have two `rust-lang/rust` worktree, named `rust-lang-1` and `rust-lang-2`, and they would not interfere with each other if they used a system like `$CARGO_TARGET_DIRECTORIES` since their directory names are different. +Ultimately, tools are not negatively affected since they should be using `CARGO_TARGET_DIR` or `cargo metadata` and humans have an easier time working with such paths so I think it's worthwhile to include the workspace's name. ## Just use `targo` @@ -246,7 +233,6 @@ While a very nice tool, `targo` is not integrated with `cargo` and has a few sho - It needs more metadata to work well, which means an external tool using it would have to understand that metadata too. - It uses `$CARGO_HOME/targo` to place its cache, making it less useful for external build tools and people wanting to separate caches and configuration. - It needs to intercept `cargo`'s arguments, making it more brittle than an integrated solution. -- It uses a hash-based naming scheme, making it less predictable and compatible with external build tools and moving directories, as seen above. Some of those could be fixed of course, and I don't expect `cargo`'s `--target-dir` and `--manifest-path` to change or disappear anytime soon, but still, it could happen. An external tool like `targo` will never be able to solve some of these or ensure forward compatibility as well as the solution proposed in this RFC. @@ -281,6 +267,7 @@ The naming scheme is as follow: `/_bazel_$USER/` is the `outputUserR The `outputRoot` can be overridden using `--output_base=...` (this is `$CARGO_TARGET_DIRECTORIES`, the subject of this RFC) and the `outputUserRoot` with `--output_user_root=...` (this is close to using `$CARGO_TARGET_DIR`, already possible in today's `cargo`). It should be noted that `bazel` is integrated with [remote caching](https://bazel.build/remote/caching) and has different needs from `cargo`, the latter only working locally. + **Conclusion**: `bazel` shows that a hash-based workflow seems to work well enough, making an argument for the use of it in `cargo` too. It also uses the current user, to prevent attacks by having compiled a program as root and making the directory accessible to other users later on by also compiling there for them. `cargo` could also do this, though I do not know what happens when `--output_user_root` is set to the same path for two different users. *Note: I looked at Bazel 5.4.0, the latest stable version as of this writing, things may change in the future or be different for older versions.* @@ -294,7 +281,6 @@ It should be noted that `bazel` is integrated with [remote caching](https://baze - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? --> -- Should we use a hash-based solution or simply a directory-named based one ? `bazel` using hashes indicates this solution is viable for them, it probably would be for `cargo` too and if we use both the hash and the directory name, it would stay fairly human-readable but this solution also has disadvantages. - Do we want to differentiate according to users ? `bazel` is a generic build tool, whereas `cargo` is not, so maybe differentiating on users is not necessary for us ? # Future possibilities @@ -322,3 +308,4 @@ The section merely provides additional information. - Allowing relative paths: I feel this is counter-productive to the stated goal and have thought of no use for it, but it's entirely possible someone else will. - Introduce remapping into the concept in some way. +- Use `CARGO_TARGET_DIRECTORIES` as the default instead of the current `target` dir, with platform-defined defaults. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. From aafee1cb5e5d19321dc41ddfa5370d60e490b895 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Tue, 28 Feb 2023 00:04:38 +0100 Subject: [PATCH 07/67] feat: discuss remapping --- text/3371-cargo-target-directories.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index d8f023bf75e..66fa502a56e 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -239,6 +239,23 @@ solve some of these or ensure forward compatibility as well as the solution prop On the other hand, `targo` is already here and working for at least one person, making it the most viable alternative for now. +## Remapping + +[rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) was originally about remapping the target directory, not about having a +central one but reading the issue, there seems to be no needs for more than the simple redefinition of the target directory proposed in this document. +In the future, if `CARGO_TARGET_DIR_REMAP` is introduced, it could be used to be the prefix to the target directory like so: + +- Set `CARGO_TARGET_DIR_REMAP=/home/user/projects=/tmp/cargo-build` +- Compile the crate under `/home/user/projects/foo/` **without** `CARGO_TARGET_DIR` or `CARGO_TARGET_DIRECTORIES` set +- The resulting target directory will be at `/tmp/cargo-build/foo/target` + +By making the priority order `CARGO_TARGET_DIR` > `CARGO_TARGET_DIRECTORIES` > `CARGO_TARGET_DIR_REMAP` (when all are absolute paths) we would keep backward compatibility. Or we could disallow having the last two set at once, so that they're alternatives and not ordered. + +When `CARGO_TARGET_DIR` is relative, the result could be `/tmp/cargo-build/foo/$CARGO_TARGET_DIR`. + +Overall, I feel remapping is much harder to implement well and can be added later without interfering with `CARGO_TARGET_DIRECTORIES` (and without +this RFC interfering with remapping), though the design space is probably bigger than the one for this RFC. + # Prior art [prior-art]: #prior-art From 4c89e347affe08d7b57d8c61c084b8e88f292dd5 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Tue, 28 Feb 2023 09:29:22 +0100 Subject: [PATCH 08/67] nit: missing `` --- text/3371-cargo-target-directories.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index 66fa502a56e..9131337268c 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -86,11 +86,11 @@ It's not possible to have to projects building at once because Cargo locks its t #### With `CARGO_TARGET_DIRECTORIES=/cargo-cache` -`cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts in `/cargo-cache/work-project/debug/...` +`cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts in `/cargo-cache/work-project-/debug/...` A `cargo build` in `project-1` will produce new artifacts in `/cargo-cache/project-1-/debug/...`. -A `cargo clean` will only remove the `/cargo-cache//` subdirectory, not all the artifacts. +A `cargo clean` will only remove the `/cargo-cache/-/` subdirectory, not all the artifacts. In this situation, it's not possible for Cargo to produce invalid state without a `build.rs` deliberately writing outside its target directory. From a3b8c7c0ecceb3b5ce5adacbfb25bf8c98ab54e3 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Tue, 28 Feb 2023 09:30:05 +0100 Subject: [PATCH 09/67] fix: decisively declare non-stability of naming scheme --- text/3371-cargo-target-directories.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index 9131337268c..a8f786e09ae 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -150,7 +150,7 @@ The resolution order favors `CARGO_TARGET_DIR` in all its forms, in the interest In the example in the previous section, using `CARGO_TARGET_DIRECTORIES` with `cargo build` produces named subdirectories. The name of those is partially deterministic: it is the name of the parent directory of the workspace's `Cargo.toml` manifest and an unspecified hash of the absolute path to the workspace's root, so building `work-project/crate-1` will still use the `/cargo-caches/work-project-/debug/...` directory for a `cargo build` call. -This naming scheme is chosen to be simple for people to navigate but is **not considered stable**: the hashing method (and so the hash) will not change within a single minor version of cargo (1.68.0 -> 1.68.1) but it can change between any two minor versions (1.68 -> 1.69) and tools using that needs to interact with `cargo`'s target directory should not rely on its value for more than a single invocation of them: they should instead query `cargo metadata` for the actual value. +This naming scheme is chosen to be simple for people to navigate but is **not considered stable**: the hashing method (and so the hash) will probably not change often but `cargo` offers no guarantee and may change it in any release. Tools that needs to interact with `cargo`'s target directory should not rely on its value for more than a single invocation of them: they should instead query `cargo metadata` for the actual value each time they are invoked. In case the parent directory is `/` or `C:\`, the subdirectory name is implementation defined. From 4ea813c3080e7b1e3830d99aaf89eab42514ca66 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Tue, 28 Feb 2023 09:30:27 +0100 Subject: [PATCH 10/67] feat: talk about symlinks resolving --- text/3371-cargo-target-directories.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index a8f786e09ae..ff0f3e20c35 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -154,6 +154,21 @@ This naming scheme is chosen to be simple for people to navigate but is **not co In case the parent directory is `/` or `C:\`, the subdirectory name is implementation defined. +### Symbolic links + +In the following situation + +``` +/Users/ +├─ poliorcetics/ +│ ├─ projects/ +│ │ ├─ actual-crate/ +│ │ │ ├─ Cargo.toml +│ │ ├─ symlink-to-crate/ -> actual-crate/ +``` + +When calling `cargo metadata` in the `symlink-to-crate` path, the result contains `"manifest_path": "/Users/poliorcetics/projects/actual-crate/Cargo.toml"` and `"workspace_root":"/Users/poliorcetics/projects/actual-crate"`. This behaviour means that symlinks won't change the final directory used inside `CARGO_TARGET_DIRECTORIES`, or in other words: symbolic links are resolved. + ## Impact on `cargo ...` calls When calling `cargo` where `CARGO_TARGET_DIRECTORIES` is active, `CARGO_TARGET_DIR` is set by all `cargo` calls that happen in a Cargo workspace, including calls to third-party tools. From a8fae16585b94364190b79419cdfcb0d6c2a9092 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Tue, 28 Feb 2023 09:30:50 +0100 Subject: [PATCH 11/67] nit: clarify `cargo clean` behaviour explanation --- text/3371-cargo-target-directories.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index ff0f3e20c35..1edfd1f7c4c 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -177,7 +177,7 @@ In the same vein, `cargo metadata` will fill the target directory information wi ### `cargo clean` -Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists. The same behavior is used for `CARGO_TARGET_DIRECTORIES`. +Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_DIRECTORIES`: if it set, `cargo clean` will delete `CARGO_TARGET_DIRECTORIES/-` and not `target/`. # Drawbacks [drawbacks]: #drawbacks From f2a8448fccfad37341603085e823ab9b2b1c9524 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Tue, 28 Feb 2023 09:31:12 +0100 Subject: [PATCH 12/67] fix: link to targo and add backlinks subsection --- text/3371-cargo-target-directories.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index 1edfd1f7c4c..0875db26a49 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -241,7 +241,7 @@ Ultimately, tools are not negatively affected since they should be using `CARGO_ ## Just use `targo` -While a very nice tool, `targo` is not integrated with `cargo` and has a few shortcomings: +While a very nice tool, [`targo`](https://github.com/sunshowers/targo) is not integrated with `cargo` and has a few shortcomings: - It uses symlinks, which are not always handled well by other tools. Specifically, since it's not integrated inside `cargo`, it uses a `target` symlink to avoid having to remap `cargo`'s execution using `CARGO_TARGET_DIR` and such,making it less useful for external build tools that would use this functionality. Using such a symlink also means `cargo clean` does not work, it just removes the symlink and not the data. - It completely ignores `CARGO_TARGET_DIR`-related options, which again may break workflows. @@ -254,6 +254,12 @@ solve some of these or ensure forward compatibility as well as the solution prop On the other hand, `targo` is already here and working for at least one person, making it the most viable alternative for now. +### Providing backlinks + +`targo` provides backlink (it links from it's own target directory to `/target`) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target dire. + +`cargo` does not for `CARGO_TARGET_DIR` and will not do it either for `CARGO_TARGET_DIRECTORIES` : it provides `cargo metadata` that is the blessed way to obtain the actual target directory and when `CARGO_TARGET_DIRECTORIES` is used, it will set `CARGO_TARGET_DIR` on all invocation (if not already set) to make it easy to obtain the target directory for simple task (e.g. a test needing to launch another binary in the repo). + ## Remapping [rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) was originally about remapping the target directory, not about having a From bd8b69074beb991047e931b623e7840dfa22a6b3 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 28 May 2023 19:41:06 +0200 Subject: [PATCH 13/67] Expand on using `CARGO_TARGET_DIRECTORIES` as the default --- text/3371-cargo-target-directories.md | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index 0875db26a49..431c2a74b05 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -145,6 +145,8 @@ The resolution order favors `CARGO_TARGET_DIR` in all its forms, in the interest `--target-dir` > `CARGO_TARGET_DIR` > `target-dir = ...` > `--target-directories` > `CARGO_TARGET_DIRECTORIES` > `target-directories = ...` +See the final section for discussion on how to make `CARGO_TARGET_DIRECTORIES` the default. + ## Naming In the example in the previous section, using `CARGO_TARGET_DIRECTORIES` with `cargo build` produces named subdirectories. The name of those is partially deterministic: @@ -346,4 +348,27 @@ The section merely provides additional information. - Allowing relative paths: I feel this is counter-productive to the stated goal and have thought of no use for it, but it's entirely possible someone else will. - Introduce remapping into the concept in some way. -- Use `CARGO_TARGET_DIRECTORIES` as the default instead of the current `target` dir, with platform-defined defaults. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. + +## Use `CARGO_TARGET_DIRECTORIES` as the default instead of `target` + +This option has several complications I'm not sure how to resolve: + +1. How do we decide on good platform defaults ? + - Subsequently, when platform defaults are decided, how do we ensure a new platform has a good default too ? +2. How do we communicate on said default values ? +3. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. + +I see `CARGO_TARGET_DIRECTORIES` as `cargo`-specific `XDG_CACHE_DIR` variable: if unset, using a local `target/` is what *all* of the Web has documented for years, it's what people expect, it's what tools expect, it makes it easy to reorganize Rust projects by just moving them around in directories, ... + +`CARGO_TARGET_DIRECTORIES` is more intended for CI, peoples with specific setups or people wanting to dive just a little deeper in their directory management, but just like `npm` puts `.node_modules/` in the project's workspace directory by default, I think `cargo` should also keep this as the default behaviour. + +### We really want to do this, how do we do it ? + +Well, first, heavy advertising of the option. Very **heavy** advertising of it and of `cargo-metadata` for all interactions with the cargo `target/` directory. After that, it will be a waiting game: + +1. Wait for tools to be updated +2. Wait for MSRVs to catch up to the cargo version introducing the option and its defaults (this could take a year or ten, I have no idea) +3. Add a config option to `cargo` to make it the default behaviour (when there is not already a `target/` dir locally is `CARGO_TARGET_DIR` set), something like `use-user-wide-cargo-target-directories-by-default = true` and communicate so that people can try it out +4. Write a post saying "we'll do the change in version X" (where current version is like X-2 to warn 3 months before at least ?) and then only apply the change to directory where there is no `CARGO_TARGET_DIR` and no `target/` dir locally + +I expect this to take a long time and `cargo` would need to be very clear about where it puts the target directory, probably through a simple command like `cargo metadata --print-target-dir` to make it easy for CIs and scripts to use it programatically without having to parse JSON all the time like a simple `cargo metadata` would do. From 7a1ee2019955794f490de3017ae658bf9249b029 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 28 May 2023 19:50:53 +0200 Subject: [PATCH 14/67] Clarify we hash on the symlink-resolved form of the path --- text/3371-cargo-target-directories.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index 431c2a74b05..9f35e74cc86 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -156,6 +156,8 @@ This naming scheme is chosen to be simple for people to navigate but is **not co In case the parent directory is `/` or `C:\`, the subdirectory name is implementation defined. +The path used for the hashing and naming of the final directory is the one found *after* symlink resolution: `bazel` does it too and I have not found any complaints about this and it has the distinct advantage of allowing to make a symlink to a project somewhere else on the machine (for example to organise work projects) and avoid duplicating the build directory and all its data (which can be quite heavy). + ### Symbolic links In the following situation From 35e99f635e2aa736235842ccadcd147693c3643c Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Jun 2023 22:09:59 +0200 Subject: [PATCH 15/67] rename CARGO_TARGET_DIRECTORIES to CARGO_TARGET_BASE_DIR --- text/3371-cargo-target-directories.md | 64 +++++++++++++-------------- 1 file changed, 32 insertions(+), 32 deletions(-) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-directories.md index 9f35e74cc86..7c49f4ca290 100644 --- a/text/3371-cargo-target-directories.md +++ b/text/3371-cargo-target-directories.md @@ -1,4 +1,4 @@ -- Feature Name: `cargo_target_directories` +- Feature Name: `cargo_target_base_dir` - Start Date: 2023-01-12 - RFC PR: [rust-lang/rfcs#3371](https://github.com/rust-lang/rfcs/pull/3371) @@ -9,7 +9,7 @@ Introduce a new configuration option for cargo to tell it to move the crate/workspace's target directory into a crate/workspace-specific subdirectory of the configured absolute path, -named `CARGO_TARGET_DIRECTORIES`. +named `CARGO_TARGET_BASE_DIR`. # Motivation [motivation]: #motivation @@ -45,9 +45,9 @@ For a single project, it is possible to use the `CARGO_TARGET_DIR` environment v While this option is useful for single-project environments (simple CI builds, builds through other build systems like Meson or Bazel), in multi-projects environment, like personal machines or repos with multiple workspaces, it conflates every build directory under the configured path: `CARGO_TARGET_DIR` directly replaces the `/target/` directory. -`CARGO_TARGET_DIRECTORIES` (or the `target-directories` TOML option or the `--target-directories` command-line flag) instead acts as a parent for those `target` directories. +`CARGO_TARGET_BASE_DIR` (or the `target-base-dir` TOML option or the `--target-base-dir` command-line flag) instead acts as a parent for those `target` directories. -Below is an example of the behavior with `CARGO_TARGET_DIR` versus the one with `CARGO_TARGET_DIRECTORIES`: +Below is an example of the behavior with `CARGO_TARGET_DIR` versus the one with `CARGO_TARGET_BASE_DIR`: ## Example @@ -84,7 +84,7 @@ It's possible to produce invalid state in the target dir by having unrelated pro It's not possible to have to projects building at once because Cargo locks its target directory during builds. -#### With `CARGO_TARGET_DIRECTORIES=/cargo-cache` +#### With `CARGO_TARGET_BASE_DIR=/cargo-cache` `cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts in `/cargo-cache/work-project-/debug/...` @@ -98,7 +98,7 @@ Two projects can be built in parallel without troubles. #### With both set -`CARGO_TARGET_DIR` was present long before `CARGO_TARGET_DIRECTORIES`: backward compatibility is important, so the first always trumps the second, +`CARGO_TARGET_DIR` was present long before `CARGO_TARGET_BASE_DIR`: backward compatibility is important, so the first always trumps the second, there is no mixing going on. #### Absolute and relative paths @@ -106,7 +106,7 @@ there is no mixing going on. `CARGO_TARGET_DIR` can be either a relative or absolute path, which makes sense since it's mostly intended for a single project, which can then work from its own position to configure the target directory. -On the other hand `CARGO_TARGET_DIRECTORIES` is intended to be used with several projects, possibly completely unrelated to each other. As such, +On the other hand `CARGO_TARGET_BASE_DIR` is intended to be used with several projects, possibly completely unrelated to each other. As such, it does not accept relative paths, only absolute ones. If a compelling use case is present for a relative path, it can added in the future as a backward-compatible change. @@ -123,7 +123,7 @@ This is the technical portion of the RFC. Explain the design in sufficient detai The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work. --> -## Setting `CARGO_TARGET_DIRECTORIES` +## Setting `CARGO_TARGET_BASE_DIR` The option is similar to `CARGO_TARGET_DIR` and can be set in the same places. From less to most specific: @@ -131,25 +131,25 @@ The option is similar to `CARGO_TARGET_DIR` and can be set in the same places. F ```toml [build] - target-directories = "/absolute/path/to/target/directories" + target-base-dir = "/absolute/path/to/target/directories" ``` -- Through the environment variable: `CARGO_TARGET_DIRECTORIES="/absolute/path/to/target/directories" cargo build` -- Through the command line flag: `cargo build --target-directories /absolute/path/to/target/directories` +- Through the environment variable: `CARGO_TARGET_BASE_DIR="/absolute/path/to/target/directories" cargo build` +- Through the command line flag: `cargo build --target-base-dir /absolute/path/to/target/directories` -The given path must be absolute: setting `CARGO_TARGET_DIRECTORIES` to an empty or relative path is an error (when used and not instantly overriden by `CARGO_TARGET_DIR`). +The given path must be absolute: setting `CARGO_TARGET_BASE_DIR` to an empty or relative path is an error (when used and not instantly overriden by `CARGO_TARGET_DIR`). ## Resolution order relative to `CARGO_TARGET_DIR` The resolution order favors `CARGO_TARGET_DIR` in all its forms, in the interest of both backward compatibility and allowing overriding for a singular workspace: -`--target-dir` > `CARGO_TARGET_DIR` > `target-dir = ...` > `--target-directories` > `CARGO_TARGET_DIRECTORIES` > `target-directories = ...` +`--target-dir` > `CARGO_TARGET_DIR` > `target-dir = ...` > `--target-base-dir` > `CARGO_TARGET_BASE_DIR` > `target-base-dir = ...` -See the final section for discussion on how to make `CARGO_TARGET_DIRECTORIES` the default. +See the final section for discussion on how to make `CARGO_TARGET_BASE_DIR` the default. ## Naming -In the example in the previous section, using `CARGO_TARGET_DIRECTORIES` with `cargo build` produces named subdirectories. The name of those is partially deterministic: +In the example in the previous section, using `CARGO_TARGET_BASE_DIR` with `cargo build` produces named subdirectories. The name of those is partially deterministic: it is the name of the parent directory of the workspace's `Cargo.toml` manifest and an unspecified hash of the absolute path to the workspace's root, so building `work-project/crate-1` will still use the `/cargo-caches/work-project-/debug/...` directory for a `cargo build` call. This naming scheme is chosen to be simple for people to navigate but is **not considered stable**: the hashing method (and so the hash) will probably not change often but `cargo` offers no guarantee and may change it in any release. Tools that needs to interact with `cargo`'s target directory should not rely on its value for more than a single invocation of them: they should instead query `cargo metadata` for the actual value each time they are invoked. @@ -171,17 +171,17 @@ In the following situation │ │ ├─ symlink-to-crate/ -> actual-crate/ ``` -When calling `cargo metadata` in the `symlink-to-crate` path, the result contains `"manifest_path": "/Users/poliorcetics/projects/actual-crate/Cargo.toml"` and `"workspace_root":"/Users/poliorcetics/projects/actual-crate"`. This behaviour means that symlinks won't change the final directory used inside `CARGO_TARGET_DIRECTORIES`, or in other words: symbolic links are resolved. +When calling `cargo metadata` in the `symlink-to-crate` path, the result contains `"manifest_path": "/Users/poliorcetics/projects/actual-crate/Cargo.toml"` and `"workspace_root":"/Users/poliorcetics/projects/actual-crate"`. This behaviour means that symlinks won't change the final directory used inside `CARGO_TARGET_BASE_DIR`, or in other words: symbolic links are resolved. ## Impact on `cargo ...` calls -When calling `cargo` where `CARGO_TARGET_DIRECTORIES` is active, `CARGO_TARGET_DIR` is set by all `cargo` calls that happen in a Cargo workspace, including calls to third-party tools. +When calling `cargo` where `CARGO_TARGET_BASE_DIR` is active, `CARGO_TARGET_DIR` is set by all `cargo` calls that happen in a Cargo workspace, including calls to third-party tools. -In the same vein, `cargo metadata` will fill the target directory information with the absolute path and make no mention of `CARGO_TARGET_DIRECTORIES` since it can only be used in a single workspace at once. +In the same vein, `cargo metadata` will fill the target directory information with the absolute path and make no mention of `CARGO_TARGET_BASE_DIR` since it can only be used in a single workspace at once. ### `cargo clean` -Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_DIRECTORIES`: if it set, `cargo clean` will delete `CARGO_TARGET_DIRECTORIES/-` and not `target/`. +Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_BASE_DIR`: if it set, `cargo clean` will delete `CARGO_TARGET_BASE_DIR/-` and not `target/`. # Drawbacks [drawbacks]: #drawbacks @@ -192,8 +192,8 @@ Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, This introduces one more option to look at to find the target directory, which may complicate the life of external tools. -This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_DIRECTORIES`, so an external tool can set it and go on its way. -Also, having `cargo` set `CARGO_TARGET_DIR` when inside a workspace where `CARGO_TARGET_DIRECTORIES` is used will help current tools (those not +This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_BASE_DIR`, so an external tool can set it and go on its way. +Also, having `cargo` set `CARGO_TARGET_DIR` when inside a workspace where `CARGO_TARGET_BASE_DIR` is used will help current tools (those not yet using `cargo metadata`) continue working without trouble. # Rationale and alternatives @@ -222,7 +222,7 @@ For those reason, this option has not been retained and the `targo` tool is disc ## Using `XDG_CACHE_HOME` instead of a cargo-specific env-var Not all OSes use the XDG convention, notably Windows and macOS (though the latter can be somewhat made to) and it is -very easy to define `CARGO_TARGET_DIRECTORIES=${XDG_CACHE_HOME:-~/.cache}/cargo_target_directories` if wanted by users. +very easy to define `CARGO_TARGET_BASE_DIR=${XDG_CACHE_HOME:-~/.cache}/cargo_target_base_dir` if wanted by users. ## Just put the directories inside of `.cargo/cache/...` @@ -238,7 +238,7 @@ What's more, by explicitely not stabilizing it (and maybe voluntarily changing i ## Use only the hash, not the name of the workspace directory -While this solution is just as easy to work with for tools, I think having a somewhat human-readable part in the name make it easier to work with through things like logging or just listing the subdirectories in `CARGO_TARGET_DIRECTORIES`. It also makes the path printed by `cargo run` or +While this solution is just as easy to work with for tools, I think having a somewhat human-readable part in the name make it easier to work with through things like logging or just listing the subdirectories in `CARGO_TARGET_BASE_DIR`. It also makes the path printed by `cargo run` or the local URL used when calling `cargo doc --open` much clearer. Ultimately, tools are not negatively affected since they should be using `CARGO_TARGET_DIR` or `cargo metadata` and humans have an easier time working with such paths so I think it's worthwhile to include the workspace's name. @@ -262,7 +262,7 @@ On the other hand, `targo` is already here and working for at least one person, `targo` provides backlink (it links from it's own target directory to `/target`) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target dire. -`cargo` does not for `CARGO_TARGET_DIR` and will not do it either for `CARGO_TARGET_DIRECTORIES` : it provides `cargo metadata` that is the blessed way to obtain the actual target directory and when `CARGO_TARGET_DIRECTORIES` is used, it will set `CARGO_TARGET_DIR` on all invocation (if not already set) to make it easy to obtain the target directory for simple task (e.g. a test needing to launch another binary in the repo). +`cargo` does not for `CARGO_TARGET_DIR` and will not do it either for `CARGO_TARGET_BASE_DIR` : it provides `cargo metadata` that is the blessed way to obtain the actual target directory and when `CARGO_TARGET_BASE_DIR` is used, it will set `CARGO_TARGET_DIR` on all invocation (if not already set) to make it easy to obtain the target directory for simple task (e.g. a test needing to launch another binary in the repo). ## Remapping @@ -271,14 +271,14 @@ central one but reading the issue, there seems to be no needs for more than the In the future, if `CARGO_TARGET_DIR_REMAP` is introduced, it could be used to be the prefix to the target directory like so: - Set `CARGO_TARGET_DIR_REMAP=/home/user/projects=/tmp/cargo-build` -- Compile the crate under `/home/user/projects/foo/` **without** `CARGO_TARGET_DIR` or `CARGO_TARGET_DIRECTORIES` set +- Compile the crate under `/home/user/projects/foo/` **without** `CARGO_TARGET_DIR` or `CARGO_TARGET_BASE_DIR` set - The resulting target directory will be at `/tmp/cargo-build/foo/target` -By making the priority order `CARGO_TARGET_DIR` > `CARGO_TARGET_DIRECTORIES` > `CARGO_TARGET_DIR_REMAP` (when all are absolute paths) we would keep backward compatibility. Or we could disallow having the last two set at once, so that they're alternatives and not ordered. +By making the priority order `CARGO_TARGET_DIR` > `CARGO_TARGET_BASE_DIR` > `CARGO_TARGET_DIR_REMAP` (when all are absolute paths) we would keep backward compatibility. Or we could disallow having the last two set at once, so that they're alternatives and not ordered. When `CARGO_TARGET_DIR` is relative, the result could be `/tmp/cargo-build/foo/$CARGO_TARGET_DIR`. -Overall, I feel remapping is much harder to implement well and can be added later without interfering with `CARGO_TARGET_DIRECTORIES` (and without +Overall, I feel remapping is much harder to implement well and can be added later without interfering with `CARGO_TARGET_BASE_DIR` (and without this RFC interfering with remapping), though the design space is probably bigger than the one for this RFC. # Prior art @@ -306,7 +306,7 @@ The [`bazel`](https://bazel.build) build system has a similar feature called the The naming scheme is as follow: `/_bazel_$USER/` is the `outputUserRoot`, used for all builds done by `$USER`. Below that, projects are identified by the MD5 hash of the path name of the workspace directory (computed after resolving symlinks). -The `outputRoot` can be overridden using `--output_base=...` (this is `$CARGO_TARGET_DIRECTORIES`, the subject of this RFC) and the `outputUserRoot` with `--output_user_root=...` (this is close to using `$CARGO_TARGET_DIR`, already possible in today's `cargo`). +The `outputRoot` can be overridden using `--output_base=...` (this is `$CARGO_TARGET_BASE_DIR`, the subject of this RFC) and the `outputUserRoot` with `--output_user_root=...` (this is close to using `$CARGO_TARGET_DIR`, already possible in today's `cargo`). It should be noted that `bazel` is integrated with [remote caching](https://bazel.build/remote/caching) and has different needs from `cargo`, the latter only working locally. @@ -351,7 +351,7 @@ The section merely provides additional information. - Allowing relative paths: I feel this is counter-productive to the stated goal and have thought of no use for it, but it's entirely possible someone else will. - Introduce remapping into the concept in some way. -## Use `CARGO_TARGET_DIRECTORIES` as the default instead of `target` +## Use `CARGO_TARGET_BASE_DIR` as the default instead of `target` This option has several complications I'm not sure how to resolve: @@ -360,9 +360,9 @@ This option has several complications I'm not sure how to resolve: 2. How do we communicate on said default values ? 3. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. -I see `CARGO_TARGET_DIRECTORIES` as `cargo`-specific `XDG_CACHE_DIR` variable: if unset, using a local `target/` is what *all* of the Web has documented for years, it's what people expect, it's what tools expect, it makes it easy to reorganize Rust projects by just moving them around in directories, ... +I see `CARGO_TARGET_BASE_DIR` as `cargo`-specific `XDG_CACHE_DIR` variable: if unset, using a local `target/` is what *all* of the Web has documented for years, it's what people expect, it's what tools expect, it makes it easy to reorganize Rust projects by just moving them around in directories, ... -`CARGO_TARGET_DIRECTORIES` is more intended for CI, peoples with specific setups or people wanting to dive just a little deeper in their directory management, but just like `npm` puts `.node_modules/` in the project's workspace directory by default, I think `cargo` should also keep this as the default behaviour. +`CARGO_TARGET_BASE_DIR` is more intended for CI, peoples with specific setups or people wanting to dive just a little deeper in their directory management, but just like `npm` puts `.node_modules/` in the project's workspace directory by default, I think `cargo` should also keep this as the default behaviour. ### We really want to do this, how do we do it ? @@ -370,7 +370,7 @@ Well, first, heavy advertising of the option. Very **heavy** advertising of it a 1. Wait for tools to be updated 2. Wait for MSRVs to catch up to the cargo version introducing the option and its defaults (this could take a year or ten, I have no idea) -3. Add a config option to `cargo` to make it the default behaviour (when there is not already a `target/` dir locally is `CARGO_TARGET_DIR` set), something like `use-user-wide-cargo-target-directories-by-default = true` and communicate so that people can try it out +3. Add a config option to `cargo` to make it the default behaviour (when there is not already a `target/` dir locally is `CARGO_TARGET_DIR` set), something like `use-user-wide-cargo-target-base-dir-by-default = true` and communicate so that people can try it out 4. Write a post saying "we'll do the change in version X" (where current version is like X-2 to warn 3 months before at least ?) and then only apply the change to directory where there is no `CARGO_TARGET_DIR` and no `target/` dir locally I expect this to take a long time and `cargo` would need to be very clear about where it puts the target directory, probably through a simple command like `cargo metadata --print-target-dir` to make it easy for CIs and scripts to use it programatically without having to parse JSON all the time like a simple `cargo metadata` would do. From c2300bb67afaf156e7ff13e193ee48f692e41057 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Jun 2023 22:10:40 +0200 Subject: [PATCH 16/67] rename file to follow feature renaming --- ...-cargo-target-directories.md => 3371-cargo-target-base-dir.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{3371-cargo-target-directories.md => 3371-cargo-target-base-dir.md} (100%) diff --git a/text/3371-cargo-target-directories.md b/text/3371-cargo-target-base-dir.md similarity index 100% rename from text/3371-cargo-target-directories.md rename to text/3371-cargo-target-base-dir.md From 4a6cb6491fb9f8744d77ffe6cff7e006c30ceede Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Jun 2023 22:24:18 +0200 Subject: [PATCH 17/67] question: how do we handle possibly thousand of directories in the same parent --- text/3371-cargo-target-base-dir.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 7c49f4ca290..a58f89b7109 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -325,6 +325,16 @@ It should be noted that `bazel` is integrated with [remote caching](https://baze - Do we want to differentiate according to users ? `bazel` is a generic build tool, whereas `cargo` is not, so maybe differentiating on users is not necessary for us ? +## Handle possibly thousands of directories in a single `CARGO_TARGET_BASE_DIR` path + +While a single dev machine is unlikely to have enough projects that the simple scheme of `-` will produce enough directories to slow down working in `$CARGO_TARGET_BASE_DIR/`, it could still happen, and notably in private CI, which are often less compartimentalized than public ones. Simple cruft over time (i.e, never calling `cargo clean` over years) could also make it happen, if much slower. + +A quick design that would probably work and still be relatively navigable by humans is: instead of `project-name-/`, use `p/r/o/j/e/ct-name-/`. Here I used an arbitrary limit of 5 subdirectories but really, anything could be used. A problem occurs if a `.` is present in the directory name,but in that case we could simply append it to the symbol before like so: `proj....name/` -> `p/r/o/j..../n/ame-/`. + +If the project name contains a space in the first 5 symbols, maybe replace it like so: `proj name/` -> `p/r/o/j-space/n/ame-/`. + +I don't know of any filesystem that has troubles with `_` or `-` (often used as delimiters) so those have no special handling. + # Future possibilities [future-possibilities]: #future-possibilities From 7ac2013307c5378dad298faacea72d7388fcc675 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Jun 2023 22:27:23 +0200 Subject: [PATCH 18/67] clarify situation for third party cargo tools --- text/3371-cargo-target-base-dir.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index a58f89b7109..1de438d3ddd 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -175,7 +175,7 @@ When calling `cargo metadata` in the `symlink-to-crate` path, the result contain ## Impact on `cargo ...` calls -When calling `cargo` where `CARGO_TARGET_BASE_DIR` is active, `CARGO_TARGET_DIR` is set by all `cargo` calls that happen in a Cargo workspace, including calls to third-party tools. +When calling `cargo` where `CARGO_TARGET_BASE_DIR` is active, `CARGO_TARGET_DIR` is set by all `cargo` calls that happen in a Cargo workspace, as much as possible. For third party tools (`cargo-*`), where cargo does not know about the relevant `Cargo.toml`, the tool would have to use [`cargo_metadata`](https://docs.rs/cargo_metadata), as is already expected today. In the same vein, `cargo metadata` will fill the target directory information with the absolute path and make no mention of `CARGO_TARGET_BASE_DIR` since it can only be used in a single workspace at once. From b1aa45e02e41f5d75f1750bd2bfa74688305028e Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Jun 2023 22:28:28 +0200 Subject: [PATCH 19/67] typo: dire -> directory --- text/3371-cargo-target-base-dir.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 1de438d3ddd..c46f01b5fd8 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -260,7 +260,7 @@ On the other hand, `targo` is already here and working for at least one person, ### Providing backlinks -`targo` provides backlink (it links from it's own target directory to `/target`) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target dire. +`targo` provides backlink (it links from it's own target directory to `/target`) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. `cargo` does not for `CARGO_TARGET_DIR` and will not do it either for `CARGO_TARGET_BASE_DIR` : it provides `cargo metadata` that is the blessed way to obtain the actual target directory and when `CARGO_TARGET_BASE_DIR` is used, it will set `CARGO_TARGET_DIR` on all invocation (if not already set) to make it easy to obtain the target directory for simple task (e.g. a test needing to launch another binary in the repo). From 367681631d08c4289b8c4d26e35e918749d8a249 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Jun 2023 22:33:20 +0200 Subject: [PATCH 20/67] clarify CARGO_HOME as target base dir --- text/3371-cargo-target-base-dir.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index c46f01b5fd8..c27b7b1a110 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -253,8 +253,7 @@ While a very nice tool, [`targo`](https://github.com/sunshowers/targo) is not in - It uses `$CARGO_HOME/targo` to place its cache, making it less useful for external build tools and people wanting to separate caches and configuration. - It needs to intercept `cargo`'s arguments, making it more brittle than an integrated solution. -Some of those could be fixed of course, and I don't expect `cargo`'s `--target-dir` and `--manifest-path` to change or disappear anytime soon, but still, it could happen. An external tool like `targo` will never be able to -solve some of these or ensure forward compatibility as well as the solution proposed in this RFC. +Some of those could be fixed of course, and I don't expect `cargo`'s `--target-dir` and `--manifest-path` to change or disappear anytime soon, but still, it could happen. An external tool like `targo` will never be able to solve some of these or ensure forward compatibility as well as the solution proposed in this RFC. On the other hand, `targo` is already here and working for at least one person, making it the most viable alternative for now. @@ -367,6 +366,7 @@ This option has several complications I'm not sure how to resolve: 1. How do we decide on good platform defaults ? - Subsequently, when platform defaults are decided, how do we ensure a new platform has a good default too ? + - `CARGO_HOME` is already criticized for being both a *cache* and a *config* home (using the XDG spec semantics), adding more local cache to it in the form of `CARGO_HOME/target-base-dir/` would not improve the situation and should probably not be done, but, if no good alternatives are found, there is precedent to use it for this. 2. How do we communicate on said default values ? 3. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. From 0520ea8f355df82aef064fc243d442cf0b70d12a Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Jun 2023 22:39:14 +0200 Subject: [PATCH 21/67] don't use line breaks, they make following changes harder since they have to be maintained manually --- text/3371-cargo-target-base-dir.md | 40 +++++++++--------------------- 1 file changed, 12 insertions(+), 28 deletions(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index c27b7b1a110..3608f46bfec 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -8,8 +8,7 @@ -Introduce a new configuration option for cargo to tell it to move the crate/workspace's target directory into a crate/workspace-specific subdirectory of the configured absolute path, -named `CARGO_TARGET_BASE_DIR`. +Introduce a new configuration option for cargo to tell it to move the crate/workspace's target directory into a crate/workspace-specific subdirectory of the configured absolute path, named `CARGO_TARGET_BASE_DIR`. # Motivation [motivation]: #motivation @@ -20,8 +19,7 @@ The original motivating issue can be found here: [rust-lang/cargo#11156](https:/ 1. Not having to find and clean all `target/` dirs everywhere while not having all projects collide (which is the effect of setting `CARGO_TARGET_DIR` globally) 1. Being able to easily exclude a directory from saves (Apple's Time Machine, ZFS snapshots, BRTS, ...) -1. Allows easily having separate directories for Rust-Analyzer and Cargo itself, allowing concurrent builds - (technically already doable with arguments/env vars but `CARGO_TARGET_DIR` collides all projects into big target dir, leading to frequent recompilation because of conflicting features and locking builds) +1. Allows easily having separate directories for Rust-Analyzer and Cargo itself, allowing concurrent builds (technically already doable with arguments/env vars but `CARGO_TARGET_DIR` collides all projects into big target dir, leading to frequent recompilation because of conflicting features and locking builds) 1. Allows using a different disk, partition or mount point for cargo artifacts 1. Avoids having to set `CARGO_TARGET_DIR` for every project to get the same effect as proposed here @@ -98,17 +96,13 @@ Two projects can be built in parallel without troubles. #### With both set -`CARGO_TARGET_DIR` was present long before `CARGO_TARGET_BASE_DIR`: backward compatibility is important, so the first always trumps the second, -there is no mixing going on. +`CARGO_TARGET_DIR` was present long before `CARGO_TARGET_BASE_DIR`: backward compatibility is important, so the first always trumps the second, there is no mixing going on. #### Absolute and relative paths -`CARGO_TARGET_DIR` can be either a relative or absolute path, which makes sense since it's mostly intended for a single project, which can then -work from its own position to configure the target directory. +`CARGO_TARGET_DIR` can be either a relative or absolute path, which makes sense since it's mostly intended for a single project, which can then work from its own position to configure the target directory. -On the other hand `CARGO_TARGET_BASE_DIR` is intended to be used with several projects, possibly completely unrelated to each other. As such, -it does not accept relative paths, only absolute ones. If a compelling use case is present for a relative path, it can added in the future as a -backward-compatible change. +On the other hand `CARGO_TARGET_BASE_DIR` is intended to be used with several projects, possibly completely unrelated to each other. As such, it does not accept relative paths, only absolute ones. If a compelling use case is present for a relative path, it can added in the future as a backward-compatible change. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -149,8 +143,7 @@ See the final section for discussion on how to make `CARGO_TARGET_BASE_DIR` the ## Naming -In the example in the previous section, using `CARGO_TARGET_BASE_DIR` with `cargo build` produces named subdirectories. The name of those is partially deterministic: -it is the name of the parent directory of the workspace's `Cargo.toml` manifest and an unspecified hash of the absolute path to the workspace's root, so building `work-project/crate-1` will still use the `/cargo-caches/work-project-/debug/...` directory for a `cargo build` call. +In the example in the previous section, using `CARGO_TARGET_BASE_DIR` with `cargo build` produces named subdirectories. The name of those is partially deterministic: it is the name of the parent directory of the workspace's `Cargo.toml` manifest and an unspecified hash of the absolute path to the workspace's root, so building `work-project/crate-1` will still use the `/cargo-caches/work-project-/debug/...` directory for a `cargo build` call. This naming scheme is chosen to be simple for people to navigate but is **not considered stable**: the hashing method (and so the hash) will probably not change often but `cargo` offers no guarantee and may change it in any release. Tools that needs to interact with `cargo`'s target directory should not rely on its value for more than a single invocation of them: they should instead query `cargo metadata` for the actual value each time they are invoked. @@ -192,9 +185,7 @@ Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, This introduces one more option to look at to find the target directory, which may complicate the life of external tools. -This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_BASE_DIR`, so an external tool can set it and go on its way. -Also, having `cargo` set `CARGO_TARGET_DIR` when inside a workspace where `CARGO_TARGET_BASE_DIR` is used will help current tools (those not -yet using `cargo metadata`) continue working without trouble. +This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_BASE_DIR`, so an external tool can set it and go on its way. Also, having `cargo` set `CARGO_TARGET_DIR` when inside a workspace where `CARGO_TARGET_BASE_DIR` is used will help current tools (those not yet using `cargo metadata`) continue working without trouble. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -221,14 +212,11 @@ For those reason, this option has not been retained and the `targo` tool is disc ## Using `XDG_CACHE_HOME` instead of a cargo-specific env-var -Not all OSes use the XDG convention, notably Windows and macOS (though the latter can be somewhat made to) and it is -very easy to define `CARGO_TARGET_BASE_DIR=${XDG_CACHE_HOME:-~/.cache}/cargo_target_base_dir` if wanted by users. +Not all OSes use the XDG convention, notably Windows and macOS (though the latter can be somewhat made to) and it is very easy to define `CARGO_TARGET_BASE_DIR=${XDG_CACHE_HOME:-~/.cache}/cargo_target_base_dir` if wanted by users. ## Just put the directories inside of `.cargo/cache/...` -There are already lots of discussion about `.cargo` and `.rustup` being home to both cache and config files and why this -is annoying for lots of users. What's more, it would not be as helpful to external build tools, they don't care about -bringing the registry cache in their build directory for example. +There are already lots of discussion about `.cargo` and `.rustup` being home to both cache and config files and why this is annoying for lots of users. What's more, it would not be as helpful to external build tools, they don't care about bringing the registry cache in their build directory for example. ## Stabilize the naming scheme @@ -238,8 +226,7 @@ What's more, by explicitely not stabilizing it (and maybe voluntarily changing i ## Use only the hash, not the name of the workspace directory -While this solution is just as easy to work with for tools, I think having a somewhat human-readable part in the name make it easier to work with through things like logging or just listing the subdirectories in `CARGO_TARGET_BASE_DIR`. It also makes the path printed by `cargo run` or -the local URL used when calling `cargo doc --open` much clearer. +While this solution is just as easy to work with for tools, I think having a somewhat human-readable part in the name make it easier to work with through things like logging or just listing the subdirectories in `CARGO_TARGET_BASE_DIR`. It also makes the path printed by `cargo run` or the local URL used when calling `cargo doc --open` much clearer. Ultimately, tools are not negatively affected since they should be using `CARGO_TARGET_DIR` or `cargo metadata` and humans have an easier time working with such paths so I think it's worthwhile to include the workspace's name. @@ -265,9 +252,7 @@ On the other hand, `targo` is already here and working for at least one person, ## Remapping -[rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) was originally about remapping the target directory, not about having a -central one but reading the issue, there seems to be no needs for more than the simple redefinition of the target directory proposed in this document. -In the future, if `CARGO_TARGET_DIR_REMAP` is introduced, it could be used to be the prefix to the target directory like so: +[rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) was originally about remapping the target directory, not about having a central one but reading the issue, there seems to be no needs for more than the simple redefinition of the target directory proposed in this document. In the future, if `CARGO_TARGET_DIR_REMAP` is introduced, it could be used to be the prefix to the target directory like so: - Set `CARGO_TARGET_DIR_REMAP=/home/user/projects=/tmp/cargo-build` - Compile the crate under `/home/user/projects/foo/` **without** `CARGO_TARGET_DIR` or `CARGO_TARGET_BASE_DIR` set @@ -277,8 +262,7 @@ By making the priority order `CARGO_TARGET_DIR` > `CARGO_TARGET_BASE_DIR` > `CAR When `CARGO_TARGET_DIR` is relative, the result could be `/tmp/cargo-build/foo/$CARGO_TARGET_DIR`. -Overall, I feel remapping is much harder to implement well and can be added later without interfering with `CARGO_TARGET_BASE_DIR` (and without -this RFC interfering with remapping), though the design space is probably bigger than the one for this RFC. +Overall, I feel remapping is much harder to implement well and can be added later without interfering with `CARGO_TARGET_BASE_DIR` (and without this RFC interfering with remapping), though the design space is probably bigger than the one for this RFC. # Prior art [prior-art]: #prior-art From f604f0a0f5091313e320af74d7824d6e6457048b Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Jun 2023 22:45:22 +0200 Subject: [PATCH 22/67] expand upon targo caching scheme --- text/3371-cargo-target-base-dir.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 3608f46bfec..c724a09557e 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -239,11 +239,14 @@ While a very nice tool, [`targo`](https://github.com/sunshowers/targo) is not in - It needs more metadata to work well, which means an external tool using it would have to understand that metadata too. - It uses `$CARGO_HOME/targo` to place its cache, making it less useful for external build tools and people wanting to separate caches and configuration. - It needs to intercept `cargo`'s arguments, making it more brittle than an integrated solution. +- Its naming scheme is a base58-encoded blake3 hash of the workspace directory ([source]), making it unusable by human (that's not necessarily a bad thing, just a difference with this RFC) and not taking into account the use case of thousands of target directories within `$CARGO_HOME/targo`. Some of those could be fixed of course, and I don't expect `cargo`'s `--target-dir` and `--manifest-path` to change or disappear anytime soon, but still, it could happen. An external tool like `targo` will never be able to solve some of these or ensure forward compatibility as well as the solution proposed in this RFC. On the other hand, `targo` is already here and working for at least one person, making it the most viable alternative for now. +[source]: https://github.com/rust-lang/cargo/issues/11156#issuecomment-1285951209 + ### Providing backlinks `targo` provides backlink (it links from it's own target directory to `/target`) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. From d7830f303184ad335a9d963e15f2782afc04451b Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Jun 2023 22:50:55 +0200 Subject: [PATCH 23/67] propose alternate naming scheme to simply scaling in unresolved questions --- text/3371-cargo-target-base-dir.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index c724a09557e..2d2cb23c00d 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -321,6 +321,8 @@ If the project name contains a space in the first 5 symbols, maybe replace it li I don't know of any filesystem that has troubles with `_` or `-` (often used as delimiters) so those have no special handling. +If, on the other, we don't care (or care less) for human usability of the directories inside `CARGO_TARGET_BASE_DIR`, we could simply invert the format to use `-`: since the hash would have fixed length and set of characters, it would be easy to split into subdirectories without special casing and a simple `find . -n 'project name'` would point to the final target directory for the project (or all of them, if several similarly named projects exists on the machine). + # Future possibilities [future-possibilities]: #future-possibilities From e79c7521425b5994ee4ae1c59c476f5bad886143 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Wed, 21 Jun 2023 22:53:52 +0200 Subject: [PATCH 24/67] underspecify naming scheme and use manifest instead of workspace dir, in preparation for cargo-script --- text/3371-cargo-target-base-dir.md | 54 ++++++++++++------------------ 1 file changed, 22 insertions(+), 32 deletions(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 2d2cb23c00d..803312adf0b 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -8,7 +8,7 @@ -Introduce a new configuration option for cargo to tell it to move the crate/workspace's target directory into a crate/workspace-specific subdirectory of the configured absolute path, named `CARGO_TARGET_BASE_DIR`. +Introduce a new configuration option for cargo to tell it to move the workspace's target directory into a workspace-specific subdirectory of the configured absolute path, named `CARGO_TARGET_BASE_DIR`. # Motivation [motivation]: #motivation @@ -74,7 +74,7 @@ Consider this directory tree: `cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts directly in `/cargo-cache/debug/...` -A subsequent `cargo build` in `project-1` will work with the same artifact, potentially having conflicting features for dependencies for example. +A subsequent `cargo build` in `perso-1` will work with the same artifact, potentially having conflicting features for dependencies for example. A `cargo clean` will delete the entire `/cargo-cache` directory, for all projects at once. @@ -84,13 +84,13 @@ It's not possible to have to projects building at once because Cargo locks its t #### With `CARGO_TARGET_BASE_DIR=/cargo-cache` -`cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts in `/cargo-cache/work-project-/debug/...` +`cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts in `/cargo-cache//debug/...` (where `project-path` is a directory or several chained directories unique to the workspace, with an unspecified naming scheme). -A `cargo build` in `project-1` will produce new artifacts in `/cargo-cache/project-1-/debug/...`. +A `cargo build` in `perso-1` will produce new artifacts in `/cargo-cache//debug/...`. -A `cargo clean` will only remove the `/cargo-cache/-/` subdirectory, not all the artifacts. +A `cargo clean` will only remove the `/cargo-cache//` subdirectory, not all the artifacts. -In this situation, it's not possible for Cargo to produce invalid state without a `build.rs` deliberately writing outside its target directory. +In this situation, it's much less likely for Cargo to produce invalid state without a `build.rs` deliberately writing outside its target directory. Two projects can be built in parallel without troubles. @@ -143,13 +143,13 @@ See the final section for discussion on how to make `CARGO_TARGET_BASE_DIR` the ## Naming -In the example in the previous section, using `CARGO_TARGET_BASE_DIR` with `cargo build` produces named subdirectories. The name of those is partially deterministic: it is the name of the parent directory of the workspace's `Cargo.toml` manifest and an unspecified hash of the absolute path to the workspace's root, so building `work-project/crate-1` will still use the `/cargo-caches/work-project-/debug/...` directory for a `cargo build` call. +In the example in the previous section, using `CARGO_TARGET_BASE_DIR` with `cargo build` produces named subdirectories. The name of those is computed from the full and canonicalized path to the manifest for the workspace, so building `work-project/crate-1` will still use the directory for the whole workspace during a `cargo build` call. -This naming scheme is chosen to be simple for people to navigate but is **not considered stable**: the hashing method (and so the hash) will probably not change often but `cargo` offers no guarantee and may change it in any release. Tools that needs to interact with `cargo`'s target directory should not rely on its value for more than a single invocation of them: they should instead query `cargo metadata` for the actual value each time they are invoked. +This naming scheme is **not considered stable**: the method will probably not change often but `cargo` offers no guarantee and may change it in any release. Tools that needs to interact with `cargo`'s target directory should not rely on its value for more than a single invocation of them: they should instead query `cargo metadata` for the actual value each time they are invoked. -In case the parent directory is `/` or `C:\`, the subdirectory name is implementation defined. +The path used for the naming of the final target directory is the one found *after* symlink resolution: `bazel` does it too and I have not found any complaints about this and it has the distinct advantage of allowing to make a symlink to a project somewhere else on the machine (for example to organise work projects) and avoid duplicating the build directory and all its data (which can be quite heavy). -The path used for the hashing and naming of the final directory is the one found *after* symlink resolution: `bazel` does it too and I have not found any complaints about this and it has the distinct advantage of allowing to make a symlink to a project somewhere else on the machine (for example to organise work projects) and avoid duplicating the build directory and all its data (which can be quite heavy). +To prevent collisions by craftings paths, the `` directory will be computed from a hash of the workspace manifest's full path (and possibly other data, for example `bazel` uses its version and the current user too). ### Symbolic links @@ -166,6 +166,12 @@ In the following situation When calling `cargo metadata` in the `symlink-to-crate` path, the result contains `"manifest_path": "/Users/poliorcetics/projects/actual-crate/Cargo.toml"` and `"workspace_root":"/Users/poliorcetics/projects/actual-crate"`. This behaviour means that symlinks won't change the final directory used inside `CARGO_TARGET_BASE_DIR`, or in other words: symbolic links are resolved. +### Handle possibly thousands of directories in a single `CARGO_TARGET_BASE_DIR` path + +While a single dev machine is unlikely to have enough projects that the scheme of `` will produce enough directories to slow down working in `$CARGO_TARGET_BASE_DIR/`, it could still happen, and notably in private CI, which are often less compartimentalized than public ones. Simple cruft over time (i.e, never calling `cargo clean` over years) could also make it happen, if much slower. + +A proposed solution for `cargo` to split the hash into something like `$CARG_TARGET_BASE_DIR/hash[:2]/hash[2:4]/hash[4:]/...`. Since the naming scheme is considered an implementation detail, if this prove insufficient it could be change in a subsequent version of `cargo`. + ## Impact on `cargo ...` calls When calling `cargo` where `CARGO_TARGET_BASE_DIR` is active, `CARGO_TARGET_DIR` is set by all `cargo` calls that happen in a Cargo workspace, as much as possible. For third party tools (`cargo-*`), where cargo does not know about the relevant `Cargo.toml`, the tool would have to use [`cargo_metadata`](https://docs.rs/cargo_metadata), as is already expected today. @@ -174,7 +180,7 @@ In the same vein, `cargo metadata` will fill the target directory information wi ### `cargo clean` -Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_BASE_DIR`: if it set, `cargo clean` will delete `CARGO_TARGET_BASE_DIR/-` and not `target/`. +Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_BASE_DIR`: if it set, `cargo clean` will delete `CARGO_TARGET_BASE_DIR//` and not `target/`. # Drawbacks [drawbacks]: #drawbacks @@ -199,12 +205,13 @@ This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_B ## Do nothing -It is already possible today to use `CARGO_TARGET_DIR` to remap workspaces and projects but this has two problems: +It is already possible today to use `CARGO_TARGET_DIR` to remap workspaces and projects but this has a few problems: 1) If done globally, the `CARGO_TARGET_DIR` becomes a hodge-podge of every project, which is not often the goal. 2) If done per-project, it is very cumbersome to maintain. 3) [`targo`](https://github.com/sunshowers/targo) by @sunshowers 4) [rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) +5) The upcoming `cargo script` command needs someplace to put its cache and having a dedicated directory for that would be nice. `targo` and the cargo issue express a need for either remapping or a global target directory that is not shared between different Cargo workspaces. @@ -224,12 +231,6 @@ I feel this require an hard-to-break naming scheme (a recent hash algorithm shou What's more, by explicitely not stabilizing it (and maybe voluntarily changing it between versions sometimes, since a version change recompiles everything anyway ?) we can instead reroute people and tools towards `CARGO_TARGET_DIR` / `cargo metadata` instead, which are much more likely to be suited to their use case if they need the path to the target directory. -## Use only the hash, not the name of the workspace directory - -While this solution is just as easy to work with for tools, I think having a somewhat human-readable part in the name make it easier to work with through things like logging or just listing the subdirectories in `CARGO_TARGET_BASE_DIR`. It also makes the path printed by `cargo run` or the local URL used when calling `cargo doc --open` much clearer. - -Ultimately, tools are not negatively affected since they should be using `CARGO_TARGET_DIR` or `cargo metadata` and humans have an easier time working with such paths so I think it's worthwhile to include the workspace's name. - ## Just use `targo` While a very nice tool, [`targo`](https://github.com/sunshowers/targo) is not integrated with `cargo` and has a few shortcomings: @@ -239,7 +240,8 @@ While a very nice tool, [`targo`](https://github.com/sunshowers/targo) is not in - It needs more metadata to work well, which means an external tool using it would have to understand that metadata too. - It uses `$CARGO_HOME/targo` to place its cache, making it less useful for external build tools and people wanting to separate caches and configuration. - It needs to intercept `cargo`'s arguments, making it more brittle than an integrated solution. -- Its naming scheme is a base58-encoded blake3 hash of the workspace directory ([source]), making it unusable by human (that's not necessarily a bad thing, just a difference with this RFC) and not taking into account the use case of thousands of target directories within `$CARGO_HOME/targo`. +- Its naming scheme is a base58-encoded blake3 hash of the workspace directory ([source]), making it unusable by human and not taking into account the use case of thousands of target directories within `$CARGO_HOME/targo`. +- It uses the workspace root dir and not manifest, which means a `targo script` would share cache between all the scripts in a directory, which may not be the desired effect. Some of those could be fixed of course, and I don't expect `cargo`'s `--target-dir` and `--manifest-path` to change or disappear anytime soon, but still, it could happen. An external tool like `targo` will never be able to solve some of these or ensure forward compatibility as well as the solution proposed in this RFC. @@ -251,7 +253,7 @@ On the other hand, `targo` is already here and working for at least one person, `targo` provides backlink (it links from it's own target directory to `/target`) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. -`cargo` does not for `CARGO_TARGET_DIR` and will not do it either for `CARGO_TARGET_BASE_DIR` : it provides `cargo metadata` that is the blessed way to obtain the actual target directory and when `CARGO_TARGET_BASE_DIR` is used, it will set `CARGO_TARGET_DIR` on all invocation (if not already set) to make it easy to obtain the target directory for simple task (e.g. a test needing to launch another binary in the repo). +`cargo` does not for `CARGO_TARGET_DIR` and will not do it either for `CARGO_TARGET_BASE_DIR` : it provides `cargo metadata` that is the blessed way to obtain the actual target directory and when `CARGO_TARGET_BASE_DIR` is used, it will set `CARGO_TARGET_DIR` on all invocation within a workspace (if not already set) to make it easy to obtain the target directory for simple task (e.g. a test needing to launch another binary in the repo). ## Remapping @@ -311,18 +313,6 @@ It should be noted that `bazel` is integrated with [remote caching](https://baze - Do we want to differentiate according to users ? `bazel` is a generic build tool, whereas `cargo` is not, so maybe differentiating on users is not necessary for us ? -## Handle possibly thousands of directories in a single `CARGO_TARGET_BASE_DIR` path - -While a single dev machine is unlikely to have enough projects that the simple scheme of `-` will produce enough directories to slow down working in `$CARGO_TARGET_BASE_DIR/`, it could still happen, and notably in private CI, which are often less compartimentalized than public ones. Simple cruft over time (i.e, never calling `cargo clean` over years) could also make it happen, if much slower. - -A quick design that would probably work and still be relatively navigable by humans is: instead of `project-name-/`, use `p/r/o/j/e/ct-name-/`. Here I used an arbitrary limit of 5 subdirectories but really, anything could be used. A problem occurs if a `.` is present in the directory name,but in that case we could simply append it to the symbol before like so: `proj....name/` -> `p/r/o/j..../n/ame-/`. - -If the project name contains a space in the first 5 symbols, maybe replace it like so: `proj name/` -> `p/r/o/j-space/n/ame-/`. - -I don't know of any filesystem that has troubles with `_` or `-` (often used as delimiters) so those have no special handling. - -If, on the other, we don't care (or care less) for human usability of the directories inside `CARGO_TARGET_BASE_DIR`, we could simply invert the format to use `-`: since the hash would have fixed length and set of characters, it would be easy to split into subdirectories without special casing and a simple `find . -n 'project name'` would point to the final target directory for the project (or all of them, if several similarly named projects exists on the machine). - # Future possibilities [future-possibilities]: #future-possibilities From a245144532582087f86fcb47686efed9f3d42d00 Mon Sep 17 00:00:00 2001 From: Poliorcetics Date: Wed, 21 Jun 2023 23:14:19 +0200 Subject: [PATCH 25/67] Clarify one advantage Co-authored-by: Rain --- text/3371-cargo-target-base-dir.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 803312adf0b..8b7a764711e 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -18,7 +18,7 @@ Introduce a new configuration option for cargo to tell it to move the workspace' The original motivating issue can be found here: [rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156). 1. Not having to find and clean all `target/` dirs everywhere while not having all projects collide (which is the effect of setting `CARGO_TARGET_DIR` globally) -1. Being able to easily exclude a directory from saves (Apple's Time Machine, ZFS snapshots, BRTS, ...) +1. Being able to easily exclude a directory from backups (Apple's Time Machine, ZFS and btrfs snapshots, ...) 1. Allows easily having separate directories for Rust-Analyzer and Cargo itself, allowing concurrent builds (technically already doable with arguments/env vars but `CARGO_TARGET_DIR` collides all projects into big target dir, leading to frequent recompilation because of conflicting features and locking builds) 1. Allows using a different disk, partition or mount point for cargo artifacts 1. Avoids having to set `CARGO_TARGET_DIR` for every project to get the same effect as proposed here From a7af62361ee81630b1563041c48a87a0982fd0f5 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 25 Jun 2023 21:31:20 +0200 Subject: [PATCH 26/67] Add drawback about windows path length limits --- text/3371-cargo-target-base-dir.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 8b7a764711e..5e48e1f5337 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -193,6 +193,12 @@ This introduces one more option to look at to find the target directory, which m This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_BASE_DIR`, so an external tool can set it and go on its way. Also, having `cargo` set `CARGO_TARGET_DIR` when inside a workspace where `CARGO_TARGET_BASE_DIR` is used will help current tools (those not yet using `cargo metadata`) continue working without trouble. +## Hitting windows path length limits + +Depending on what naming scheme is used (e.g., a very long hash), we could hit the Windows path length limits if not careful. + +A mitigation for this is recommending a short prefix (in `CARGO_TARGET_BASE_DIR`) and using a has that doesn't include that many characters but those are only mitigations and do not fully fix the underlying problem. + # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives From 001f27f70c6dd0022eef4fce5c6a2d3a155c5460 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 25 Jun 2023 21:35:20 +0200 Subject: [PATCH 27/67] keeping local target dir even if the default changes --- text/3371-cargo-target-base-dir.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 5e48e1f5337..eafe882c2eb 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -369,3 +369,11 @@ Well, first, heavy advertising of the option. Very **heavy** advertising of it a 4. Write a post saying "we'll do the change in version X" (where current version is like X-2 to warn 3 months before at least ?) and then only apply the change to directory where there is no `CARGO_TARGET_DIR` and no `target/` dir locally I expect this to take a long time and `cargo` would need to be very clear about where it puts the target directory, probably through a simple command like `cargo metadata --print-target-dir` to make it easy for CIs and scripts to use it programatically without having to parse JSON all the time like a simple `cargo metadata` would do. + +### A user still wants the current behaviour and not this RFC's + +- We could use a config option (mentioned in previous section) +- `CARGO_TARGET_DIR=target` would still be available +- We could add special behaviour like `CARGO_TARGET_BASE_DIR=""` meaning "use current" directory + +The first two are probably enough, the third is a bandaid. From e3fda640b24ea262a0ec0be40eda8b71d1ce18c2 Mon Sep 17 00:00:00 2001 From: Poliorcetics Date: Tue, 27 Jun 2023 11:14:11 +0200 Subject: [PATCH 28/67] Typo fix Co-authored-by: Jacob Lifshay --- text/3371-cargo-target-base-dir.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index eafe882c2eb..dc597533404 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -197,7 +197,7 @@ This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_B Depending on what naming scheme is used (e.g., a very long hash), we could hit the Windows path length limits if not careful. -A mitigation for this is recommending a short prefix (in `CARGO_TARGET_BASE_DIR`) and using a has that doesn't include that many characters but those are only mitigations and do not fully fix the underlying problem. +A mitigation for this is recommending a short prefix (in `CARGO_TARGET_BASE_DIR`) and using a hash that doesn't include that many characters but those are only mitigations and do not fully fix the underlying problem. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives From 5d33e78e9b04baf36eb57513301659983a4d3285 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sat, 8 Jul 2023 18:45:12 +0200 Subject: [PATCH 29/67] introduce backlinks and expand on them --- text/3371-cargo-target-base-dir.md | 41 ++++++++++++++++-------------- 1 file changed, 22 insertions(+), 19 deletions(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index dc597533404..4183d2d0188 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -182,6 +182,18 @@ In the same vein, `cargo metadata` will fill the target directory information wi Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_BASE_DIR`: if it set, `cargo clean` will delete `CARGO_TARGET_BASE_DIR//` and not `target/`. +### Providing backlinks + +`targo` provides backlink (it links from it's own target directory to `/target`) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. + +`cargo` does not for `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). + +When `CARGO_TARGET_BASE_DIR` is used (in any form) and not superseded by other configurations (`CARGO_TARGET_DIR`), it *will* use a backlink by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if `CARGO_TARGET_BASE_DIR` and `CARGO_TARGET_DIR` weren't set at all. + +This has a two big advantages: not breaking external tools and giving easy access to artifacts produced by `cargo build/test/doc` to users (they're in the habit of typing `./target/debug/my-bin`, this would continue working with backlinks). + +A config option (CLI, `config.toml` and env var), `link-target-dir`, will be introduced to deactivate this behaviour, but it will `true` by default, for the reasons provided in favor of backlinks just above. + # Drawbacks [drawbacks]: #drawbacks @@ -199,6 +211,10 @@ Depending on what naming scheme is used (e.g., a very long hash), we could hit t A mitigation for this is recommending a short prefix (in `CARGO_TARGET_BASE_DIR`) and using a hash that doesn't include that many characters but those are only mitigations and do not fully fix the underlying problem. +## Backlinks + +There a few cases where a symlink instead of a real dir will break programs: at least SQLite can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since `CARGO_TARGET_BASE_DIR` won't become a default in this RFC, we are not breaking any existing use cases. + # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -255,12 +271,6 @@ On the other hand, `targo` is already here and working for at least one person, [source]: https://github.com/rust-lang/cargo/issues/11156#issuecomment-1285951209 -### Providing backlinks - -`targo` provides backlink (it links from it's own target directory to `/target`) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. - -`cargo` does not for `CARGO_TARGET_DIR` and will not do it either for `CARGO_TARGET_BASE_DIR` : it provides `cargo metadata` that is the blessed way to obtain the actual target directory and when `CARGO_TARGET_BASE_DIR` is used, it will set `CARGO_TARGET_DIR` on all invocation within a workspace (if not already set) to make it easy to obtain the target directory for simple task (e.g. a test needing to launch another binary in the repo). - ## Remapping [rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) was originally about remapping the target directory, not about having a central one but reading the issue, there seems to be no needs for more than the simple redefinition of the target directory proposed in this document. In the future, if `CARGO_TARGET_DIR_REMAP` is introduced, it could be used to be the prefix to the target directory like so: @@ -317,6 +327,7 @@ It should be noted that `bazel` is integrated with [remote caching](https://baze - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? --> +- Do we want to add backlinks for `CARGO_TARGET_DIR` too in the process of this RFC ? - Do we want to differentiate according to users ? `bazel` is a generic build tool, whereas `cargo` is not, so maybe differentiating on users is not necessary for us ? # Future possibilities @@ -353,26 +364,18 @@ This option has several complications I'm not sure how to resolve: - Subsequently, when platform defaults are decided, how do we ensure a new platform has a good default too ? - `CARGO_HOME` is already criticized for being both a *cache* and a *config* home (using the XDG spec semantics), adding more local cache to it in the form of `CARGO_HOME/target-base-dir/` would not improve the situation and should probably not be done, but, if no good alternatives are found, there is precedent to use it for this. 2. How do we communicate on said default values ? -3. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. - -I see `CARGO_TARGET_BASE_DIR` as `cargo`-specific `XDG_CACHE_DIR` variable: if unset, using a local `target/` is what *all* of the Web has documented for years, it's what people expect, it's what tools expect, it makes it easy to reorganize Rust projects by just moving them around in directories, ... - -`CARGO_TARGET_BASE_DIR` is more intended for CI, peoples with specific setups or people wanting to dive just a little deeper in their directory management, but just like `npm` puts `.node_modules/` in the project's workspace directory by default, I think `cargo` should also keep this as the default behaviour. +3. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. Note that having backlinks active by default (see relevant section earlier in the RFC) will help offset a lot of the problems here. ### We really want to do this, how do we do it ? -Well, first, heavy advertising of the option. Very **heavy** advertising of it and of `cargo-metadata` for all interactions with the cargo `target/` directory. After that, it will be a waiting game: - -1. Wait for tools to be updated -2. Wait for MSRVs to catch up to the cargo version introducing the option and its defaults (this could take a year or ten, I have no idea) -3. Add a config option to `cargo` to make it the default behaviour (when there is not already a `target/` dir locally is `CARGO_TARGET_DIR` set), something like `use-user-wide-cargo-target-base-dir-by-default = true` and communicate so that people can try it out -4. Write a post saying "we'll do the change in version X" (where current version is like X-2 to warn 3 months before at least ?) and then only apply the change to directory where there is no `CARGO_TARGET_DIR` and no `target/` dir locally +Well, first, advertising of the option and its behaviour, as well as the backlink behaviour (so people know we're not just breaking tools for fun). After that, it becomes necessary to test it quite heavily to really ensure nothing has broken irremediably. -I expect this to take a long time and `cargo` would need to be very clear about where it puts the target directory, probably through a simple command like `cargo metadata --print-target-dir` to make it easy for CIs and scripts to use it programatically without having to parse JSON all the time like a simple `cargo metadata` would do. +1. Introduce it as the default behaviour in nightly, wait for a few stable releases so that beta has it too and it can start being used for a few months after that so that esoteric setups can also try it. +2. Write a post saying "we'll do the change in version X" (where current version is like X-2 to warn 3 months before at least ?) and then only apply the change to directory where there is no `CARGO_TARGET_DIR` config set. ### A user still wants the current behaviour and not this RFC's -- We could use a config option (mentioned in previous section) +- We could use a config option to entirely deactivate it - `CARGO_TARGET_DIR=target` would still be available - We could add special behaviour like `CARGO_TARGET_BASE_DIR=""` meaning "use current" directory From 46f4b806d10a3f5d874535491a6c01788dc581c2 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sat, 8 Jul 2023 18:48:00 +0200 Subject: [PATCH 30/67] mention garbage collection in future possibilities --- text/3371-cargo-target-base-dir.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 4183d2d0188..b4ba2ed726d 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -355,6 +355,7 @@ The section merely provides additional information. - Allowing relative paths: I feel this is counter-productive to the stated goal and have thought of no use for it, but it's entirely possible someone else will. - Introduce remapping into the concept in some way. +- Introduce a form of garbage collection. Expanded upon this [Zulip thread](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/last-use.20tracking). ## Use `CARGO_TARGET_BASE_DIR` as the default instead of `target` From 10e2d190161c80581f287be82dc6b23f71970455 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 3 Sep 2023 20:18:35 +0200 Subject: [PATCH 31/67] nit: fix unclear text --- text/3371-cargo-target-base-dir.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index b4ba2ed726d..582ac6252d3 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -186,7 +186,7 @@ Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `targo` provides backlink (it links from it's own target directory to `/target`) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. -`cargo` does not for `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). +`cargo` does not provide them for `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). When `CARGO_TARGET_BASE_DIR` is used (in any form) and not superseded by other configurations (`CARGO_TARGET_DIR`), it *will* use a backlink by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if `CARGO_TARGET_BASE_DIR` and `CARGO_TARGET_DIR` weren't set at all. From c386c979f6a2f8103e71214a0c326ede1774e9ed Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 3 Sep 2023 20:29:03 +0200 Subject: [PATCH 32/67] clarify text about setting `CARGO_TARGET_DIR` for cargo calls --- text/3371-cargo-target-base-dir.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 582ac6252d3..45279eb6597 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -174,10 +174,12 @@ A proposed solution for `cargo` to split the hash into something like `$CARG_TAR ## Impact on `cargo ...` calls -When calling `cargo` where `CARGO_TARGET_BASE_DIR` is active, `CARGO_TARGET_DIR` is set by all `cargo` calls that happen in a Cargo workspace, as much as possible. For third party tools (`cargo-*`), where cargo does not know about the relevant `Cargo.toml`, the tool would have to use [`cargo_metadata`](https://docs.rs/cargo_metadata), as is already expected today. +When calling `cargo` where `CARGO_TARGET_BASE_DIR` is active, a `CARGO_TARGET_DIR` is set by all `cargo` calls. For third party tools (`cargo-*`), where cargo does not know about the relevant `Cargo.toml`, the tool would have to use [`cargo_metadata`](https://docs.rs/cargo_metadata), as is already expected today. In the same vein, `cargo metadata` will fill the target directory information with the absolute path and make no mention of `CARGO_TARGET_BASE_DIR` since it can only be used in a single workspace at once. +Example: calling `cargo install` can initially work outside of a workspace by installing a binary from *crates.io*, *git* or other sources: in this case, `cargo` will use the manifest path used when compiling the crate (from somewhere inside `CARGO_HOME` I suppose) to create the hashed path to use inside `CARGO_TARGET_BASE_DIR`. + ### `cargo clean` Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_BASE_DIR`: if it set, `cargo clean` will delete `CARGO_TARGET_BASE_DIR//` and not `target/`. From a519e7cdfce0f1d8f889731dcb3cc781faf85229 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 4 Sep 2023 13:37:01 +0200 Subject: [PATCH 33/67] clarify backlinks and forward links, I misunderstood them at first --- text/3371-cargo-target-base-dir.md | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 45279eb6597..7b4ce02d96e 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -184,17 +184,29 @@ Example: calling `cargo install` can initially work outside of a workspace by in Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_BASE_DIR`: if it set, `cargo clean` will delete `CARGO_TARGET_BASE_DIR//` and not `target/`. -### Providing backlinks +### Providing forward links -`targo` provides backlink (it links from it's own target directory to `/target`) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. +`targo` provides forward link (it links from `/target` to its own target directory) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. `cargo` does not provide them for `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). -When `CARGO_TARGET_BASE_DIR` is used (in any form) and not superseded by other configurations (`CARGO_TARGET_DIR`), it *will* use a backlink by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if `CARGO_TARGET_BASE_DIR` and `CARGO_TARGET_DIR` weren't set at all. +When `CARGO_TARGET_BASE_DIR` is used (in any form) and not superseded by other configurations (`CARGO_TARGET_DIR`), it *will* use a forward link by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if `CARGO_TARGET_BASE_DIR` and `CARGO_TARGET_DIR` weren't set at all. + +This has a two big advantages: not breaking external tools and giving easy access to artifacts produced by `cargo build/test/doc` to users (they're in the habit of typing `./target/debug/my-bin`, this would continue working with forward links). + +A config option (CLI, `config.toml` and env var), `link-target-dir`, will be introduced to deactivate this behaviour, but it will `true` by default, for the reasons provided in favor of forward links just above. -This has a two big advantages: not breaking external tools and giving easy access to artifacts produced by `cargo build/test/doc` to users (they're in the habit of typing `./target/debug/my-bin`, this would continue working with backlinks). +#### Detailed working of forward links + +When creating a forward link `cargo` will first attempt to create a symbolic link (regardless of the platform). If that fails, it will attempt zero or more platform-specific solutions, like junction points on NTFS. If that fails too, a warning will be emitted but this will not prevent the rest of the action to go on: regular calls like `cargo check/clippy/build/test` likely won't need this forward link and after the user has been warned they could either resolve the problem themselves or ignore it, depending on their own use case and domain-specific knowledge. + +### Providing backlinks -A config option (CLI, `config.toml` and env var), `link-target-dir`, will be introduced to deactivate this behaviour, but it will `true` by default, for the reasons provided in favor of backlinks just above. +Backlinks are metadata in `CARGO_TARGET_BASE_DIR` that links directories back to the workspace they came from. + +`targo` uses them in its own form of the feature and `cargo` will use them too in `CARGO_TARGET_BASE_DIR`. + +While details of the stored data should probably be left to the implementation (there is no need for `cargo` to expose this data directly, though it could be exposed through `cargo metadata` in the future, see the relevant section later), one could imagine using it to clean target directories whose corresponding workspace does not exist anymore when calling something like `cargo clean --all-workspaces` (doing it automatically is not possible, else any workspace on external disks would have its target directory cleaned up each time the disk is unmounted, which is probably way too aggressive a default). # Drawbacks [drawbacks]: #drawbacks @@ -213,7 +225,7 @@ Depending on what naming scheme is used (e.g., a very long hash), we could hit t A mitigation for this is recommending a short prefix (in `CARGO_TARGET_BASE_DIR`) and using a hash that doesn't include that many characters but those are only mitigations and do not fully fix the underlying problem. -## Backlinks +## Forward links There a few cases where a symlink instead of a real dir will break programs: at least SQLite can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since `CARGO_TARGET_BASE_DIR` won't become a default in this RFC, we are not breaking any existing use cases. @@ -383,3 +395,9 @@ Well, first, advertising of the option and its behaviour, as well as the backlin - We could add special behaviour like `CARGO_TARGET_BASE_DIR=""` meaning "use current" directory The first two are probably enough, the third is a bandaid. + +### Expose `CARGO_TARGET_BASE_DIR` metadata + +`cargo` will use backlinks in an implementation-defined form to keep track in the `CARGO_TARGET_BASE_DIR` of the relation from a target directory to its source workspace. + +In the future, we could envisage letting external tools and users access this data in a well-defined form through `cargo metadata`. From 50251e4a4288e613379a9dce8a8bfb3c6b4df02f Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 4 Sep 2023 23:26:23 +0200 Subject: [PATCH 34/67] cleanup of RFC text to make less use of 'I', 'we', add a link to targo and clarify in response to Ehuss' question --- text/3371-cargo-target-base-dir.md | 72 +++++++++++++++--------------- 1 file changed, 36 insertions(+), 36 deletions(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index 7b4ce02d96e..d06ccef9c42 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -74,9 +74,9 @@ Consider this directory tree: `cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts directly in `/cargo-cache/debug/...` -A subsequent `cargo build` in `perso-1` will work with the same artifact, potentially having conflicting features for dependencies for example. +A subsequent `cargo build` in `perso-1` works with the same artifact, potentially having conflicting features for dependencies for example. -A `cargo clean` will delete the entire `/cargo-cache` directory, for all projects at once. +A `cargo clean` deletes the entire `/cargo-cache` directory, for all projects at once. It's possible to produce invalid state in the target dir by having unrelated projects writing in the same place. @@ -86,9 +86,9 @@ It's not possible to have to projects building at once because Cargo locks its t `cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts in `/cargo-cache//debug/...` (where `project-path` is a directory or several chained directories unique to the workspace, with an unspecified naming scheme). -A `cargo build` in `perso-1` will produce new artifacts in `/cargo-cache//debug/...`. +A `cargo build` in `perso-1` produces new artifacts in `/cargo-cache//debug/...`. -A `cargo clean` will only remove the `/cargo-cache//` subdirectory, not all the artifacts. +A `cargo clean` only removed the `/cargo-cache//` subdirectory, not all the artifacts for all other projects that are also in the cache. In this situation, it's much less likely for Cargo to produce invalid state without a `build.rs` deliberately writing outside its target directory. @@ -102,7 +102,7 @@ Two projects can be built in parallel without troubles. `CARGO_TARGET_DIR` can be either a relative or absolute path, which makes sense since it's mostly intended for a single project, which can then work from its own position to configure the target directory. -On the other hand `CARGO_TARGET_BASE_DIR` is intended to be used with several projects, possibly completely unrelated to each other. As such, it does not accept relative paths, only absolute ones. If a compelling use case is present for a relative path, it can added in the future as a backward-compatible change. +On the other hand `CARGO_TARGET_BASE_DIR` is intended to be used with several projects, possibly completely unrelated to each other. As such, it does not accept relative paths, only absolute ones. If a compelling use case is present for a relative path, it can be added in the future as a backward-compatible change. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -168,25 +168,23 @@ When calling `cargo metadata` in the `symlink-to-crate` path, the result contain ### Handle possibly thousands of directories in a single `CARGO_TARGET_BASE_DIR` path -While a single dev machine is unlikely to have enough projects that the scheme of `` will produce enough directories to slow down working in `$CARGO_TARGET_BASE_DIR/`, it could still happen, and notably in private CI, which are often less compartimentalized than public ones. Simple cruft over time (i.e, never calling `cargo clean` over years) could also make it happen, if much slower. +While a single dev machine is unlikely to have enough projects that the naming scheme of `` will produce enough directories to slow down working in `$CARGO_TARGET_BASE_DIR/`, it could still happen, and notably in private CI, which are often less compartimentalized than public ones. Simple cruft over time (i.e, never calling `cargo clean` over years) could also make it happen, if much slower. -A proposed solution for `cargo` to split the hash into something like `$CARG_TARGET_BASE_DIR/hash[:2]/hash[2:4]/hash[4:]/...`. Since the naming scheme is considered an implementation detail, if this prove insufficient it could be change in a subsequent version of `cargo`. +To prevent this, `cargo` splits the hash into something like `$CARG_TARGET_BASE_DIR/hash[:2]/hash[2:4]/hash[4:]/...`. Since the naming scheme is considered an implementation detail, if this prove insufficient it could be change in a subsequent version of `cargo`. ## Impact on `cargo ...` calls -When calling `cargo` where `CARGO_TARGET_BASE_DIR` is active, a `CARGO_TARGET_DIR` is set by all `cargo` calls. For third party tools (`cargo-*`), where cargo does not know about the relevant `Cargo.toml`, the tool would have to use [`cargo_metadata`](https://docs.rs/cargo_metadata), as is already expected today. +When calling `cargo` with a builtin call (e.g., `build`, `check` or `test`) where `CARGO_TARGET_BASE_DIR` is active, `cargo` will first resolve the effective `CARGO_TARGET_DIR` and then proceed with the command as if `CARGO_TARGET_DIR` had been set directly. For third party tools (`cargo-*`), where cargo does not know about the relevant `Cargo.toml`, the tool will have to use [`cargo_metadata`](https://docs.rs/cargo_metadata), as is already expected today, to learn about the effective target directory. -In the same vein, `cargo metadata` will fill the target directory information with the absolute path and make no mention of `CARGO_TARGET_BASE_DIR` since it can only be used in a single workspace at once. - -Example: calling `cargo install` can initially work outside of a workspace by installing a binary from *crates.io*, *git* or other sources: in this case, `cargo` will use the manifest path used when compiling the crate (from somewhere inside `CARGO_HOME` I suppose) to create the hashed path to use inside `CARGO_TARGET_BASE_DIR`. +In the same vein, `cargo metadata` fills the target directory information with the absolute path and make no mention of `CARGO_TARGET_BASE_DIR` since it can only be used with a single workspace at once. ### `cargo clean` -Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_BASE_DIR`: if it set, `cargo clean` will delete `CARGO_TARGET_BASE_DIR//` and not `target/`. +Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_BASE_DIR`: if it set, `cargo clean` deletes `CARGO_TARGET_BASE_DIR//` and not `target/`. ### Providing forward links -`targo` provides forward link (it links from `/target` to its own target directory) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. +[`targo`][tg] provides forward link (it links from `/target` to its own target directory) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. `cargo` does not provide them for `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). @@ -202,11 +200,11 @@ When creating a forward link `cargo` will first attempt to create a symbolic lin ### Providing backlinks -Backlinks are metadata in `CARGO_TARGET_BASE_DIR` that links directories back to the workspace they came from. +Backlinks are metadata in `CARGO_TARGET_BASE_DIR` that links target directories back to the workspace they came from. -`targo` uses them in its own form of the feature and `cargo` will use them too in `CARGO_TARGET_BASE_DIR`. +[`targo`][tg] uses them in its own form of the feature and `cargo` uses them too in `CARGO_TARGET_BASE_DIR`. -While details of the stored data should probably be left to the implementation (there is no need for `cargo` to expose this data directly, though it could be exposed through `cargo metadata` in the future, see the relevant section later), one could imagine using it to clean target directories whose corresponding workspace does not exist anymore when calling something like `cargo clean --all-workspaces` (doing it automatically is not possible, else any workspace on external disks would have its target directory cleaned up each time the disk is unmounted, which is probably way too aggressive a default). +While details of the stored data are left to the implementation (there is no need for `cargo` to expose this data directly, though it could be exposed through `cargo metadata` in the future, see the relevant section below), one could imagine using it to clean target directories whose corresponding workspace does not exist anymore when calling something like `cargo clean --all-workspaces` (doing it automatically is not possible, else any workspace on external disks would have its target directory cleaned up each time the disk is unmounted, which is way too aggressive a default). # Drawbacks [drawbacks]: #drawbacks @@ -217,7 +215,7 @@ While details of the stored data should probably be left to the implementation ( This introduces one more option to look at to find the target directory, which may complicate the life of external tools. -This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_BASE_DIR`, so an external tool can set it and go on its way. Also, having `cargo` set `CARGO_TARGET_DIR` when inside a workspace where `CARGO_TARGET_BASE_DIR` is used will help current tools (those not yet using `cargo metadata`) continue working without trouble. +This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_BASE_DIR`, so an external tool can set it and go on its way. For tools not yet using `cargo metadata`, they'll be able to depend in the forward link provided by default by `cargo` when using `CARGO_TARGET_BASE_DIR`. ## Hitting windows path length limits @@ -227,7 +225,7 @@ A mitigation for this is recommending a short prefix (in `CARGO_TARGET_BASE_DIR` ## Forward links -There a few cases where a symlink instead of a real dir will break programs: at least SQLite can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since `CARGO_TARGET_BASE_DIR` won't become a default in this RFC, we are not breaking any existing use cases. +There a few cases where a symlink instead of a real dir will break programs: at least SQLite 3 can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since `CARGO_TARGET_BASE_DIR` won't become a default in this RFC, we are not breaking any existing use cases. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -245,13 +243,13 @@ It is already possible today to use `CARGO_TARGET_DIR` to remap workspaces and p 1) If done globally, the `CARGO_TARGET_DIR` becomes a hodge-podge of every project, which is not often the goal. 2) If done per-project, it is very cumbersome to maintain. -3) [`targo`](https://github.com/sunshowers/targo) by @sunshowers +3) [`targo`][tg] by @sunshowers 4) [rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) 5) The upcoming `cargo script` command needs someplace to put its cache and having a dedicated directory for that would be nice. -`targo` and the cargo issue express a need for either remapping or a global target directory that is not shared between different Cargo workspaces. +[`targo`][tg] and the cargo issue express a need for either remapping or a global target directory that is not shared between different Cargo workspaces. -For those reason, this option has not been retained and the `targo` tool is discussed more in details below. +For those reason, this option has not been retained and the [`targo`][tg] tool is discussed more in details below. ## Using `XDG_CACHE_HOME` instead of a cargo-specific env-var @@ -263,25 +261,25 @@ There are already lots of discussion about `.cargo` and `.rustup` being home to ## Stabilize the naming scheme -I feel this require an hard-to-break naming scheme (a recent hash algorithm should be good enough in 99% of the cases but collisions are always possible), which I don't have the skills nor motivation to design. Instead, I prefer explicitely telling the naming scheme is not to be considered stable and allow more invested people to experiment with the feature and find something solid. +This require an hard-to-break naming scheme (a recent hash algorithm should be good enough in 99% of the cases but collisions are always possible), which is something the `cargo` team probably does not want to offer guarantees about. Instead, explicitely telling the naming scheme is not to be considered stable allows more invested people to experiment with the feature and find something solid if stability proves itself necessary. -What's more, by explicitely not stabilizing it (and maybe voluntarily changing it between versions sometimes, since a version change recompiles everything anyway ?) we can instead reroute people and tools towards `CARGO_TARGET_DIR` / `cargo metadata` instead, which are much more likely to be suited to their use case if they need the path to the target directory. +What's more, by explicitely not stabilizing it (and maybe voluntarily changing it between versions sometimes, since a version change recompiles everything anyway ?) `cargo` can instead reroute people and tools towards `CARGO_TARGET_DIR` / `cargo metadata` instead, which are much more likely to be suited to their use case if they need the path to the target directory. ## Just use `targo` -While a very nice tool, [`targo`](https://github.com/sunshowers/targo) is not integrated with `cargo` and has a few shortcomings: +While a very nice tool, [`targo`][tg] is not integrated with `cargo` and has a few shortcomings: -- It uses symlinks, which are not always handled well by other tools. Specifically, since it's not integrated inside `cargo`, it uses a `target` symlink to avoid having to remap `cargo`'s execution using `CARGO_TARGET_DIR` and such,making it less useful for external build tools that would use this functionality. Using such a symlink also means `cargo clean` does not work, it just removes the symlink and not the data. +- It uses symlinks, which are not always handled well by other tools. Specifically, since it's not integrated inside `cargo`, it uses a `target` symlink to avoid having to remap `cargo`'s execution using `CARGO_TARGET_DIR` and such,making it less useful for external build tools that would use this functionality. Using such a symlink without setting the `CARGO_TARGET_DIR` env var also means `cargo clean` does not work, it just removes the symlink and not the data. - It completely ignores `CARGO_TARGET_DIR`-related options, which again may break workflows. - It needs more metadata to work well, which means an external tool using it would have to understand that metadata too. - It uses `$CARGO_HOME/targo` to place its cache, making it less useful for external build tools and people wanting to separate caches and configuration. - It needs to intercept `cargo`'s arguments, making it more brittle than an integrated solution. -- Its naming scheme is a base58-encoded blake3 hash of the workspace directory ([source]), making it unusable by human and not taking into account the use case of thousands of target directories within `$CARGO_HOME/targo`. -- It uses the workspace root dir and not manifest, which means a `targo script` would share cache between all the scripts in a directory, which may not be the desired effect. +- Its naming scheme is a base58-encoded blake3 hash of the workspace directory ([source]), not taking into account the use case of thousands of target directories within `$CARGO_HOME/targo`. +- It uses the workspace root dir and not manifest, which means a `targo script` would share cache between all the scripts (`cargo script`) in a directory, which may not be the desired effect. -Some of those could be fixed of course, and I don't expect `cargo`'s `--target-dir` and `--manifest-path` to change or disappear anytime soon, but still, it could happen. An external tool like `targo` will never be able to solve some of these or ensure forward compatibility as well as the solution proposed in this RFC. +Some of those could be fixed of course, and I don't expect `cargo`'s `--target-dir` and `--manifest-path` to change or disappear anytime soon, but still, it could happen. An external tool like [`targo`][tg] will never be able to solve some of these or ensure forward compatibility as well as the solution proposed in this RFC. -On the other hand, `targo` is already here and working for at least one person, making it the most viable alternative for now. +On the other hand, [`targo`][tg] is already here and working for at least one person, making it the most viable alternative for now. [source]: https://github.com/rust-lang/cargo/issues/11156#issuecomment-1285951209 @@ -328,9 +326,9 @@ The `outputRoot` can be overridden using `--output_base=...` (this is `$CARGO_TA It should be noted that `bazel` is integrated with [remote caching](https://bazel.build/remote/caching) and has different needs from `cargo`, the latter only working locally. -**Conclusion**: `bazel` shows that a hash-based workflow seems to work well enough, making an argument for the use of it in `cargo` too. It also uses the current user, to prevent attacks by having compiled a program as root and making the directory accessible to other users later on by also compiling there for them. `cargo` could also do this, though I do not know what happens when `--output_user_root` is set to the same path for two different users. +**Conclusion**: `bazel` shows that a hash-based workflow seems to work well enough, making an argument for the use of it in `cargo` too. It also uses the current user, to prevent attacks by having compiled a program as root and making the directory accessible to other users later on by also compiling there for them. `cargo` could also do this, though it is not clear what happens when `--output_user_root` is set to the same path for two different users. -*Note: I looked at Bazel 5.4.0, the latest stable version as of this writing, things may change in the future or be different for older versions.* +*Note: Bazel 5.4.0 was used as the reference, the latest stable version as of this writing, things may change in the future or be different for older versions.* # Unresolved questions [unresolved-questions]: #unresolved-questions @@ -341,8 +339,8 @@ It should be noted that `bazel` is integrated with [remote caching](https://baze - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? --> -- Do we want to add backlinks for `CARGO_TARGET_DIR` too in the process of this RFC ? -- Do we want to differentiate according to users ? `bazel` is a generic build tool, whereas `cargo` is not, so maybe differentiating on users is not necessary for us ? +- Do we want to add forward links for `CARGO_TARGET_DIR` too in the process of this RFC ? +- Do we want to differentiate according to users ? `bazel` is a generic build tool, whereas `cargo` is not, so maybe differentiating on users is not necessary for the latter ? # Future possibilities [future-possibilities]: #future-possibilities @@ -367,7 +365,7 @@ in the section on motivation or rationale in this or subsequent RFCs. The section merely provides additional information. --> -- Allowing relative paths: I feel this is counter-productive to the stated goal and have thought of no use for it, but it's entirely possible someone else will. +- Allowing relative paths: this is counter-productive to the stated goal and their is not obvious use for it, but it's entirely possible someone will. - Introduce remapping into the concept in some way. - Introduce a form of garbage collection. Expanded upon this [Zulip thread](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/last-use.20tracking). @@ -379,11 +377,11 @@ This option has several complications I'm not sure how to resolve: - Subsequently, when platform defaults are decided, how do we ensure a new platform has a good default too ? - `CARGO_HOME` is already criticized for being both a *cache* and a *config* home (using the XDG spec semantics), adding more local cache to it in the form of `CARGO_HOME/target-base-dir/` would not improve the situation and should probably not be done, but, if no good alternatives are found, there is precedent to use it for this. 2. How do we communicate on said default values ? -3. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. Note that having backlinks active by default (see relevant section earlier in the RFC) will help offset a lot of the problems here. +3. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. Note that having forward links active by default (see relevant section earlier in the RFC) will help offset a lot of the problems here. ### We really want to do this, how do we do it ? -Well, first, advertising of the option and its behaviour, as well as the backlink behaviour (so people know we're not just breaking tools for fun). After that, it becomes necessary to test it quite heavily to really ensure nothing has broken irremediably. +Well, first, advertising of the option and its behaviour, as well as the forward link behaviour (so people know we're not just breaking tools for fun). After that, it becomes necessary to test it quite heavily to really ensure nothing has broken irremediably. 1. Introduce it as the default behaviour in nightly, wait for a few stable releases so that beta has it too and it can start being used for a few months after that so that esoteric setups can also try it. 2. Write a post saying "we'll do the change in version X" (where current version is like X-2 to warn 3 months before at least ?) and then only apply the change to directory where there is no `CARGO_TARGET_DIR` config set. @@ -401,3 +399,5 @@ The first two are probably enough, the third is a bandaid. `cargo` will use backlinks in an implementation-defined form to keep track in the `CARGO_TARGET_BASE_DIR` of the relation from a target directory to its source workspace. In the future, we could envisage letting external tools and users access this data in a well-defined form through `cargo metadata`. + +[tg]: https://github.com/sunshowers/targo From 0e5d30e000d77825ef0937ed12b89eadc1d83558 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sat, 17 Feb 2024 22:27:52 +0100 Subject: [PATCH 35/67] feat: move to templating `CARGO_TARGET_DIR` instead of a new config `CARGO_TARGET_BASE_DIR` --- text/3371-cargo-target-base-dir.md | 121 ++++++++++++++--------------- 1 file changed, 58 insertions(+), 63 deletions(-) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index d06ccef9c42..f4830a21e0f 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -1,4 +1,4 @@ -- Feature Name: `cargo_target_base_dir` +- Feature Name: `cargo_target_dir_templates` - Start Date: 2023-01-12 - RFC PR: [rust-lang/rfcs#3371](https://github.com/rust-lang/rfcs/pull/3371) @@ -8,7 +8,7 @@ -Introduce a new configuration option for cargo to tell it to move the workspace's target directory into a workspace-specific subdirectory of the configured absolute path, named `CARGO_TARGET_BASE_DIR`. +Introduce templating to `CARGO_TARGET_DIR` to have `cargo` adapts its target directory dynamically depending on (at least) the manifest's path. # Motivation [motivation]: #motivation @@ -39,13 +39,21 @@ Explain the proposal as if it was already included in the language and you were For implementation-oriented RFCs (e.g. for compiler internals), this section should focus on how compiler contributors should think about the change, and give examples of its concrete impact. For policy RFCs, this section should provide an example-driven introduction to the policy, and explain its impact in concrete terms. --> -For a single project, it is possible to use the `CARGO_TARGET_DIR` environment variable (or the `target-dir` TOML config option or the `--target-dir` command-line flag) to change the position of the `target/` directory used for build artifacts during compilation with Cargo. +For a single project, it is possible to use the `CARGO_TARGET_DIR` environment variable (or the `target-dir` TOML config option or the `--target-dir` command-line flag) with either an absolute or relative path to change the position of the `target/` directory used for build artifacts during compilation with Cargo. While this option is useful for single-project environments (simple CI builds, builds through other build systems like Meson or Bazel), in multi-projects environment, like personal machines or repos with multiple workspaces, it conflates every build directory under the configured path: `CARGO_TARGET_DIR` directly replaces the `/target/` directory. -`CARGO_TARGET_BASE_DIR` (or the `target-base-dir` TOML option or the `--target-base-dir` command-line flag) instead acts as a parent for those `target` directories. +Templating introduces one new templating key for `CARGO_TARGET_DIR`, in the same spirit as [the index configuration format][icf]: -Below is an example of the behavior with `CARGO_TARGET_DIR` versus the one with `CARGO_TARGET_BASE_DIR`: +- `{manifest-path-hash}`: a hash of the manifest's absolute path as a path. This is **not** an absolute path. + +It can be used like this: `CARGO_TARGET_DIR="$HOME/.cache/cargo-target-dirs/{manifest-path-hash}"`. + +When compiling `/home/ferris/src/cargo/` with user `ferris`, `manifest-path-hash` would be something like `ab/cd/` and the artifacts would be found in `/home/ferris/.cache/cargo-target-dirs/ab/cd//...`. Note the hash used and the path derived from that for `{manifest-path-hash}` are implementation details and the values here are just example. + +Below is an example of the behavior with untemplated versus templated forms: + +[icf]: https://doc.rust-lang.org/cargo/reference/registry-index.html#index-configuration ## Example @@ -82,27 +90,21 @@ It's possible to produce invalid state in the target dir by having unrelated pro It's not possible to have to projects building at once because Cargo locks its target directory during builds. -#### With `CARGO_TARGET_BASE_DIR=/cargo-cache` +#### With `CARGO_TARGET_DIR=/cargo-cache/{manifest-path-hash}` -`cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts in `/cargo-cache//debug/...` (where `project-path` is a directory or several chained directories unique to the workspace, with an unspecified naming scheme). +`cd /Users/poliorcetics/work/work-project && cargo build` produces artifacts in `/cargo-cache//debug/...` (where `manifest-path-hash` is a directory or several chained directories unique to the workspace, with an unspecified naming scheme). -A `cargo build` in `perso-1` produces new artifacts in `/cargo-cache//debug/...`. +A `cargo build` in `perso-1` produces new artifacts in `/cargo-cache//debug/...`. -A `cargo clean` only removed the `/cargo-cache//` subdirectory, not all the artifacts for all other projects that are also in the cache. +A `cargo clean` only removed the `/cargo-cache//` subdirectory, not all the artifacts for all other projects that are also in the cache. In this situation, it's much less likely for Cargo to produce invalid state without a `build.rs` deliberately writing outside its target directory. Two projects can be built in parallel without troubles. -#### With both set - -`CARGO_TARGET_DIR` was present long before `CARGO_TARGET_BASE_DIR`: backward compatibility is important, so the first always trumps the second, there is no mixing going on. - #### Absolute and relative paths -`CARGO_TARGET_DIR` can be either a relative or absolute path, which makes sense since it's mostly intended for a single project, which can then work from its own position to configure the target directory. - -On the other hand `CARGO_TARGET_BASE_DIR` is intended to be used with several projects, possibly completely unrelated to each other. As such, it does not accept relative paths, only absolute ones. If a compelling use case is present for a relative path, it can be added in the future as a backward-compatible change. +`CARGO_TARGET_DIR` can be either a relative or absolute path, which makes sense since it's mostly intended for a single project, which can then work from its own position to configure the target directory, and that stays the case with templates. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -117,39 +119,29 @@ This is the technical portion of the RFC. Explain the design in sufficient detai The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work. --> -## Setting `CARGO_TARGET_BASE_DIR` +## Templating values of `CARGO_TARGET_DIR` -The option is similar to `CARGO_TARGET_DIR` and can be set in the same places. From less to most specific: +Templating does not interfere with the resolution order of `CARGO_TARGET_DIR`. From less to most specific: - Through the `config.toml`: ```toml [build] - target-base-dir = "/absolute/path/to/target/directories" + target-base-dir = "/absolute/path/to/cache/{manifest-path-hash}" ``` -- Through the environment variable: `CARGO_TARGET_BASE_DIR="/absolute/path/to/target/directories" cargo build` -- Through the command line flag: `cargo build --target-base-dir /absolute/path/to/target/directories` - -The given path must be absolute: setting `CARGO_TARGET_BASE_DIR` to an empty or relative path is an error (when used and not instantly overriden by `CARGO_TARGET_DIR`). - -## Resolution order relative to `CARGO_TARGET_DIR` - -The resolution order favors `CARGO_TARGET_DIR` in all its forms, in the interest of both backward compatibility and allowing overriding for a singular workspace: - -`--target-dir` > `CARGO_TARGET_DIR` > `target-dir = ...` > `--target-base-dir` > `CARGO_TARGET_BASE_DIR` > `target-base-dir = ...` - -See the final section for discussion on how to make `CARGO_TARGET_BASE_DIR` the default. +- Through the environment variable: `CARGO_TARGET_DIR="/absolute/path/to/cache/{manifest-path-hash}" cargo build` +- Through the command line flag: `cargo build --target-dir "/absolute/path/to/cache/{manifest-path-hash}"` ## Naming -In the example in the previous section, using `CARGO_TARGET_BASE_DIR` with `cargo build` produces named subdirectories. The name of those is computed from the full and canonicalized path to the manifest for the workspace, so building `work-project/crate-1` will still use the directory for the whole workspace during a `cargo build` call. +In the example in the previous section, using a templated `CARGO_TARGET_DIR` with `cargo build` produces named subdirectories. The name of those is computed from the full and canonicalized path to the manifest for the workspace, so building `work-project/crate-1` will still use the directory for the whole workspace during a `cargo build` call. This naming scheme is **not considered stable**: the method will probably not change often but `cargo` offers no guarantee and may change it in any release. Tools that needs to interact with `cargo`'s target directory should not rely on its value for more than a single invocation of them: they should instead query `cargo metadata` for the actual value each time they are invoked. The path used for the naming of the final target directory is the one found *after* symlink resolution: `bazel` does it too and I have not found any complaints about this and it has the distinct advantage of allowing to make a symlink to a project somewhere else on the machine (for example to organise work projects) and avoid duplicating the build directory and all its data (which can be quite heavy). -To prevent collisions by craftings paths, the `` directory will be computed from a hash of the workspace manifest's full path (and possibly other data, for example `bazel` uses its version and the current user too). +To prevent collisions by craftings paths, the `` directory will be computed from a hash of the workspace manifest's full path (and possibly other data, for example `bazel` uses its version and the current user too). ### Symbolic links @@ -164,31 +156,31 @@ In the following situation │ │ ├─ symlink-to-crate/ -> actual-crate/ ``` -When calling `cargo metadata` in the `symlink-to-crate` path, the result contains `"manifest_path": "/Users/poliorcetics/projects/actual-crate/Cargo.toml"` and `"workspace_root":"/Users/poliorcetics/projects/actual-crate"`. This behaviour means that symlinks won't change the final directory used inside `CARGO_TARGET_BASE_DIR`, or in other words: symbolic links are resolved. +When calling `cargo metadata` in the `symlink-to-crate` path, the result contains `"manifest_path": "/Users/poliorcetics/projects/actual-crate/Cargo.toml"` and `"workspace_root":"/Users/poliorcetics/projects/actual-crate"`. This behaviour means that symlinks won't change the final directory used inside `{manifest-path-hash}`, or in other words: symbolic links are resolved. -### Handle possibly thousands of directories in a single `CARGO_TARGET_BASE_DIR` path +### Handle possibly thousands of directories in a single templated `CARGO_TARGET_DIR` path -While a single dev machine is unlikely to have enough projects that the naming scheme of `` will produce enough directories to slow down working in `$CARGO_TARGET_BASE_DIR/`, it could still happen, and notably in private CI, which are often less compartimentalized than public ones. Simple cruft over time (i.e, never calling `cargo clean` over years) could also make it happen, if much slower. +While a single dev machine is unlikely to have enough projects that the naming scheme of `` will produce enough directories to slow down working in `$CARGO_TARGET_DIR/`, it could still happen, and notably in private CI, which are often less compartimentalized than public ones. Simple cruft over time (i.e, never calling `cargo clean` over years) could also make it happen, if much slower. -To prevent this, `cargo` splits the hash into something like `$CARG_TARGET_BASE_DIR/hash[:2]/hash[2:4]/hash[4:]/...`. Since the naming scheme is considered an implementation detail, if this prove insufficient it could be change in a subsequent version of `cargo`. +To prevent this, `cargo` splits the hash into something like `$CARGO_TARGET_DIR/hash[:2]/hash[2:4]/hash[4:]/...`. Since the naming scheme is considered an implementation detail, if this prove insufficient it could be changed in a subsequent version of `cargo`. ## Impact on `cargo ...` calls -When calling `cargo` with a builtin call (e.g., `build`, `check` or `test`) where `CARGO_TARGET_BASE_DIR` is active, `cargo` will first resolve the effective `CARGO_TARGET_DIR` and then proceed with the command as if `CARGO_TARGET_DIR` had been set directly. For third party tools (`cargo-*`), where cargo does not know about the relevant `Cargo.toml`, the tool will have to use [`cargo_metadata`](https://docs.rs/cargo_metadata), as is already expected today, to learn about the effective target directory. +When calling `cargo` with a builtin call (e.g., `build`, `check` or `test`) where a templated `CARGO_TARGET_DIR` is active, `cargo` will first resolve the effective `CARGO_TARGET_DIR` and then proceed with the command as if `CARGO_TARGET_DIR` had been set directly. For third party tools (`cargo-*`), where cargo does not know about the relevant `Cargo.toml`, the tool will have to use [`cargo_metadata`](https://docs.rs/cargo_metadata), as is already expected today, to learn about the effective target directory. -In the same vein, `cargo metadata` fills the target directory information with the absolute path and make no mention of `CARGO_TARGET_BASE_DIR` since it can only be used with a single workspace at once. +In the same vein, `cargo metadata` fills the target directory information with the absolute path and make no mention of the template in `CARGO_TARGET_DIR` since it can only be used with a single workspace at once. ### `cargo clean` -Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for `CARGO_TARGET_BASE_DIR`: if it set, `cargo clean` deletes `CARGO_TARGET_BASE_DIR//` and not `target/`. +Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for the templated version: if it set, `cargo clean` deletes `/path/to//` and not `target/`. ### Providing forward links -[`targo`][tg] provides forward link (it links from `/target` to its own target directory) as a way for existing tools to continue working despite there being no `CARGO_TARGET_DIR` set for them to find the real target directory. +[`targo`][tg] provides forward link (it links from `/target` to its own target directory) as a way for existing tools to continue working despite there being no explicit `CARGO_TARGET_DIR` set for them to find the real target directory. -`cargo` does not provide them for `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). +`cargo` does not provide them for regular (untemplated) `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). -When `CARGO_TARGET_BASE_DIR` is used (in any form) and not superseded by other configurations (`CARGO_TARGET_DIR`), it *will* use a forward link by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if `CARGO_TARGET_BASE_DIR` and `CARGO_TARGET_DIR` weren't set at all. +When a templated `CARGO_TARGET_DIR` is used (in any form) (and not overriden, for example a templated env var overriden by a precise `--target-dir` option), it *will* use a forward link by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if the templated `CARGO_TARGET_DIR` wasn't set at all. This has a two big advantages: not breaking external tools and giving easy access to artifacts produced by `cargo build/test/doc` to users (they're in the habit of typing `./target/debug/my-bin`, this would continue working with forward links). @@ -200,9 +192,9 @@ When creating a forward link `cargo` will first attempt to create a symbolic lin ### Providing backlinks -Backlinks are metadata in `CARGO_TARGET_BASE_DIR` that links target directories back to the workspace they came from. +Backlinks are metadata in templated `CARGO_TARGET_DIR` that links target directories back to the workspace they came from. -[`targo`][tg] uses them in its own form of the feature and `cargo` uses them too in `CARGO_TARGET_BASE_DIR`. +[`targo`][tg] uses them in its own form of the feature and `cargo` uses them too for templated `CARGO_TARGET_DIR`. While details of the stored data are left to the implementation (there is no need for `cargo` to expose this data directly, though it could be exposed through `cargo metadata` in the future, see the relevant section below), one could imagine using it to clean target directories whose corresponding workspace does not exist anymore when calling something like `cargo clean --all-workspaces` (doing it automatically is not possible, else any workspace on external disks would have its target directory cleaned up each time the disk is unmounted, which is way too aggressive a default). @@ -215,17 +207,17 @@ While details of the stored data are left to the implementation (there is no nee This introduces one more option to look at to find the target directory, which may complicate the life of external tools. -This is mitigated by having `CARGO_TARGET_DIR` entirely override `CARGO_TARGET_BASE_DIR`, so an external tool can set it and go on its way. For tools not yet using `cargo metadata`, they'll be able to depend in the forward link provided by default by `cargo` when using `CARGO_TARGET_BASE_DIR`. +This is mitigated by the forward link provided by default by `cargo` when using the templated form of `CARGO_TARGET_DIR`. ## Hitting windows path length limits Depending on what naming scheme is used (e.g., a very long hash), we could hit the Windows path length limits if not careful. -A mitigation for this is recommending a short prefix (in `CARGO_TARGET_BASE_DIR`) and using a hash that doesn't include that many characters but those are only mitigations and do not fully fix the underlying problem. +A mitigation for this is recommending a short prefix (in `CARGO_TARGET_DIR`) and using a hash that doesn't include that many characters but those are only mitigations and do not fully fix the underlying problem. ## Forward links -There a few cases where a symlink instead of a real dir will break programs: at least SQLite 3 can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since `CARGO_TARGET_BASE_DIR` won't become a default in this RFC, we are not breaking any existing use cases. +There a few cases where a symlink instead of a real dir will break programs: at least SQLite 3 can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since templated `CARGO_TARGET_DIR` won't become a default in this RFC, we are not breaking any existing use cases. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -251,10 +243,6 @@ It is already possible today to use `CARGO_TARGET_DIR` to remap workspaces and p For those reason, this option has not been retained and the [`targo`][tg] tool is discussed more in details below. -## Using `XDG_CACHE_HOME` instead of a cargo-specific env-var - -Not all OSes use the XDG convention, notably Windows and macOS (though the latter can be somewhat made to) and it is very easy to define `CARGO_TARGET_BASE_DIR=${XDG_CACHE_HOME:-~/.cache}/cargo_target_base_dir` if wanted by users. - ## Just put the directories inside of `.cargo/cache/...` There are already lots of discussion about `.cargo` and `.rustup` being home to both cache and config files and why this is annoying for lots of users. What's more, it would not be as helpful to external build tools, they don't care about bringing the registry cache in their build directory for example. @@ -263,7 +251,7 @@ There are already lots of discussion about `.cargo` and `.rustup` being home to This require an hard-to-break naming scheme (a recent hash algorithm should be good enough in 99% of the cases but collisions are always possible), which is something the `cargo` team probably does not want to offer guarantees about. Instead, explicitely telling the naming scheme is not to be considered stable allows more invested people to experiment with the feature and find something solid if stability proves itself necessary. -What's more, by explicitely not stabilizing it (and maybe voluntarily changing it between versions sometimes, since a version change recompiles everything anyway ?) `cargo` can instead reroute people and tools towards `CARGO_TARGET_DIR` / `cargo metadata` instead, which are much more likely to be suited to their use case if they need the path to the target directory. +What's more, by explicitely not stabilizing it (and maybe voluntarily changing it between versions sometimes, since a version change recompiles everything anyway ?) `cargo` can instead reroute people and tools towards untemplated `CARGO_TARGET_DIR` / `cargo metadata` instead, which are much more likely to be suited to their use case if they need the path to the target directory. ## Just use `targo` @@ -288,14 +276,14 @@ On the other hand, [`targo`][tg] is already here and working for at least one pe [rust-lang/cargo#11156](https://github.com/rust-lang/cargo/issues/11156) was originally about remapping the target directory, not about having a central one but reading the issue, there seems to be no needs for more than the simple redefinition of the target directory proposed in this document. In the future, if `CARGO_TARGET_DIR_REMAP` is introduced, it could be used to be the prefix to the target directory like so: - Set `CARGO_TARGET_DIR_REMAP=/home/user/projects=/tmp/cargo-build` -- Compile the crate under `/home/user/projects/foo/` **without** `CARGO_TARGET_DIR` or `CARGO_TARGET_BASE_DIR` set +- Compile the crate under `/home/user/projects/foo/` **without** `CARGO_TARGET_DIR` set - The resulting target directory will be at `/tmp/cargo-build/foo/target` -By making the priority order `CARGO_TARGET_DIR` > `CARGO_TARGET_BASE_DIR` > `CARGO_TARGET_DIR_REMAP` (when all are absolute paths) we would keep backward compatibility. Or we could disallow having the last two set at once, so that they're alternatives and not ordered. +By making the priority order `CARGO_TARGET_DIR` > `CARGO_TARGET_DIR_REMAP` (when both are absolute paths) we would keep backward compatibility. Or we could disallow having the two set at once, so that they're alternatives and not ordered. When `CARGO_TARGET_DIR` is relative, the result could be `/tmp/cargo-build/foo/$CARGO_TARGET_DIR`. -Overall, I feel remapping is much harder to implement well and can be added later without interfering with `CARGO_TARGET_BASE_DIR` (and without this RFC interfering with remapping), though the design space is probably bigger than the one for this RFC. +Overall, I feel remapping is much harder to implement well and can be added later without interfering with templates in `CARGO_TARGET_DIR` (and without this RFC interfering with remapping), though the design space is probably bigger than the one for this RFC. # Prior art [prior-art]: #prior-art @@ -322,7 +310,7 @@ The [`bazel`](https://bazel.build) build system has a similar feature called the The naming scheme is as follow: `/_bazel_$USER/` is the `outputUserRoot`, used for all builds done by `$USER`. Below that, projects are identified by the MD5 hash of the path name of the workspace directory (computed after resolving symlinks). -The `outputRoot` can be overridden using `--output_base=...` (this is `$CARGO_TARGET_BASE_DIR`, the subject of this RFC) and the `outputUserRoot` with `--output_user_root=...` (this is close to using `$CARGO_TARGET_DIR`, already possible in today's `cargo`). +The `outputRoot` can be overridden using `--output_base=...` (this is the untemplated `$CARGO_TARGET_DIR` when it is used with a template) and the `outputUserRoot` with `--output_user_root=...` (this is close to using `$CARGO_TARGET_DIR`, already possible in today's `cargo`). It should be noted that `bazel` is integrated with [remote caching](https://bazel.build/remote/caching) and has different needs from `cargo`, the latter only working locally. @@ -339,7 +327,7 @@ It should be noted that `bazel` is integrated with [remote caching](https://baze - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? --> -- Do we want to add forward links for `CARGO_TARGET_DIR` too in the process of this RFC ? +- Do we want to add forward links for untemplated `CARGO_TARGET_DIR` too in the process of this RFC ? - Do we want to differentiate according to users ? `bazel` is a generic build tool, whereas `cargo` is not, so maybe differentiating on users is not necessary for the latter ? # Future possibilities @@ -365,11 +353,18 @@ in the section on motivation or rationale in this or subsequent RFCs. The section merely provides additional information. --> -- Allowing relative paths: this is counter-productive to the stated goal and their is not obvious use for it, but it's entirely possible someone will. - Introduce remapping into the concept in some way. - Introduce a form of garbage collection. Expanded upon this [Zulip thread](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/last-use.20tracking). -## Use `CARGO_TARGET_BASE_DIR` as the default instead of `target` +## Adding `XDG_CACHE_HOME`, `HOME` or `CARGO_HOME` as templates + +Not all OSes use the XDG convention, notably Windows and macOS (though the latter can be somewhat made to) and it is very easy to define `CARGO_TARGET_DIR=${XDG_CACHE_HOME:-~/.cache}/cargo-target-directories/{manifest-path-hash}` if wanted by users. + +It won't work in the `config.toml` but it will work with the environment variable and the command line option, both of which override the TOML config. + +It is certainly possible to add at least `{home}` and `{cargo_home}` but it can be done in the future without interfering at all with `{manifest-path-hash}`, making it relatively easy to defer and add if there is demand. + +## Use templated `CARGO_TARGET_DIR` as the default instead of `target` This option has several complications I'm not sure how to resolve: @@ -384,7 +379,7 @@ This option has several complications I'm not sure how to resolve: Well, first, advertising of the option and its behaviour, as well as the forward link behaviour (so people know we're not just breaking tools for fun). After that, it becomes necessary to test it quite heavily to really ensure nothing has broken irremediably. 1. Introduce it as the default behaviour in nightly, wait for a few stable releases so that beta has it too and it can start being used for a few months after that so that esoteric setups can also try it. -2. Write a post saying "we'll do the change in version X" (where current version is like X-2 to warn 3 months before at least ?) and then only apply the change to directory where there is no `CARGO_TARGET_DIR` config set. +2. Write a post saying "we'll do the change in version X" (where current version is like X-2 to warn 3 months before at least ?) and then only apply the change to projects where there is no untemplated `CARGO_TARGET_DIR` config set. ### A user still wants the current behaviour and not this RFC's @@ -394,9 +389,9 @@ Well, first, advertising of the option and its behaviour, as well as the forward The first two are probably enough, the third is a bandaid. -### Expose `CARGO_TARGET_BASE_DIR` metadata +### Expose template metadata -`cargo` will use backlinks in an implementation-defined form to keep track in the `CARGO_TARGET_BASE_DIR` of the relation from a target directory to its source workspace. +`cargo` will use backlinks in an implementation-defined form to keep track in the templated `CARGO_TARGET_DIR` of the relation from a target directory to its source workspace. In the future, we could envisage letting external tools and users access this data in a well-defined form through `cargo metadata`. From 9e119efec5bb97da633fedce9ed4072ca704e8fb Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sat, 17 Feb 2024 23:35:35 +0100 Subject: [PATCH 36/67] fix: add transition period notice --- text/3371-cargo-target-base-dir.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-base-dir.md index f4830a21e0f..abe6781b4a3 100644 --- a/text/3371-cargo-target-base-dir.md +++ b/text/3371-cargo-target-base-dir.md @@ -219,6 +219,10 @@ A mitigation for this is recommending a short prefix (in `CARGO_TARGET_DIR`) and There a few cases where a symlink instead of a real dir will break programs: at least SQLite 3 can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since templated `CARGO_TARGET_DIR` won't become a default in this RFC, we are not breaking any existing use cases. +## Transition period + +During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, should provide warnings about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the warning, but it's better than silently breaking workflows. + # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives From d4fd860a0d4e3778b82ae44e8448e7a7fdd0004a Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 18 Feb 2024 19:06:15 +0100 Subject: [PATCH 37/67] rename file to follow feature rename --- ...argo-target-base-dir.md => 3371-cargo-target-dir-templates.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename text/{3371-cargo-target-base-dir.md => 3371-cargo-target-dir-templates.md} (100%) diff --git a/text/3371-cargo-target-base-dir.md b/text/3371-cargo-target-dir-templates.md similarity index 100% rename from text/3371-cargo-target-base-dir.md rename to text/3371-cargo-target-dir-templates.md From 9945d9ad6db70fde4ce1973068e6c4164fca02e8 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 18 Feb 2024 19:06:53 +0100 Subject: [PATCH 38/67] nit: missed BASE_ --- text/3371-cargo-target-dir-templates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index abe6781b4a3..7d529adc8cf 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -389,7 +389,7 @@ Well, first, advertising of the option and its behaviour, as well as the forward - We could use a config option to entirely deactivate it - `CARGO_TARGET_DIR=target` would still be available -- We could add special behaviour like `CARGO_TARGET_BASE_DIR=""` meaning "use current" directory +- We could add special behaviour like `CARGO_TARGET_DIR=""` meaning "use current" directory The first two are probably enough, the third is a bandaid. From d222d65db26efcb8925def0bc9fdd1e40f3566e0 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 18 Feb 2024 19:17:00 +0100 Subject: [PATCH 39/67] point to the discussion on garbage collection --- text/3371-cargo-target-dir-templates.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 7d529adc8cf..063c8cf07f9 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -198,6 +198,8 @@ Backlinks are metadata in templated `CARGO_TARGET_DIR` that links target directo While details of the stored data are left to the implementation (there is no need for `cargo` to expose this data directly, though it could be exposed through `cargo metadata` in the future, see the relevant section below), one could imagine using it to clean target directories whose corresponding workspace does not exist anymore when calling something like `cargo clean --all-workspaces` (doing it automatically is not possible, else any workspace on external disks would have its target directory cleaned up each time the disk is unmounted, which is way too aggressive a default). +Garbage collection of unused target directories is discussed in [rust-lang/cargo#13136](https://github.com/rust-lang/cargo/issues/13136), follow the discussions there for more details. + # Drawbacks [drawbacks]: #drawbacks From aea0adb99c682037e873e5804c77382744fe8216 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 18 Feb 2024 19:26:43 +0100 Subject: [PATCH 40/67] expand on exposing template metadata --- text/3371-cargo-target-dir-templates.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 063c8cf07f9..5bcef1a45f6 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -395,10 +395,10 @@ Well, first, advertising of the option and its behaviour, as well as the forward The first two are probably enough, the third is a bandaid. -### Expose template metadata +## Expose template metadata `cargo` will use backlinks in an implementation-defined form to keep track in the templated `CARGO_TARGET_DIR` of the relation from a target directory to its source workspace. -In the future, we could envisage letting external tools and users access this data in a well-defined form through `cargo metadata`. +In the future, we could envisage letting external tools and users access this data in a well-defined form through `cargo metadata`, but without a use case to plan for, we cannot make any decision today. [tg]: https://github.com/sunshowers/targo From 376c30697d580c0ce715708d85902cdadb916e8b Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 18 Feb 2024 19:28:52 +0100 Subject: [PATCH 41/67] caching in cargo home discussion --- text/3371-cargo-target-dir-templates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 5bcef1a45f6..02c4c9ba194 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -376,7 +376,7 @@ This option has several complications I'm not sure how to resolve: 1. How do we decide on good platform defaults ? - Subsequently, when platform defaults are decided, how do we ensure a new platform has a good default too ? - - `CARGO_HOME` is already criticized for being both a *cache* and a *config* home (using the XDG spec semantics), adding more local cache to it in the form of `CARGO_HOME/target-base-dir/` would not improve the situation and should probably not be done, but, if no good alternatives are found, there is precedent to use it for this. + - `CARGO_HOME` is already criticized for being both a *cache* and a *config* home (using the XDG spec semantics), adding more local cache to it in the form of `CARGO_HOME/target-base-dir/` would not improve the situation and should probably not be done, but, if no good alternatives are found, there is precedent to use it for this. (Note: this is discussed more in [rust-lang/cargo#1734](https://github.com/rust-lang/cargo/issues/1734)) 2. How do we communicate on said default values ? 3. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. Note that having forward links active by default (see relevant section earlier in the RFC) will help offset a lot of the problems here. From befae7d9caad64f81df5c2389d28b85ff5b69c86 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 18 Feb 2024 19:45:06 +0100 Subject: [PATCH 42/67] move forward links sections to Unresolved questions --- text/3371-cargo-target-dir-templates.md | 51 +++++++++++++++---------- 1 file changed, 31 insertions(+), 20 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 02c4c9ba194..6c6d567028a 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -174,22 +174,6 @@ In the same vein, `cargo metadata` fills the target directory information with t Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for the templated version: if it set, `cargo clean` deletes `/path/to//` and not `target/`. -### Providing forward links - -[`targo`][tg] provides forward link (it links from `/target` to its own target directory) as a way for existing tools to continue working despite there being no explicit `CARGO_TARGET_DIR` set for them to find the real target directory. - -`cargo` does not provide them for regular (untemplated) `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). - -When a templated `CARGO_TARGET_DIR` is used (in any form) (and not overriden, for example a templated env var overriden by a precise `--target-dir` option), it *will* use a forward link by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if the templated `CARGO_TARGET_DIR` wasn't set at all. - -This has a two big advantages: not breaking external tools and giving easy access to artifacts produced by `cargo build/test/doc` to users (they're in the habit of typing `./target/debug/my-bin`, this would continue working with forward links). - -A config option (CLI, `config.toml` and env var), `link-target-dir`, will be introduced to deactivate this behaviour, but it will `true` by default, for the reasons provided in favor of forward links just above. - -#### Detailed working of forward links - -When creating a forward link `cargo` will first attempt to create a symbolic link (regardless of the platform). If that fails, it will attempt zero or more platform-specific solutions, like junction points on NTFS. If that fails too, a warning will be emitted but this will not prevent the rest of the action to go on: regular calls like `cargo check/clippy/build/test` likely won't need this forward link and after the user has been warned they could either resolve the problem themselves or ignore it, depending on their own use case and domain-specific knowledge. - ### Providing backlinks Backlinks are metadata in templated `CARGO_TARGET_DIR` that links target directories back to the workspace they came from. @@ -217,10 +201,6 @@ Depending on what naming scheme is used (e.g., a very long hash), we could hit t A mitigation for this is recommending a short prefix (in `CARGO_TARGET_DIR`) and using a hash that doesn't include that many characters but those are only mitigations and do not fully fix the underlying problem. -## Forward links - -There a few cases where a symlink instead of a real dir will break programs: at least SQLite 3 can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since templated `CARGO_TARGET_DIR` won't become a default in this RFC, we are not breaking any existing use cases. - ## Transition period During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, should provide warnings about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the warning, but it's better than silently breaking workflows. @@ -336,6 +316,37 @@ It should be noted that `bazel` is integrated with [remote caching](https://baze - Do we want to add forward links for untemplated `CARGO_TARGET_DIR` too in the process of this RFC ? - Do we want to differentiate according to users ? `bazel` is a generic build tool, whereas `cargo` is not, so maybe differentiating on users is not necessary for the latter ? +## Providing forward links + +[`targo`][tg] provides forward link (it links from `/target` to its own target directory) as a way for existing tools to continue working despite there being no explicit `CARGO_TARGET_DIR` set for them to find the real target directory. + +`cargo` does not provide them for regular (untemplated) `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). + +When a templated `CARGO_TARGET_DIR` is used (in any form) (and not overriden, for example a templated env var overriden by a precise `--target-dir` option), it *could* use a forward link by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if the templated `CARGO_TARGET_DIR` wasn't set at all. + +### Detailed working of forward links + +When creating a forward link `cargo` will first attempt to create a symbolic link (regardless of the platform). If that fails, it will attempt zero or more platform-specific solutions, like junction points on NTFS. If that fails too, a warning or note will be emitted but this will not prevent the rest of the action to go on: regular calls like `cargo check/clippy/build/test` likely won't need this forward link and after the user has been warned they could either resolve the problem themselves or ignore it, depending on their own use case and domain-specific knowledge. + +### Advantages of forward links + +This has a two big advantages: not breaking external tools and giving easy access to artifacts produced by `cargo build/test/doc` to users (they're in the habit of typing `./target/debug/my-bin`, this would continue working with forward links). + +A config option (CLI, `config.toml` and env var), `link-target-dir`, should be introduced to deactivate this behaviour, but it will `true` by default, for the reasons provided in favor of forward links just above. + +### Drawbacks of forward links + +There a few cases where a symlink instead of a real dir will break programs: at least SQLite 3 can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since templated `CARGO_TARGET_DIR` won't become a default in this RFC, we are not breaking any existing use cases. + +Additionally, there are issue when competing `CARGO_TARGET_DIR`s are set, for example if a different one is used for rust-analyzer and for nightly/stable versions: + +- Do we update the symlink all the time ? Rust-Analyzer and a `cargo build` could be running in parallel, with unpredictable results +- Do we only set it if it doesn't exist yet ? It could become outdated, a tool could run first and make it point somewhere unexpected + +Another drawback is about the warning/note when creating the symlink fails: + +- Do we warn/notify on each command ? Only on creation ? Only with `--verbose` ? All have issues with either being too verbose or users potentially missing relevant information. This can be determined during stabilization though, once more people have experience with it, and is easy to change later on. + # Future possibilities [future-possibilities]: #future-possibilities From 07f899b4982fb5ced22a05172365a513c77f29b0 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 18 Feb 2024 19:49:15 +0100 Subject: [PATCH 43/67] specify link-target-dir --- text/3371-cargo-target-dir-templates.md | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 6c6d567028a..d4de96be8dc 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -203,7 +203,7 @@ A mitigation for this is recommending a short prefix (in `CARGO_TARGET_DIR`) and ## Transition period -During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, should provide warnings about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the warning, but it's better than silently breaking workflows. +During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, should provide errors about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the error, but it's better than silently breaking workflows. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -322,18 +322,26 @@ It should be noted that `bazel` is integrated with [remote caching](https://baze `cargo` does not provide them for regular (untemplated) `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). -When a templated `CARGO_TARGET_DIR` is used (in any form) (and not overriden, for example a templated env var overriden by a precise `--target-dir` option), it *could* use a forward link by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if the templated `CARGO_TARGET_DIR` wasn't set at all. +After this RFC, `cargo` will provide the option of creating forward links regardless of templating when the target directory is not the default one. + +When a `CARGO_TARGET_DIR` is used (in any form), it *could* use a forward link by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if `CARGO_TARGET_DIR` wasn't set at all. ### Detailed working of forward links -When creating a forward link `cargo` will first attempt to create a symbolic link (regardless of the platform). If that fails, it will attempt zero or more platform-specific solutions, like junction points on NTFS. If that fails too, a warning or note will be emitted but this will not prevent the rest of the action to go on: regular calls like `cargo check/clippy/build/test` likely won't need this forward link and after the user has been warned they could either resolve the problem themselves or ignore it, depending on their own use case and domain-specific knowledge. +When creating a forward link `cargo` will first attempt to create a symbolic link (regardless of the platform). If that fails, it will attempt zero or more platform-specific solutions, like junction points on NTFS. If that fails too, a warning or note will be emitted (or error, see the configuration option below) but this will not prevent the rest of the action to go on: regular calls like `cargo check/clippy/build/test` likely won't need this forward link and after the user has been warned they could either resolve the problem themselves or ignore it, depending on their own use case and domain-specific knowledge. + +A config option (CLI, `config.toml` and env var), `link-target-dir`, should be introduced to control this behaviour, it will `auto` by default. + +Its possible values would be: + +- `true`: create the symlink and produce an error if it fails +- `"auto"`: create the symlink, produce a warning (or note) but do not fail the command +- `false`: don't create the symlink at all (don't touch it if it exists already though) ### Advantages of forward links This has a two big advantages: not breaking external tools and giving easy access to artifacts produced by `cargo build/test/doc` to users (they're in the habit of typing `./target/debug/my-bin`, this would continue working with forward links). -A config option (CLI, `config.toml` and env var), `link-target-dir`, should be introduced to deactivate this behaviour, but it will `true` by default, for the reasons provided in favor of forward links just above. - ### Drawbacks of forward links There a few cases where a symlink instead of a real dir will break programs: at least SQLite 3 can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since templated `CARGO_TARGET_DIR` won't become a default in this RFC, we are not breaking any existing use cases. From fac2d9c0852b8b043d3baf22f3fe5313de743072 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Feb 2024 23:15:22 +0100 Subject: [PATCH 44/67] Explain the advantages of hashes over copied folder tree structure --- text/3371-cargo-target-dir-templates.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index d4de96be8dc..01dd13fc881 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -271,6 +271,15 @@ When `CARGO_TARGET_DIR` is relative, the result could be `/tmp/cargo-build/foo/$ Overall, I feel remapping is much harder to implement well and can be added later without interfering with templates in `CARGO_TARGET_DIR` (and without this RFC interfering with remapping), though the design space is probably bigger than the one for this RFC. +## Advantages of using a hash over regular subdirectories + +It's possible to achieve most of the same system proposed here by setting a value like `CARGO_TARGET_DIR="/base/dir/{manifest-path-dirs}"`, where a manifest in `/tmp/test1/test2/Cargo.toml` would resolve the build directory to `/base/dir/tmp/test1/test2/`, but most is not *all* of them: + +1. Predictable depth of subdirectories: even with a hash, depth can be unpredictable in some way since the untemplated part can be as long as necessary but by using a hash of predictable length, users can prepare for the limits imposed by their filesystem (for example on Windows, some forms of path are quite short): it would be surprising for a project to build when in `/` but not in `/tmp/some/what/deep/inside/folders` for example +2. More traversals: by using a hash always cut in the same way to form subdirectories, it's easy to plan traversals for the filesystem +3. Encourage interactions through `cargo-metadata`: by using paths computed through cargo and not easily derivable from the file tree, future tools will be incentivized to work through `cargo-metadata` to find the target directory, widening adoption and making it easier for the cargo team to ensure nothing breaks in subsequent cargo updates +4. Avoid introducing a strong dependency on only the path: by using a hash, cargo can add element to it to help differentiate builds: for example we could use more parameters in the hash, see the relevant section in Future Possibilites + # Prior art [prior-art]: #prior-art @@ -355,6 +364,10 @@ Another drawback is about the warning/note when creating the symlink fails: - Do we warn/notify on each command ? Only on creation ? Only with `--verbose` ? All have issues with either being too verbose or users potentially missing relevant information. This can be determined during stabilization though, once more people have experience with it, and is easy to change later on. +## Rename the template from `{manifest-path-hash}` to `{project-hash}` + +This works in concert with the subsection `Introducing more data to the hash` later below: the currently proposed key name introduces a dependency in its name, should we prepare for possible changes by renaming it ? + # Future possibilities [future-possibilities]: #future-possibilities @@ -414,6 +427,13 @@ Well, first, advertising of the option and its behaviour, as well as the forward The first two are probably enough, the third is a bandaid. +## Introducing more data to the hash + +This RFC uses only the manifest's full path to produce the hash but we could conceivably introduce more data into it to make it more unique for more use cases: + +- By considering the user, we could avoid sharing caches between `sudo cargo build` and `cargo build` for example (and more). It could be especially useful for shared artifacts storage. Bazel already does this, but Bazel was built to be a distributed build system, whereas Cargo was not. +- By considering features, build flags and a host of other parameters we could share builds of crates that use the same set of features between various projects. This is already discussed in [rust-lang/cargo#5931](https://github.com/rust-lang/cargo/issues/5931). + ## Expose template metadata `cargo` will use backlinks in an implementation-defined form to keep track in the templated `CARGO_TARGET_DIR` of the relation from a target directory to its source workspace. From 23ae70a625866e5ccac0937a7faf6aec99a6b468 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Feb 2024 23:35:05 +0100 Subject: [PATCH 45/67] fix: put section back where it belong, it was one level too deep --- text/3371-cargo-target-dir-templates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 01dd13fc881..0d0344f7590 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -174,7 +174,7 @@ In the same vein, `cargo metadata` fills the target directory information with t Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for the templated version: if it set, `cargo clean` deletes `/path/to//` and not `target/`. -### Providing backlinks +## Providing backlinks Backlinks are metadata in templated `CARGO_TARGET_DIR` that links target directories back to the workspace they came from. From 025b8effa206b047a18fceca2661db1ec4c4d554 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Feb 2024 23:35:18 +0100 Subject: [PATCH 46/67] drawback: brace expansion --- text/3371-cargo-target-dir-templates.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 0d0344f7590..b50cc209b6a 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -205,6 +205,12 @@ A mitigation for this is recommending a short prefix (in `CARGO_TARGET_DIR`) and During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, should provide errors about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the error, but it's better than silently breaking workflows. +## Brace expansion + +Bash has [brace expansion](https://www.gnu.org/software/bash/manual/html_node/Brace-Expansion.html), other shells too. By using `{manifest-path-hash}` we risk users getting bitten by that behaviour. Brace expansion is not activated when there are no `,` nor `..` inside the `{}` so cargo should be fine. Since brace expansion is done at the shell level, cargo won't be able to detect it if it happens. + +Escaping, using single quotes (`'`) or even double quotes (`"`) will work to disable brace expansion, making it even easier to work around it if needed. + # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives From 8d1ddc29d8d637d0e2599d5aa31169704c30f511 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Feb 2024 23:36:50 +0100 Subject: [PATCH 47/67] future possibilities: remove section about metadata since there is no clear possible use --- text/3371-cargo-target-dir-templates.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index b50cc209b6a..08b4998cf34 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -440,10 +440,4 @@ This RFC uses only the manifest's full path to produce the hash but we could con - By considering the user, we could avoid sharing caches between `sudo cargo build` and `cargo build` for example (and more). It could be especially useful for shared artifacts storage. Bazel already does this, but Bazel was built to be a distributed build system, whereas Cargo was not. - By considering features, build flags and a host of other parameters we could share builds of crates that use the same set of features between various projects. This is already discussed in [rust-lang/cargo#5931](https://github.com/rust-lang/cargo/issues/5931). -## Expose template metadata - -`cargo` will use backlinks in an implementation-defined form to keep track in the templated `CARGO_TARGET_DIR` of the relation from a target directory to its source workspace. - -In the future, we could envisage letting external tools and users access this data in a well-defined form through `cargo metadata`, but without a use case to plan for, we cannot make any decision today. - [tg]: https://github.com/sunshowers/targo From b81b3edea0353fcf63568eac84a5e95360154459 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Feb 2024 23:44:39 +0100 Subject: [PATCH 48/67] link to issue 1734 for other possible templates discussions --- text/3371-cargo-target-dir-templates.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 08b4998cf34..8589c9d02fe 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -400,13 +400,15 @@ The section merely provides additional information. - Introduce remapping into the concept in some way. - Introduce a form of garbage collection. Expanded upon this [Zulip thread](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/last-use.20tracking). -## Adding `XDG_CACHE_HOME`, `HOME` or `CARGO_HOME` as templates +## Adding `{xdg-cache-home}`, `{home}`, `{cargo-home}` or `{user-base-target-directory}` as templates -Not all OSes use the XDG convention, notably Windows and macOS (though the latter can be somewhat made to) and it is very easy to define `CARGO_TARGET_DIR=${XDG_CACHE_HOME:-~/.cache}/cargo-target-directories/{manifest-path-hash}` if wanted by users. +OS-native cache directories are discussed more in details in [rust-lang/cargo#1734](https://github.com/rust-lang/cargo/issues/1734), there are semantic and naming issues to resolve before moving forward with them in cargo (and so as template for `CARGO_TARGET_DIR`). + +As a workaround, it is possible to use `CARGO_TARGET_DIR="${XDG_CACHE_HOME:-~/.cache}/cargo-target-directories/{manifest-path-hash}"`. It won't work in the `config.toml` but it will work with the environment variable and the command line option, both of which override the TOML config. -It is certainly possible to add at least `{home}` and `{cargo_home}` but it can be done in the future without interfering at all with `{manifest-path-hash}`, making it relatively easy to defer and add if there is demand. +It is certainly possible to add at least `{home}` and `{cargo-home}` but it can be done in the future without interfering at all with `{manifest-path-hash}`, making it a good option for a future addition without blocking. ## Use templated `CARGO_TARGET_DIR` as the default instead of `target` From 2ae536e3d73ae211b0342aa0448f95123a240516 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Feb 2024 23:52:03 +0100 Subject: [PATCH 49/67] fix link to garbage collection issue --- text/3371-cargo-target-dir-templates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 8589c9d02fe..84255bdce50 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -398,7 +398,7 @@ The section merely provides additional information. --> - Introduce remapping into the concept in some way. -- Introduce a form of garbage collection. Expanded upon this [Zulip thread](https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/last-use.20tracking). +- Introduce a form of garbage collection for these target directories so we don't leak them when projects are deleted, see [rust-lang/cargo#13136](https://github.com/rust-lang/cargo#13136). ## Adding `{xdg-cache-home}`, `{home}`, `{cargo-home}` or `{user-base-target-directory}` as templates From 11ab0e38c6351d8d89fa2a6abb063f19c2a18916 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 19 Feb 2024 23:53:31 +0100 Subject: [PATCH 50/67] delete section about providing backlinks --- text/3371-cargo-target-dir-templates.md | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 84255bdce50..0d3e3f8e464 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -174,16 +174,6 @@ In the same vein, `cargo metadata` fills the target directory information with t Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for the templated version: if it set, `cargo clean` deletes `/path/to//` and not `target/`. -## Providing backlinks - -Backlinks are metadata in templated `CARGO_TARGET_DIR` that links target directories back to the workspace they came from. - -[`targo`][tg] uses them in its own form of the feature and `cargo` uses them too for templated `CARGO_TARGET_DIR`. - -While details of the stored data are left to the implementation (there is no need for `cargo` to expose this data directly, though it could be exposed through `cargo metadata` in the future, see the relevant section below), one could imagine using it to clean target directories whose corresponding workspace does not exist anymore when calling something like `cargo clean --all-workspaces` (doing it automatically is not possible, else any workspace on external disks would have its target directory cleaned up each time the disk is unmounted, which is way too aggressive a default). - -Garbage collection of unused target directories is discussed in [rust-lang/cargo#13136](https://github.com/rust-lang/cargo/issues/13136), follow the discussions there for more details. - # Drawbacks [drawbacks]: #drawbacks @@ -398,7 +388,7 @@ The section merely provides additional information. --> - Introduce remapping into the concept in some way. -- Introduce a form of garbage collection for these target directories so we don't leak them when projects are deleted, see [rust-lang/cargo#13136](https://github.com/rust-lang/cargo#13136). +- Introduce a form of garbage collection for these target directories so we don't leak them when projects are deleted, see [rust-lang/cargo#13136](https://github.com/rust-lang/cargo#13136) (essentially, backlinks) ## Adding `{xdg-cache-home}`, `{home}`, `{cargo-home}` or `{user-base-target-directory}` as templates From 6540d11b8e2a2f4def265989a925932843ced8f3 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Tue, 20 Feb 2024 00:19:31 +0100 Subject: [PATCH 51/67] expand a lot on forward links, adding them to the RFC and not just as an unresolved question --- text/3371-cargo-target-dir-templates.md | 77 ++++++++++++------------- 1 file changed, 36 insertions(+), 41 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 0d3e3f8e464..7f4f1e432c2 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -143,7 +143,7 @@ The path used for the naming of the final target directory is the one found *aft To prevent collisions by craftings paths, the `` directory will be computed from a hash of the workspace manifest's full path (and possibly other data, for example `bazel` uses its version and the current user too). -### Symbolic links +## Symbolic links In the following situation @@ -164,6 +164,41 @@ While a single dev machine is unlikely to have enough projects that the naming s To prevent this, `cargo` splits the hash into something like `$CARGO_TARGET_DIR/hash[:2]/hash[2:4]/hash[4:]/...`. Since the naming scheme is considered an implementation detail, if this prove insufficient it could be changed in a subsequent version of `cargo`. +## Providing forward links + +[`targo`][tg] provides forward link (it links from `/target` to its own target directory) as a way for existing tools to continue working despite there being no explicit `CARGO_TARGET_DIR` set for them to find the real target directory. + +`cargo` currently does not provide them for regular (untemplated) `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). + +After this RFC, when the `CARGO_TARGET_DIR` will provide the option of creating a forward link, configurable via a new configuration option, `target-dir-link` (see below for details). + +### Detailed working of forward links + +When creating a forward link `cargo` will first attempt to create a symbolic link (regardless of the platform). If that fails, it will attempt zero or more platform-specific solutions, like junction points on NTFS. If that fails too, a warning or note will be emitted (or error, see the configuration option below) and after the user has been warned they could either resolve the problem themselves or ignore it, depending on their own use case and domain-specific knowledge. + +A config option (CLI, `config.toml` and env var), `target-dir-link`, controls this behaviour, it is `auto` by default. + +Its possible values would be: + +- `true`: create the symlink and produce an error if it fails +- `"auto"`: create the symlink, produce a warning (or note) but do not fail the command +- `false`: don't create the symlink at all (don't touch it if it exists already though) + +### Handling of concurrents builds + +It is possible to call `cargo build` for the same project twice with two different target directories to avoid build locks (common when building with different features or to have Rust-Analyzer work in a different target directory for example), which poses a problem for forward links: if `target-dir-link` is active, `cargo` could be replacing the `./target` symlink constantly. + +This can be configured through another option, `target-dir-link-replace`, with the following possible values: + +- `"notify"` (the default): replace and show the user a note about the replacement +- `"warn"`: replace and show the user a warning about the replacement +- `"silent"`: replace silently +- `"never"`: do not replace + +If the forward link doesn't exist already, this option has no effect. If the forward link exist and is the same as the would-be new one, no note nor warning will be produced. + +With both of these options, callers of cargo will be able to use `--config KEY=VALUE` to override it, for example a Rust-Analyzer config could use `cargo.extraArgs = ["--config", "target-dir-link=false"]` to ensure R-A never touches them. + ## Impact on `cargo ...` calls When calling `cargo` with a builtin call (e.g., `build`, `check` or `test`) where a templated `CARGO_TARGET_DIR` is active, `cargo` will first resolve the effective `CARGO_TARGET_DIR` and then proceed with the command as if `CARGO_TARGET_DIR` had been set directly. For third party tools (`cargo-*`), where cargo does not know about the relevant `Cargo.toml`, the tool will have to use [`cargo_metadata`](https://docs.rs/cargo_metadata), as is already expected today, to learn about the effective target directory. @@ -318,48 +353,8 @@ It should be noted that `bazel` is integrated with [remote caching](https://baze - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? --> -- Do we want to add forward links for untemplated `CARGO_TARGET_DIR` too in the process of this RFC ? - Do we want to differentiate according to users ? `bazel` is a generic build tool, whereas `cargo` is not, so maybe differentiating on users is not necessary for the latter ? -## Providing forward links - -[`targo`][tg] provides forward link (it links from `/target` to its own target directory) as a way for existing tools to continue working despite there being no explicit `CARGO_TARGET_DIR` set for them to find the real target directory. - -`cargo` does not provide them for regular (untemplated) `CARGO_TARGET_DIR`. This is not a limitation when using the environment variable set globally, since all processes can read it, but it is one when this config is only set on specific calls or via `target-dir` in the config, meaning others tools cannot easily pick it up (and most external tools don't use `cargo-metadata`, which makes them all broken by default, but fixing this situation is not this RFC's purpose). - -After this RFC, `cargo` will provide the option of creating forward links regardless of templating when the target directory is not the default one. - -When a `CARGO_TARGET_DIR` is used (in any form), it *could* use a forward link by adding a `target` symlink to the real target directory. This `target` symlink will be in the exact place the real target directory would have been if `CARGO_TARGET_DIR` wasn't set at all. - -### Detailed working of forward links - -When creating a forward link `cargo` will first attempt to create a symbolic link (regardless of the platform). If that fails, it will attempt zero or more platform-specific solutions, like junction points on NTFS. If that fails too, a warning or note will be emitted (or error, see the configuration option below) but this will not prevent the rest of the action to go on: regular calls like `cargo check/clippy/build/test` likely won't need this forward link and after the user has been warned they could either resolve the problem themselves or ignore it, depending on their own use case and domain-specific knowledge. - -A config option (CLI, `config.toml` and env var), `link-target-dir`, should be introduced to control this behaviour, it will `auto` by default. - -Its possible values would be: - -- `true`: create the symlink and produce an error if it fails -- `"auto"`: create the symlink, produce a warning (or note) but do not fail the command -- `false`: don't create the symlink at all (don't touch it if it exists already though) - -### Advantages of forward links - -This has a two big advantages: not breaking external tools and giving easy access to artifacts produced by `cargo build/test/doc` to users (they're in the habit of typing `./target/debug/my-bin`, this would continue working with forward links). - -### Drawbacks of forward links - -There a few cases where a symlink instead of a real dir will break programs: at least SQLite 3 can be configured to raise an error if the database is behind a symlink anywhere in its opening path, it's probably other programs can also be configured to check this (or do it by default). Since templated `CARGO_TARGET_DIR` won't become a default in this RFC, we are not breaking any existing use cases. - -Additionally, there are issue when competing `CARGO_TARGET_DIR`s are set, for example if a different one is used for rust-analyzer and for nightly/stable versions: - -- Do we update the symlink all the time ? Rust-Analyzer and a `cargo build` could be running in parallel, with unpredictable results -- Do we only set it if it doesn't exist yet ? It could become outdated, a tool could run first and make it point somewhere unexpected - -Another drawback is about the warning/note when creating the symlink fails: - -- Do we warn/notify on each command ? Only on creation ? Only with `--verbose` ? All have issues with either being too verbose or users potentially missing relevant information. This can be determined during stabilization though, once more people have experience with it, and is easy to change later on. - ## Rename the template from `{manifest-path-hash}` to `{project-hash}` This works in concert with the subsection `Introducing more data to the hash` later below: the currently proposed key name introduces a dependency in its name, should we prepare for possible changes by renaming it ? From 1195538cc598c7dd5b83c910ef2eb5022aa7cb06 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Tue, 20 Feb 2024 00:29:23 +0100 Subject: [PATCH 52/67] default with template fix --- text/3371-cargo-target-dir-templates.md | 21 +-------------------- 1 file changed, 1 insertion(+), 20 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 7f4f1e432c2..ad47acfe054 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -397,29 +397,10 @@ It is certainly possible to add at least `{home}` and `{cargo-home}` but it can ## Use templated `CARGO_TARGET_DIR` as the default instead of `target` -This option has several complications I'm not sure how to resolve: - -1. How do we decide on good platform defaults ? - - Subsequently, when platform defaults are decided, how do we ensure a new platform has a good default too ? - - `CARGO_HOME` is already criticized for being both a *cache* and a *config* home (using the XDG spec semantics), adding more local cache to it in the form of `CARGO_HOME/target-base-dir/` would not improve the situation and should probably not be done, but, if no good alternatives are found, there is precedent to use it for this. (Note: this is discussed more in [rust-lang/cargo#1734](https://github.com/rust-lang/cargo/issues/1734)) +1. This has several unresolved issues, notably: what default value do we use ? This discussion is already happening in [rust-lang/cargo#1734](https://github.com/rust-lang/cargo/issues/1734)) and does not block this RFC 2. How do we communicate on said default values ? 3. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. Note that having forward links active by default (see relevant section earlier in the RFC) will help offset a lot of the problems here. -### We really want to do this, how do we do it ? - -Well, first, advertising of the option and its behaviour, as well as the forward link behaviour (so people know we're not just breaking tools for fun). After that, it becomes necessary to test it quite heavily to really ensure nothing has broken irremediably. - -1. Introduce it as the default behaviour in nightly, wait for a few stable releases so that beta has it too and it can start being used for a few months after that so that esoteric setups can also try it. -2. Write a post saying "we'll do the change in version X" (where current version is like X-2 to warn 3 months before at least ?) and then only apply the change to projects where there is no untemplated `CARGO_TARGET_DIR` config set. - -### A user still wants the current behaviour and not this RFC's - -- We could use a config option to entirely deactivate it -- `CARGO_TARGET_DIR=target` would still be available -- We could add special behaviour like `CARGO_TARGET_DIR=""` meaning "use current" directory - -The first two are probably enough, the third is a bandaid. - ## Introducing more data to the hash This RFC uses only the manifest's full path to produce the hash but we could conceivably introduce more data into it to make it more unique for more use cases: From 3cead0dc905baa7ca7a41dca75e7ef1583348a14 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Fri, 22 Mar 2024 19:21:43 +0100 Subject: [PATCH 53/67] nit: simplify section title for template variables --- text/3371-cargo-target-dir-templates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index ad47acfe054..2ddaafdb1ff 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -385,7 +385,7 @@ The section merely provides additional information. - Introduce remapping into the concept in some way. - Introduce a form of garbage collection for these target directories so we don't leak them when projects are deleted, see [rust-lang/cargo#13136](https://github.com/rust-lang/cargo#13136) (essentially, backlinks) -## Adding `{xdg-cache-home}`, `{home}`, `{cargo-home}` or `{user-base-target-directory}` as templates +## Adding more template variables OS-native cache directories are discussed more in details in [rust-lang/cargo#1734](https://github.com/rust-lang/cargo/issues/1734), there are semantic and naming issues to resolve before moving forward with them in cargo (and so as template for `CARGO_TARGET_DIR`). From 3a6fa3e6848a295f7544bf6ac430119b8c1ffc94 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Fri, 22 Mar 2024 19:50:42 +0100 Subject: [PATCH 54/67] nit: no double negatives --- text/3371-cargo-target-dir-templates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 2ddaafdb1ff..df9bbf778b3 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -232,7 +232,7 @@ During the transition period, any `CARGO_TARGET_DIR` that was defined as contain ## Brace expansion -Bash has [brace expansion](https://www.gnu.org/software/bash/manual/html_node/Brace-Expansion.html), other shells too. By using `{manifest-path-hash}` we risk users getting bitten by that behaviour. Brace expansion is not activated when there are no `,` nor `..` inside the `{}` so cargo should be fine. Since brace expansion is done at the shell level, cargo won't be able to detect it if it happens. +Bash has [brace expansion](https://www.gnu.org/software/bash/manual/html_node/Brace-Expansion.html), other shells too. By using `{manifest-path-hash}` we risk users getting bitten by that behaviour. Brace expansion is only activated when there are `,` or `..` inside the `{}` so cargo should be fine. Since brace expansion is done at the shell level, cargo won't be able to detect it if it happens. Escaping, using single quotes (`'`) or even double quotes (`"`) will work to disable brace expansion, making it even easier to work around it if needed. From 645d9c41cd581b3c9aaac7f78169cce391ad59dc Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sat, 23 Mar 2024 15:23:57 +0100 Subject: [PATCH 55/67] nit: clarify the point about path length as an argument for hashes --- text/3371-cargo-target-dir-templates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index df9bbf778b3..54b6155b688 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -306,7 +306,7 @@ Overall, I feel remapping is much harder to implement well and can be added late It's possible to achieve most of the same system proposed here by setting a value like `CARGO_TARGET_DIR="/base/dir/{manifest-path-dirs}"`, where a manifest in `/tmp/test1/test2/Cargo.toml` would resolve the build directory to `/base/dir/tmp/test1/test2/`, but most is not *all* of them: -1. Predictable depth of subdirectories: even with a hash, depth can be unpredictable in some way since the untemplated part can be as long as necessary but by using a hash of predictable length, users can prepare for the limits imposed by their filesystem (for example on Windows, some forms of path are quite short): it would be surprising for a project to build when in `/` but not in `/tmp/some/what/deep/inside/folders` for example +1. Hashes have a fixed length while `manifest-path-dirs` is dependent on the context, making it a hazard for cross-platform compatibility. Say on Windows the target-dir is rooted in a user tmp dir and the manifest path is inside of the user documents. Especially combined with corporate policies on names, those base paths alone can take up a good amount of the character budget without getting into project names, etc. 2. More traversals: by using a hash always cut in the same way to form subdirectories, it's easy to plan traversals for the filesystem 3. Encourage interactions through `cargo-metadata`: by using paths computed through cargo and not easily derivable from the file tree, future tools will be incentivized to work through `cargo-metadata` to find the target directory, widening adoption and making it easier for the cargo team to ensure nothing breaks in subsequent cargo updates 4. Avoid introducing a strong dependency on only the path: by using a hash, cargo can add element to it to help differentiate builds: for example we could use more parameters in the hash, see the relevant section in Future Possibilites From 05b96f8c6638b4bdb72ad240a0e8f8098715f7f6 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sat, 23 Mar 2024 15:27:45 +0100 Subject: [PATCH 56/67] clarify summary by being more explicit (and fix a typo) --- text/3371-cargo-target-dir-templates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 54b6155b688..50604620879 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -8,7 +8,7 @@ -Introduce templating to `CARGO_TARGET_DIR` to have `cargo` adapts its target directory dynamically depending on (at least) the manifest's path. +Introduce templating to `CARGO_TARGET_DIR`, allowing `cargo` to adapt its target directory dynamically depending on (at least) the manifest's path with the `{manifest-path-hash}` key. # Motivation [motivation]: #motivation From 226f9ec125f1735c3d1596257d41341aa9e5e815 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sat, 23 Mar 2024 15:54:01 +0100 Subject: [PATCH 57/67] nit: fix a forgotten 'target-base-dir' remnant --- text/3371-cargo-target-dir-templates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 50604620879..c85e3b75a23 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -127,7 +127,7 @@ Templating does not interfere with the resolution order of `CARGO_TARGET_DIR`. F ```toml [build] - target-base-dir = "/absolute/path/to/cache/{manifest-path-hash}" + target-dir = "/absolute/path/to/cache/{manifest-path-hash}" ``` - Through the environment variable: `CARGO_TARGET_DIR="/absolute/path/to/cache/{manifest-path-hash}" cargo build` From 2802659393186077d79969e3169926b07b672a4f Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sat, 23 Mar 2024 15:58:31 +0100 Subject: [PATCH 58/67] feat: only one new option, not two, handle concurrent builds predictably --- text/3371-cargo-target-dir-templates.md | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index c85e3b75a23..0c2500395c3 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -188,16 +188,13 @@ Its possible values would be: It is possible to call `cargo build` for the same project twice with two different target directories to avoid build locks (common when building with different features or to have Rust-Analyzer work in a different target directory for example), which poses a problem for forward links: if `target-dir-link` is active, `cargo` could be replacing the `./target` symlink constantly. -This can be configured through another option, `target-dir-link-replace`, with the following possible values: +Cargo, when trying to create the forward link (so for `true` and `"auto"`), will handle the situation predictably in the following way: -- `"notify"` (the default): replace and show the user a note about the replacement -- `"warn"`: replace and show the user a warning about the replacement -- `"silent"`: replace silently -- `"never"`: do not replace +- If no `target` *link or directory* is present: create it as expected by `target-dir-link` +- If a `target` *link* is present: update the link +- If a `target` *directory* is present: consider it a failure, respond accordingly to `true` or `"auto"` -If the forward link doesn't exist already, this option has no effect. If the forward link exist and is the same as the would-be new one, no note nor warning will be produced. - -With both of these options, callers of cargo will be able to use `--config KEY=VALUE` to override it, for example a Rust-Analyzer config could use `cargo.extraArgs = ["--config", "target-dir-link=false"]` to ensure R-A never touches them. +Callers of cargo will be able to use `--config KEY=VALUE` to override it, for example a Rust-Analyzer config could use `cargo.extraArgs = ["--config", "target-dir-link=false"]` to ensure R-A never touches forward links. ## Impact on `cargo ...` calls From d8797514d8b167c09f941b20606a409c35b6db3c Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sat, 23 Mar 2024 16:09:36 +0100 Subject: [PATCH 59/67] feat: clarify template-as-default section in future possibilites --- text/3371-cargo-target-dir-templates.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 0c2500395c3..e63a890d022 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -394,9 +394,8 @@ It is certainly possible to add at least `{home}` and `{cargo-home}` but it can ## Use templated `CARGO_TARGET_DIR` as the default instead of `target` -1. This has several unresolved issues, notably: what default value do we use ? This discussion is already happening in [rust-lang/cargo#1734](https://github.com/rust-lang/cargo/issues/1734)) and does not block this RFC -2. How do we communicate on said default values ? -3. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. Note that having forward links active by default (see relevant section earlier in the RFC) will help offset a lot of the problems here. +1. What default value do we use ? This discussion is already happening in [rust-lang/cargo#1734](https://github.com/rust-lang/cargo/issues/1734)) and does not block this RFC (and also came up with `cargo script` and as been discussed at length there, we would use the same default as chosen for it) +2. This would probably break backward compatibility and lots of tools ? We could heavily advertise the option in the Rust book and Cargo's documentation but making it the default is probably not something we will be able (or even willing) to do any time soon. Note that having forward links active by default (see relevant section earlier in the RFC) will help offset a lot of the problems here. ## Introducing more data to the hash From 0f71cf57ee98bce40695eb39c0c9c12b2d18446b Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 24 Mar 2024 21:56:02 +0100 Subject: [PATCH 60/67] nit: remove nonsense line --- text/3371-cargo-target-dir-templates.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index e63a890d022..20a37c9a629 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -304,9 +304,8 @@ Overall, I feel remapping is much harder to implement well and can be added late It's possible to achieve most of the same system proposed here by setting a value like `CARGO_TARGET_DIR="/base/dir/{manifest-path-dirs}"`, where a manifest in `/tmp/test1/test2/Cargo.toml` would resolve the build directory to `/base/dir/tmp/test1/test2/`, but most is not *all* of them: 1. Hashes have a fixed length while `manifest-path-dirs` is dependent on the context, making it a hazard for cross-platform compatibility. Say on Windows the target-dir is rooted in a user tmp dir and the manifest path is inside of the user documents. Especially combined with corporate policies on names, those base paths alone can take up a good amount of the character budget without getting into project names, etc. -2. More traversals: by using a hash always cut in the same way to form subdirectories, it's easy to plan traversals for the filesystem -3. Encourage interactions through `cargo-metadata`: by using paths computed through cargo and not easily derivable from the file tree, future tools will be incentivized to work through `cargo-metadata` to find the target directory, widening adoption and making it easier for the cargo team to ensure nothing breaks in subsequent cargo updates -4. Avoid introducing a strong dependency on only the path: by using a hash, cargo can add element to it to help differentiate builds: for example we could use more parameters in the hash, see the relevant section in Future Possibilites +2. Encourage interactions through `cargo-metadata`: by using paths computed through cargo and not easily derivable from the file tree, future tools will be incentivized to work through `cargo-metadata` to find the target directory, widening adoption and making it easier for the cargo team to ensure nothing breaks in subsequent cargo updates +3. Avoid introducing a strong dependency on only the path: by using a hash, cargo can add element to it to help differentiate builds: for example we could use more parameters in the hash, see the relevant section in Future Possibilites # Prior art [prior-art]: #prior-art From 927226dbd2abd0b6be700b524f3e887ea3841f62 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 24 Mar 2024 21:59:50 +0100 Subject: [PATCH 61/67] nit: move transition period section --- text/3371-cargo-target-dir-templates.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 20a37c9a629..e4a7e6265c6 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -211,6 +211,10 @@ Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, +## Transition period + +During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, should provide errors about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the error, but it's better than silently breaking workflows. + ## One more option to find the target directory This introduces one more option to look at to find the target directory, which may complicate the life of external tools. @@ -223,10 +227,6 @@ Depending on what naming scheme is used (e.g., a very long hash), we could hit t A mitigation for this is recommending a short prefix (in `CARGO_TARGET_DIR`) and using a hash that doesn't include that many characters but those are only mitigations and do not fully fix the underlying problem. -## Transition period - -During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, should provide errors about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the error, but it's better than silently breaking workflows. - ## Brace expansion Bash has [brace expansion](https://www.gnu.org/software/bash/manual/html_node/Brace-Expansion.html), other shells too. By using `{manifest-path-hash}` we risk users getting bitten by that behaviour. Brace expansion is only activated when there are `,` or `..` inside the `{}` so cargo should be fine. Since brace expansion is done at the shell level, cargo won't be able to detect it if it happens. From a821d25a624e1ddf3143e854a3f77dc9cff07dd9 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 24 Mar 2024 22:18:59 +0100 Subject: [PATCH 62/67] feat: expand on transition period --- text/3371-cargo-target-dir-templates.md | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index e4a7e6265c6..ea3145cd05a 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -213,7 +213,11 @@ Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, ## Transition period -During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, should provide errors about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the error, but it's better than silently breaking workflows. +During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, should provide errors about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. + +"How to fix it" will have two solutions: change the configured target directory to not use the new key or use a newer version of cargo (which will not be available at the beginning since it won't exist). + + In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the error, but it's probably better than silently breaking workflows. ## One more option to find the target directory @@ -307,6 +311,19 @@ It's possible to achieve most of the same system proposed here by setting a valu 2. Encourage interactions through `cargo-metadata`: by using paths computed through cargo and not easily derivable from the file tree, future tools will be incentivized to work through `cargo-metadata` to find the target directory, widening adoption and making it easier for the cargo team to ensure nothing breaks in subsequent cargo updates 3. Avoid introducing a strong dependency on only the path: by using a hash, cargo can add element to it to help differentiate builds: for example we could use more parameters in the hash, see the relevant section in Future Possibilites +## Alternatives to the transition period + +While unlikely, the transition period may break workflows by introducing errors for previously valid target directories. +Several alternatives exist for this: + +1. Do the transition immediately and silently: any `CARGO_TARGET_DIR` previously using the exact new key will change meaning, though as noted before, it is unlikely to have been set to that in practice. + - Rust and Cargo both tend to shy away from making silent changes when they can affect observable behaviour. +2. Introduce a config to deactivate it fully, something like `CARGO_TARGET_DIR_USE_TEMPLATING=false`. + - We want to avoid config proliferation, especially a config that is intended to disappear once the transition period is over. +3. Ignore the templating if a directory exists at `/path/to/{manifest-path-hash}` (without interpreting the key) and do the transition immediately. + - The newer `cargo` could print a note or warning noting the change occured. + - This could work well, though any existing workflow relying on the target dir path would break if not using `cargo metadata` (again, it is very unlikely for this exact key to be present in a target directory). + # Prior art [prior-art]: #prior-art From 5dcfb2906a337605b6c25aa7a9d743462828b6a5 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 25 Mar 2024 21:18:15 +0100 Subject: [PATCH 63/67] feat: mention key for a future default cargo default target directory --- text/3371-cargo-target-dir-templates.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index ea3145cd05a..7137114e19b 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -406,7 +406,7 @@ As a workaround, it is possible to use `CARGO_TARGET_DIR="${XDG_CACHE_HOME:-~/.c It won't work in the `config.toml` but it will work with the environment variable and the command line option, both of which override the TOML config. -It is certainly possible to add at least `{home}` and `{cargo-home}` but it can be done in the future without interfering at all with `{manifest-path-hash}`, making it a good option for a future addition without blocking. +It is certainly possible to add at least `{home}`, `{cargo-home}` and something like `{cargo-default-target-dir}` but it can be done in the future without interfering at all with `{manifest-path-hash}`, making it a good option for a future addition without blocking. ## Use templated `CARGO_TARGET_DIR` as the default instead of `target` From 774e32843e841892814a6fd77a060f2cbe5119f0 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Mon, 25 Mar 2024 21:38:52 +0100 Subject: [PATCH 64/67] chore: move Transition Period section to reference level explanation --- text/3371-cargo-target-dir-templates.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 7137114e19b..449e1f84fe4 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -206,18 +206,20 @@ In the same vein, `cargo metadata` fills the target directory information with t Currently, if `CARGO_TARGET_DIR` is set to anything but `target` for a project, `cargo clean` does not delete the `target/` directory if it exists, instead deleting the directory pointed by `CARGO_TARGET_DIR`. The same behavior is used for the templated version: if it set, `cargo clean` deletes `/path/to//` and not `target/`. -# Drawbacks -[drawbacks]: #drawbacks - - - ## Transition period -During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, should provide errors about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. +During the transition period, any `CARGO_TARGET_DIR` that was defined as containing `{manifest-path-hash}` will change meaning. `cargo`, for at least one stable version of Rust, will provide errors about this and point to either this RFC or its documentation to explain why the incompatiblity arised and how to fix it. "How to fix it" will have two solutions: change the configured target directory to not use the new key or use a newer version of cargo (which will not be available at the beginning since it won't exist). - In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the error, but it's probably better than silently breaking workflows. + In practice, paths with `{` or `}` in it are unlikely, even more with the exact key used by cargo here, so maybe no one will ever see the error, but it's better than silently breaking workflows. + +# Drawbacks +[drawbacks]: #drawbacks + + + +- Breaking change for `CARGO_TARGET_DIR` since previously valid settings could become invalid (see "Transition period" section). ## One more option to find the target directory From 81bac72702e64fe03a0cfaf3050e71d7383922e5 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Fri, 19 Apr 2024 23:54:07 +0200 Subject: [PATCH 65/67] feat: clarify the 'Naming' section Thanks epage for the suggestions. --- text/3371-cargo-target-dir-templates.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 449e1f84fe4..f13ea5bba67 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -133,13 +133,12 @@ Templating does not interfere with the resolution order of `CARGO_TARGET_DIR`. F - Through the environment variable: `CARGO_TARGET_DIR="/absolute/path/to/cache/{manifest-path-hash}" cargo build` - Through the command line flag: `cargo build --target-dir "/absolute/path/to/cache/{manifest-path-hash}"` -## Naming +## Definition of `{manifest-path-hash}` -In the example in the previous section, using a templated `CARGO_TARGET_DIR` with `cargo build` produces named subdirectories. The name of those is computed from the full and canonicalized path to the manifest for the workspace, so building `work-project/crate-1` will still use the directory for the whole workspace during a `cargo build` call. +In the example in the previous section, `{manifest-path-hash}` was replaced with a relative path. This relative path is computed from the full and canonicalized path to the manifest for the workspace `Cargo.toml` (or the `script.rs` file directly for cargo-scripts). +By being canonicalized, including resolving of symlinks, symlinked projects will share the same target directory. This is following the prior art from `bazel` and I have not found any complaints about this. -This naming scheme is **not considered stable**: the method will probably not change often but `cargo` offers no guarantee and may change it in any release. Tools that needs to interact with `cargo`'s target directory should not rely on its value for more than a single invocation of them: they should instead query `cargo metadata` for the actual value each time they are invoked. - -The path used for the naming of the final target directory is the one found *after* symlink resolution: `bazel` does it too and I have not found any complaints about this and it has the distinct advantage of allowing to make a symlink to a project somewhere else on the machine (for example to organise work projects) and avoid duplicating the build directory and all its data (which can be quite heavy). +The hashing and turning the hash into nested directories is **not considered stable**: the method will probably not change often but `cargo` offers no guarantee and may change it in any release. Tools that needs to interact with `cargo`'s target directory should not rely on its value for more than a single invocation of them: they should instead query `cargo metadata` for the actual value each time they are invoked. To prevent collisions by craftings paths, the `` directory will be computed from a hash of the workspace manifest's full path (and possibly other data, for example `bazel` uses its version and the current user too). From 0a38c0767eeae0d6f0bdb433c9207c167e4f3245 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Fri, 19 Apr 2024 23:59:20 +0200 Subject: [PATCH 66/67] nit: forgotten 'an' --- text/3371-cargo-target-dir-templates.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index f13ea5bba67..6f2712c3457 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -49,7 +49,8 @@ Templating introduces one new templating key for `CARGO_TARGET_DIR`, in the same It can be used like this: `CARGO_TARGET_DIR="$HOME/.cache/cargo-target-dirs/{manifest-path-hash}"`. -When compiling `/home/ferris/src/cargo/` with user `ferris`, `manifest-path-hash` would be something like `ab/cd/` and the artifacts would be found in `/home/ferris/.cache/cargo-target-dirs/ab/cd//...`. Note the hash used and the path derived from that for `{manifest-path-hash}` are implementation details and the values here are just example. +When compiling `/home/ferris/src/cargo/` with user `ferris`, `manifest-path-hash` would be something like `ab/cd/` and the artifacts would be found in `/home/ferris/.cache/cargo-target-dirs/ab/cd//...`. +Note the hash used and the path derived from that for `{manifest-path-hash}` are implementation details and the values here are just an example. Below is an example of the behavior with untemplated versus templated forms: From d549cb12889d5a2c4a4dd64b5a8a0e3ee7b49aa0 Mon Sep 17 00:00:00 2001 From: "Alexis (Poliorcetics) Bourget" Date: Sun, 21 Apr 2024 23:35:19 +0200 Subject: [PATCH 67/67] feat: add future possibilities for templating other configs --- text/3371-cargo-target-dir-templates.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/text/3371-cargo-target-dir-templates.md b/text/3371-cargo-target-dir-templates.md index 6f2712c3457..ab79db25a93 100644 --- a/text/3371-cargo-target-dir-templates.md +++ b/text/3371-cargo-target-dir-templates.md @@ -410,6 +410,16 @@ It won't work in the `config.toml` but it will work with the environment variabl It is certainly possible to add at least `{home}`, `{cargo-home}` and something like `{cargo-default-target-dir}` but it can be done in the future without interfering at all with `{manifest-path-hash}`, making it a good option for a future addition without blocking. +## Adding template variables to more config + +Other configuration items in `cargo` could eventually benefit from templating, like `[build.rustflags]` or `[env]`. + +While not the goal of this RFC, the design is such that is should easily extend: `{...}` is not a valid format anywhere currently except in places with paths or where `cargo` directly forwards to either a shell or `rustc`: + +- `rustc` can be made to work with cargo and is also very unlikely to introduce `{...}` in flags because of the large number of sh-like shells that use this format to resolve globs already: calling `rustc` with `{...}` as an argument would become very error prone. +- paths, as already discussed earlier, are also highly unlikely to contain `{...}`, even less the exact keys that `cargo` will expect. +- runners could expect such arguments, but that would only become a problem if cargo introduced a conflicting key and it would need a runner using that pattern, which is again unlikely. + ## Use templated `CARGO_TARGET_DIR` as the default instead of `target` 1. What default value do we use ? This discussion is already happening in [rust-lang/cargo#1734](https://github.com/rust-lang/cargo/issues/1734)) and does not block this RFC (and also came up with `cargo script` and as been discussed at length there, we would use the same default as chosen for it)