Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gitoxide integration: fetch #11448

Merged
merged 1 commit into from
Mar 2, 2023
Merged

Conversation

Byron
Copy link
Member

@Byron Byron commented Dec 1, 2022

This PR is the first step towards resolving #1171.

In order to get there, we integrate gitoxide into cargo in such a way that one can control its usage in nightly via -Zgitoxide or Zgitoxide=<feature>[,featureN].

Planned features are:

  • fetch - all fetches are done with gitxide (this PR)
  • shallow_index - the crates index will be a shallow clone (planned)
  • shallow_deps - git dependencies will be a shallow clone (planned)
  • checkout - plain checkouts with gitoxide (planned)

The above list is a prediction and might change as we understand the requirements better.

Testing and Transitioning

By default, everything stays as is. However, relevant tests can be re-runwith gitoxide using

__CARGO_USE_GITOXIDE_INSTEAD_OF_GIT2=1 cargo test git

There are about 200 tests with 'git' in their name and I plan to enable them one by one. That way the costs for CI stay managable (my first measurement with one test was 2min 30s), while allowing to take one step at a time.

Custom tests shall be added once we realize that more coverage is needed.

That way we should be able to maintain running git2 and gitoxide side by side until we are willing to switch over to gitoxide entirely on stable cargo. Then turning on git2 might be a feature toggle for a while until we finally remove it from the codebase.

Please see the above paragraph as invitation for discussion, it's merely a basis to explore from and improve upon.

Tasks

  • add feature toggle
  • setup test system with one currently successful test
  • implement fetch with gitoxide (MVP)
  • fetch progress
  • detect spurious errors
  • enable as many git tests as possible (and ignore what's not possible)
  • fix all git-related test failures (except for 1: built-in upload-pack, skipped for now)
  • validate that all HTTP handle options that come from cargo specific values are passed to gitoxide
  • a test to validate git2 code can handle crates-index clones created with gitoxide and vice-versa
  • remove patches that enabled gitoxide enabled testing - it's not used anymore
  • remove all TODOs and use crates-index version of git-repository The remaining 2 TODO's are more like questions for the reviewer.
  • run all tests with gitoxide on the fastest platform as another parallel task
  • switch to released version
  • Tasks from first review round
  • create a new gitoxide release and refer to the latest version from crates.io (instead of git-dependency)
  • address 2nd review round comments

Postponed Tasks

I suggest to go breadth-first and implement the most valuable features first, and then aim for a broad replacement of git2. What's left is details and improved compatibility with the git2 implementation that will be required once gitoxide should become the default implementation on stable to complete the transition.

  • built-in support for serving the file protocol (i.e. without using git). Simple cases like clone can probably be supported quickly, fetch needs more work though due to negotiation.
  • SSH name fallbacks via a native (probably libssh (avoid LGPL) libssh2 based) transport. Look at this issue for some history.
  • additional tasks from this tracking issue

Proposed Workflow

I am now using stacked git to keep commits meaningful during development. This will also mean that before reviews I will force-push a lot as changes will be bucketed into their respective commits.

Once review officially begins I will stop force-pushing and create small commits to address review comments. That way it should be easier to understand how things change over time.

Those review-comments can certainly be squashed into one commit before merging.

Please let me know if this is feasible or if there are other ways of working you prefer.

Development notes

  • unrelated: this line refers to an issue that has since been resolved in curl.
  • Additional tasks related to a correct fetch implementation are collected in this tracking issue. These affect how well the HTTP transport can be configured, needs work
  • authentication is quite complex and centred around making SSH connections work. This feature is currently the weakest in gitoxide as it simply uses ssh (the program) and calls it a day. No authentication flows are supported there yet and the goal would be to match git there at least (which it might already do by just calling ssh). Needs investigation. Once en-par with git I think cargo can restart the whole fetch operation to try different user names like before.
    • the built-in ssh-program based transport can now understand permission-denied errors, but the capability isn't used after all since a builtin ssh transport is required.
  • It would be possible to implement git::Progress and just ignore most of the calls, but that's known to be too slow as the implementation assumes a Progress::inc() call is as fast as an atomic increment and makes no attempt to reduce its calls to it.
  • learning about a way to get custom traits in thiserror could make spurious error checks nicer and less error prone during maintenance. It's not a problem though.
  • I am using RUSTFLAGS=--cfg to influence the entire build and unit-tests as environment variables didn't get through to the binary built and run for tests.

Questions

  • The way gitoxide is configured the user has the opportunity to override these values using more specific git options, for example using url specific http settings. This looks like a feature to me, but if it's not gitoxide needs to provide a way to disable applying these overrides. Please let me know what's desired here - my preference is to allow overrides.
  • gitoxide currently opens repositories similar to how git does which respects git specific environment variables. This might be a deviation from how it was before and can be turned off. My preference is to see it as a feature.

Prerequisite PRs

@rustbot
Copy link
Collaborator

rustbot commented Dec 1, 2022

r? @epage

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 1, 2022
@Byron Byron force-pushed the integrate-gitoxide branch from 0c9d0c6 to a305af5 Compare December 1, 2022 18:00
@Byron Byron force-pushed the integrate-gitoxide branch 2 times, most recently from 62af2de to 1759263 Compare December 5, 2022 14:50
@Byron
Copy link
Member Author

Byron commented Dec 5, 2022

Very early results for the MVP of fetch (a lot is missing, but it works already).

  • 78s cargo (stable)
  • 36s cargo (nightly) (2.2x faster)

And for additional reference:

  • 32s git (2.4x faster)
  • 31s gix (2.5x faster)
Measurements
Gix
❯ gix clone --bare https://github.com/rust-lang/crates.io-index  crates-index
 14:36:44 read pack done 299.8MB in 27.80s (10.8MB/s)
 14:36:44  indexing 928260 objects were resolved into 464130 objects during thin-pack resolution
 14:36:44  indexing done 464.1k objects in 27.75s (16.7k objects/s)
 14:36:44 decompressing done 1.6GB in 27.75s (57.0MB/s)
 14:36:47     Resolving done 464.1k objects in 2.54s (183.1k objects/s)
 14:36:47      Decoding done 17.0GB in 2.54s (6.7GB/s)
 14:36:47 writing index file done 13.0MB in 0.02s (748.1MB/s)
 14:36:47  create index file done 464.1k objects in 30.36s (15.3k objects/s)
+refs/heads/*:refs/remotes/origin/*
        a44803046c31473544456b8d9ea089a868da2691 refs/heads/master -> refs/remotes/origin/master [new]

prodash ( main) [$?] took 31s

cargo (stable)
❯ cargo fetch
warning: file found to be present in multiple build targets: /Users/byron/dev/github.com/Byron/prodash/examples/dashboard.rs
    Updating crates.io index

prodash ( main) [$?] took 1m18s
git
❯ git clone --bare https://github.com/rust-lang/crates.io-index  crates-index
Cloning into bare repository 'crates-index'...
remote: Enumerating objects: 464130, done.
remote: Counting objects: 100% (55/55), done.
remote: Compressing objects: 100% (55/55), done.
remote: Total 464130 (delta 29), reused 26 (delta 0), pack-reused 464075
Receiving objects: 100% (464130/464130), 285.87 MiB | 11.45 MiB/s, done.
Resolving deltas: 100% (320065/320065), done.

prodash ( main) [$?] took 32s
cargo (nightly)
❯ /Users/byron/dev/github.com/rust-lang/cargo/target/release/cargo -Z gitoxide=fetch fetch
warning: file found to be present in multiple build targets: /Users/byron/dev/github.com/Byron/prodash/examples/dashboard.rs
    Updating crates.io index

prodash ( main) [$?] took 36s

@Byron Byron force-pushed the integrate-gitoxide branch 18 times, most recently from 2552fec to e8c4201 Compare December 11, 2022 20:46
@Byron Byron force-pushed the integrate-gitoxide branch from e8c4201 to fb11423 Compare December 15, 2022 19:34
@Byron Byron force-pushed the integrate-gitoxide branch 2 times, most recently from b903e2d to 759c264 Compare December 26, 2022 09:03
@bors
Copy link
Contributor

bors commented Dec 29, 2022

☔ The latest upstream changes (presumably #10771) made this pull request unmergeable. Please resolve the merge conflicts.

@Byron Byron force-pushed the integrate-gitoxide branch from 7792d2e to cfffda9 Compare March 2, 2023 11:35
@weihanglo
Copy link
Member

This is soon to be merged. @Byron wrote a decent commit message cfffda9, calling out important stuff to follow up. Let me repost them here:

Limitations

Note that what follows are current shortcomings that will be addressed in future PRs.

  • it's likely that authentication around the ssh protocol will work differently in practice
    as it uses the ssh program.
  • clones from file:// based crates indices will need the git binary to serve the locatl repository.
  • the progress bar shown when fetching doesn't work like the orgiinal, but should already feel 'faster'.

One thing also needs attentions is the behavior change of git credential helper mentioned in #11448 (comment). -Zgitoxide follows what git CLI does, but it is better mentioning the change when calling for testing.

Thank you to everyone involved in this pull request!

@bors r+

@bors
Copy link
Contributor

bors commented Mar 2, 2023

📌 Commit cfffda9 has been approved by weihanglo

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 2, 2023
@bors
Copy link
Contributor

bors commented Mar 2, 2023

⌛ Testing commit cfffda9 with merge 7cba527...

@bors
Copy link
Contributor

bors commented Mar 2, 2023

☀️ Test successful - checks-actions
Approved by: weihanglo
Pushing 7cba527 to master...

@bors bors merged commit 7cba527 into rust-lang:master Mar 2, 2023
@Byron Byron deleted the integrate-gitoxide branch March 2, 2023 13:06
bors added a commit that referenced this pull request Mar 2, 2023
chore(ci): Enforce cargo-deny in CI

With #11448, we are pulling in a wide and deep dependency tree which makes it harder for us to track what we are pulling in over time.

I've been trying out [`cargo-deny`](https://github.com/EmbarkStudios/cargo-deny) on my projects and wanted to explore how useful it might be for cargo.  atm I only have it configured to fail for unexpected licenses.  We can also use its warnings to hunt down and remove duplicated dependencies to speed up our builds.

I did also enable advisories.  We ignore the failure in a way to not block PRs or even show up as failure in PRs as PR authors are not responsible for dealing with these (unless its a new dep) and it can be intimidating as a contributor to see a failure and have no idea how to resolve it (as authors generally assume CI is green and failures are there fault)

I did not go too much further into what all `cargo-deny` can do; there might be more we can leverage.
weihanglo added a commit to weihanglo/rust that referenced this pull request Mar 7, 2023
23 commits in 9880b408a3af50c08fab3dbf4aa2a972df71e951..c1334b059c6dcceab3c10c81413f79bb832c8d9d
2023-02-28 19:39:39 +0000 to 2023-03-07 19:21:50 +0000

- Add `CARGO_PKG_README` (rust-lang/cargo#11645)
- path dependency: fix cargo-util version (rust-lang/cargo#11807)
- Adding display of which target failed to compile (rust-lang/cargo#11636)
- Fix `CARGO_CFG_` vars for configs defined both with and without value (rust-lang/cargo#11790)
- Breaking endless loop on cyclic features in added dependency in cargo-add (rust-lang/cargo#11805)
- Enhance the doc of timing report with graphs (rust-lang/cargo#11798)
- Make `sparse` the default protocol for crates.io (rust-lang/cargo#11791)
- Use sha2 to calculate SHA256 (rust-lang/cargo#11795)
- gitoxide progress bar fixes (rust-lang/cargo#11800)
- Check publish_to_alt_registry publish content (rust-lang/cargo#11799)
- chore: fix missing files in autolabel trigger_files (rust-lang/cargo#11797)
- chore: Update base64 (rust-lang/cargo#11796)
- Fix some doc typos (rust-lang/cargo#11794)
- chore(ci): Enforce cargo-deny in CI (rust-lang/cargo#11761)
- Some cleanup for unstable docs (rust-lang/cargo#11793)
- gitoxide integration: fetch (rust-lang/cargo#11448)
- patch can conflict on not activated packages (rust-lang/cargo#11770)
- fix(toml): Provide a way to show unused manifest keys for dependencies (rust-lang/cargo#11630)
- Improve error for missing crate in --offline mode for sparse index (rust-lang/cargo#11783)
- feat(resolver): `-Zdirect-minimal-versions` (rust-lang/cargo#11688)
- feat: Use test name for dir when running tests (rust-lang/cargo#11738)
- Jobserver cleanup (rust-lang/cargo#11764)
- Fix help string for  "--charset" option of "cargo tree" (rust-lang/cargo#11785)

Note that some 3rd-party licensing allowed list changed due to the
introducion of `gix` dependency
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 8, 2023
Update cargo

25 commits in 9880b408a3af50c08fab3dbf4aa2a972df71e951..7d3033d2e59383fd76193daf9423c3d141972a7d
2023-02-28 19:39:39 +0000 to 2023-03-08 17:05:08 +0000

- Revert "rust-lang/cargo#11738" - Use test name for dir when running tests (rust-lang/cargo#11812)
- Update CHANGELOG for 1.68 backports (rust-lang/cargo#11810)
- Add `CARGO_PKG_README` (rust-lang/cargo#11645)
- path dependency: fix cargo-util version (rust-lang/cargo#11807)
- Adding display of which target failed to compile (rust-lang/cargo#11636)
- Fix `CARGO_CFG_` vars for configs defined both with and without value (rust-lang/cargo#11790)
- Breaking endless loop on cyclic features in added dependency in cargo-add (rust-lang/cargo#11805)
- Enhance the doc of timing report with graphs (rust-lang/cargo#11798)
- Make `sparse` the default protocol for crates.io (rust-lang/cargo#11791)
- Use sha2 to calculate SHA256 (rust-lang/cargo#11795)
- gitoxide progress bar fixes (rust-lang/cargo#11800)
- Check publish_to_alt_registry publish content (rust-lang/cargo#11799)
- chore: fix missing files in autolabel trigger_files (rust-lang/cargo#11797)
- chore: Update base64 (rust-lang/cargo#11796)
- Fix some doc typos (rust-lang/cargo#11794)
- chore(ci): Enforce cargo-deny in CI (rust-lang/cargo#11761)
- Some cleanup for unstable docs (rust-lang/cargo#11793)
- gitoxide integration: fetch (rust-lang/cargo#11448)
- patch can conflict on not activated packages (rust-lang/cargo#11770)
- fix(toml): Provide a way to show unused manifest keys for dependencies (rust-lang/cargo#11630)
- Improve error for missing crate in --offline mode for sparse index (rust-lang/cargo#11783)
- feat(resolver): `-Zdirect-minimal-versions` (rust-lang/cargo#11688)
- feat: Use test name for dir when running tests (rust-lang/cargo#11738)
- Jobserver cleanup (rust-lang/cargo#11764)
- Fix help string for  "--charset" option of "cargo tree" (rust-lang/cargo#11785)

---

~~This update is primarily for making rust-lang/cargo#11630 into 1.69~~ (will file a beta backport then). However, just look into the licenses and dependencies permitted list, it looks a bit unfortunate but inevitable I guess?

r? `@ehuss`
cc `@Muscraft`
@Byron Byron mentioned this pull request Mar 9, 2023
15 tasks
@weihanglo weihanglo added the Z-gitoxide Nightly: gitoxide integration label Mar 9, 2023
@ehuss ehuss added this to the 1.70.0 milestone Mar 9, 2023
saethlin pushed a commit to saethlin/miri that referenced this pull request Mar 11, 2023
Update cargo

25 commits in 9880b408a3af50c08fab3dbf4aa2a972df71e951..7d3033d2e59383fd76193daf9423c3d141972a7d
2023-02-28 19:39:39 +0000 to 2023-03-08 17:05:08 +0000

- Revert "rust-lang/cargo#11738" - Use test name for dir when running tests (rust-lang/cargo#11812)
- Update CHANGELOG for 1.68 backports (rust-lang/cargo#11810)
- Add `CARGO_PKG_README` (rust-lang/cargo#11645)
- path dependency: fix cargo-util version (rust-lang/cargo#11807)
- Adding display of which target failed to compile (rust-lang/cargo#11636)
- Fix `CARGO_CFG_` vars for configs defined both with and without value (rust-lang/cargo#11790)
- Breaking endless loop on cyclic features in added dependency in cargo-add (rust-lang/cargo#11805)
- Enhance the doc of timing report with graphs (rust-lang/cargo#11798)
- Make `sparse` the default protocol for crates.io (rust-lang/cargo#11791)
- Use sha2 to calculate SHA256 (rust-lang/cargo#11795)
- gitoxide progress bar fixes (rust-lang/cargo#11800)
- Check publish_to_alt_registry publish content (rust-lang/cargo#11799)
- chore: fix missing files in autolabel trigger_files (rust-lang/cargo#11797)
- chore: Update base64 (rust-lang/cargo#11796)
- Fix some doc typos (rust-lang/cargo#11794)
- chore(ci): Enforce cargo-deny in CI (rust-lang/cargo#11761)
- Some cleanup for unstable docs (rust-lang/cargo#11793)
- gitoxide integration: fetch (rust-lang/cargo#11448)
- patch can conflict on not activated packages (rust-lang/cargo#11770)
- fix(toml): Provide a way to show unused manifest keys for dependencies (rust-lang/cargo#11630)
- Improve error for missing crate in --offline mode for sparse index (rust-lang/cargo#11783)
- feat(resolver): `-Zdirect-minimal-versions` (rust-lang/cargo#11688)
- feat: Use test name for dir when running tests (rust-lang/cargo#11738)
- Jobserver cleanup (rust-lang/cargo#11764)
- Fix help string for  "--charset" option of "cargo tree" (rust-lang/cargo#11785)

---

~~This update is primarily for making rust-lang/cargo#11630 into 1.69~~ (will file a beta backport then). However, just look into the licenses and dependencies permitted list, it looks a bit unfortunate but inevitable I guess?

r? `@ehuss`
cc `@Muscraft`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-documenting-cargo-itself Area: Cargo's documentation A-git Area: anything dealing with git A-interacts-with-crates.io Area: interaction with registries A-networking Area: networking issues, curl, etc. A-testing-cargo-itself Area: cargo's tests A-unstable Area: nightly unstable support Command-clean S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. Z-gitoxide Nightly: gitoxide integration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants