Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index parsing errors are swallowed #14894

Closed
jonhoo opened this issue Dec 5, 2024 · 3 comments · Fixed by #14927
Closed

Index parsing errors are swallowed #14894

jonhoo opened this issue Dec 5, 2024 · 3 comments · Fixed by #14927
Labels
A-dependency-resolution Area: dependency resolution and the resolver C-bug Category: bug S-triage Status: This issue is waiting on initial triage.

Comments

@jonhoo
Copy link
Contributor

jonhoo commented Dec 5, 2024

Problem

When using a private registry, I recently ran into a case where a crate version that is in the index (confirmed by cURLing the registry directly) could not be resolved by Cargo, giving me an error like

error: no matching package named `something` found
location searched: `private-registry` index

I quickly suspected that the index entry was invalid somehow, but Cargo gave no errors or warnings, even with CARGO_LOG=trace. Furthermore, by tweaking http_remote.rs, I found that its query to the registry (with an if-none-match header) returned NotModified, indicating Cargo did have the etag of the latest index file, which did include the crate/version in question. By removing the etag header, I also confirmed that the contents Cargo got from its request contained the version in question.

Long story short, I discovered that Cargo ends up swallowing errors related to invalid index entries. The first time Cargo downloads an index file with an invalid entry, it hits

bail!(
"optional dependency `{dep}` is not included in any feature\n\
Make sure that `dep:{dep}` is included in one of features in the [features] table."
);

However, that gets turned into a tracing::info in

Err(e) => {
// This should only happen when there is an index
// entry from a future version of cargo that this
// version doesn't understand. Hopefully, those future
// versions of cargo correctly set INDEX_V_MAX and
// CURRENT_CACHE_VERSION, otherwise this will skip
// entries in the cache preventing those newer
// versions from reading them (that is, until the
// cache is rebuilt).
tracing::info!(
"failed to parse {:?} registry package: {}",
relative,
e
);
continue;
}

The user is likely to miss this since Cargo tracing logs are off by default, so they will just observe that Cargo claims the version doesn't exist.

Unfortunately, the issue doesn't stop there. If the user then tries to run with CARGO_LOG=trace, they will then see no errors or warnings from Cargo. In fact, the logs will make it seem like the invalid entry doesn't exist at all. There will be no trace of it anywhere. This is because when Cargo parses an index entry, it caches the parsed lines so that it doesn't have to parse them again

cache.versions.push((version.clone(), line));

let cache_bytes = cache.serialize(index_version.as_str());
// Once we have our `cache_bytes` which represents the `Summaries` we're
// about to return, write that back out to disk so future Cargo
// invocations can use it.
cache_manager.put(name, &cache_bytes);

However, when a line can't be parsed, it doesn't cache that line (note the continue where the error gets turned into tracing::info!). It also doesn't leave that line as "unparsed" somewhere. So on subsequent invocations, Cargo grabs the version list from the cache manager to avoid parsing lines again, doesn't observe the (invalid) index entry for the version in question so it doesn't warn, then downloads the file again because it doesn't find the requested version, that download yields and empty response because the etag does match the latest index file, and so Cargo gives up.

So, two requests:

  1. Cargo should be louder when it finds an invalid index entry.
  2. Cargo should repeat the warning about an invalid index entry even if it's in an up-to-date cached version of the index file.

Steps

  1. Inject a new version with an invalid index entry into a registry.
  2. Take a dependency on said version.
  3. Run cargo build.
  4. Observe that Cargo only says "version not found" with no further warnings.
  5. Re-run cargo build with CARGO_LOG=trace.
  6. Observe that there is no information in the logs about the invalid index entry which caused the version not to be found.

Possible Solution(s)

No response

Notes

In my case, the index entry parsing error I was eventually able to extract by disabling the if-none-match logic and then running with CARGO_LOG=trace was:

failed to parse "so/me/something" registry package: optional dependency `uom_0_35` is not included in any feature
Make sure that `dep:uom_0_35` is included in one of features in the [features] table.

I suspect, but am not quite sure, that this may be an erroneous index entry manipulation by the private registry provider we're using. Although it could also be that cargo publish allowed publishing something with a nonsensical optional dependency in the first place?

Version

cargo 1.85.0 (05f54fdc3 2024-12-03)
@jonhoo jonhoo added C-bug Category: bug S-triage Status: This issue is waiting on initial triage. labels Dec 5, 2024
@epage
Copy link
Contributor

epage commented Dec 5, 2024

This is heavily related to #10623.

Ideally we track the error state through Cargo. We've added IndexSummary to better track state. That now tracks newer versions which can help with #10623 (though we should also track the MSRV if present) though we still need to pipe this through up to the user. Also, unsure if that is only on successful parses or also on failed parses which we should also support.

What is distinct about this Issue compared to #10623 is adding an error variant for when the parse fails for a reason other than a newer index version.

Granted, this will only help if we can extract the name and version. If its too corrupted to even do that, we'll need to consider other means of reporting.

@ehuss
Copy link
Contributor

ehuss commented Dec 5, 2024

Thanks for the report! The failures are silenced somewhat on purpose, since this is intended to handle the scenario where an older version doesn't understand the format of newer index entries. We wanted to avoid the situation where older versions would spam warnings on the console whenever they are used. I think this is somewhat related or a duplicate of #7929 where we could have a controlled method of presenting this information. Or maybe similarly to #4309, where there could be a separate command to display that.

@epage
Copy link
Contributor

epage commented Dec 5, 2024

I think there are two separate cases

The work we have planned for #10623 would make it so we have more information available to show for #7929.

@epage epage added the A-dependency-resolution Area: dependency resolution and the resolver label Dec 5, 2024
github-merge-queue bot pushed a commit that referenced this issue Dec 12, 2024
…14923)

### What does this PR try to resolve?

The user likely intended what they said and we should tell them why it
didn't work.
Rejected versions also imply alt versions below them.

Fixes #4260

In changing the errors from alt-versions to rejected-versions, I found
part of the old error message was better and changed the message
introduced in #14897 from matching alt-names to alt-versions.

This includes the test for #14921. The "fix" was done before this
because as it was part of the refactors in prep for this.

### How should we test and review this PR?

### Additional information

Next steps after this are to still produce an
`IndexSummary::Unsupported` even if there are deserialization errors, so
long as we know the name and version which should take care of #10623
and #14894.
epage added a commit to epage/cargo that referenced this issue Dec 12, 2024
While rust-lang#14897 reported packages with an unsupported index schema version,
that only worked if the changes in the schema version did not cause
errors in deserializing `IndexPackage` or in generating a `Summary`.

This extends that change by recoverying on error with a more lax,
incomplete parse of `IndexPackage` which should always generate a valid
`Summary`.

To help with a buggy Index, we also will report as many as we can.
This does not provide a way to report to users or log on cache reads if
the index entry is not at least `{"name": "<string>", "vers": "<semver>"}`.

As a side effect, the index cache will include more "invalid" index
entries.
That should be ok as we will ignore the invalid entry in the cache when
loading it.
Ignoring of invalid entries dates back to rust-lang#6880 when the index cache was
introduced.

Fixes rust-lang#10623
Fixes rust-lang#14894
epage added a commit to epage/cargo that referenced this issue Dec 12, 2024
While rust-lang#14897 reported packages with an unsupported index schema version,
that only worked if the changes in the schema version did not cause
errors in deserializing `IndexPackage` or in generating a `Summary`.

This extends that change by recoverying on error with a more lax,
incomplete parse of `IndexPackage` which should always generate a valid
`Summary`.

To help with a buggy Index, we also will report as many as we can.
This does not provide a way to report to users or log on cache reads if
the index entry is not at least `{"name": "<string>", "vers": "<semver>"}`.

As a side effect, the index cache will include more "invalid" index
entries.
That should be ok as we will ignore the invalid entry in the cache when
loading it.
Ignoring of invalid entries dates back to rust-lang#6880 when the index cache was
introduced.

Fixes rust-lang#10623
Fixes rust-lang#14894
github-merge-queue bot pushed a commit that referenced this issue Dec 13, 2024
### What does this PR try to resolve?

While #14897 reported packages with an unsupported index schema version,
that only worked if the changes in the schema version did not cause
errors in deserializing `IndexPackage` or in generating a `Summary`.

This extends that change by recoverying on error with a more lax,
incomplete parse of `IndexPackage` which should always generate a valid
`Summary`.

To help with a buggy Index, we also will report as many as we can.
This does not provide a way to report to users or log on cache reads if
the index entry is not at least `{"name": "<string>", "vers":
"<semver>"}`.

Fixes #10623
Fixes #14894

### How should we test and review this PR?

My biggest paranoia is some bad interaction with the index cache
including more "invalid" index entries.
That should be ok as we will ignore the invalid entry in the cache when
loading it.
Ignoring of invalid entries dates back to #6880 when the index cache was
introduced.

### Additional information
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-dependency-resolution Area: dependency resolution and the resolver C-bug Category: bug S-triage Status: This issue is waiting on initial triage.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants