[red-knot] Fix bug where module resolution would not be invalidated if an entire package was deleted #12378

AlexWaygood · 2024-07-18T13:00:58Z

Summary

This fixes a bug where module resolution would not be invalidated if an entire package was deleted

Test Plan

I added a test that fails on main, and passes with this PR.

github-actions · 2024-07-18T13:14:20Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

MichaReiser · 2024-07-18T13:51:53Z

crates/red_knot_module_resolver/src/path.rs

-                        FilePathRef::System(path) => resolver.db.system().path_exists(&path.join("__init__.pyi")),
-                        FilePathRef::Vendored(path) => resolver.db.vendored().exists(path.join("__init__.pyi")),
+                        FilePathRef::System(path) => system_path_to_file(resolver.db.upcast(),path.join("__init__.pyi")).is_some(),
+                        FilePathRef::Vendored(path) => vendored_path_to_file(resolver.db.upcast(), path.join("__init__.pyi")).is_some(),


Nit: We can use vendored().exists here because we know that the system is read only.

I wondered about this. In what situation should we use vendored_path_to_file()?

In cases where we need a File, for example because we have to pass it to source_text

Right -- but if we know it's immutable, maybe we should just be calling vendored.read-to_string() instead of source_text() to get the source text -- the same argument applies, no? (Not trying to be argumentative, just curious!)

It depends. Not if it is a Python file, because calling source_text then has the benefit that we read the vendored python file exactly once instead of multiple times (file.read_to_string is not cached).

That's also not the point I made above. We need vendored_path_to_file when calling parsed_module because the function takes a File argument (because it doesn't care if it is a vendored or non vendored file). That's when you need a file.

file.read_to_string is not cached

Nor is vendored.exists() -- each time we call that, we'll be querying the zip archive. Is the logic that we think that this will be inexpensive enough that it's not worth caching?

I'm not sure I understand. This seems to be unrelated to the original question.

it's not worth caching

Probably not. Caching comes at a high cost (memory, locking, lookup). We cache the resolved module, which should prevent from calling into the vendored system in the first place. But we have to wait for some real world usage to tell.

This seems to be unrelated to the original question.

It seems related to me, but perhaps this illustates confusion on my part :-)

I'm trying to understand when and why (in future PRs) I should go via the Files APIs for reading information about files stored in the vendored zip archive, and when it makes sense to just get this information from the VendoredFileSystem. If I understand correctly, the tradeoffs are as follows:

Going via the Files API has the advantage that the query is automatically invalidated when the file changes -- but the file will never change for vendored files, so, unlike with filesystem files, that's not a good reason to go via the Files API.

Going via the Files API has the advantage that the call is automatically cached. This means that if we go via the Files API, we'll end up querying the zip archive less frequently; but that could also cost more than it saves us, due to higher memory allocation/consumption, and because locking and lookups in the cache are also costly.

Going via the Files API gives you a File object that you can pass directly to queries such as ruff_db::parsed::parsed_module. You wouldn't be able to call these queries if you just read the contents of the file as a string using the VendoredFileSystem directly.

Is that an accurate summary? I feel like my question has been answered now, anyway!

I think that's a good summary. I now see how it is related.

Regarding vendored_path_to_file(...).exists. This is actually not cached for files that don't exist. We could, but we currently don't to avoid tracking Files for vendored paths that don't exist. So vendored().exists and vendored_path_to_file(path).exists have the same cost for files that don't exist.

crates/red_knot_module_resolver/src/resolver.rs

…f an entire package was deleted

AlexWaygood added the red-knot Multi-file analysis & type inference label Jul 18, 2024

AlexWaygood requested review from carljm and MichaReiser as code owners July 18, 2024 13:00

MichaReiser approved these changes Jul 18, 2024

View reviewed changes

AlexWaygood added 2 commits July 19, 2024 12:43

[red-knot] Fix bug where module resolution would not be invalidated i…

942abf3

…f an entire package was deleted

Remove the test that doesn't fail on main, address review, add a comment

031effb

AlexWaygood force-pushed the resolver-salsa-invalidation branch from 9430f04 to 031effb Compare July 19, 2024 12:43

AlexWaygood merged commit 5f96f69 into main Jul 19, 2024
20 checks passed

AlexWaygood deleted the resolver-salsa-invalidation branch July 19, 2024 12:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Fix bug where module resolution would not be invalidated if an entire package was deleted #12378

[red-knot] Fix bug where module resolution would not be invalidated if an entire package was deleted #12378

AlexWaygood commented Jul 18, 2024

github-actions bot commented Jul 18, 2024 •

edited

Loading

MichaReiser Jul 18, 2024

AlexWaygood Jul 18, 2024

MichaReiser Jul 18, 2024

AlexWaygood Jul 18, 2024

MichaReiser Jul 18, 2024

AlexWaygood Jul 18, 2024

MichaReiser Jul 18, 2024

AlexWaygood Jul 18, 2024 •

edited

Loading

MichaReiser Jul 18, 2024

[red-knot] Fix bug where module resolution would not be invalidated if an entire package was deleted #12378

[red-knot] Fix bug where module resolution would not be invalidated if an entire package was deleted #12378

Conversation

AlexWaygood commented Jul 18, 2024

Summary

Test Plan

github-actions bot commented Jul 18, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexWaygood Jul 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Jul 18, 2024 •

edited

Loading

`ruff-ecosystem` results

AlexWaygood Jul 18, 2024 •

edited

Loading