Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes when GIT_DIR is a relative path, and the common prefix of paths to check includes a directory. #112

Closed
wnoise opened this issue Jan 2, 2021 · 6 comments
Labels
bug Something isn't working maybe invalid? Can't reproduce, or seems already fixed, or need more information question Further information is requested
Milestone

Comments

@wnoise
Copy link

wnoise commented Jan 2, 2021

Summary

Crashes when GIT_DIR is a relative path, and the common prefix of paths to check includes a directory. The latter is common, e.g. src. And of course pre-commit often specifies one or a few files.

This happens with the latest release, 1.2.2.

I found this bug while trying to use darker in a git commit hook.

Many versions of git set GIT_DIR when running hooks, and further set it to a relative path.

I'm using git version 2.17.1, but e.g. pre-commit has some interesting comments about other versions and environment variables as well. GIT_WORK_TREE and GIT_INDEX can also be set, for instance. A fuller set is listed below.

While this was exposed by trying to run it in a hook, darker should be robust to odd configurations.

Example

% GIT_DIR=.git darker --check src
Not a git repository
To compare two paths outside a working tree:
usage: git diff [--no-index] <path> <path>
Traceback (most recent call last):
  File "<virtualenv>/bin/darker", line 8, in <module>
    sys.exit(main())
  File "<virtualenv>/lib/python3.7/site-packages/darker/__main__.py", line 231, in main
    paths, revrange, args.isort, args.lint, black_args
  File "<virtualenv>/lib/python3.7/site-packages/darker/__main__.py", line 64, in format_edited_parts
    changed_files = git_get_modified_files(srcs, revrange, git_root)
  File "<virtualenv>/lib/python3.7/site-packages/darker/git.py", line 174, in git_get_modified_files
    lines = _git_check_output_lines(diff_cmd, cwd)
  File "<virtualenv>/lib/python3.7/site-packages/darker/git.py", line 130, in _git_check_output_lines
    return check_output(cmd, cwd=str(cwd)).decode("utf-8").splitlines()
  File "/usr/lib/python3.7/subprocess.py", line 411, in check_output
    **kwargs).stdout
  File "/usr/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['git', 'diff', '--name-only', '--relative', 'HEAD', '--', '.']' returned non-zero exit status 129.

vs the expected

% darker --check src
%

Analysis and suggested fix.

How it fails

In __main__.format_edited_parts() darker attempts to find a common root for the files specified, by calling darker.utils.get_common_root(). It then assigns this to the misleadingly named git_commit_root (it can
actually be deep within the git repository). This in turn gets passed to darker.git.git_get_modified_files(). That runs a subprocess call to git diff, but changed to that directory, making .git no longer resolve to the actual directory. Naturally, this fails.

I assume this feature is to enable running darker from outside a git repository, but with paths pointing into it. I personally don't see much use for this, and my preferred solution would be to rip it out entirely, rather than trying to do something clever. (If it is actually needed, it clearly should support the case where different files are in different git repositories.)

I do also see that it makes the current set of tests work easier, but this appears to be as much the structure of the tests depending on the implementation as the implementation making it easier to write tests.

Rip it out

The minimal thing to do is ripping out finding this common-prefix directory and just using the current directory.

This is implemented on the wnoise:minimal_disable_directory_changing branch (single commit).

With fewer git-specific hacks than other options, this also leaves it easier in the future to e.g. support other version control systems.

Unfortunately, I don't fully understand the tests, nor do I want to do lots of work that might not be accepted. This leaves the low-level code and tests supporting a feature that is never actually used.

Normalize environment

The easiest thing that preserves this feature seems to be turning all git environment variables that determine location into absolute paths.

These include:

  • GIT_DIR
  • GIT_INDEX
  • GIT_WORK_TREE
  • GIT_OBJECT_DIRECTORY
  • GIT_ALTERNATE_OBJECT_DIRECTORIES
  • GIT_COMMON_DIR

(In this specific case they could obviously be unset, but that'll cause problems in general).

Note that if GIT_DIR is set, but GIT_WORK_TREE is not, then git treats GIT_WORK_TREE as . If you change directories and GIT_DIR is set, you must also set GIT_WORK_TREE -- but this is exactly the problem of finding the repository root. If we assume that git is working as intended, then we can just set it to the absolute path to the current directory.

I have implemented this at:

https://github.com/wnoise/darker/tree/sanitize_git_environment

though I'm not sure whether I have added the code to do this in the right places.

This seems to work well, and was easy to implement, but I don't fully trust it. It tries to be too clever in understanding git's behavior and I don't necessarily expect it to work for e.g. submodules.

It's probably also a good idea to delete all GIT_* variables from the test environment. Again, I didn't feel comfortable altering the tests too much.

Bad ideas

There are other options that may seem tempting, but are definitely bad ideas:

  1. git-revparse has a truly bewildering set of functionality, including the flag --show-toplevel which will print out the repository root (although until git 2.25 it will fail silently when run outside a repository). In theory this should be relatively harmless, but sadly, it fails if GIT_DIR is set and doesn't point to the root. If GIT_WORK_TREE is set, it just reports that. If these are relative, it will break after changing directory.

  2. git has a -C option that is "pretend I was run from that directory". But it actually handles relative paths too, so it breaks in exactly the same way.

@akaihola
Copy link
Owner

akaihola commented Jan 4, 2021

@wnoise thanks for a totally awesome bug report! I'll take a careful look at it later this week – now it's way past my bedtime...

@akaihola
Copy link
Owner

I assume this feature is to enable running darker from outside a git repository, but with paths pointing into it. I personally don't see much use for this

It's hard to anticipate what IDEs may use as their working directory and how they pass paths to Python files to Darker and other tools. That's why I'd like Darker to survive as many different situations as possible.

@akaihola
Copy link
Owner

What about this solution:

GIT_DIR GIT_WORK_TREE cwd action
not set not set any chdir to closest common parent dir containing .git/
is set is set any no chdir
not set is set inside GIT_WORK_TREE no chdir
not set is set outside GIT_WORK_TREE chdir to GIT_WORK_TREE
is set not set any chdir to GIT_DIR/.. if common parent is inside it; otherwise fail

It's annoyingly many cases, though. The two first cases are probably realistically going to occur.

@akaihola akaihola added bug Something isn't working question Further information is requested labels Sep 4, 2021
@akaihola akaihola added this to the 1.3.1 milestone Sep 4, 2021
@akaihola
Copy link
Owner

akaihola commented Sep 9, 2021

Ok so I'm not sure if I understand my own proposal above from 8 months back.

Would it work if we keep the current behavior (chdir to closest common root) unless GIT_DIR is set, in which case no chdir would be done?

@akaihola
Copy link
Owner

akaihola commented Sep 9, 2021

Darker 1.3.0 accidentally changes the behavior: when setting LC_ALL=C for git invocations, it also cleans any other environment variables like GIT_*.

Does this still leave use cases which do require GIT_DIR as broken?

@akaihola akaihola added the maybe invalid? Can't reproduce, or seems already fixed, or need more information label Sep 10, 2021
@akaihola akaihola modified the milestones: 1.3.1, 1.3.2 Sep 15, 2021
@akaihola akaihola modified the milestones: 1.3.2, 1.4.0 Oct 28, 2021
@akaihola
Copy link
Owner

Darker 1.3.0 indeed seems to fix the behavior with @wnoise's original example. I'm thus closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working maybe invalid? Can't reproduce, or seems already fixed, or need more information question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants