Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide more guidance on how to use .git_archival.txt #987

Open
Mr0grog opened this issue Dec 4, 2023 · 12 comments · May be fixed by #1034
Open

Provide more guidance on how to use .git_archival.txt #987

Mr0grog opened this issue Dec 4, 2023 · 12 comments · May be fixed by #1034

Comments

@Mr0grog
Copy link

Mr0grog commented Dec 4, 2023

I was working on switching a package from Versioneer to setuptools_scm, and wound up scratching my head over warnings about “unprocessed git archival found” when running python -m build .. It took me a while to understand what was going on, and I think some more guidance in the docs about how one should build a package when .git_archival.txt is present would have been helpful.

Specifically, if you add a .git_archival.txt file to a repository as suggested in docs/usage.md and then run python -m build ., you’ll see output like:

* Creating venv isolated environment...
* Installing packages in isolated environment... (setuptools>=64, setuptools_scm>=8)
* Getting build dependencies for sdist...
* Building sdist...
* Building wheel from sdist
* Creating venv isolated environment...
* Installing packages in isolated environment... (setuptools>=64, setuptools_scm>=8)
* Getting build dependencies for wheel...
/private/var/folders/jk/1hv06w454vj4q4rk2gl0zg800000gn/T/build-env-_6jt3k01/lib/python3.12/site-packages/setuptools_scm/git.py:308: UserWarning: git archive did not support describe output
  warnings.warn("git archive did not support describe output")
/private/var/folders/jk/1hv06w454vj4q4rk2gl0zg800000gn/T/build-env-_6jt3k01/lib/python3.12/site-packages/setuptools_scm/git.py:327: UserWarning: unprocessed git archival found (no export subst applied)
  warnings.warn("unprocessed git archival found (no export subst applied)")
ERROR setuptools_scm._file_finders.git listing git files failed - pretending there aren't any
* Installing packages in isolated environment... (wheel)
* Building wheel...
/private/var/folders/jk/1hv06w454vj4q4rk2gl0zg800000gn/T/build-env-_6jt3k01/lib/python3.12/site-packages/setuptools_scm/git.py:308: UserWarning: git archive did not support describe output
  warnings.warn("git archive did not support describe output")
/private/var/folders/jk/1hv06w454vj4q4rk2gl0zg800000gn/T/build-env-_6jt3k01/lib/python3.12/site-packages/setuptools_scm/git.py:327: UserWarning: unprocessed git archival found (no export subst applied)
  warnings.warn("unprocessed git archival found (no export subst applied)")
ERROR setuptools_scm._file_finders.git listing git files failed - pretending there aren't any
Successfully built mypkg-0.1.2.dev0+g8e7dc46.d20231204.tar.gz and mypkg-0.1.2.dev0+g8e7dc46.d20231204-py3-none-any.whl

I’ve cut some of the extra log lines to make the two warnings and errors clearer.

I’m pretty certain what’s happening here is that build creates an sdist, which works fine, but then unpacks that sdist in a temp directory to build the wheel from it, which is not fine. The unpacked sdist is not in a git repo, so setuptools-scm can’t find version info that way, and the sdist, which was build from my repo’s working directory, has a .git_archival.txt file that is just a template/unprocessed, so setuptools-scm is warning about that and then failing to get any useful versioning info.

That took me a little while to understand, and suggested to me that the right way to build a package when you have a .git_archival.txt file present is to do one of two things:

  1. Ensure the file is not included in the build via MANIFEST.in or some config setting for whatever build backend the project is using, or

  2. Always build from an actual archive. That is, a build process should do something like:

    git archive --output ../build_dir/source_archive.tar <commit-or-tag-to-build>
    cd ../build_dir
    tar -xf source_archive.tar
    rm source_archive.tar
    python -m build .

Do I have that right? If so, I think it would be helpful to add to the docs. I’m happy to make those changes in a PR if my understanding is correct.

(Obviously these specific results are caused by how build does its work, and other font-ends might or might not cause this warning. But I think the problem and solutions above are fairly generic and helpful to clarify. A lot of front-ends or build techniques would still wind up with the unprocessed .git_archival.txt file in an sdist if you don’t take one of the two above steps, and that doesn’t seem good.)


As a side note, I also found the dual warnings confusing:

/private/var/folders/jk/1hv06w454vj4q4rk2gl0zg800000gn/T/build-env-_6jt3k01/lib/python3.12/site-packages/setuptools_scm/git.py:308: UserWarning: git archive did not support describe output
  warnings.warn("git archive did not support describe output")
/private/var/folders/jk/1hv06w454vj4q4rk2gl0zg800000gn/T/build-env-_6jt3k01/lib/python3.12/site-packages/setuptools_scm/git.py:327: UserWarning: unprocessed git archival found (no export subst applied)
  warnings.warn("unprocessed git archival found (no export subst applied)")

And it wasn’t until I started searching the source here that I realized the first warning is not really correct and is kind of redundant. It would be clearer if only the second warning (“unprocessed git archival found”) was logged and the first (“git archive did not support describe output”) was suppressed in this case.

@LecrisUT
Copy link
Contributor

.git_archival.txt is just a fancy way of saving the output of saving the output of

$ git describe --tags --match=...

If the above command does not create appropriate output of the tags, check what tags you have and their naming schema. I often forget to add an initial v0.0.0 tag on a project which is often the issue I get.

.git_archival.txt is also a fallback, the preference are: PKG-INFO (from sdist on PyPI for example) -> .git metadata -> .git_archival.txt. The documentation here is misleading, the source-code below shows the actual order:

https://github.com/pypa/setuptools_scm/blob/d081257ea39dfae710603796a9e85033256cc012/_own_version_helper.py#L28-L34

@Mr0grog
Copy link
Author

Mr0grog commented Apr 16, 2024

.git_archival.txt is just a fancy way of saving the output of saving the output of… .git_archival.txt is also a fallback, the preference are: PKG-INFO -> .git metadata -> .git_archival.txt.

Yes, I understand that part. There is no issue with git or tags or anything, and no issue with how setuptools_scm is getting the version info — there are actually no technical issues here at all. As noted in my original post, this is more a documentation issue, and is specifically an issue when you use this package and use the build frontend to do your builds.

To put it more concisely, if you are using this package and follow the directions in the usage docs, running python -m build . (in your git repo, with proper tags) will print several warnings and errors. None of these are fatal or will prevent your package from having correct versioning info, but they certainly seemed concerning to me. (To be fair, I asked other package authors who use this, and they all sort of shrugged and said “yeah, there are a bunch of warnings, but things seem to work fine, so I just ignore them.” So maybe I’m just more sensitive than other people!)

The solution to this is either:

  1. Explicitly keep .git_archival.txt out of your sdist (reasonable since a valid sdist will have PKG-INFO for it to use instead)
  2. OR build from a git archive rather than a git checkout/working directory. (If you think it’s nice to have .git_archival.txt with all the other metadata it has from git in your sdist, even though it’s not needed there. Probably nice for SBOM-related use cases.)

I realize there are other build frontends that might not run afoul of these issues and that this package shouldn’t be too concerned with whatever frontend you are using, but given that build is the one recommended by PyPA, I imagine this error must be one that is common. It seems like it would be good to call this out in the usage/configuration docs.

@LecrisUT
Copy link
Contributor

LecrisUT commented Apr 16, 2024

Hmm that is odd, looking at the order above, you should not be getting any messages related to git-archive. Maybe it is an older version of setuptools_scm that is used? Looking at the git-blame though, I see it dating back to 8.0.0. It seems worth investigating why these warnings are being triggered.

  1. Explicitly keep .git_archival.txt out of your sdist (reasonable since a valid sdist will have PKG-INFO for it to use instead)

Should not be necessary since PKG-INFO is first priority. Normal git is next, and neither case are there any (non-debug) message triggered.

The .git_archival.txt is only relevant for packagers and usage of pip install https://... (maybe also pip install git+https://...?). The normal user should be successful at parse_pkginfo and the developer at git.parse

To put it more concisely, if you are using this package and follow the directions in the usage docs, running python -m build . (in your git repo, with proper tags) will print several warnings and errors.

Following the steps, git.parse path should be successful (there is no step to call git archive or to move out of the git repo). There is something fishy there.

@Mr0grog
Copy link
Author

Mr0grog commented Apr 16, 2024

It might help if I explain what’s happening here step-by-step:

When build builds both an sdist and a wheel (which is what it does by default), it first builds the sdist, and then builds the wheel from the sdist.

  1. build works with the build backend to produce an sdist.

    • The sdist will have the right version info in PKG-INFO, since it gets pulled from the .git metadata. All good.
    • If you run this from a git checkout (not an archive), most backends will copy the template version of .git_archival.txt into your sdist alongside all the normal PKG-INFO files and whatnot. I tested this with both Setuptools and Hatch. Other backends would probably produce a similar result, but I have not tested.
    • No errors or warnings are logged at this point. Everything is fine, except you have an invalid .git_archival.txt file in your sdist, which isn’t technically wrong, but is not great.
  2. build copies the sdist into a temporary directory and builds a wheel from the sdist.

    1. build installs the package and build dependencies from the sdist in a new, isolated environment.
      1. setuptools_scm warns “git archive did not support describe output” because there is a .git_archival.txt file present but it has bad content (since, as noted above, the sdist we are building from has an unprocessed template version of this file, not one that has been filled in).
      2. setuptools_scm warns “unprocessed git archival found (no export subst applied)”. This happens just after the previous warning and for basically the same reasons, but this message is clearer (I think it would be helpful if this message was printed instead of the previous one, instead of in addition to it, but it’s not the main issue here).
      3. setuptools_scm prints ERROR setuptools_scm._file_finders.git listing git files failed - pretending there aren't any because of the previous warnings. It falls back to the PKG-INFO in the sdist and building continues fine.
    2. build works with the build backend to build the wheel in the new, isolated environment. The above warnings/errors are printed again (the previous were from package installation, and then it all happens again during build).
    3. The wheel is built fine and all good since the fallback to PKG-INFO works.
  3. build puts the resulting sdist and wheel in your dist directory. They work fine (although the sdist has the template .git_archival.txt file in it; see step 1 above). All done.

There’s not really an issue with the fallback order or anything else. You get working sdists and wheels at the end, and things function. These are just annoying/confusing warnings for most people.

Hopefully that makes it clear how excluding the .git_archival.txt file from sdists solves this problem by avoiding the underlying issues created in step 1. And how building from an archive instead of a checkout would also solve this by causing step 1 to build an sdist with a valid .git_archival.txt file instead of a template.

Does that make sense? It’s been several months since I filed this issue, but I’m pretty sure that’s what was happening when I investigated.

@Mr0grog
Copy link
Author

Mr0grog commented Apr 16, 2024

(Also possible some changes with ordering make this all work fine now; I just know this is how it worked with whatever was the latest release back in December.)

@LecrisUT
Copy link
Contributor

Ok, I've had a bit more navigation and realized the snippet I've shared was only used by the setuptools-scm itself and not the projects. And I guess you are not setting tool.setuptools_scm.version_file. The logic of version detection for actual projects seems to be governed by entry-points setups. Hopefully I got the relevant code correct now at:
https://github.com/pypa/setuptools_scm/blob/d8d2b8614c6710d0a06a5c22da35d083f9fb5e95/src/setuptools_scm/_integration/setuptools.py#L71 which leads to https://github.com/pypa/setuptools_scm/blob/d8d2b8614c6710d0a06a5c22da35d083f9fb5e95/src/setuptools_scm/_get_version_impl.py#L58-L63

I am failing to navigate further at this time of day. I haven't run build in a while, but I don't remember encountering such error messages. I've always had tool.setuptools_scm.version_file setup though.

But I agree such messages should not appear, and it's weird that the order is defined in _own_version_helper, but not synchronized outside. I'm just not sure where the root cause would be to better find what should be documented or fixed.

@Mr0grog
Copy link
Author

Mr0grog commented Apr 16, 2024

And I guess you are not setting tool.setuptools_scm.version_file.

Actually, I am setting that. I dug around and found the old minimal test-case repo I made for this last year and just added it to GitHub: https://github.com/Mr0grog/setuptools_scm_issue_987_demo

Trying it out again, I just noticed that even with .git_archival.txt excluded from the sdist (via uncommenting the exclusion in MANIFEST.in), it still logs the ERROR setuptools_scm._file_finders.git listing git files failed - pretending there aren't any line, which I don’t remember happening before (maybe it did and I didn’t notice before, though).

@Mr0grog
Copy link
Author

Mr0grog commented Apr 16, 2024

Ah, after messing around a bit with the source locally, it looks like:

  • setuptools starts by calling _get_version_impl._get_version() (interesting that it starts with the underscore-prefixed version of this function, but 🤷 )

  • That calls _get_version_impl.parse_version()

  • That uses the following sources/fallbacks, in order:

    1. _read_pretended_version_for() (gets it from the environment)
    2. parse_scm_version() (calls the handlers for the setuptools_scm.parse_scm entrypoint, which is basically reading direct from git if a .git file is present)
    3. parse_fallback_version() (calls the handlers for the setuptools_scm.parse_scm_fallback entrypoint, which checks the .git_archival.txt file before PKG-INFO)

So really there aren’t any warnings building the sdist because parse_scm_version() succeeds (since you are building from a checkout and .git is present). When building the wheel from the sdist, that fails and falls back to parse_fallback_version(), which reads from .git_archival.txt before PKG-INFO, a totally different order from try_parse in _own_version_helper.py. I’m not sure why the order is different, but the todo comment implies that it shouldn’t be and they just got out of sync.

That explains the warnings, although I’m not sure about the history/correctness/edge cases around the current way things are ordered for the setuptools_scm.parse_scm_fallback and what the implications of changing it would be.


This doesn’t explain the ERROR setuptools_scm._file_finders.git listing git files failed - pretending there aren't any error.

It looks to me like that’s because _file_finders.find_files() (which plugs directly into setuptools) uses _entrypoints.iter_entry_points() instead of discover.iter_matching_entrypoints(), so it looks for git files even if there’s no .git directory present (and similarly, always looks for mercurial files as long as it didn’t find any git files, even if there’s no .hg directory present).

I definitely don’t have enough context to say why it might be that way or if it’s a mistake. But at first glance it looks like just an oversight or typo.

@Mr0grog
Copy link
Author

Mr0grog commented Apr 16, 2024

Anyway, it’s probably good to fix the above so errors and warnings aren’t logged, but it also seems odd to me for sdists to have the template version of .git_archival.txt in them, and I think it would still be good to amend the docs to suggest either excluding it from the sdist or building from an archive (excluding is probably simpler to explain, and the docs could stay simpler by just suggesting that).

@LecrisUT
Copy link
Contributor

Nice, thanks for checking. I kinda understand the preference for .git there. If you do an editable install, you will have PKG-INFO file, but that will rapidly be outdated. But the order of .git_archival.txt and PKG-INFO doesn't make sense there. I will try to make a PR to invert that order if you don't beat me to it.

Also setuptools_scm._file_finders.git message should be changed, just double-check if it's used somewhere else, otherwise gate that by a have_fallback=False kwarg.

@Mr0grog
Copy link
Author

Mr0grog commented Apr 17, 2024

If you do an editable install, you will have PKG-INFO file, but that will rapidly be outdated.

Oh, good point!

the order of .git_archival.txt and PKG-INFO doesn't make sense there. I will try to make a PR to invert that order if you don't beat me to it.

You clearly have a much better sense of the right approaches, impacts of the ordering, and edge cases this project has had to deal with, so I’m happy to leave that to you.

setuptools_scm._file_finders.git message should be changed, just double-check if it's used somewhere else, otherwise gate that by a have_fallback=False kwarg.

I’m not sure I follow this part. If I understand correctly, the file finder stuff is a totally separate entrypoint into setuptools_scm from the version stuff, so it doesn’t have any context about the other fallback things are happening. I don’t understand what would be used to set a new have_fallback kwarg when applicable.

@LecrisUT LecrisUT linked a pull request Apr 17, 2024 that will close this issue
3 tasks
@LecrisUT
Copy link
Contributor

I have started the work on it with #1034, not sure how much I will have time to work on it this week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants