-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bootstrap install script for flit_core #481
Conversation
We only have fairly rudimentary state tracking so state from a failure may not get discarded on failure in all circumstances. I'm not really sure that approach is a good fit for our infrastructure which tends to do fairly strict build/install separation, also maintaining entirely separate package build/install infrastructure only for bootstrapping isn't exactly great from a maintenance point of view. Does it avoid the recursive dependency issues with Currently I have pep517/flit support brought up somewhat using |
whl_fname = build_thyself.build_wheel(td) | ||
whl_path = os.path.join(td, whl_fname) | ||
|
||
print("Installing to", dest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have argparse arguments that split the build and install stages up? It would also be handy to have a --root
flag along the lines of ensurepip/distutils/setuptools since we need to redirect the install location when cross compiling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing to actually do with --root
here - that only matters if you install scripts, which this doesn't. For this scenario, you'd just prefix it to the target path, e.g. ${ROOT}/usr/lib/pythonX.Y/site-packages
.
The commands to do this in separate build & install steps already exist 😉 :
# Build
rm -rf dist
python build_dists.py
# Install
unzip dist/*.whl -d /path/to/whatever/site-packages
The only real difference in this new script is that it skips building the sdist. The Python zipfile module also has a CLI if that's easier than using the unzip command.
OK, fair enough. I won't go anywhere with it unless we find there is a use case where it helps. For a package like this, you could just have the 'build' step be a no-op, and run this in the 'install' step. There's really no 'build' occurring in the sense that compiled languages, or even Javascript packers, mean. The source code (plus some metadata) is what gets installed.
This isn't entirely separate, though. It's using existing methods to create a wheel of flit_core, and then just unpacking it to the specified
This one by itself wouldn't, I think, but having a similar script in tomli would. It would be an implementation of the mechanism I suggested on #462: using tomli from source to run
I don't find the Makefile syntax super easy to read, but I think you're using
I think the override there might be to not use sysconfig. As I understand it, |
I could easily refactor this to cover more use cases like ours I think. Would that be something upstream-able?
I mean there's 2 stages that I'm seeing here that would be trivial to have flags that split them out.
Hmm, how is pip itself bootstrapped, is that mechanism reusable?
This just seems unnecessarily complex and brittle, IMO we really should just vendor
Yeah...
Ugh, there really should be some standardized way for bootstrapping build tools from source...dealing with these deep recursive dependency trees is very difficult to do in a clean/reliable way otherwise.
Looks like it's the
The sysconfig overrides I think are mostly for c extensions. We're using it to trick the various build systems into building c extensions for the target python, is there a better way to get it to do that? Python tends to be one of the most difficult languages to cross compile in general, we carry over 30 downstream patches as well. |
My thinking was this is just a tiny script which only a few people will need to use, I wanted to keep it as simple as possible, with really minimal options. 🤷 If you want the two steps separated... the first one is already there,
Normally, from a wheel, either bundled with a Python installation (ensurepip), or bundled with a script you download (get-pip). Systems that build everything from source probably at present rely on
If you're happy with vendoring, then https://github.com/FFY00/python-bootstrap is essentially that -
Sorry about this. It's kind of a new world for us - until recently, bootstrapping build tools just meant installing setuptools. We're still figuring out how to deal with packaging infrastructure spread across multiple packages.
Not that I know of, and I suspect any better ways will be build-system (backend) specific. Flit can't build C extensions anyway, so no need for trickery here. But as you're separating build and install steps, |
I think most that would need it are going to be running somewhat non-standard setups and need extra options not minimal.
Stuff like the
By vendoring I mean namespace isolate the package like pip does so that the imports don't conflict with the normal
I'm curious which ecosystems are like this for build tools? Are they not allowing packages like
Yeah, I just don't understand why everyone is being so quick to drop backwards compatibility(especially for build tools that need to be bootstrapped) with this being the case, usually when backwards compatibility is dropped there is a very clear migration path that doesn't involved multiple recursive dependencies or manually unzipping archives into installation folders. I would say the best option would probably be to write a bootstrap Using existing legacy infrastructure to bootstrap new infrastructure is how we build our toolchains in buildroot, but we kinda depend on there being a clear dependency tree here(our infrastructure is really not great at handing recursive dependencies) which doesn't seem to exist yet for pep517.
Yeah, we have plenty of hacks to work around these sort of cross-compilation issues but we at least try to make them reliable/maintainable.
I think the features are mostly there but they are not exposed all that cleanly yet. |
Picking up from #462, the way that Python Packaging is modelled nowadays to use wheels as an intermediary for distribution/sharing an installable artifact. It's roughly equivalent to a binary RPM or binary .deb file, if you're familiar with those.
Yes, and that approach is not compatible with the policies of many certain Linux distributions because it means that you have a second copy of the same soffware. They don't want multiple copies of the same software (it makes things like ensuring that the right security patches have been applied difficult). Debian is a reasonably easy to reach for example. https://pip.pypa.io/en/stable/development/vendoring-policy/ has most of pip's rationale for vendoring stuff, most of which doesn't really apply to flit IMO.
I think it is. And in the spirit of making things consistent, how do you feel about having it be @jameshilliard I guess a question for you: would it be sufficient if the various projects provided a zip file that you could just unpack in the site packages directory and have a working installation of those packages? |
Again, there's nothing this needs
Linux distros are the ones I'm familiar with - Debian is especially set against vendoring, but I think they all prefer not to. They allow pip, but patch it to un-bundle its dependencies. Pip's maintainers recommend against this, but still describe a semi-supported way to achieve it.
We've been working towards this for several years, I'm afraid. From my perspective, some downstreams are only going to think about it when the old way stops working, so keeping
EDIT: I realised 'simple' might be misunderstood. It might be simple to do, but it's complex in that it involves many more layers of code. |
From my understanding distributions either typically either use git or sdist and not wheels for building RPM's or .deb's. For buildroot we do not build binary packages in the traditional sense, this as we are a source only distro without a traditional package manager. However we pretty much bootstrap all our tools from source(sdists in the case of python packages) and try to avoid downstream patches/hacks as much as feasible.
Are distros actually de-vendoring
Well they don't seem to devendor
Seems to all apply to some degree from my reading of it, especially the bootstrapping/fragility section since
Current policy for buildroot is we always build from sdists and never download wheels from pypi. We have separate build and install stages in our infrastructure and we track info in the sdists such as the LICENSE files which we don't want to install of course. So this is not really what we'd be looking for and does not appear to simplify anything from what I can tell as we could already do this sort of thing manually(it's just not particularly maintainable). |
Debian devendor everything, as far as I can see (debundle.patch). Though the version of pip they're packaging is too old to have tomli (it has |
I'm not saying it's a hard requirement to make an install possible, just that it would probably improve maintainability/consistency with existing tooling.
Yeah, if that's the case the path/name/semantics of the bootstrap file should probably be standardized at least so we can share bootstrapping infrastructure across build tools. It should maybe be maintained as a separate project build that other build tool projects vendor for their releases.
The reason would be in case a distro is depending on some option/feature in setuptools/distutils that the other fallback might not have.
I mean, it's just a simple way to provide multiple migration paths that's known to work reasonably reliably. It's a fairly common pattern for
I mean, avoiding flag day migrations makes that a lot nicer for downstream distros, trying to migrate a whole dependency tree simultaneously is rather non-trivial(especially when recursive dependencies are involved), being able to do it incrementally makes testing a lot easier. I'm not saying we shouldn't migrate, just that we could really use a more solid path to do so. I tend to take the view that if you want users to update something you should avoid breaking backwards compatibility unless absolutely necessary.
Sure...more code is involved but I don't think it would be all that likely to break.
Yeah, they have slower update cycles than most I guess. I think they are able to avoid this recursive dependency issue since they have binary packages while we don't. |
I did an experimental implementation of |
Right, that's a very weird way of saying no. :) Anyway, that's basically the policy that I'm pushing back on. I think that you might not want to be be doing that for pure-Python projects, especially when bootstrapping. If you trust that source distributions produce something that can be installed, then you can also trust that the published wheels do the right thing. It's usually literally the same file content in the archives, only structured in a manner that it can be installed directly.
Ah, interesting. TIL.
Well, then you and I see things differently. :) The only thing that applies is that vendoring can simplify bootstrapping. And I don't think that's a strong-enough rationale to do this. That is precisely the anti-pattern that I want to avoid here. Regarding fragility, the main point there is "Obviously, when pip can’t run, you can’t use pip to fix pip, so you’re left having to manually resolve dependencies and installing them by hand." which does not apply to flit -- flit is not used to manage packages, only to generate distribution artifacts from them.
Thanks for doing this, and I'm happy to hear that you found it straightforward to do. It is indeed a very maintainable approach upstream (that's literally why I wrote the tooling you've used for doing this). However, the concern is not whether vendoring tomli is something that we can do technically or whether it'd be maintainable (I already knew the answer is yes) -- but rather that we don't think flit should be vendoring anything. Further, the |
I didn't create that policy but it doesn't sound like something that we'd want to change. :)
I mean, at a minimum it's going to break our license tracking tooling to use wheels I think, but from a maintenance point of view it's inherently dangerous to use wheels for source based distros in case one were to accidentally pull in a prebuilt c extension.
I mean flit is required to install flit based sdists which makes it effectively involved in local package management at a minimum in a number of cases, the main issue is that If unvendored |
I'm not going to try to reply to everything, because this discussion is sucking up way too much time. But I'll note that I don't see unzipping wheels for a couple of basic tools as a hack, or particularly difficult to maintain. The basic structure of wheels ensures this works, and that's a spec that hasn't changed significantly since it was written 8 years ago. What can we do - besides using setuptools or vendoring packages - to improve matters? Build: Would it help if Install: Would it help if flit_core & tomli had an |
It's an issue for us since we don't support(by policy) wheels downloaded from pypi.
I'm not really sure this would help. Maybe it helps if I explain a little more how our build process works(which is a bit non-standard to say the least) so that it's more clear what the problem is and why circular dependencies are so problematic for us and what potential solutions may be viable. For us each separate package build(either for host or target which are essentially independent builds) is built in isolation first then it's entire output tree(the entire host toolchain and target rootfs it modifies) is then rsynced into the isolated build trees for any packages that depends on it(each package essentially has its own copy of a host toolchain and target rootfs tree that is based on the combined trees of their dependencies), this process is inherently one way.
Essentially you can think of our dependency resolution/build process as a Directed acyclic graph where depending on any package automatically pulls in all the dependencies of that package but a package can never depend on or even access anything from a reverse dependency. This is why the suggestion that we use
Not sure, I'm not seeing how the |
Can't flit_core and tomli just run |
Doesn't seem like a great option since it's kinda noisy in the commit history.
That means custom handling for each package |
It's no more noisy than bumping the version.
It means custom handling for exactly two packages and that's only for bootstrapping. You can give them internal identifiers if you want and rebuild them using PEP 517 tooling. |
I'm seeing at least 7 packages that need to be installed for the minimal
|
If this is helpful to repackagers, a PR to add a similar script to Tomli is welcome! Would be great if @takluyver either made the PR or reviewed it however, as this isn't my area of expertise. |
@takluyver is this now obsolete? |
AFAIU this was superseded by #511. |
Yes, I'm kind of angling whether this should be closed. |
Yup, superseded by #511 - thanks everyone. |
This builds a wheel and unzips it to
site-packages
, which is my suggestion for how downstream packagers bootstrap packaging tools. This isn't the only possible approach: https://github.com/FFY00/python-bootstrap shows how to have a set of basic tools all importable together for the bootstrapping phase.This is deliberately lacking install features like removing a previous version, or atomic installation. It's meant to be run in a context where you know it's not already installed, and where a failure means the resulting state will be discarded anyway.
@jameshilliard, does this look like it would help you, following our discussion in #462?
If this approach is useful, it might be reasonable to add similar scripts for a few other pieces of low-level packaging infrastructure - in particular
tomli
andinstaller
(cc @hukkin, @pradyunsg ). This is just a starting point for discussion.