Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for custom build logic in Cargo #1861

Closed
thoughtpolice opened this issue Feb 17, 2012 · 8 comments · May be fixed by devcode1981/rust#6 or MarcelRaschke/rust#6
Closed

Support for custom build logic in Cargo #1861

thoughtpolice opened this issue Feb 17, 2012 · 8 comments · May be fixed by devcode1981/rust#6 or MarcelRaschke/rust#6
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one.

Comments

@thoughtpolice
Copy link
Contributor

The metaticket is #1850.

Summary: invariably, due to wanting to keep packages self contained, write wrappers for C++ code, invoke configuration-based build logic (i.e. I want to invoke some transparent compression, providing zlib is on the system, etc.) or any number of things, there is often going a need for a more complicated build system than what the crate mechanism provides. Specifically, libraries which do things like foreign bindings are extremely prone to needing this in my experience a lot of times, and requiring arduous - sometimes totally manual - installation on the users part is incredibly unsatisfactory.

As of right now, Cargo only manually looks for crates in the source directory, followed by hitting them with rustc. I envision a world where most of the time - this is all that's needed. Cargo should not be responsible for complex code building logic.

However, it most certainly needs support for allowing library creators to defer to a build system to run if the simple case doesn't cut it. The compression example is a real one - in writing Haskell bindings for LevelDB, I found it very nice that I could use autoconf to detect if the snappy compression library was available, and compile the LevelDB source code inline with that assumption if it was.

I like the homebrew approach. Any build system it wants - but you express the build in Ruby as a 'formula' for building it, using system commands. So you can have cmake based software, autoconf based, just something with a makefile, etc. The formula merely invokes system commands and fails if they err. One thing to note here is that Homebrew does expect you to respect the installation prefix normally, so it gives ruby scripts a 'prefix' variable you can pass on, etc.

I envision something similar for Cargo - just a way to run commands for a build. A nice aspect of this is Cargo still doesn't really do any building on its own, you just give it commands. If your actual package build is handled in an autoconf and makefile like style, you can always just build like that during development. The metadata is only need to inform Cargo how to run a build at installation time.

With the running LevelDB example, let's say I have a package, just called leveldb. It has a file structure like this:

./configure
./Makefile
./leveldb.rc
./leveldb.rs
./build.cargo

Name is arbitrary, but build.cargo contains a list of commands to run - it's basically the trigger for custom configuration logic. In fact, you could just say it's any arbitrary executable script, e.g:

#!/bin/sh
./configure --prefix=$(CARGO_PREFIX)
make
make install

and make would take care of calling gcc and rustc on the crate, etc. This example in particular is kind of half-assed, but you get the gist. Ultimately, I think Cargo should always defer the responsibility of building onto a build system, and not say "you must have a configure script" or "must use a makefile." build.cargo is just some script - you could write it in Perl or ruby/etc. But there are downsides to this approach:

  • Per Use msvc on windows, not mingw #1768, rust should eventually work with msvc, and robustly on windows so people will need a way to specify builds for different platforms, particularly windows. I have no idea what this looks like.
  • It adds the caveat all libraries must respect installation prefixes, especially since cargo is local by default.

This does allow even the simplest of build systems to work, though. As an aside, this, along with support for something like #612 would most certainly make it very possible to bind a lot of packages, especially when you want autoconf-style feature detection at configuration time.

So I am personally a fan of this method or something like it. Does anybody have objections to something like this? Thoughts? I'm particularly hazy on saying something as flat out as "build.cargo is executable, just do it", but it is simple and flexible.

@brson
Copy link
Contributor

brson commented Feb 18, 2012

My impression is that for compiling rust code we kind of want to avoid having cargo delegate to other build systems (cargo should just build the crates). For complex non-rust components I suppose something like this is inevitable. Do you need this much flexibility currently for the leveldb bindings?

@thoughtpolice
Copy link
Contributor Author

Most of the source code could probably be put into crate attributes with #[cfg] and the hypothetical #[external] a la #1862. The required components are mostly just that I can pass correct options to the C compiler, depending on the platform. That's all doable providing #1862 is implemented, and I could get bindings coming along quickly.

The main thing I would like to support, that this scheme allows, is configuration based logic before the build. LevelDB has optional snappy support for compressing blocks that hit the disk; I would like to be able to detect a snappy.h and libsnappy, and, provided they exist, compile LevelDB with snappy support, and add a link_args attr to the crate, saying "also link with libsnappy when building."

Something like this can easily be supported with the above framework, but it does need a feature like #612 as well (because the configure logic needs to give some extra crate attrs.) It's not required, but it's nice to enable; especially as compression can regularly offset the I/O required for disk writes/reads enough to make a noticeable improvement, in my experience.

Perhaps the scope here can be narrowed, to only encompass configuration logic for Cargo? I will say something like this is rather inevitable IMO, when you want to integrate a more complicated build scheme into Cargo packages - but of course an issue like this is also an issue of cost analysis - it's potentially not worth adding all this if we only have 1 or 2 packages (mine) using part of this infrastructure for like, the next year or however long. And when others come along, maybe it still won't be sufficient for them, or it'll just be terrible. ;)

I'm more than willing to narrow down the scope here. An issue like this brings into question a lot of cargo's responsibilities/expected workflow, so please - tell me what you think. :)

@brson
Copy link
Contributor

brson commented Feb 22, 2012

I do think we should add the minimum necessary to support current use cases, particularly because I just don't know much about how to properly design a package manager, nor build system (and I'm not sure exactly which cargo is). Maybe we can investigate what npm, cabal etc. do to support foreign code. I completely agree that something is necessary here. I'm a little scared of screwing this up, but I guess we can afford to make some mistakes here.

We need to get graydon's buy-in before expanding cargo in this direction. I believe he as expressed strong opinions about not executing arbitrary build scripts before.

I realize this is extremely non-committal and I'm sorry.

@brson
Copy link
Contributor

brson commented Feb 22, 2012

I'm ok trying the design you suggest with cargo delegating to a script. There's no reason to hold this up if you need the feature now.

@graydon
Copy link
Contributor

graydon commented Feb 22, 2012

Hi,

I'm going to ask for a little bit of care and patience finding the right balance on this issue. Not to say that you shouldn't experiment -- please do! -- but I think that finding a "right" point that satisfies the various conflicting tensions in this design space is going to take a few iterations.

With that in mind, my general preferences are:

  • Declarative over procedural. That is, #[cfg] and such as much as we can, until it is really getting silly.
  • Searching for header files is not silly. It's a very sensible thing to do.
  • Querying pkgconfig or its ilk is not silly. It's built for this sort of task.
  • Passing -D defines through to compilers, likewise.
  • In both cases, the more declarative (the less you're just passing opaque strings to shells) the more likely we can adapt to multiple toolchains.
  • A good criterion for "declarative enough" is "all cargo packages build on all 3 tier-1 operating systems, and those that do not, only because some dependent library in question isn't really available on that platform, and we'd have to be teaching cargo to build the entire dependency as well". If we have a bunch of packages where the library requirements can all be met, but cargo fails to build a rust package on that platform (especially windows) due to "bad configuration logic", we're not declarative enough.
  • A definite criterion for "silly" when considering how far to stretch the declarative system is "turing-complete". That is, If a "declarative" scheme provides a way of expressing non-terminating loops or defining callable functions, it's gone too far and should have just bottomed out in a trap door to real procedural code.
  • _Note_: that trap door should, itself, probably be a rust file, not "some script we try to run via shebang".
  • There is a reason for this!
  • The common denominator for "invoking a command" is actually a lot less useful than you imagine. The set of commands in common between a stock win32 development system (with powershell and visual studio, but not including msys since we want to eventually shed that dependency) and a stock linux system is nearly empty. That's important. It means "invoking a script we ship" will cause the package manager to force msys on everyone. And probably python as well, because seriously who wants to write autoconf in 2012? Now your build-deps have gone up a lot.
  • If you consider the analogy with homebrew further, you'll see what's going on here. Homebrew knows it's on OSX and has ruby, so "invoke a command" will (a) definitely work and (b) probably has access to a ton of useful tools for doing builds already. The only analogous thing we know we're going to have kicking around is a rust compiler with the rust standard library. We start from there. So don't shell out to shell; shell out to "compile this bit of procedural rust logic and run it". Best part, it can call into the same stdlib for sniffing around its environment that cargo was calling into. It's very likely that the user can express "a build process that goes beyond declarative" in plain rust-that-uses-stdlib code, and do so portably. If the user is just desperate for some shell commands, they can put in a call to std::run_command and in so doing make themselves a whole lot less portable. But I don't want it to be the first hammer anyone reaches for. It's a hammer that's got "nonportable extra build dependencies" written on it in fine print.
  • About that fine print: as was pointed out on the mailing list the other day, we're "reinventing" a bunch of pieces that other building and package-management systems already have. We don't want to reinvent too much. We will need, at some point, to do version-range calculation and dependency sorting, both between cargo packages and packages-outside-cargo. But: if you find yourself re-expressing, in cargo-ese, the configuration and build logic for some C package that already has its own configuration and build logic, you're probably making a mistake, have gone too far. Not absolutely certain -- some C code will sometimes call for packaging by copying-and-integrating into a cargo-managed tree -- but that's the boundary where you should be thinking carefully about whether to shell out to apt or yum to get a prebuilt C library in binary form, and/or fail with sorry, I don't know how to retrieve libfoo on this platform.

You are probably going to block, here, on conditionalizing attribute processing, supporting some kind of read/write or read/rebind access to the dictionary of configuration-time variables, as well as adding some fresh attributes for new tasks. You are not the only one. I am somewhat blocked on it too and @DylanLukes is poking around at it as well. I'd welcome coherent work in this space. We need a "grammar" relating conditionals, variables, and attributes. There's a sketch @DylanLukes wrote over in https://github.com/mozilla/rust/wiki/Proposal-for-attribute-preprocessing and a bug #1242 that talks about this, and he put up a gist the other day as a result of a conversation we had on IRC, but I'm having trouble tracking it down.

Talk some with him, or catch me on IRC tomorrow, I'd like to see more of this stuff come into focus soon. It's important if I keep saying "use attributes!" for there to be some credible system for expressing more complex logic in attributes.

@DylanLukes
Copy link

That proposal is very out of date... I'll TRY to have a new one with my new design up today sometime between 11:20 and 12:07.

If not, I should manage by the end of the day.

It'll be easier to discuss with something relatively concrete in place.

@z0w0
Copy link
Contributor

z0w0 commented Feb 16, 2013

This can probably be closed now.

@catamorphism
Copy link
Contributor

Closing as obsolete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one.
Projects
None yet
6 participants