Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Promote the libc crate from the nursery #1291

Merged
merged 1 commit into from
Oct 29, 2015

Conversation

alexcrichton
Copy link
Member

Move the libc crate into the rust-lang organization after applying changes
such as:

  • Remove the internal organization of the crate in favor of just one flat
    namespace at the top of the crate.
  • Set up a large number of CI builders to verify FFI bindings across many
    platforms in an automatic fashion.
  • Define the scope of libc in terms of bindings it will provide for each
    platform.

Rendered

Move the `libc` crate into the `rust-lang` organization after applying changes
such as:

* Remove the internal organization of the crate in favor of just one flat
  namespace at the top of the crate.
* Set up a large number of CI builders to verify FFI bindings across many
  platforms in an automatic fashion.
* Define the scope of libc in terms of bindings it will provide for each
  platform.
@alexcrichton alexcrichton added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Sep 23, 2015
@alexcrichton alexcrichton self-assigned this Sep 23, 2015
@petrochenkov
Copy link
Contributor

How complete the new libc is supposed to be? In the current libc large parts of the ISO C library are missing, the same is true for posix, IIRC. To some degree it can be described as "If some binding was needed for something in rust-lang/* and friends it was included."
I always had an impression, that libstdc, libposix, libwinapi etc. (or libglibc, libvisualc, libmusl etc. if we base everything on implementations and not standards) should be separate libraries aimed for completeness and compliance to the relevant standards (as much as actual C implementations allow it) and libc should not exist, or at best should be a facade for some of these crates (but not libwinapi in particular).

Edit: everything else - flat namespacing, size_t changes, testing, removal of winapi looks good to me.

@BurntSushi
Copy link
Member

👍 to the flat namespace. I don't think I've ever personally found the existing hierarchy too useful. Whenever I need to use libc, it's never "consult the docs" but rather "grep the source." If this were simplified to a flat namespace at the root, then it seems like "consult the docs" would be viable. :-)

With respect to std::os::*::raw, I will mention that I recently ran into a problem where I wanted to call a Windows function (in libc) with a file handle (from std::os::windows), but the underlying c_void types were different. http://doc.rust-lang.org/libc/src/libc/lib.rs.html#199-202 and http://doc.rust-lang.org/src/std/os/raw.rs.html#51-56 --- So perhaps some kind of re-exporting is a good idea?

@nagisa
Copy link
Member

nagisa commented Sep 23, 2015

First of all, rather than testing the bindings, I believe they should be fully auto-generated (similarly to winapi), possibly at crate build time. The only thing that should receive real commits in such case are lists of headers/libraries to look at/link to, and the generator itself. Provided binding generators are correct (which is what you’d test), no API testing is necessary.

I’m strongly in favour of keeping only CRT on windows-msvc, but windows-gnu should expose gnu part as well.

I’m strongly in favour of flat namespace.

I’d like crate to get named libc-sys or something, to match the conventions.

@alexcrichton
Copy link
Member Author

@petrochenkov

I intended to address the "what does this library contain" question in the section about libc's scope, so are you thinking that libraries like librt, libm, libpthread, libc, etc are too broad or not specific enough? I also think that it would be fine to say that for some particular platform a specific standard is followed, but for a platform like OSX there may not be a standard to follow in that situation.


@BurntSushi

So perhaps some kind of re-exporting is a good idea?

One worry I've had about this in the past is compatibility in the backwards and forwards direction. For example, let's say we're using a very new libc with a very old libstd. This means that any new std::os::raw types added recently aren't there to be reexported by libc, and libc would need to realize this to be able to compile.

On the flip side if we're using a very old libc with a very new libstd, it's likely that there's a number of types in libstd which aren't reexported by libc because when it was originally written those types didn't exist.

The only real solution I know of to this is to put libc into the main distribution itself, which would need quite a bit of its own discussion to take that direction. To sidestep these problems entirely and simply require a few casts here and there when interoperating with std, I'm slightly in favor of not reexporting for now that I think through it a bit more.


@nagisa

I believe they should be fully auto-generated (similarly to winapi), possibly at crate build time

I'm currently under the impression that it's not viable to generate any of these bindings at build time because that implies something like libclang being installed which would be a pretty unfortunate dependency of the libc crate! Generating bindings ahead of time and checking them is totally fine, however, and there's certainly nothing here to rule that out. I think that automation and testing is still quite valuable for a number of reasons:

  • Allows organizing code while ensuring that it doesn't break existing platforms.
    • Also helps sanity-check any future refactorings
  • Acts as a sanity check for any generated code
  • Catches problems like #[link_name] which auto-generated code likely wouldn't catch.
  • Most bindings in Rust are not auto-generated today, and having a relatively lightweight framework for testing bindings is quite useful in the broader sense.

Also, there's no real downside to just running more tests! (especially because they already exist)

I’d like crate to get named libc-sys or something, to match the conventions.

I debated for a bit about perhaps renaming this crate, but it's so widely used and ingrained that I think ship has sailed here at this point. It's also not clear that it would actually follow the existing conventions because foo-sys indicates that it links to and provides the bindings for the native library "foo", but in this case the exact set of native libraries linked to is platform-dependent, so there's not necessarily "one true name" for this crate.

@eefriedman
Copy link
Contributor

For the -pc-windows-msvc targets, exposing "C runtime" functions has weird implications:

  • Suppose a future version of the Rust standard library doesn't depend on any C runtime on Windows. If the user doesn't want a C runtime at all, how are they supposed to get definitions for types like wchar_t?
  • Assuming Rust is dependent on a particular version of the C runtime, exposing functions like malloc is a trap because other DLLs are likely to have access to a different version of malloc which isn't compatible. On the other hand, if you're statically linking against code written in C, you might end up needing whatever malloc happens to be linked in.
  • The current proposed version of libc exposes functions which take paths as *const c_char... but you never want to use those functions on Windows because of encoding problems.

There's probably an argument to be made that libc shouldn't expose any functions at all on Windows. (The crate wouldn't be completely empty because it would contain some types.)


It might be a good idea to explicitly exclude pure math functions from the scope of libc (sin(), abs(), etc.). They aren't necessary, and we'd end up either reimplementing a lot of them in Rust or randomly excluding them on various platforms.

@alexcrichton
Copy link
Member Author

@eefriedman

The C runtime is so ubiquitous and difficult to change that I would expect any alteration in how it's linked to be a modification to the target triple entirely. I certainly do not want to promise that the standard library will link to a C runtime on all platforms for forever, rather I want there to always be a possibility for the standard library to be built completely independently of libc (e.g. on Unix) or the CRT (on Windows).

That being said, the purpose of this library will just be to say "I'd like a C runtime linked in, please". Although the standard library may not link a C runtime, that'd be the purpose of this crate. In that sense, to answer your points:

If the user doesn't want a C runtime at all, how are they supposed to get definitions for types like wchar_t?

I'm a little confused by this in the sense of if you don't have a C runtime, why would you want wchar_t? That would imply some form of interoperation with FFI code perhaps, which then in turn means you need a C runtime (which seems at conflict with the original desire?).

I would expect, however, that libc would represent "The CRT" on Windows so if you are compiling for a target on Windows that means "std doesn't link to the CRT" it just means that by linking libc you'll be linking in the standard CRT.

Assuming Rust is dependent on a particular version of the C runtime, exposing functions like malloc is a trap because other DLLs are likely to have access to a different version of malloc which isn't compatible. On the other hand, if you're statically linking against code written in C, you might end up needing whatever malloc happens to be linked in.

I'm not sure I understand what the concern is here. If this is a problem, how is it supposed to work in the ideal case? Surely we have to always be able to link to "a CRT" as well as external C code?

The current proposed version of libc exposes functions which take paths as *const c_char... but you never want to use those functions on Windows because of encoding problems.

I disagree that you never want these functions because I can imagine a niche use case where you're not dealing with many Rust types but instead more C types, so it may be easier to call these functions in that case. Regardless, though, this library represents an exact binding to the platform in question, not necessarily an opinionated version of "here's what we think you should call". For example I wouldn't reject a PR to add gets to the Unix bindings.

There's probably an argument to be made that libc shouldn't expose any functions at all on Windows. (The crate wouldn't be completely empty because it would contain some types.)

While I agree that the functions are likely to be rarely used, I don't think we should necessarily actively remove them just because we don't think they should be there. I would expect a use case to eventually arise in one form or another and it's nice to have the bindings already available!

It might be a good idea to explicitly exclude pure math functions from the scope of libc (sin(), abs(), etc.). They aren't necessary, and we'd end up either reimplementing a lot of them in Rust or randomly excluding them on various platforms.

Could you elaborate a bit on this? While I agree that they may not be necessary (because the standard library provides them) I don't see how reimplementing them in Rust would affect this. By linking to libc and using the functions you're getting a guarantee that you're using the libm variants of the functions (or whatever the symbol is in the process image) which may be desired if you want to bypass any custom Rust implementation we may have.

@eefriedman
Copy link
Contributor

I would expect, however, that libc would represent "The CRT" on Windows so if you are compiling for a target on Windows that means "std doesn't link to the CRT" it just means that by linking libc you'll be linking in the standard CRT.

There is no "standard CRT" on Windows. Every version of Visual Studio ships with a different, binary-incompatible version of the CRT.

I'm a little confused by this in the sense of if you don't have a C runtime, why would you want wchar_t? That would imply some form of interoperation with FFI code perhaps, which then in turn means you need a C runtime (which seems at conflict with the original desire?).

You don't necessarily need a C runtime to do FFI on Windows. Some Windows APIs are defined to take wchar_t or a typedef of it (the winapi crate has a definition), but they're independent of any C runtime.

I'm not sure I understand what the concern is here. If this is a problem, how is it supposed to work in the ideal case?

Ideally, APIs which allocate memory have a companion API to free it. http://blogs.msdn.com/b/oldnewthing/archive/2006/09/15/755966.aspx is a more in-depth description of the issue.

@eefriedman
Copy link
Contributor

By linking to libc and using the functions you're getting a guarantee that you're using the libm variants of the functions (or whatever the symbol is in the process image) which may be desired if you want to bypass any custom Rust implementation we may have.

This becomes a bit problematic when, for example, abs() on Android or sinf() on Windows is defined in a header, so there is no symbol to link against, so we would have to rewrite the definition in Rust to expose "libc" in the same way it would be visible to C code. There isn't any fundamental reason we can't do that, but it seems like a lot of useless work.

@petrochenkov
Copy link
Contributor

@alexcrichton

I intended to address the "what does this library contain" question in the section about libc's scope, so are you thinking that libraries like librt, libm, libpthread, libc, etc are too broad or not specific enough?

Not specific enough. When scope section says "libc, libm, librt, libdl, and libpthread" I assume glibc is implied and the scope is based on implementations and not standards. At the same time Rust libc contains only a tiny part of what glibc provides. Is better (or even complete) coverage of glibc promised or encouraged? Will for example PRs adding wide string or locale stuff be accepted or worked on by rust-lang members? Will C99/C11 bindings/types be provided or accepted? Will glibc(musl/VC)-only functions/types/etc. present in the library? E.g. what is the purpose of the library - gather some popular stuff from popular implementations or provide a more or less complete C layer.

@alexcrichton
Copy link
Member Author

@eefriedman

In general I'm not understanding your comments with respect to the design of this library, it sounds like you're talking about general "things to worry about" with MSVC which have not a lot of relevance to providing bindings to the CRT?

For example, I don't understand how binary incompatibility of the CRT comes into play here. Surely the signature of functions like memcmp aren't changing. If Rust is binding the function of a signature of a function that changes, that's of course bad, but other than that is there a failure mode that you're concerned about?

I'm also not sure why you're concerned about getting type definitions without a CRT? That sounds like the job of winapi rather than libc. If you're just dealing with Windows functions then you're talking to the CRT, and if you're just using Windows API functions then you can stay within those crates.

And finally, I don't understand how malloc/free are relevant here. It's of course a problem if you malloc in one DLL and then free in another with a different free function than intended, but how is that related to the design of libc? Are you worried specifically about how the CRT is linked in?

It'd be helpful to me to articulate exact failure scenarios you're worried about, and keep in mind that this library isn't going to "just solve all your problems", it's just declarations to functions found on the platform in question!

This becomes a bit problematic when, for example, abs() on Android or sinf() on Windows is defined in a header

This is only really problematic if you want the library to be "cross platform" in the sense that it's always conforming to some standard or another, but the RFC explicitly states that this library is not cross platform and it's quite specific to the platform in question. In that sense there's no problem here. We could provide the inline functions ourselves as well for these platforms.

I also don't understand why you say that writing the inline definition in Rust is "useless work"? What are you trying to reason towards? Not exposing these symbols? Not binding Android at all? We don't necessarily have to seek out and implement all of these definitions, the RFC is just defining the scope of libc to include them so if added it'd be an acceptable change.


@petrochenkov

Currently when this RFC says that group of libraries on Linux, it really means it. There's no implication of glibc or of any particular standard, it's simply whatever's to be found in those libraries on all Linux distributions can be included in this library. We'd certainly accept PRs for wide strings, locales, new types, etc, so long as they're present on Linux in these libraries across all Linux distributions (e.g. following the letter-of-the-law of this RFC in terms of scope).

The purpose of this library, stated in the RFC, is "to provide all of the definitions necessary to easily interoperate with C code". Each platform defines its own scope (and the tier 1 platforms are defined in this RFC), and as long as the function is within that scope it's welcome in libc.

@eefriedman
Copy link
Contributor

I also don't understand why you say that writing the inline definition in Rust is "useless work"? What are you trying to reason towards? Not exposing these symbols? Not binding Android at all? We don't necessarily have to seek out and implement all of these definitions, the RFC is just defining the scope of libc to include them so if added it'd be an acceptable change.

If you want to draw the line that way, I guess that's fine. It's a good point that we don't actually have to implement all the functions which could theoretically be supported.

For example, I don't understand how binary incompatibility of the CRT comes into play here. Surely the signature of functions like memcmp aren't changing. If Rust is binding the function of a signature of a function that changes, that's of course bad, but other than that is there a failure mode that you're concerned about?

Sure, memcmp probably isn't changing... but for example, in VS2015, the CRT no longer provides a symbol named printf, and the signature of wcstok changed. (See https://msdn.microsoft.com/en-us/library/bb531344.aspx .) So if you really want bind the whole CRT correctly, it has to be version-sensitive.

I'm also not sure why you're concerned about getting type definitions without a CRT? That sounds like the job of winapi rather than libc. If you're just dealing with Windows functions then you're talking to the CRT, and if you're just using Windows API functions then you can stay within those crates.

It would be preferable if there were a canonical home for wchar_t, although I guess it's not a hard requirement.

And finally, I don't understand how malloc/free are relevant here. It's of course a problem if you malloc in one DLL and then free in another with a different free function than intended, but how is that related to the design of libc? Are you worried specifically about how the CRT is linked in?

Hmm... I'll try to work through various scenarios.

Everything statically linked, built from source, mostly just works... but there are two issues. One, if the user uses functions like printf, where the symbol changes, or a function like wcstok, where the signature changes, libc has to know which CRT you will be linking against. Two, some libraries might prefer to pin a CRT version so that they don't have to deal with unknown runtime behavior changes in a future version of the CRT.

Suppose I'm dealing with a badly written binary DLL, I might need to link dynamically against a particular CRT to get the right versions of malloc() and free(). The user will ensure the correct CRT is linked in... and need to make sure libc knows about it so version-sensitive bits work.

Suppose I'm writing pure Rust code, and I decide to allocate memory using malloc() and free(), and pass it all over the place, because I don't want to deal with whatever equivalents Rust has. I decide to build part of my application as a DLL because I'm building multiple binaries, and statically link against the CRT because I don't know what I'm doing. The program crashes because it's using multiple different versions of malloc() and free().

@alexcrichton
Copy link
Member Author

@eefriedman

Thanks for the MSDN link! Definitely quite helpful to see what kind of breakage we're talking about. While good to keep in mind, I don't think it affects the design of this library all that much though. In the worst case it's got a build script which dynamically alters the surface area and API of libc based on what MSVCRT version is found locally, but for now we'll probably just stick to the subset which is stable across the versions of VS we support (with automation to back this up).

With respect to the linkage issues, it also seems somewhat orthogonal to the API issue of libc? Linkage against the CRT is still determined by the standard library really (rust-lang/rust#26258) because it uses some symbols from it, so it'll be awhile before we can tackle this sort of issue. So while this definitely sounds like it could be a problem, it doesn't seem too relevant to the design of libc itself because short of just not providing anything on Windows it's just a problem that has to be dealt with no matter what.

@eefriedman
Copy link
Contributor

With respect to the linkage issues, it also seems somewhat orthogonal to the API issue of libc?

If we just say by definition that libc picks up symbols from whatever CRT is linked in, and rustc controls which CRT is linked in, yes, there isn't really any design impact. There are other possible designs, though: if libc controls which CRT is linked in, that would probably be exposed through cargo features or something like that. We might be able to put that off until later, though.

@retep998
Copy link
Member

but it makes no effort to be help being portable by tweaking type definitions and signatures.

I presume this means we won't be using #[link_name] for CRT functions that have slightly different signatures than the normal posix version? Things like _fstat64 -> fstat?

The scope for the Windows version of libc seems reasonable to me. I don't see the loss of Windows bindings as a drawback.

With regards to std::os::raw, winapi currently relies on that for the primitive C types, so as long as std uses the c_void from std::os::raw for the handles it returns through AsRawHandle, then those handles should be usable within winapi as well without any casting. Whether std defines its own c_void or uses the one from the internal libc doesn't matter, so long as the public API for std has a single consistent c_void definition.

@alexcrichton
Copy link
Member Author

@retep998

To help write "cross platform" code in the sense of trying to avoid unnecessary #[cfg] the library currently uses #[link_name] to rename some common functions. For example android doesn't define signal and OSX needs lots of help on some symbols.

Currently on Windows the leading underscore is removed and things like fstat64 also lose the trailing 64, but I could probably go either way on the losing-the-trailing-64 aspect.

@mzabaluev
Copy link
Contributor

I'd like the following approach to organization of the public namespace to be considered:
the APIs that are only present on certain OS families, or differ in their public surface from APIs with the same name on other OSes, should be exported in OS-specific modules following the convention in std, such as os::unix, os::bsd, or os::windows (or top-level modules without the os designator, whichever is thought to be stylistically better).

The purpose is to make the writer of consumer code conscious of the non-universal availability of those APIs, and facilitate management of downstream OS-specific code with #[cfg] directives.

There probably will still be some per-target variation, such as "holes" where particular functions aren't available on certain targets despite being available elsewhere in the wider family. But that kind of organization would help manage most of the differences.

@mzabaluev
Copy link
Contributor

I expect that manual editing of target-specific definitions will leave room for error, even with the general intent to back it up with tests. To validate the data type definitions it's necessary to test correct functionality of some C API functions that use the data types, which may not be feasible for all APIs without imposing particular requirements on the test environment, such as filesystem, configuration of the network, DNS etc. Testing structure fields, bit flags, or enum constants may need a lot of care to isolate the effect of each definition.

A more practical approach in the long term would be to generate the FFI definitions per target from the system C headers, and only edit the public reexports manually. The generated code for all supported targets would still be checked into git, but there should be tools to update it for a specific target when needed.

@mzabaluev
Copy link
Contributor

@eefriedman

Suppose I'm writing pure Rust code, and I decide to allocate memory using malloc() and free(), and pass it all over the place, because I don't want to deal with whatever equivalents Rust has.

That is a bit contradictory: if you are writing pure Rust code, surely you'd be using Rust allocators? :)
But it looks like there is a requirement in there: the linkage of libc should "do the right thing" on Windows. In my understanding, that means all Rust crates linking to libc, built by one particular configuration of Rust compiler, Cargo, and the MSVC libraries, would share a single MSVC RT DLL implementation, modulo the SxS lookup issues.

If some Rust crates have different requirements for linking the CRT, they probably should do it explicitly; there could be a MSVC-specific crate to facilitate that.

There's probably an argument to be made that libc shouldn't expose any functions at all on Windows.

The line may be drawn before functions that deal with state managed by the CRT, like malloc/free, <stdio.h>, and the POSIX-lookalike stuff that works with userspace "file descriptors" (does anyone use them, anyway?). Stateless functions, on the other hand, are fine.

@eefriedman
Copy link
Contributor

That is a bit contradictory: if you are writing pure Rust code, surely you'd be using Rust allocators? :)
But it looks like there is a requirement in there: the linkage of libc should "do the right thing" on Windows. In my understanding, that means all Rust crates linking to libc, built by one particular configuration of Rust compiler, Cargo, and the MSVC libraries, would share a single MSVC RT DLL implementation, modulo the SxS lookup issues.

Yes... if you dynamically link against the CRT, and don't override the linker settings, it all just works out. If you statically link against the CRT, or try to mix different versions of the CRT, you can run into trouble.

@retep998
Copy link
Member

Keep in mind that Rust's allocator doesn't use the CRT allocator, even if jemalloc is disabled. The only always safe option is to have the C library provide functions to free the things that it allocates.

@alexcrichton
Copy link
Member Author

@mzabaluev

the APIs that are only present on certain OS families, or differ in their public surface from APIs with the same name on other OSes, should be exported in OS-specific modules following the convention in std, such as os::unix, os::bsd, or os::windows (or top-level modules without the os designator, whichever is thought to be stylistically better).

Thanks for bringing this up! I talked about this a bit in the drawbacks and alternative sections, but it's good to dive in here with more detail to see what it would look like. This is sort of what the library does today (only with standards, not surfaces kinda), so there's at least some precedent with this. I definitely agree that this fits Rust's "conventional platform-specific functionality" pattern better where the main surface area is entirely cross platform where modules then contain the specific functionality per-platform (e.g. std::os).

In considering this, however, I found that it may not end up solving some of the points in the motivation section. For example:

  • When adding an API it may not always be immediately clear what module the API should go into. There's probably a pretty good handle of the minimum module (e.g. the exact OS) that it should reside in, but there may be a question of lifting it up to the bsd or unix modules for example and whether that's possible.
  • Looking for an API in libc may be a little more difficult because it may not always be obvious which module it resides in. Search in the documentation can definitely help quite a bit with this, but it's an extra step to take.
  • Let's say we break up modules in terms of something like "unix" and "bsd-like". It may be difficult to say that all future "bsd-like" platforms will include the exact set of all APIs in the current "bsd-like" module. This may mean that adding another platform ends up being a breaking change in the future or it deviates from the structure by having a hole in that module for the new platform.

Overall I ended up personally concluding that this form of organization was one where the cons outweighed the pros, but I'm curious how you think about these topics?

I expect that manual editing of target-specific definitions will leave room for error, even with the general intent to back it up with tests

Yeah I expect that we may have to issue updates or fixes to APIs, and the hope is that with the automated testing in place we're at least "as covered in possible" and can hopefully prevent issuing a new major version of libc (which I imagine would be quite painful to upgrade). I'd certainly be on board with some tests, but I wouldn't necessarily want to require a test-per-API.

A more practical approach in the long term would be to generate the FFI definitions per target from the system C headers

I totally agree! @nagisa mentioned this earlier, and this RFC certainly doesn't preclude the approach of auto-generating FFI definitions and committing them in that form. That kind of one-time operation would be quite useful and could be easily verified with the testing infrastructure set up as well.


This is also a bit of a maintenance burden on the standard library itself as it
means that all the bindings it uses must move to `src/libstd/sys/windows/c.rs`
in the immedidate future.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although with a bit of build system finagling (like say, making Rust use Cargo as its build system), you could make it use winapi for those things. 😉

@mzabaluev
Copy link
Contributor

@alexcrichton, I agree that it's unlikely that #[cfg] work on a name-to-name basis can be completely eliminated. My worry is that with a flat namespace, there are no speed bumps against writing non-portable code. The current segmentation, though, is not very practical for use: it is organized in terms of standard layers which hardly anybody wants to remember in their daily work.

I'm thinking of segmenting the non-portable names along OS families correlated to the target configuration, where anything in os::unix is only available with #[cfg(unix)], os::linux brings in items that are definitely Linux-specific (rather than some outliers in widely adopted Unix APIs that Linux supports while some other OSes don't), and so on. The definitions pertaining to the C standards do not need to be namespaced (the hunch is that all modern compilers support C99, and should be expected to support C11 in case we need something from there; it's also possible to implement "polyfills" in the libc crate itself). This way, there will still be some per-target variation in available names, but the bulk of non-portable names will be clearly identified. It's also arguably easy to decide where a new item should belong, based on the principle outlined above.

@alexcrichton
Copy link
Member Author

@mzabaluev

I definitely agree that there's no speed bump to writing non-portable code with a flat namespace. I do think, however, that exposing e.g. libc::unix wouldn't necessarily be a "silver bullet" to writing portable code though specifically around the topic of types of arguments and such (e.g. function signatures are tweaked slightly among platforms with functions of the same name). It'd certainly improve the portability situation, however!

You may want to take a look at the current organization of the libc crate for a concrete idea of what this sort of organization would look like. Right now src/unix/mod.rs basically has a skeleton of what would be the contents of libc::unix if it existed. It's missing constants and types and such which are available on all platforms but have different definitions, but the general idea is there.

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Sep 30, 2015
This lint warning was originally intended to help against misuse of the old Rust
`int` and `uint` types in FFI bindings where the Rust `int` was not equal to the
C `int`. This confusion no longer exists (as Rust's types are now `isize` and
`usize`), and as a result the need for this lint has become much less over time.

Additionally, starting with [the RFC for libc][rfc] it's likely that `isize` and
`usize` will be quite common in FFI bindings (e.g. they're the definition of
`size_t` and `ssize_t` on many platforms).

[rfc]: rust-lang/rfcs#1291

This commit disables these lints to instead consider `isize` and `usize` valid
types to have in FFI signatures.
@alexcrichton alexcrichton added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label Oct 15, 2015
alexcrichton added a commit to brson/rust that referenced this pull request Oct 16, 2015
This lint warning was originally intended to help against misuse of the old Rust
`int` and `uint` types in FFI bindings where the Rust `int` was not equal to the
C `int`. This confusion no longer exists (as Rust's types are now `isize` and
`usize`), and as a result the need for this lint has become much less over time.

Additionally, starting with [the RFC for libc][rfc] it's likely that `isize` and
`usize` will be quite common in FFI bindings (e.g. they're the definition of
`size_t` and `ssize_t` on many platforms).

[rfc]: rust-lang/rfcs#1291

This commit disables these lints to instead consider `isize` and `usize` valid
types to have in FFI signatures.
@nagisa
Copy link
Member

nagisa commented Oct 17, 2015

@alexcrichton for the auto-generated bindings on at least UNIX’es most of the relevant documentation can be pulled down from the manual pages or said manual pages linked to.

e.g. something like

Binding to [dup2](http://linux.die.net/man/2/dup2).

Available on Linux, OS X, FreeBSD etc.

Manual pages usually already contain useful availability information (e.g.

dup(), dup2(): SVr4, 4.3BSD, POSIX.1-2001.

dup3() is Linux-specific.

)

Lists of supported platforms can be pretty reliably generated over a few generations by making generators communicate (by generating lists of bound functions or something along the lines along with the bindings themselves). This would in a sense help somewhat with the fact that documentation pages are usually generated on linux only (and hence are linux specific).

@retep998
Copy link
Member

I am currently opposed to the line #[link(name = "msvcrt")] in libc. It should be rustc's job to choose which CRT to link in, so that someone can choose to statically link the CRT instead if they wanted. Right now it is forcing the hand of people using libc, especially since the libc that is used by std does the same.

@alexcrichton
Copy link
Member Author

@nagisa

That sounds promising to me! It'd actually be quite nice if could just auto-generate all the linux (and perhaps other platforms) bindings at once and be done with it!


@retep998

As I've mentioned earlier, the question of linking in the CRT I believe is a moot point with respect to this RFC. I certainly agree there are improvements to be made, but it doesn't have much to do with the API of liblibc itself.

@cuviper
Copy link
Member

cuviper commented Oct 19, 2015

Can you clarify more between the cross-platform bit of the RFC:

One question that typically comes up with this sort of purpose is whether the crate is "cross platform" in the sense that it basically just works across the platforms it supports. The libc crate, however, is not intended to be cross platform but rather the opposite, an exact binding to the platform in question. In essence, the libc crate is targeted as "replacement for #include in Rust" for traditional system header files, but it makes no effort to be help being portable by tweaking type definitions and signatures.

and this comment:

Currently on Windows the leading underscore is removed and things like fstat64 also lose the trailing 64, but I could probably go either way on the losing-the-trailing-64 aspect.

Well fstat is actually an interesting case, because Windows has 6 variants of different bitsizes! The chosen _fstat64 is indeed the best in terms of having steady 64-bit components though.

And as you know, on the internals list I brought up Linux LFS and the effects of _FILE_OFFSET_BITS, so even a "replacement for #include in Rust" still has to make decisions about this kind of definition. I still owe you patches for this, but I hesitate a little to see where this RFC lands.

I think the right answer is for Rust to choose the "best" variant, with 64-bit off_t of course. I'm not aware of other variations so broad, but if they arise I think it should be policy for the libc to just expose the "best" one. But this is also a wrinkle in the idea of trying to auto-generate bindings.

@alexcrichton
Copy link
Member Author

@cuviper

You certainly bring up some excellent points! I agree that for now the "best" variant is probably the one that should be bound by liblibc, and we can perhaps add bindings in the future explicitly to older or different functions. I think one metric could be to compile a C program referencing the APIs in question with the "best set of #define directives" in play and then take a look at the object file and reference those symbol names (and corresponding structures). Consequently this is why we end up with crazy function names on OSX.

Does that make sense? Do you think it should be fleshed out a little more?

@cuviper
Copy link
Member

cuviper commented Oct 19, 2015

Yes, I'm happy with that approach, thanks!

@mzabaluev
Copy link
Contributor

@retep998:

I am currently opposed to the line #[link(name = "msvcrt")] in libc. It should be rustc's job to choose which CRT to link in

I'm not familiar with how rustc operates on Windows, but unless output executables and DLL are all linked with CRT, libc is the go-to crate to request a CRT to be linked, while hiding the actual linkage details from all other crates. This works similar to -sys crates for other foreign libraries. On Linux, there is a similar effect of getting a particular library DSO name baked in at link time, which is not directly controlled by the compiler options (nor is it with the C toolchain, where processing an -l flag typically hits a symbolic link to some particular version of the shared library).

Perhaps the only place where the CRT DLL name can visibly leak into the Rust build system is the links key for Cargo, so special care needs to be taken there. The nursery project does not currently have links, and I think this is OK, as libc is not an ordinary -sys crate.

@briansmith
Copy link

libc is the go-to crate to request a CRT to be linked

Consider a C/C++ program, which has already made a decision of which CRT to link. Now, a Rust library is added to the program, which depends on the libc crate. Ideally, libc needs to use whatever version of the CRT the C/C++ program had already decided to use, instead of having libc force the program to make its CRT choice based on libc's choice.

@mzabaluev
Copy link
Contributor

libc needs to use whatever version of the CRT the C/C++ program had already decided to use

I assume, in a future revision of the crate, that can be selected at build time by means of configuration features. This detail should not affect the Rust source of crates using libc, it only needs to show up in the build files. I think this is good enough to stabilize the libc crate now and add linkage features later.

One possible case where this could break is when the stable API has implicit dependency on the choice of the CRT version or linkage variant. We probably can set the cutoff on the oldest supported CRT based on what symbols libc exports currently, which should be an old enough version of Visual C++ to satisfy the majority of users. Other than that, is anyone aware of any msvcrt variance that leaks through the libc crate? Here, I'm not talking about unintended weirdness such as having multiple mallocs in the program due to improper linkage.

@briansmith
Copy link

Documenting equivalence seems like a fine idea to me, and I'd also be down for some stylistic advice with the caveat of auto-generated or auto-verified bindings!

As you can see above, I've filed a pull request that documents the equivalence and suggests using the Rust names.

You mentioned caveats regarding auto-generated or auto-verified bindings, but I don't think any caveat is necessary. It is trivial for binding generators and verifiers to map between Rust and C names, and such programs should also prefer to use the Rust names.

@steveklabnik
Copy link
Member

Digging through old Rust issues, I found rust-lang/rust#17547 which is about libc.

@alexcrichton
Copy link
Member Author

The libs team discussed this during triage yesterday, and the decision was to merge, so I will do so. Thanks for the discussion everyone!

@alexcrichton alexcrichton merged commit 415125f into rust-lang:master Oct 29, 2015
bors added a commit to rust-lang/rust that referenced this pull request Nov 10, 2015
This commit replaces the in-tree liblibc with the [external clone](https://github.com/rust-lang-nursery/libc) which has no evolved beyond the in-tree version in light of its [recent redesign](rust-lang/rfcs#1291).

The primary changes here are:

* `src/liblibc/lib.rs` was deleted
* `src/liblibc` is now a submodule pointing at the external repository
* `src/libstd/sys/unix/{c.rs,sync.rs}` were both deleted having all bindings folded into the external liblibc.
* Many ad-hoc `extern` blocks in the standard library were removed in favor of bindings now being in the external liblibc.
* Many functions/types were added to `src/libstd/sys/windows/c.rs`, and the scattered definitions throughout the standard library were consolidated here.

At the API level this commit is **not a breaking change**, although it is only very lightly tested on the *BSD variants and is probably going to break almost all of their builds! Follow-up commits to liblibc should in theory be all that's necessary to get the build working on the *BSDs again.
@Centril Centril added A-nursery Proposals relating to the rust-lang-nursery. A-libc Proposals relating to the libc crate. labels Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-libc Proposals relating to the libc crate. A-nursery Proposals relating to the rust-lang-nursery. final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.