Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit HeaderMaps #3712

Closed
wants to merge 3 commits into from

Conversation

jerrymarino
Copy link
Contributor

@jerrymarino jerrymarino commented Sep 9, 2017

Request for feedback on an implementation of C++ HeaderMaps in Bazel.

A HeaderMap is a data structure that allows the compiler to lookup included
headers in constant time.

Traditionally, the compiler has to parse a large string of iquote includes,
and then search these directories for a given header. This is slow for many
reasons.

The protocol of HeaderMap is implemented within compilers. Please find the
Lexer implementation in Clang.
https://clang.llvm.org/doxygen/HeaderMapTypes_8h.html
https://clang.llvm.org/doxygen/HeaderMap_8cpp_source.html

Use case:

I'm seeing a massive increase in build performance by using this. It cut my
clean build time in half.

Performance data:

Build time before HeaderMap:

Target //Pinterest/iOS/App:PinterestDevelopment up-to-date:
bazel-bin/Pinterest/iOS/App/PinterestDevelopment.ipa
____Elapsed time: 373.588s, Critical Path: 18.86s

Build time after header maps on the entire project:

Target //Pinterest/iOS/App:PinterestDevelopment up-to-date:
bazel-bin/Pinterest/iOS/App/PinterestDevelopment.ipa
____Elapsed time: 188.971s, Critical Path: 17.11s

Additionally, this solves the problem of having namespaced headers which is used
in CocoaPods all over. Using a namespace makes includes more clear since it is
easier for the user to distinguish where the header was derived.

Implementation:

At the ObjC level, headermaps are created with a namespace of the given target.
In objc_library it is possible for the user to override the value of the
namespace via the new attribute, header_namespace.

By using 2 headermaps the headersearchs are most efficient: a headermap for the
current target, and a header map with namespaced includes.

Users can include headers from ObjC targets in the convention of
Namespace/Header.h. Projects that don't use namespacing should see benefits as
well: includes of the form Header.h will be read from the headermap.

HeaderMapInfo contains all of the transitive info for dependent header maps,
and is merged together into a single map. This yields much better performance
than multiple headermaps.

This is my first PR to the Bazel repo, so any suggestions or feedback is greatly
appreciated!

@bazel-io
Copy link
Member

bazel-io commented Sep 9, 2017

Can one of the admins verify this patch?

@philwo
Copy link
Member

philwo commented Sep 11, 2017

@lberki @ulfjack Could you please take a look at this? Maybe Manuel would also be interested in this?

@lberki
Copy link
Contributor

lberki commented Sep 11, 2017

/cc @mhlopko

First the fundamentals:

  • How are these files passed to Clang? On the command line?
  • What happens if two header maps conflict or if what the header maps say is inconsistent with what the outcome header search would be?
  • How would this work with non-Clang compilers, e.g. gcc or MSVC?

Since these are important enough questions, I took the liberty of only skimming over the change, and there were two things that came to mind:

  • You seem to pass a specific string-to-string map to the new rule. It looks like it's easy to override anything the C++ compiler might want to include (e.g. <stdio.h>), which looks like a bit too much rope. Maybe a better approach is to generate these from the hdrs= declarations somehow (handwave-handwave)
  • It's true that having external utilities is a bit of an overhead, but what you get out of it is that you can change them without waiting for the next Bazel release. So all things considered, I'd prefer as much functionality as possible to be outside of the source of Bazel. In my experience, that makes for much less frustration.

@jerrymarino
Copy link
Contributor Author

Thanks for the feedback @lberki!

How are these files passed to Clang? On the command line?

First, a HeaderMap is included in the hdrs attribute, which makes it a dependency of a given rule. A user passes the file to a compiler via include options: in Bazel via the includes and copts attributes. Additionally, it needs an include opt for the current directory -I .

Example:

header_map(
    name = "SomeHeaderMap",
    map = {
        "Header.h": "Some/Header.h"
    }
)

objc_library(
    name = "Some",
    hdrs = [
        "Some/Header.h",

        # Depend on and propagate the HeaderMap
        ":SomeHeaderMap"
    ],

    # include genfiles -> SomeHeaderMap.hmap
    includes = [
        "SomeHeaderMap.hmap",

        # Needed for HeaderMaps
        "-I", "."
    ]
)

What happens if two header maps conflict or if what the header maps say is inconsistent with what the outcome header search would be?

The compiler resolves conflicts internally. At a higher level, it is a similar issue if a user added conflicting include paths; Say, "Header.h" on disk in 2 different directories and both directories are isystem included. Similar to include order, there is a precedence order to HeaderMap includes.

How would this work with non-Clang compilers, e.g. gcc or MSVC?

HeaderMaps are supported on Apple's GCC Fork.

I'm not familiar with MSVC, but it looks like on WinObjc, they have support for HeaderMaps.

You seem to pass a specific string-to-string map to the new rule. It looks like it's easy to override anything the C++ compiler might want to include (e.g. <stdio.h>), which looks like a bit too much rope. Maybe a better approach is to generate these from the hdrs= declarations somehow (handwave-handwave)

I definitely think it should generate them based on hdrs ( perhaps in a similar fashion as modulemaps are ). For me, this would have netted a 2 minute decrease in clean build times, without any complexity on the my end! I can look into adding support for this, but it may be a larger change.

For some of my use cases, a custom map is required: I need to specify a custom namespace for headers. The namespace is not exposed to bazel objc_library rule in any way.

For example, in CocoaPods, headers are included from a given Pod with a namespace.

<AsyncDisplayKit/Header.h>

And on the file system:

external/*Texture*/Arbitrary/Path/To/Header.h

The string AsyncDisplayKit doesn't necessary exist at the rule or file system level, but other CocoaPods and consumers import the header with that namespace.

So for generating maps for CocoaPods or other Apple style libs, I think manually specifying namespaced headers is useful. Doing so at the BUILD file level seems less obtrusive and has the added benefit of being more flexible for other use cases.

It's true that having external utilities is a bit of an overhead, but what you get out of it is that you can change them without waiting for the next Bazel release. So all things considered, I'd prefer as much functionality as possible to be outside of the source of Bazel. In my experience, that makes for much less frustration.

Waiting for official Bazel releases isn't a big issue for me. At this point, I'm using custom builds for this and other changes.

@r4nt
Copy link
Contributor

r4nt commented Sep 12, 2017

I agree this should be implemented very similarly to module maps - that is, automatically generated from the build rules. For namespacing, it sounds like we could add a hdrs_ns attribute that would set the mapped to namespace for a particularly library?

@lberki
Copy link
Contributor

lberki commented Sep 12, 2017

How does the compiler find the header map? As far as I can tell, this change doesn't change its command line. What happens if multiple header maps are supplied? Is there a precedence between them? If so, what is it?

I agree with @r4nt -- it looks like it's better to compute header maps from information already given to Bazel so that one doesn't run the risk of providing inconsistent information in the header_map build rule and in e.g. hdrs= attributes.

How stable are header maps? The links you provided are to Clang internals, which doesn't fill me with confidence that it will remain the same for a reasonable amount of time.

@d0k
Copy link

d0k commented Sep 12, 2017

The header map format hasn't changed in the last 10 years and probably has been stable for a long time before that. However, it is undocumented.

It's essentially a hashmap to accelerate file system queries and thus replaces directory lookups with a path string lookup. Header maps are specified with -Ifoo.hmap on the command line and act pretty much the same as specifying a directory. You can have multiple of those flags or mix them with normal -Idir and they will be queried in command line order.

Automatically generating header maps to accelerate Clang build times (GCC and MSVC are unlikely to ever implement this) seems worth investigating. Getting the granularity right for that might be tricky though, you cannot represent include order within a single header map but making many small header maps destroys the potential compile time speedup.

@thomasvl
Copy link
Member

Drive by - Why would you want this? Last time I looked, header maps (which were created by Xcode), they were keyed by the header name only. So if you have foo.h in two directories, it couldn't model it correctly and even thought you might have the property includes set up to control the search order, the compiler could end up using the wrong one because of what "won" in the header map.

@jerrymarino
Copy link
Contributor Author

Thank you kindly for the consideration and feedback!

As @d0k mentioned, HeaderMaps have been around for a while. They have been default enabled in production Xcode under Clang and before that Apple GCC.

For my iOS app, the traditional approach of header lookup does not scale. I've got a large target and directory of ~1700 headers, a few small internal deps, and ~50 external deps ( CocoaPods ). Total, it puts me at ~4500 headers and respective directories.

The poor performance even at this scale is due to the nature of the compiler's header search algorithm. The complexity of linear searching over included directories is the issue here. Build times without this are a non starter even on the best Apple hardware.

Anecdotally, as an experiment, I replaced HeaderMaps with regular includes in Xcode and saw a 3x slowdown in Xcode's clean build time.

Ultimately, I think having a few maps at most for each compiler invocation ( and no other includes ) will be the fastest.

For a given target, I should be able to implicitly derive a headermap map based on hdrs which will boost performance across the board. The ideal headermap implementation for transitive headers is more nuanced and needs more consideration and testing.

I'm going to test a few different implementations to see how much build time I can regain. An implementation that just works would be ideal it's possible.

@r4nt
Copy link
Contributor

r4nt commented Sep 13, 2017

One question is whether we should have a header map for compiling each target; that way, we have
a) a single header map -> max performance
b) full control over how the header map is structured according to the dependencies
c) creating the header map is an extra step that will take time in O(transitive deps), which is the downside

@jerrymarino
Copy link
Contributor Author

So I've tested out a few different approaches to implementing this in of Bazel and the performance implications.

Header maps are created internally in Bazel and implicitly.

ObjC Namepace heuristic

Namespaced keys allow users to include headers from ObjC targets in the convention of <Namespace/Header.h> This is idiomatic for headers inclusions in ObjC development and CocoaPods. ( #306 )

By default headermaps are created with a namespace based on the name of a given target.

In objc_library it is possible for the user to override the value of the namespace via the new attribute, header_namespace.

Dependency headers and namespaces are added transitively with a TransitiveInfoProvider.

From the BUILD file perspective, this is easier than having to create a header_map manually. It wouldn't be possible to achieve this level of performance or clean API with a Skylark based rule.

Performance: Cuts clean build time in half

By using 2 headermaps the header searches are most efficient: a small -iquote included headermap for the current target, and an -I included headermap. Additionally, Bazel processes smaller compiler invocations now: all of the includes I previously needed are reduced to 2 includes.

Clean build time on base commit:

Target //Pinterest/iOS/App:PinterestDevelopment up-to-date:
bazel-bin/Pinterest/iOS/App/PinterestDevelopment.ipa
____Elapsed time: 373.588s, Critical Path: 18.86s

Clean build time after implicit header maps:

Target //Pinterest/iOS/App:PinterestDevelopment up-to-date:
bazel-bin/Pinterest/iOS/App/PinterestDevelopment.ipa
____Elapsed time: 188.971s, Critical Path: 17.11s

Overall, I think it's a good improvement 🎉

@dmishe
Copy link
Contributor

dmishe commented Sep 16, 2017

<> usually are used for system header imports. I understand that header maps gave access to user imports using system syntax, but the latest Xcode seems to do away with this practice:

The traditional header map which was generated when the “Always Search User Paths”
(ALWAYS_SEARCH_USER_PATHS) setting was YES is not supported by the new build system.
Instead, projects should set ALWAYS_SEARCH_USER_PATHS to NO and should migrate to
using modern header include syntax:

Use quote-style include ("foo.h") for project headers, and reserve angle-bracket include
(<foo.h>) for system headers.

I think this is a good change, and makes your imports clear. Can header maps be used with "" includes instead?

@jerrymarino
Copy link
Contributor Author

jerrymarino commented Sep 18, 2017

@dmishe thanks for the feedback. On this branch, the header search routine will find imports of the form: "Namespace/Header.h" from the implicit header maps. Existing imports of "Header.h"from deps will work too. I think you may see some speedups just by using this branch on existing repos.

@jerrymarino jerrymarino changed the title RFC: [Performance] C++ HeaderMap rule Implicit HeaderMaps Sep 19, 2017
@jerrymarino
Copy link
Contributor Author

I've updated the patch based on the above discussions: added a couple tests and removed the original rule. The implicit header map logic handles all use cases I could think of.

I also wrapped it behind an option experimental_enable_implicit_headermaps, which is defaulted to off. It shouldn't impact builds in any way that don't have it set.

cc @mhlopko @lberki @ulfjack Is there any other feedback that you have for this?

" hdrs = ['headerb.h'],",
")");

assertThat(getConfiguredTarget("//x:objc")).isNotNull();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see a way to actually build //x:objc here, but that could be good to have it running end to end!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Java tests are not integration tests. You'd need to write an integration test to do an actual build.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulfjack thanks for the feedback. I'm not too familiar with the integration testing system yet but will look into this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the easiest would be to take inspiration from our existing integration tests, e.g. https://github.com/bazelbuild/bazel/blob/master/src/test/shell/integration/cpp_test.sh

@ulfjack ulfjack requested a review from hlopko September 25, 2017 11:05
@ulfjack ulfjack removed their assignment Sep 25, 2017
@jerrymarino
Copy link
Contributor Author

@mhlopko friendly ping! Would you kindly take another look at this?

If there anything you all think can be done here to get this in a mergeable state, I'm happy to discuss 👍

Copy link
Member

@hlopko hlopko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have time to finish the review today, so sending at least partial review out. One more note - please do not make it objc specific, it's useful for c++ too. And as you'll see in the comments, pls integrate it with include_prefix and strip_include_prefix.

headerMapsBuilder.add(internalHeaderMap);

String namespace;
if (ruleContext.attributes().has("header_namespace")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use include_prefix and strip_include_prefix for this? So we don't need another attribute.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for "this" I mean header_namespace attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could take a stab at making it work with the existing include_prefix and strip_include_prefix attributes in cc_library too.

None the less, I think the behavior of header_namespace is useful for objc_library and especially objc that was written under Xcode's conventions.

The current implementation of objc_library -> header_namespace has quite different abilities than cc_library -> include_prefix / cc_library -> strip_include_prefix. Basically, header_namespace puts every single header under a flat namespace regardless of that header's location on disk.

Even if objc_library gained the include_prefix and strip_include_prefix attributes, I don't think it'd be trivial to setup includes for the following repo:

LibA
    BUILD
    A1
        SRC1.c
LibB
    BUILD
    C1
        HDR1.h
    C2
        HDR2.h
        D1
            HDR3.h
    .. continues for tens of levels of headers

File LibA/BUILD

objc_library(
    name = "LibA",
    srcs = ["A1/SRC1.c"],
    deps = ["//LibB:LibB"],
)

File LibA/A1/SRC1.c

// The header file resides at LibB/C1/D1/HDR3.h
#import "B/HDR3.h"

File LibB/BUILD

objc_library(
    name = "LibB"
    hdrs = ["C1/HDR1.h", "C2/HDR2.h", "C2/D1/HDR3.h"],

    # Every single header is available under namespace "B"
    # i.e. #include "B/HDR3.h"
    header_namespace = "B"
)

When we compile LibA/SRC1.c we are passed the following module map:

    B/HDR1.h -> LibB/C1/HDR1.h
    B/HDR2.h -> LibB/C2/HDR2.h
    B/HDR2.h -> LibB/C2/D1/HDR2.h
    ... more

So includes in File SRC1.c are valid

// The header file resides at LibB/C1/D1/HDR3.h
#include "B/HDR3.h"

In CocoaPods.org and other iOS development, this include style is widely used

Given the usage of includes in iOS development ( i.e. in Texture ) and Xcode's include behavior I think it makes sense for objc_library to have the header_namespace attribute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I argued for something like header_namespace back in the day; I do think you bring good arguments why it provides an easier user experience. Given that, it might make sense to also support this for C++, as well as supporting it both with and without header maps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback @r4nt! For me, in addition to perf benefits, this feature key to making Xcode based source compatible with Bazel in a way that's not to much effort too.

If this is can be upstreamed, then I should be able to add header_namespace to C++ library as well.

Ideally, getting header_namespace to work without header maps could be broken out into a separate PR. I'm not exactly sure what that would entail, but it seems doable.

Does that sound like a good plan @mhlopko?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "flattening" behavior of header_namespace really something people rely on? It sounds confusing and error prone to me because of header name collisions. In order to implement this without header map (because people might want to use header_namespace even when they don't want header maps) we would have to pass -Idir for every dir in the target. I need more convincing that this is good idea @jerrymarino @r4nt :) If we remove "flattening" requirement, then header_namespace behaves exactly like include_prefix. Is there any way to collect more data on how often people rely of the "flattening"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to go into the direction of allowing OS code to work together without changes, it's fundamentally similar to the include_prefix and strip_include_prefix, while being simpler to understand and providing more flexibility, so I'd say it's a clear win.

ImmutableList.Builder<Artifact> headerMapsBuilder = ImmutableList.builder();
String targetName = ruleContext.getTarget().getName();

HeaderMapInfo.Builder internalHeaderMapInfo = new HeaderMapInfo.Builder();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also include virtual headers for include_prefix/strip_include_prefix

depHeaderMapInfo.addHeaders(publicTextualHeaders);

// Merge all of the header map info from deps. The headers within a given
// target have precedence over over dep headers ( See
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SuperNit: extra space between ( and See.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will Fix

// the working directory ( i.e. exec root ) in this form
// and it must be after including the header map files
contextBuilder.addIncludeDir(PathFragment.create("."));
ImmutableList headerMaps = headerMapsBuilder.build();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SuperNit: Reformat pls. In general, Bazel follows google java style guide (https://google.github.io/styleguide/javaguide.html). I'll take care of that when importing, but to increase readability pls try to adhere to column limit: 100, extra points for https://google.github.io/styleguide/javaguide.html#s4.5.1-line-wrapping-where-to-break :) But I'm nitting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks for the tip. I'll try to make this follow to the style guide closely :)

public final class ClangHeaderMap {
// Logical representation of a bucket.
// The actual data is stored in the string pool.
private class HMapBucket {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Let's make this comment a javadoc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And consider renaming to HeaderMapBucket?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll rename it to HeaderMapBucket

private static final String GUID = "4f407081-1951-40c1-befc-d6b4daff5de3";

// C++ header map of the current target
private final Map <String, String> headerMap;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make ImmutableMap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, will do!

) {
super(
owner,
ImmutableList.<Artifact>builder().build(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace with ImmutableList.of()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

owner,
ImmutableList.<Artifact>builder().build(),
output,
/*makeExecutable=*/ false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kudos for using param comments! :) Could you add spaces around makeExecutable=?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do!

}

@Override
public DeterministicWriter newDeterministicWriter(ActionExecutionContext ctx) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to use short names here for ctx and b. I'd personally go with headerMap instead of hmap, but that's subjective taste.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, Sir. I'll write out the actual names.

for(Map.Entry<String, String> entry: headerMap.entrySet()){
String key = entry.getKey();
String path = entry.getValue();
f.addString(key + "->" + path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to have this -> in the key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will Remove!

@jerrymarino
Copy link
Contributor Author

@mhlopko thanks so much for going through this and for the feedback, I greatly appreciate your time! I responded with comments inline. Most of it is straightforward enough.

I still think there's good value in having the header_namespace in objc_library - see the longer winded rant of why inline.

For cc_library, I can look into supporting the include_prefix and strip_include_prefix attributes in HeaderMaps. I definitely want this to work for other types of repos than Apple based ones :)

What to do you think?

@r4nt
Copy link
Contributor

r4nt commented Sep 28, 2017

My take is that header maps should only be a performance trick, and that we should always have a way to at least make it compile with compilers not supporting them, otherwise we generate a confusing user experience.

The say we'd do that is very similar to what cc_inc_library does - we'll create a directory symlinking in all the headers and put a -I in for that directory.

@jerrymarino
Copy link
Contributor Author

Ok, so I've freed some cycles to work on this and build perf again 🎉

So far I've:

  • rebased master
  • split out header_namespace into flatten_virtual_headers and include_prefix
  • fixed a few misc issues

Overall, I can't agree more this rendition is better than header_namespace.

I'd like to see this merged because it is making Bazel faster and easier to use for iOS/OSX development - especially for big apps migrated from Xcode, like Pinterest.

Is it possible to support the symlink'd version ( non Apple gcc / Clang ) as a followup PR? Doing this incrementally will prevent it from not being part of Bazel and prevent more bitrot. If we've got a plan to move forward, I'll work through the rest of @mhlopko 's comments ASAP. What do you think @ulfjack @mhlopko @r4nt

@lfpino
Copy link
Contributor

lfpino commented Mar 27, 2018

Hi @ulfjack @mhlopko @r4nt, Bazel sheriff here, what's the status of this pull request? It hasn't been updated for more than a week so it'd be good to understand if it's still ongoing or safe to close. Thanks.

@ob
Copy link
Contributor

ob commented May 22, 2018

I'm also wondering what the plans for this PR are... note that the rebased version is already almost 1,000 commits behind the latest Bazel release.

@ob
Copy link
Contributor

ob commented May 25, 2018

One problem that I just noticed with this PR (after rebasing on master and playing with it) is that it doesn't seem to add a way to get at the header map artifact from the objcProvider so that, for example, if a swift_library is made to depend on an objc_library that has flattened its header maps, there is no way from the apple_rules for swift to add the header map to swiftc's invocation.

ob pushed a commit to ob/bazel that referenced this pull request Jul 12, 2018
This is a squashed version of PR bazelbuild#3712
bazelbuild#3712
by Jerry Marino <[email protected]>

It adds an objc_library option called flatten_virtual_headers
that modifies the behavior of include_prefix such that instead
of prepending the path to the header's actual path, the headers
are mapped to the include_path namespace. E.g.

  src/heaaders/foo/bar.h src/headers/bar/baz.h

with the options:

  include_prefix = "FOO"
  flatten_virtual_headers = True

would let you include bar.h and baz.h as

  #include <FOO/bar.h>
  #include <FOO/baz.h>
ob added a commit to ob/bazel that referenced this pull request Jul 12, 2018
@ob ob mentioned this pull request Jul 12, 2018
@ob
Copy link
Contributor

ob commented Jul 13, 2018

I have some bandwidth to shepherd this through (I created #5587 with a rebased version on top of master and feedback from here incorporated). I don't know where is a good way to follow the discussion though.

I'm trying to clean up the code and tighten a few corners, and one of the things that I noticed is that the current version of the code adds both private and public headers to the header maps.

For example, this:

objc_library(
  name='trans_dep',
  srcs=['trans.m', 'trans_private.h', 'deep/trans/trans_private_deep.h'],
  hdrs=['trans.h', 'deep/trans/trans_deep.h'],
  include_prefix='trans',
  flatten_virtual_headers=True,
)

Will produce two header maps, trans_dep.hmap, and trans_dep_internal.hmap. I think the intent is that the _internal.hmap file has the non-namespaced includes and the other one is the public one, however, what I see is:

$ hmap print bazel-out/ios_x86_64-fastbuild/genfiles/test/trans_dep.hmap 
trans.h -> test/trans.h
trans/trans.h -> test/trans.h
trans/trans_deep.h -> test/deep/trans/trans_deep.h
trans/trans_private.h -> test/trans_private.h
trans/trans_private_deep.h -> test/deep/trans/trans_private_deep.h
trans_deep.h -> test/deep/trans/trans_deep.h
trans_private.h -> test/trans_private.h
trans_private_deep.h -> test/deep/trans/trans_private_deep.h
$ hmap print bazel-out/ios_x86_64-fastbuild/genfiles/test/trans_dep_internal.hmap
trans.h -> test/trans.h
trans_deep.h -> test/deep/trans/trans_deep.h
trans_private.h -> test/trans_private.h
trans_private_deep.h -> test/deep/trans/trans_private_deep.h

So it seems that the private headers are being exposed in both header maps. Doing this import:

#import <trans/trans_private.h>

in a dependency fails when used with a sandbox (because even though the header is mapped in the header map file, bazel doesn't copy it to the sandbox). However, if I run the build with --spawn_strategy=standalone to disable the sandbox, it finds the header.

I think we want the _internal.hmap file to have both private and public headers and the regular .hmap file to only have public headers. That way the current target can include both header maps and just propagate the .hmap file to its dependents.

(Meta note: do we want to continue the conversation here or in #5587? I created a new PR since I can't change the code in this one and I'm actively hacking on it).

@r4nt
Copy link
Contributor

r4nt commented Jul 16, 2018

@djasper

@ob
Copy link
Contributor

ob commented Jul 17, 2018

@r4nt I would hold off on the review. I'm finding it hard to wire in the Swift support because of the way HeaderMaps are implemented here. Namely, I think they should be closer to how ModuleMaps are implemented (i.e. remove HeaderMapInfoProvider.java and HeaderMapInfo.java and roll that functionality into a new file CppHeaderMap.java, then move some or most of ClangHeaderMap.java into HeaderMapAction.java). The purpose of this refactoring would be to be able to carry around the Artifact so that we can expose it in ObjcInfoProvider to be picked up by the Swift's rules written in Skylark.

@r4nt
Copy link
Contributor

r4nt commented Jul 17, 2018

Looped in djasper as he has worked on making c++ header detection faster (we still need to open source include scanning). Once we have include scanning, header maps should nicely fall out, making c++ compiles potentially a whole bit faster.

@ob
Copy link
Contributor

ob commented Jul 17, 2018

Hmm, interesting. Would it make sense to continue working on this PR or would you suggest waiting for @djasper's work to be open sourced?

Note that I'm interested in two aspects of this PR, one is the performance gain of using header maps, but the other is the ability to tweak the header namespace. There is a convention in the iOS apps world that headers from the same target can be imported as:

#import "header.h"

And headers from dependencies are imported as

#import <dependency/header.h>

Regardless of where they are located in the file system. The combination of the include_prefix and flatten_virtual_headers flags in this PR make that possible.

@r4nt
Copy link
Contributor

r4nt commented Jul 18, 2018

Oh, I wasn't suggesting this is gated on anything, just that djasper might want to help making sure this works well with his we envision the c++ rules to work.

@ob
Copy link
Contributor

ob commented Jul 18, 2018

Ok, my main worry is that @djasper's work could be rewriting CppHelper.java or CcCompilationHelper.java or some other large change like that which will make my hacking in this space pointless.

I'm currently modeling header maps pretty much like module maps, cleaning up interfaces and simplifying the classes. It'll probably take me another week or so to have something to show.

Is there any place where I can get a sense of what @djasper is doing or what his vision for C++ is?

@jerrymarino
Copy link
Contributor Author

@ob another, pragmatic suggestion is to pull this logic out into a skylark rule and try to add headermaps support via objc_library wrapping macros 😉 Such a rule might be easy to hook up to swift library as well.

@djasper
Copy link

djasper commented Jul 19, 2018

I have not planned any significant change in any of the regions mentioned. I am very much focussing on the execution phase, i.e. providing the compiler with the right inputs efficiently, improving input discovery and .d file scanning.

@ob
Copy link
Contributor

ob commented Jul 19, 2018

@jerrymarino that's an interesting idea. I'm not sure how that would manage to propagate the header maps through the objective-c dependencies that don't use the wrapper macro though... it seems to me that as long as objc_library is part of Bazel, the change needs to be done in Bazel.

@djasper Thanks for confirming what you're doing. I'll plow ahead with my changes then (I'm just repackaging the code and cleaning up some corner cases). I'll post back once I'm done.

@jerrymarino
Copy link
Contributor Author

@ob with some basic skylark, I think it is possible.

I'm testing out a few ideas:

  1. Merge headermap files transitively.
  2. Mirror the dependency graph with providers and add namespaces into the provider.

A theoretical skylark rule has a usability tradeoff since it can't automatically work as the current PR does.

@hlopko hlopko added team-Rules-CPP Issues for C++ rules and removed category: rules > C++ labels Oct 11, 2018
@hlopko hlopko added the P3 We're not considering working on this, but happy to review a PR. (No assignee) label Feb 19, 2019
@hlopko
Copy link
Member

hlopko commented Mar 12, 2019

Hi all,
I'm wondering if this PR still has a chance of being implemented, what do you think? Should we just close it and maybe revisit in rules_cc once the rules are fully there?

@ob
Copy link
Contributor

ob commented Mar 12, 2019

I closed mine: #5587, and #5954. Should I start working on rules_cc?

@hlopko
Copy link
Member

hlopko commented Mar 13, 2019

Well there is not much to work on in rules_cc yet (so far only macros delegating to native rules only). But eventually, yeah :)

@hlopko
Copy link
Member

hlopko commented Mar 13, 2019

Ok, I'll close this one as well.

@hlopko hlopko closed this Mar 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Rules-CPP Issues for C++ rules type: feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.