-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for changes to source-build to support parallelism #3608
Comments
Do we also need to ensure that any compiler servers that are started by the build of one repo are isolated from other repo-builds and don't accidentally share their state? |
Possibly. That needs investigation. |
That's a good point. We've run into issues in the past with that: #3233 |
I think we need to follow up on the expected behavior of the compiler servers with the roslyn/msbuild team. Since the repo invocations are separate processes altogether (we invoke a new build.sh), I would not expect them to accidentally share state. If they did, that would be unwanted behavior in many cases (e.g. you run a build with two different .NET SDKs in serial). |
Note that we do have the option of turning off the compiler server if we want. |
If there are issues running compiler server in parallel builds please put the bug in dotnet/roslyn not here. The compiler server fully supports running in parallel and any bugs when it does not function there is something I would really like to look into. |
Alright, we have the Jared stamp. Good to know that these issues have been resolved. |
I looked at the original issue where you ran into parallelism problems. The symptoms on the PR are virtually identical to what we see in dotnet/runtime#85082. That is one of those "the error makes no sense" type of issues. The repro is very sporadic and hence never been able to catch a machine doing it so we could debug. If you have a reliable repro it would be nice. Note: it's likely with newer compilers (RC1 and later) this issue wouldn't repro. Even the "this can't happen but what if it did" scenarios that produce the behavior were eliminated with a recent compiler change. If it still repros then I'd be even more interested because there is a misconception somewhere that needs to be rectified. |
I can setup a private branch with our workaround removed to see if it still repros. |
Describe the Problem
Parallelism should provide serious gains for source-build's build. In a typical build, the end to end is around 50 mins. With parallelism enabled, I got < 30.
Describe the Solution
The are a few problems with enabling parallelism:
How to define the dependency graph correctly
This one is trickier than expected. What we should strive for is that each component project identifies only those dependencies that it directly depends on, and does so completely. Ideally, this information would be generated programmatically off of what the repo depends on in its Version.Details.xml file, as this is the source of information we use to determine what versions should be overridden.
Theoretically, for every component, we could identify the set of
SourceBuild
dependencies in the Version.Details.xml, and use those as theRepositoryReference
items. However, this graph would be both incomplete in some areas, and have too many edges in others. The main problem with this approach is that the SourceBuild elements are often centered around usage in repo-level source build, which is slightly different.All said though, info from traversing the graph is very close to the desired build order. Just need to add a few new edges and remove some existing ones. I propose the following approach:
Approach
SourceBuild
marked dependencies.AdditionalRepositoryReferences
andRemoveRepositoryReferences
AdditionalRepositoryReferences
is a set of additionalRepositoryReference
that should be added to the existing dependencies. For sdk.proj, this might include nuget.client, adding to the existing set. For dotnet.proj, this would be installer and source-build-packages, adding to an empty set because there are no sources.RemoveRepositoryReferences
is a set ofRepositoryReference
that should be removed from the generated set. This would include dependencies that cause circular references.The set of repository references can be calculated as:
RepositoryReferences = <Set of eng/Version.Details.xml SourceBuild dependencies, if available> + AdditionalRepositoryReferences - RemoveRepositoryReferences
The set of final RepositoryReference elements is built before the current project is built.
How to ensure correctness
If the final generated graph has cycles, MSBuild will detect these cycles during evaluation and the build will fail. Ensuring that there are enough edges is slightly more difficult.
The key to ensuring there are enough edges and they are in the right places is to ensure that outputs of a repository only go to separated locations not used for inputs and input locations are not shared between repos. A similar approach to what ProdCon v1 did would work here.
RepositoryReferences
set for that component are combined together into a unique feed (including for non-NuGet). To save space, symlinks or hardlinks could theoretically be used.Because the input version of a package is determined from the previously source-built packages + the input package feed, which would now be unique per component, if an edge is missing, the incorrect input version would be used. This would then usually result in a poison failure (taken from previously source-built), or a prebuilt (no edge at all).
Caveat
There is one case that this approach would not catch. A missing edge (e.g. to arcade) that would not cause a poison failure but could cause unwanted build behavior. In that case, the previously source-built would be used instead of the live arcade.
To fix this, we could go one step further and separate the previously source-built artifacts by component, and then apply the same methodology of restricting the inputs to a given component build only to declared dependencies. In this case, the declared repo references would need to avoid trimming away cycles in the graph (e.g. arcade would need to depend on arcade, sdk and runtime).
I don't think this step is necessary, unless it becomes clear that the number of edges that must be added or removed from the graph to ensure correctness is large (and thus easy to get wrong). My initial investigations suggest it is not, and only a few components need to alter their deps.
T-Shirt Size: Medium
The text was updated successfully, but these errors were encountered: