-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable multiple exporters #2760
Conversation
9d597fe
to
97804ac
Compare
Have not reviewed this yet but making the returned metadata into array with objects with exporter types does not seem right. I think all exporters would just add their keys to the metadata. Exporters should be careful to not include keys that another exporter might use for a different value. |
The reasoning behind an array is the freedom it provides regarding the use of multiple instances of the same exporter(s). But if the mental model of the export is the use of unique exporters - e.g. that no exporters can reuse keys / should be careful about overwriting existing keys, then I understand that a single map is a more appealing and simpler approach. |
6d899f4
to
083e4c5
Compare
@tonistiigi I updated the implementation to pack multiple exporter responses into a single map. |
@a-palchikov Check the CI results. |
664635e
to
1a24044
Compare
@tonistiigi Let me know if there's anything I need to do on my side. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some tests still look red in CI.
api/services/control/control.proto
Outdated
message SolveResponse { | ||
map<string, string> ExporterResponse = 1; | ||
ExporterResponse ExporterResponse = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this backward compatible in proto level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it is, I'll go back to the previous message layout - there is no more need for this.
api/services/control/control.proto
Outdated
map<string, string> Response = 1; | ||
// Name identifies the exporter. It is empty | ||
// for the common exporter metadata (e.g. cache exporter) | ||
string Name = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afaics this is unused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - will remove.
client/client_test.go
Outdated
@@ -1893,6 +1895,74 @@ func testUser(t *testing.T, sb integration.Sandbox) { | |||
checkAllReleasable(t, c, sb, true) | |||
} | |||
|
|||
func testPreventsMultipleExportWithSameExporter(t *testing.T, sb integration.Sandbox) { | |||
integration.SkipIfDockerd(t, sb, "multiple exporters") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple local exporters should be allowed in dockerd.
client/client_test.go
Outdated
} | ||
|
||
func testMultipleExporters(t *testing.T, sb integration.Sandbox) { | ||
integration.SkipIfDockerd(t, sb, "multiple exporters") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skip should not be required here but need to change the image exporter name to "moby" if Dockerd. Eg. look at testBuildInfoExporter()
.
Attrs: map[string]string{ | ||
"name": strings.Join([]string{target1, target2}, ","), | ||
}, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add tar
exporters here as well for good measure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, just realized that this could be a bigger issue with the multiple file sender implementations like local and tar exporters. There needs to be some sort of fan-out for the multiple file exporters on the gRPC level.
solver/llbsolver/solver.go
Outdated
exporterResponse, err = e.Export(ctx, inp, j.SessionID) | ||
return err | ||
if err := inBuilderContext(ctx, j, exp.Name(), "", func(ctx context.Context, _ session.Group) error { | ||
resp, err := exp.Export(ctx, inp, j.SessionID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if these should run in parallel if multiple set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will give it a try and see if it can be made to run in parallel. Hopefully there's no mutable state to share.
1a24044
to
44e035f
Compare
My recent changes had another undesired ripple effect - the propagation of compression to the inline cache exporter. I traced the changes to this PR and now have a question regarding this behavior in the presence of multiple exporters. The behavior could be extended as following:
Has a support for explicit compression configuration for the inline cache exporter been considered? (sorry if I'm out of the loop here):
I know this was possibly a valid choice with just one export target, but now that we want multiple exporters at a time and I think it's valid to assume they might have different compression requirements. |
44e035f
to
1496c15
Compare
Inline cache export should not define a compression because it shouldn't create any new layers. Its job is to match the compression of the exported image. So there is always 1-to-1 match between image exporter and inline cache exporter. If there are multiple image exporters with multiple compressions then the inline cache needs to also run multiple times with each result connected to the correct image export. Did this appear from making the exporters run in parallel or just a new aspect you didn't think about before? If former then we could leave that to follow-up. FYI @ktock |
No, it was something I did not think about before. Thanks for explaining - I get it now. Also sorry it is taking that much time - I have actually started looking into a way to make multiple client exporters (file and directory) work together after realizing this and thinking about your suggestion. Currently, it is not possible to run two or more local exporters (i.e. by changing the destination) or write multiple tarballs due to the limitation of a single |
1496c15
to
384c0ee
Compare
384c0ee
to
60ff7c7
Compare
var exporters []exporter | ||
ids := make(map[string]int) | ||
for _, exp := range opt.Exports { | ||
if id, ok := ids[exp.Type]; !ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A naive way to auto-id the exporters. Potentially could also be provided id
from configuration.
} | ||
|
||
func (e *imageExporterInstance) ExportImage(ctx context.Context, src exporter.Source, inlineCache exptypes.InlineCache, sessionID string) (map[string]string, error) { | ||
meta := make(map[string][]byte) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy of the metadata as the exporters are executed concurrently.
@@ -113,10 +124,29 @@ func syncTargetDiffCopy(ds grpc.ServerStream, dest string) error { | |||
return true | |||
} | |||
}(), | |||
})) | |||
}) | |||
for _, dest := range dests[1:] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This runs unconditionally of the previous error as fsutil.Receive
seems to alwalys fail with context.Cancelled
(even on master) and I did not look into this just yet.
session/filesync/filesync.go
Outdated
} | ||
|
||
func CopyToCaller(ctx context.Context, fs fsutil.FS, c session.Caller, progress func(int, bool)) error { | ||
method := session.MethodURL(_FSSend_serviceDesc.ServiceName, "diffcopy") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose to add another grpc server to sync directory for the local exporter since it was impossible (well, complicated) to make both file exporters and the local exporter run on the same protocol.
I think it could have already been the precedent with the FileSync service but I might be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need another service here.
We can do what we use for multiplexing input directories, using the grpc header. So in the filesend handler, we should get the header value (and use something similar to the DirSource
interface).
I also left the metadata conflict resolution up for discussion and potentially another PR. |
60ff7c7
to
0f4c73d
Compare
Needs rebase |
0f4c73d
to
a8ec866
Compare
1fea6d7
to
27a1987
Compare
27a1987
to
ae8e22d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@a-palchikov sorry for the massively delayed review.
I've had a glance through, I think the protocol changes make sense - I've just got a couple notes around the local exporting, I think we can simplify some stuff there.
It's been a while - I'm happy to help carry this, otherwise, I'll definitely stay on top of reviews and feedback for this, really sorry for the delays again ❤️
exporter/containerimage/export.go
Outdated
src := *inp | ||
src.Metadata = meta |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're making a copy, we should overwrite inp
with the copy so we don't accidentally use it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I also updated the inputs to be values instead of references to present a stronger cue that they're not supposed to relay values back to caller.
frontend/dockerui/build.go
Outdated
r.mu.Unlock() | ||
} | ||
|
||
type mutableResult struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, this is a good point.
I'd rather we rework the Result
type though (making all of the public types into private ones), instead of having wrapper types around it.
session/filesync/filesync.go
Outdated
} | ||
|
||
func CopyToCaller(ctx context.Context, fs fsutil.FS, c session.Caller, progress func(int, bool)) error { | ||
method := session.MethodURL(_FSSend_serviceDesc.ServiceName, "diffcopy") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't need another service here.
We can do what we use for multiplexing input directories, using the grpc header. So in the filesend handler, we should get the header value (and use something similar to the DirSource
interface).
} | ||
return inBuilderContext(ctx, job, exp.Name(), job.SessionID+"-export", func(ctx context.Context, _ session.Group) (err error) { | ||
var dref exporter.DescriptorReference | ||
resps[i], dref, err = imageExporter.ExportImage(ctx, inp, *inlineCache, sessionID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can avoid the interface juggling by passing a function that runs the runInlineCacheExporter
function. Exporters that don't support inline cache just don't need to call that function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface helps to determine whether it's necessary to run the inline cache exporter in the first place - if that were a simple function / or placed into Source
as additional metadata before the corresponding exporter's Export is entered, such optimization wouldn't be possible. I'm ok dropping it if it really simplifies things by a lot in your POV.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Result type and the need to handle concurrent updates - I see the reasons of simplicity of having a single thread-safe type but I also don't see how to do this with as little changes as possible - this change will be massive, penetrating a lot of places due to potential change of APIs - e.g. instead of allowing direct attribute access, it would have getters/setters instead to exercise greater control (that concerns the metadata of all attributes).
Which also makes me wonder if there won't be a strong case for the 2 type hierarchy - a simple Result not safe for concurrent use (and suiting the majority of use-cases) but also a thread-safe wrapper to compensate for cases where there is a legitimate need to populate the result from multiple goroutines. So far, I could only find a single place where that was necessary. Please let me know if I'm in the wrong here.
With that said, can we also reach some sort of intermediate ground and delay this refactoring to a dedicated issue?
Attrs: exp.Attrs, | ||
}) | ||
} else if localExporter == nil { | ||
// TODO(dima): different options per local exporter? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we do my suggestion below for multi-plexing between multiple output directories, this issue should go away? We then shouldn't need to do the de-duplication.
@jedevc No worries - I'll look into your suggestions, thanks! |
Do you already have a clue if this feature will land in a release, or in a preview release? |
I'm working on the review comments, albeit slowly. Sorry for the delay. |
ae8e22d
to
5555bd5
Compare
In order to support multiple local exporters, a new gRPC API has been added in order to distinguish the origin of the message packet. Also, I felt that the solver/result package was slightly confusing with its use of the mutex to protect some metadata access patterns while not providing enough protection for potential data races. Instead, where necessary to protect against data races, the result values are explicitly duplicated (including the metadata). Signed-off-by: fahed dorgaa <[email protected]> Signed-off-by: a-palchikov <[email protected]>
…h the implementations Signed-off-by: a-palchikov <[email protected]>
remains as to what to do to possibly different options per local exporter since it's currently replicated on the client but the attestation files are generated on the server. Signed-off-by: a-palchikov <[email protected]>
Signed-off-by: a-palchikov <[email protected]>
Signed-off-by: a-palchikov <[email protected]>
* assign the input source in exporters to avoid re-using the stale reference * remove the FSSend gRPC service - instead rely on gRPC metadata to communicate the functional protocol Signed-off-by: a-palchikov <[email protected]>
thread-safe Result. Address review comments: * Re-use the input variable in exporters to avoid accidental re-use of stale values Signed-off-by: a-palchikov <[email protected]>
5555bd5
to
d1192f4
Compare
No worries thanks a lot for the feedback ! |
There's actually a replacement PR for this - I'll close this one to avoid confusion. |
fixes #1555
This is a follow-up on #1788.
Multiple exporters
This PR enables buildkit to output to multiple destinations:
No attempt was made to avoid metadata collisions that multiple exporter instances might be running into at this point.