Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ppc64le support #28363

Closed
wants to merge 6 commits into from
Closed

ppc64le support #28363

wants to merge 6 commits into from

Conversation

sumitd2
Copy link

@sumitd2 sumitd2 commented Jul 12, 2023

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

Signed-off-by: Sumit Dubey <[email protected]>
@repokitteh-read-only
Copy link

Hi @sumitd2, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #28363 was opened by sumitd2.

see: more, trace.

@repokitteh-read-only repokitteh-read-only bot added the deps Approval required for changes to Envoy's external dependencies label Jul 12, 2023
@repokitteh-read-only
Copy link

CC @envoyproxy/dependency-shepherds: Your approval is needed for changes made to (bazel/.*repos.*\.bzl)|(bazel/dependency_imports\.bzl)|(api/bazel/.*\.bzl)|(.*/requirements\.txt)|(.*\.patch).
envoyproxy/dependency-shepherds assignee is @htuch

🐱

Caused by: #28363 was opened by sumitd2.

see: more, trace.

@sumitd2
Copy link
Author

sumitd2 commented Jul 12, 2023

cc: @clnperez

@yanavlasov
Copy link
Contributor

@sumitd2 looks like BUILD files needs to be re-formatted: https://dev.azure.com/cncf/envoy/_build/results?buildId=142353&view=logs&j=c7de00aa-b0fb-52ba-707b-da6afd918eda&t=2ec0901d-43fb-521a-88ec-473706b56eab&l=275

@yanavlasov
Copy link
Contributor

/wait

Signed-off-by: Sumit Dubey <[email protected]>
"amd64": "@platforms//cpu:x86_64",
"arm": "@platforms//cpu:arm",
"arm64": "@platforms//cpu:aarch64",
- "ppc64": "@platforms//cpu:ppc",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we upstream and we have to resort to patches?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch Two issues are still open with the rules_go community, hence the delay:
bazel-contrib/rules_go#3626
bazel-contrib/rules_go#3531

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also making some progress there. apparently the big and little endian cpu types were using the same platform constraint. this was just merged: bazelbuild/platforms#64

Copy link
Author

@sumitd2 sumitd2 Nov 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also making some progress there. apparently the big and little endian cpu types were using the same platform constraint. this was just merged: bazelbuild/platforms#64

@yanavlasov @htuch We applied a patch to bazel 6.3.2 (updated bazelbuild/platforms version to 0.0.8) and rules_go (added "ppc64le": "@platforms//cpu:ppc64le"), then tried to build envoy, but somehow the cpu addition is not taking effect. This method worked with another project called bazelbuild/buildtools. Can you tell us where the envoy build is picking up the bazelbuild/platforms version? cc: @seth-priya @clnperez

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can refer to what I changed several years ago to get envoy to build for Power: https://github.com/envoyproxy/envoy/blob/main/bazel/BUILD
though I'm not sure if some of that has been reworked since then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @keith @phlax have better coverage these days on these kinds of Bazel issues.

#define OPENSSL_32_BIT
#elif defined(__myriad2__)
#define OPENSSL_32_BIT
+#elif defined(_ARCH_PPC64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you upstream this to BoringSSL? Thanks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @htuch, we have decided to build, test and upstream these packages.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch The boringssl community isn't keen, so we will have to keep this patch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we may be able to just get back into base.h. we'd be an "unofficial" architecture but could build. we just wouldn't have assembly initially.

@htuch
Copy link
Member

htuch commented Jul 13, 2023

OK, waiting for the update after upstreaming.
/wait

@sumitd2
Copy link
Author

sumitd2 commented Jul 18, 2023

cc: @clnperez

@clnperez
Copy link
Contributor

clnperez commented Aug 2, 2023

actually -- it looks like ppc64le support was removed from boringssl b/c google (i'm assuming that's the "we" in the commit message) no longer has a use for it: google/boringssl@7d2338d. @htuch do you have any insight into that? or suggestions?

@htuch
Copy link
Member

htuch commented Aug 3, 2023

@davidben

@agl
Copy link
Contributor

agl commented Aug 3, 2023

it looks like ppc64le support was removed from boringssl b/c google (i'm assuming that's the "we" in the commit message) no longer has a use for it: google/boringssl@7d2338d.

We (BoringSSL) try to be clear about what is supported and what isn't. Otherwise, a laissez-faire approach leads to expectations developing that things that happen to work will continue to work, and much upset if that ever ceases to be the case.

We no longer have access to any ppc64le machines, and they are no longer available in our build & test environments, thus support was dropped.

@clnperez
Copy link
Contributor

We no longer have access to any ppc64le machines, and they are no longer available in our build & test environments, thus support was dropped.

Thanks @agl. LMK if there's a better forum to dicuss this, but I'd love to get boringssl support for ppc64le back. I wasn't sure from reading commit messages here and there if you guys only keep support for platforms you are actively using for something. If all it takes is us giving you guys some VMs, I can get you that.

@htuch
Copy link
Member

htuch commented Sep 15, 2023

If BoringSSL will not support, we can't provide production support either, including coverage with https://github.com/envoyproxy/envoy/security/policy. We also don't have CI ourselves covering this.

@phlax @envoyproxy/dependency-shepherds @envoyproxy/senior-maintainers thoughts on how to handle this case?

@moderation
Copy link
Contributor

Not sure if it is a formal position (if it isn't it should be), however I believe there is agreement that we can't support platforms for which there is no CI. This is in alignment with the BoringSSL folks. Over the years the PPC interest has ebbed and flowed and without a solid generally available CI supporting the platform, and a team interested in maintaining the platform code, it isn't feasible to support.

@phlax
Copy link
Member

phlax commented Sep 17, 2023

If BoringSSL will not support, we can't provide production support either

i think i would agree

that said - im aware that a lot of effort has gone into lining up the pieces to add support here so would be good to find some kinda way forward

even if we get everything resolved in terms of dependencies etc im not sure we can support running as part of our normal ci pipeline, which would be required to add as part of production publishing (fwiw im not against adding this support but it would be a big commitment and would need general maintainer agreement)

im aware there is some separate CI that gets triggered (and currently fails) - im definitely not against fixing up whatever is required to make that work again - on the basis that its non-blocking and essentially informational

@ggreenway
Copy link
Contributor

If BoringSSL doesn't support it, I'm against adding Envoy CI for it; that would create the impression that this is supported. Unless we want to create a build with no TLS/QUIC/crypto for PPC, but I'm not sure if that's useful.

@phlax
Copy link
Member

phlax commented Sep 18, 2023

fwiw there is also the envoy-openssl project which i notice has been active recently - perhaps that would be a route

@clnperez
Copy link
Contributor

thanks for the discussion everyone. we're trying to figure out what we can realistically do.

im aware there is some separate CI that gets triggered (and currently fails) - im definitely not against fixing up whatever is required to make that work again - on the basis that its non-blocking and essentially informational

that's what we're shooting for right now. i'd like to find a way to chat with someone from boringssl about getting them hardware (also a way to be more involved vs just tossing hardware over the fence). boringssl support going away has, over the past few months, cropped up in a few different projects.

Copy link
Contributor

@clnperez clnperez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good news on two fronts

#define OPENSSL_32_BIT
#elif defined(__myriad2__)
#define OPENSSL_32_BIT
+#elif defined(_ARCH_PPC64)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we may be able to just get back into base.h. we'd be an "unofficial" architecture but could build. we just wouldn't have assembly initially.

"amd64": "@platforms//cpu:x86_64",
"arm": "@platforms//cpu:arm",
"arm64": "@platforms//cpu:aarch64",
- "ppc64": "@platforms//cpu:ppc",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also making some progress there. apparently the big and little endian cpu types were using the same platform constraint. this was just merged: bazelbuild/platforms#64

@clnperez
Copy link
Contributor

@sumitd2 due to the new cpu added, we'll need to update this to ppc64le instead of just ppc.

@sumitd2
Copy link
Author

sumitd2 commented Oct 25, 2023

@sumitd2 due to the new cpu added, we'll need to update this to ppc64le instead of just ppc.

@clnperez The change (bazelbuild/platforms version update) has to first go into bazelbuild/bazel and a new version be tagged. Then it has to go into bazelbuild/rules_go and a new version be tagged. Then envoy (this PR) will be updated with both these new versions, and ppc be changed to ppc64le as you mentioned. I am waiting for the internal code contribution approval for bazelbuild/bazel to start the process. cc: @seth-priya

@sumitd2
Copy link
Author

sumitd2 commented Dec 20, 2023

fwiw there is also the envoy-openssl project which i notice has been active recently - perhaps that would be a route

@phlax @clnperez I dont think we can use envoy-openssl because of these patches:
https://github.com/envoyproxy/envoy-openssl/blob/d4035e86c6fff83a9712c25e4e3fa17033c64632/WORKSPACE#L30
So basically, in order to save ourselves one boringssl patch (as included in this PR), we will have to maintain ten of them. Let me know if i am missing something.
cc: @seth-priya

@clnperez
Copy link
Contributor

clnperez commented Jan 3, 2024

@sumitd2 ah, yes. Now that I'm digging there are many patches there we'd have to maintain. Some are not straight-forward. But the Google folks have said they don't want to deal with anyone assuming they themselves support an architecture they don't have in their CI platform.

Adam Langley said the point of base.h is to make it easy for others to patch in support as needed, and make the above very clear. So, @phlax -- could we reconsider the base.h patch here?

im aware there is some separate CI that gets triggered (and currently fails) - im definitely not against fixing up whatever is required to make that work again - on the basis that its non-blocking and essentially informational

That's what I've been working to get running again. I own it, and @sumitd2 is going to help keep it working. It runs in a VM at Oregon State University using an image I build and push (but am looking into improving that process to make it less on me).

@phlax
Copy link
Member

phlax commented Jan 3, 2024

So, @phlax -- could we reconsider the base.h patch here?

probably more a question for @ggreenway @htuch

@davidben
Copy link
Contributor

davidben commented Jan 3, 2024

Adam Langley said the point of base.h is to make it easy for others to patch in support as needed, and make the above very clear.

Are you referring to this comment?
https://boringssl.googlesource.com/boringssl/+/c1c7f0929f18fabc74004242d3e4ce03f54511d0/include/openssl/target.h#58

That's not what the comment says. The comment says the point of this list (which has since moved from base.h, so a patch would need to be rebased when updating BoringSSL) is to make it clear what we do and do not support. ppc64le is in the latter category.

It then talks about patching, but it is not saying the point is to be easy to patch. Rather, it is saying that, as a practical matter, if you want to use some configuration we do not support, you must be willing to carry local patches, with all the overhead that entails. E.g. you may run into issues that only impact ppc64le. (When we did support it, we had to carry a workaround for some ppc64le-only transactional memory bug that triggered on old kernel + old glibc! Time-traveling bugs are quite a nightmare to track down!)

If you are willing to play that game, then patching the list is trivial.

@htuch
Copy link
Member

htuch commented Jan 4, 2024

It then talks about patching, but it is not saying the point is to be easy to patch. Rather, it is saying that, as a practical matter, if you want to use some configuration we do not support, you must be willing to carry local patches, with all the overhead that entails. E.g. you may run into issues that only impact ppc64le. (When we did support it, we had to carry a workaround for some ppc64le-only transactional memory bug that triggered on old kernel + old glibc! Time-traveling bugs are quite a nightmare to track down!)

If you are willing to play that game, then patching the list is trivial.

I think I can safely say Envoy as a project does not have the appetite to develop, debug or maintain patches of the nature mentioned by @davidben. I think a prerequisite for us to support any architecture is BoringSSL support.

If we want to carry support for PPC with big red warnings pointing to the "use at your own risk, security critical portions of the build are unmaintained", exclude support for PPC build issues from Envoy maintainers and explicitly rule our supporting PPC in our security policy, I guess we can take the patch. But, I'm not a fan, and tend to agree with @ggreenway.

@mattklein123
Copy link
Member

If we want to carry support for PPC with big red warnings pointing to the "use at your own risk, security critical portions of the build are unmaintained", exclude support for PPC build issues from Envoy maintainers and explicitly rule our supporting PPC in our security policy, I guess we can take the patch. But, I'm not a fan, and tend to agree with @ggreenway.

+1 IMO this is a non-starter. We have been down this path before and it doesn't end well.

I would recommend maintaining a fork. We are happy to link to it.

@htuch
Copy link
Member

htuch commented Jan 17, 2024

https://nvd.nist.gov/vuln/detail/CVE-2023-6129 for a timely PPC specific bug that affects OpenSSL.

@alyssawilk
Copy link
Contributor

@htuch PTAL?

@clnperez
Copy link
Contributor

clnperez commented Feb 7, 2024

Are you referring to this comment?

no. an offline conversation

I would recommend maintaining a fork. We are happy to link to it.

We do have the maistra proxy fork: https://github.com/maistra/proxy, but no pure envoy fork with this patch. Maybe that's something we can take a look at.

I do appreciate all the input here. We (some s390x and ppc64le folks at IBM) will have to have some chats.

@htuch
Copy link
Member

htuch commented Feb 7, 2024

Thanks all. At this point I'm going to close this PR as it looks like we can't find a path forward to merge it without BoringSSL support. @clnperez feel free to open a PR for our landing Markdown / docs as appropriate to link to any PPC artefacts.

@htuch htuch closed this Feb 7, 2024
@clnperez
Copy link
Contributor

@htuch -- what're your thoughts on something like a bazel workaround that will build envoy for power/z with openssl, but boringssl for everything else? and of course we get the CI back (where this all started), and we'd own and watch that CI.

@clnperez
Copy link
Contributor

Actually that isn't something bazel could do. Probably no way to conditionally use one or the other here aside from the fork that already exists

@phlax
Copy link
Member

phlax commented Feb 16, 2024

bazel can build differently according to arch - using a flag is also a possibility

@ggreenway
Copy link
Contributor

Bazel could do that (see the FIPS build for an example), however I don't think openssl and boringssl are API-compatible, at least in some of the specific ways we're using them. Hence the openssl-fork of envoy, which is what you'd need to use to get openssl to work.

@phlax
Copy link
Member

phlax commented Feb 16, 2024

@clnperez if you need any help setting up an envoyproxy/envoy-ppc64le repo or similar i can assist

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deps Approval required for changes to Envoy's external dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.