-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent page-size on arm64 #735
Comments
At least for gold this was fixed in the linker to default to 64KB and it looks likes ld has defaulted to max-page-size=64KB from when the code was originally added. Furthermore other bugs similar to this and this reference the gold bug but otherwise don't seem to have a problem with 64kB pages. Feels like we're missing something here |
@AGSaidi yes I tend to agree having played a bit more now. I have a native Ubuntu Bionic arm64 host here, and it seems to create binaries 64k aligned both natively and also when building inside a manylinux2014_aarch container. I'm basing this on
However, clearly people are managing to get incorrectly built binaries out, somehow. It looks like bad ffi wheels (https://foss.heptapod.net/pypy/cffi/-/issues/463) might have been generated in https://github.com/python-cffi/cffi-travis-wheel via https://github.com/matthew-brett/multibuild ? The bcrypt build that also works around it https://github.com/pyca/bcrypt/blob/master/.github/workflows/wheel-builder.yml#L101 appears to run the container pretty plainly under qemu. |
I'm looking into this and a smallish project that exposes the issue if the cffi project. If we just do the following:
We see the last 2 LOAD sections carry the 4kB alignment, all they contain is the .gnu.hash and .dynstr structures. But interestingly enough if we pull out what the
.gnu.hash and .dynstr are in the first LOAD entry, at 64kB alignment. Something else is happening to relocate these entries, maybe this error is in setuptools? |
Hmmm, no not setuptools, it appears the |
It looks like the version of patchelf used in the current version of manylinux2014_aarch64 is slightly too old, it is the culprit doing the rewriting on Arm64 and misaligning the LOAD sections in the ELF library. If I hacked auditwheel/patcher.py to force the
|
So the issue is coming from Manylinux2014_aarch64 and its version of patchelf being slightly too old. NixOS/patchelf@0470d69 |
This is a duplicate of issue pypa/auditwheel#251. From the original discussion on the numpy issue, there is a claim that using patchelf to solve this is the wrong tool, and a better idea would be to build python itself with an appropriate flag, those flags would then be passed on to any c-extension module build with distutils. |
Ahh, I misunderstood. the compile is correct, but then auditwheel calls patchelf which breaks the shared objects? |
Correct. The compile is fine. |
The fix @geoffreyblake referenced above hasn't made it into a release yet. One option would be to go back to carrying a patch for this |
I pinged upstream, but yes, a PR to add the patch might fix the problem until they release. |
@geoffreyblake great analysis! |
@mattip carrying a patch, or just checking out a known good hash should solve this unless they can cut a 0.12 release quickly. Building a new container to try out and see if that fixes things. |
Pushed a PR for review that solves this issue by carrying a patch for patchelf: #739 |
patchelf released 0.12 a little while ago |
I've created #740 to upgrade patchelf. |
I think #741 closes this. Thanks to all who put the effort into pinpointing the problem and solving it. |
Closing per last comments. |
Hello,
tl;dr Debuntu has a 4k page-size and CentOS 7/8 has a 64k page size, so aarch64 manylinux wheels built on the former fail on the latter with alignment issues.
There seem to have been a number of places that have noticed this
and a few other issues pop up too.
Now I understand that to the letter of PEP599 that the build system should be CentOS 7; but in this case the page-size relates to the host system, so running the centos7 manylinux2014_aarch64 container is going to give you a different page-size depending on if you run it on Debuntu or CentOS (or of course, some other distro).
This is compounded by (AFAIK) travisCI only providing Ubuntu images (https://docs.travis-ci.com/user/reference/overview/ ?) making it difficult for people to get a 64k environment.
Individual builds can work around this with a linker flag to set 64k-alignment (e.g.pyca/bcrypt@8d35d8a) . But that has to be ported into each build script and most people will never realise.
I wonder what solutions people might have for this? Some I can think of
It would be good to come up with something by default, because doing the obvious thing of grabbing the manylinux2014_aarch64 container and building wheels on debuntu (i.e. travis practically) creates wheels that don't actually work on the reference platform; so that seems suboptimal.
The text was updated successfully, but these errors were encountered: