Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support ARM64 architecture at the job description and runtime #11063

Closed
amaltaro opened this issue Mar 31, 2022 · 7 comments
Closed

Support ARM64 architecture at the job description and runtime #11063

amaltaro opened this issue Mar 31, 2022 · 7 comments

Comments

@amaltaro
Copy link
Contributor

Impact of the new feature
WMAgent

Is your feature request related to a problem? Please describe.
Given that we have CMSSW builds for ARM64 architecture, we should make sure this architecture would be supported in central production workflows as well.

Describe the solution you'd like
We need to ensure that aarch64 architecture is supported at the job description AND job runtime. This involves:

  • the submit_py3.sh job wrapper
  • required libraries need to be available in CVMFS (even if it doesn't come from our COMP stack). We must have at least python3 (3.8 would be the best) and future python library.
  • SimpleCondorPlugin need to properly define the Target architecture

and any other place that I might be still missing.

Describe alternatives you've considered
None

Additional context
Some discussions with Shahzad started in this PR: #11051

@amaltaro
Copy link
Contributor Author

@smuzaffar here is the new ticket to discuss the required developments to support ARM64 jobs in central production.

My first question to you is, do you know what is the result of uname -m in an ARM64 node? Would it return the exact string we have in the ScramArch?

@smuzaffar
Copy link

uname -m for arm64 is aarch64 and yes it is exact same string used in ScramArch

@hufnagel
Copy link
Member

hufnagel commented Apr 1, 2022

+1 on this since we'll be trying to commission an ARM based HPC in the US ~soonish

@khurtado khurtado self-assigned this Apr 7, 2022
@khurtado
Copy link
Contributor

khurtado commented Apr 29, 2022

@amaltaro Do we need to support ARM64 under slc6?

From what I see, we seem to have everything in place for rhel7 and 8 .

Describe the solution you'd like

  • the submit_py3.sh job wrapper

This here should already match to the rhel7/8 python3 and py3-future.

[khurtado@lxplus723 /cvmfs/cms.cern.ch/COMP]$ ls -ld *rhel*aarch64*
lrwxrwxrwx. 1 cvmfs cvmfs 19 Apr 11 10:02 rhel7_aarch64 -> slc7_aarch64_gcc820
lrwxrwxrwx. 1 cvmfs cvmfs 16 Apr 11 10:02 rhel8_aarch64 -> cc8_aarch64_gcc9
  • required libraries need to be available in CVMFS (even if it doesn't come from our COMP stack). We must have at least python3 (3.8 would be the best) and future python library.

Same here. We have python 3.8.2 and py3-future 0.18.2

[khurtado@lxplus723 /cvmfs/cms.cern.ch/COMP]$ ls rhel*aarch64/external/*/
rhel7_aarch64/external/py2-future/:
0.18.2

rhel7_aarch64/external/py3-future/:
0.18.2

rhel7_aarch64/external/python3/:
3.8.2

rhel8_aarch64/external/py3-future/:
0.18.2

rhel8_aarch64/external/python3/:
3.8.2

SimpleCondorPlugin need to properly define the Target architecture
Also here. For the plugin, we set the Target in the code below:

requiredArch = self.scramArchtoRequiredArch(job.get('scramArch'))
if not requiredArch: # only Cleanup jobs should not have ScramArch defined
ad['Requirements'] = '(TARGET.Arch =!= Undefined)'
else:
ad['Requirements'] = '(TARGET.Arch =?= "{}")'.format(requiredArch)

which relies on the method below that already considers aarch64.

if len(requiredArchs) == 1:
return requiredArchs.pop()
elif "X86_64" in requiredArchs:
return "X86_64"
elif "ppc64le" in requiredArchs:
return "ppc64le"
elif "aarch64" in requiredArchs:
return "aarch64"
else: # should never get here!
return defaultArch

Other than slc6, are we missing anything?

@smuzaffar
Copy link

No, slc6/ARM or slc6/power support is not needed, we do not have any cmssw release for these archs.

@amaltaro
Copy link
Contributor Author

amaltaro commented May 2, 2022

@khurtado Kenyi, I understand that there is no required work to be done and that the last changes provided with this PR:
#11077

completed the support to this architectured (and PowerPC). Is that correct? If so, shall we close it?

Just a note regarding your comment on the BasePlugin, it looks like you just identified the place that we need to change (plus SimpleCondorPlugin) in order to address this ticket: #10674

@khurtado
Copy link
Contributor

khurtado commented May 2, 2022

@smuzaffar Awesome, thanks!
@amaltaro Unless I'm missing anything, yes, I think we are good and we can close this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants