Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for IAM token #10939

Merged
merged 1 commit into from
May 11, 2022
Merged

Add support for IAM token #10939

merged 1 commit into from
May 11, 2022

Conversation

vkuznet
Copy link
Contributor

@vkuznet vkuznet commented Dec 22, 2021

Fixes #10940

This PR provides support to include IAM token to HTTP requests. The token can be obtained via separate process, e.g. via curl, create-iam-token.sh script or oidc-agent, and then can be used by any HTTP client. We added appropriate Authorization HTTP header to pycurl_manager. The token itself can be provided either via file name or environment variable.

For more information please consult Tokens in WMCore wiki page.

Status

ready

Description

We provide a patch to pycurl manager to setup appropriate HTTP header if token is provided via configuration or environment. And, TokenManager module which provides different ways to obtain IAM token, either via readToken function or TokenManager class.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

Here is PR for wmcore/wmagent specs: cms-sw/cmsdist#7647

External dependencies / deployment changes

None

@cmsdmwmbot
Copy link

Jenkins results:

  • Python2 Unit tests: failed
    • 9 new failures
    • 1 tests no longer failing
    • 1 tests added
  • Python3 Unit tests: failed
    • 8 new failures
    • 1 tests added
    • 1 changes in unstable tests
  • Python2 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 59 comments to review
  • Python3 Pylint check: succeeded
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/12709/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented Feb 16, 2022

@amaltaro , @todor-ivanov could you please review this PR. The changes are trivial and if merged we can start testing usage of WMCore services with token based authentication.

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valentin, in addition to the comments made along the code, I have a more general question/comment on how these tokens will be used, like:

  • once we load a token in the RequestHandler object, I think we should keep using it instead of making the os.environ call for every HTTP call that goes through pycurl_manager.
  • with that said, we would need to have a mechanism to also update our token because it will eventually expire, right?
  • adding support to reading this token from a file might be a good idea

Last but not least, the pycurl_manager module also contains some functions. We need to make sure getdata function is covered with this implementation.

In terms of other HTTP modules, we use this one for uploading a root file - from within the worker node - to the DQM Gui:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/Services/HTTPS/HTTPSAuthHandler.py
you might want to have a look at it as well.

@@ -61,6 +61,14 @@
from Utils.PortForward import portForward, PortForward


def get_token(env=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valentin, I believe at some point we will want to have different operations for tokens (like, renewing, checking validity, etc) and maybe even different technologies. Right?

In that case, I would suggest to create a new module under the Utils package and have all the token-based implementation in there.

@@ -242,6 +257,7 @@ def set_opts(self, curl, url, params, headers,
if verbose:
curl.setopt(pycurl.VERBOSE, True)
curl.setopt(pycurl.DEBUGFUNCTION, self.debug)
self.request_headers = headers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this attribute is not used anywhere but here in the __init__ method, it looks like this line could be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the request headers can be required in different clients, e.g. DBSClient, and should be set accodringly. This is why it was set here. The are initially declared in ctor and set appropriate for every HTTP request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's using this class, then it should already have the headers set (L239), no? Which DBSClient library are you referring to here?
Maybe to verify whether it's really needed or not, you could comment this out and push it in and see what jenkins output we get.

@vkuznet
Copy link
Contributor Author

vkuznet commented Feb 21, 2022

@amaltaro thanks for comments. We're working on setting up never expired token, similar to how we handle proxy for x509. Once new service will be in place, the token will be updated on regular basis and placed in a file. The token validity is defined by IAM provider and as far as I saw it is valid for 1hour. Therefore, I'll adjust code to do the following:

  • add token manager to provide valid token
    • it will read token from default env/file
    • it will check validity of the token
    • if necessary it will re-read token again (if token is expired)
  • initialize token manager at ctor

I'll put new code in Utils as you suggested and then use it here and pycurl manager.

@vkuznet vkuznet self-assigned this Feb 21, 2022
@vkuznet vkuznet requested a review from amaltaro February 23, 2022 14:49
@vkuznet
Copy link
Contributor Author

vkuznet commented Feb 23, 2022

@amaltaro , I provided a first draft of TokenManager which so far can only read and validate token. Later, we may discuss and add new functionality how to obtain new token in TokenManager. Please bare this in your review. I explored several python libraries and decided to stick to pyjwt recommended on jwt.io. This library provides ability to inspect token and does not have much dependencies. This implies that it should be added to requirements of WMCore.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 67 new failures
    • 589 tests deleted
    • 1 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: failed
    • 1 errors and warnings that should be fixed
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/12812/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented Feb 23, 2022

I fixed unit tests, but we should decide how to perform unit test which require valid token. I put necessary comments into the codebase to comeback to this once we'll implement procedure to obtain token programmatically.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 306 new failures
    • 262 tests deleted
    • 6 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: failed
    • 1 errors and warnings that should be fixed
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/12813/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented Feb 23, 2022

@amaltaro , @todor-ivanov I am not sure I understand the following pylint error [E1601, line 72 in tokenData: print statement used](). According to quick Google search it is false positive, see here

@amaltaro
Copy link
Contributor

@vkuznet I haven't looked into this PR, but please replace the print statement by the logging library. This is an effort that we made a few years ago and kept the print statement only where we didn't manage to get a logger object working well.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 306 new failures
    • 262 tests deleted
    • 7 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: failed
    • 1 errors and warnings that should be fixed
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/12815/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 306 new failures
    • 262 tests deleted
    • 8 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/12821/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor

@vkuznet Valentin, could you please work on the cmsdist specs to get this new python library in the stack? Once you have that working, please link it to the first description of this PR. I haven't looked into these details, but I believe the massive failures are coming from the import error for the new library.

@vkuznet
Copy link
Contributor Author

vkuznet commented Feb 24, 2022

@amaltaro I made necessary changes to replace print with logger. I checked that all pylint issues are gone. But the jenkins tests are failing due to the following:

...
  File "/build/cmsbld/jenkins/workspace/DMWM-WMCorePy3-PR-unittests/SLICE/0/label/cms-dmwm-cc7/code/src/python/WMCore/Services/Requests.py", line 80, in __init__
    self.reqmgr = RequestHandler()
name 'RequestHandler' is not defined

It seems to me that in Jenkins there is an issue with environment since RequestHandler is defined in WMCore/Services/Requests.py as following:

try:
    from WMCore.Services.pycurl_manager import RequestHandler, ResponseHeader
except ImportError:
    pass

Therefore, it seems to me that there is an issue with loading pycurl_manager which is part of this PR. Since there is not any printout in import error section we don't know what is the actual problem. Locally, I can easily load pycurl_manager and its pylint scores 10/10. If you want I can change Requests.py to add import error printout to understand the issue.

@vkuznet vkuznet force-pushed the iam-token branch 2 times, most recently from ff0f581 to deddb15 Compare February 24, 2022 18:40
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: failed
    • 306 new failures
    • 262 tests deleted
    • 7 changes in unstable tests
  • Python3 Pylint check: succeeded
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/12823/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented Feb 24, 2022

@amaltaro , please be specific which spec file you're talking about? I adjusted code base to handle missing jwt import, hopefully this will fix failed unit tests. I think we should add separate PR for requirements.txt and the spec, but I need to know which spec(s) do you use these days.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 8 tests added
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 3 warnings
    • 87 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 34 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13175/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 8 tests added
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 3 warnings
    • 87 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 34 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13184/artifact/artifacts/PullRequestReport.html

@vkuznet vkuznet requested a review from amaltaro May 10, 2022 19:11
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 8 tests added
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 3 warnings
    • 87 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 34 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13185/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valentin, I had another look at this and it's looking good now, almost ready to go.
I resolved a few comments that you have already worked on but were not automatically closed by GH, and left 2 or 3 comments that are worth it.

Once you have those changes in place, feel free to rebase and squash these commits. If it's not too late, please keep the test changes in a separate commit.

src/python/Utils/TokenManager.py Show resolved Hide resolved
src/python/Utils/TokenManager.py Outdated Show resolved Hide resolved
src/python/Utils/TokenManager.py Show resolved Hide resolved
It caches token along with expiration timestamp.
By default the env variable to use is IAM_TOKEN.
:param name: string representing either file or env where we should read token from
:param url: IAM provider URL, by default it is https://cms-auth.web.cern.ch/jwk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding this comment, please actually consider the one made in the __init__ method signature.

src/python/Utils/TokenManager.py Show resolved Hide resolved
requirements.txt Outdated Show resolved Hide resolved
@amaltaro
Copy link
Contributor

@vkuznet Valentin, can you also please give a brief summary of what is provided in this PR and what kind of token management it allows now? It'd be useful to link the WMCore token wiki as well: https://github.com/dmwm/WMCore/wiki/Tokens-in-WMCore

FYI @belforte @mapellidario this change is likely going in in the next release and it depends on a new python 3rd-party library called: PyJWT, which has been already integrated to cmsdist repo, but our wmcore/crab specs still need to have this dependency added to the Requires list.

Last but not least, these changes - especially to pycurl_manager - should be backwards compatible, so even if we are not using tokens and don't have PyJWT library, nothing should break. Valentin can confirm this though.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 8 tests added
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 3 warnings
    • 87 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 34 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13190/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented May 11, 2022

@amaltaro , thanks for suggestions I adjusted the codebase accordingly. Regarding PR description. I updated it too. But this PR does not provide any token management, we only provide methods (either via function or class) to read token, but we do not manage tokens. Pointer to wiki has been added.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 8 tests added
    • 1 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 3 warnings
    • 87 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 34 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13192/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor

Thank you, Valentin. Yes, by management I meant that we have some utilities to load in a token and check expiration time. But I agree this is only a small set of managing tokens.

If you are ready with these changes, please:

  1. squash the commits accordingly
  2. remove the no longer relevant GH labels
  3. update the status In development to ready or done.
  4. and fire a final request review.

@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 8 tests added
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 3 warnings
    • 87 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 34 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13194/artifact/artifacts/PullRequestReport.html

@vkuznet
Copy link
Contributor Author

vkuznet commented May 11, 2022

test this please

@vkuznet vkuznet requested a review from amaltaro May 11, 2022 14:18
@cmsdmwmbot
Copy link

Jenkins results:

  • Python3 Unit tests: succeeded
    • 8 tests added
    • 2 changes in unstable tests
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 3 warnings
    • 87 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 34 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13195/artifact/artifacts/PullRequestReport.html

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Valentin.

@amaltaro amaltaro merged commit 3e1eaf0 into dmwm:master May 11, 2022
@belforte
Copy link
Member

thanks for keeping us in the loop. Appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for TOKEN based authentication
6 participants