Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream revert revert auth token fix #5407

Merged
merged 15 commits into from
May 31, 2024

Conversation

pmahindrakar-oss
Copy link
Contributor

@pmahindrakar-oss pmahindrakar-oss commented May 22, 2024

Why are the changes needed?

Fixes race condition in token auth cache #5387

What changes were proposed in this pull request?

Fixes the racy behavior in auth token cache using mutexes and condition variables.

In a nutshell, the change does the following.

  • Allows all parallel token requesters to try to acquire a lock
  • One of the requester gets the lock and rest get waitlisted and wait for notification from the requester who acquired the lock
  • The requester who acquired the lock refreshes the token and save to the token cache
  • He also notifies the waitlisted requesters to now go and read the saved token cache and proceed with the request

The test plan below summarizes how this was tested

How was this patch tested?

This was tested locally using integration test and was ran using 20 parallel routines

go test -v -tags=integration -run '^TestAPISuite$/^Test_RandomNameWorkflow$'
  • All routines get unauthenticated in the first attempt of using inmemory cached token provider
cat t |grep "If it's an unauthenticated error, we will attempt to establish an authenticated context" |wc -l
      20
  • All routines try to lock
cat t |grep "try lock"  |wc -l                                                         python
      20
  • One of the routine is able to acquire the lock
cat t |grep "Locked : true"  |wc -l                                                    python
       1
  • Rest fail to lock
cat t |grep "Locked : false"  |wc -l                                                   python
      19
  • The ones who fail to lock go into waiting state . i.e they get added to condition variables waitlist
cat t |grep "Waiting" |grep -v "Coming" |wc -l                                         python
      19

eg : log for ref

{"json":{"src":"token_cache_inmemory.go:77"},"level":"info","msg":"Coming out of Waiting","ts":"2024-05-22T23:52:47-07:00"}
  • The one who has locked goes and saves the token
cat t |grep "Saved"  |wc -l                                                            python
       1
  • Broadcasts to all the others waiting using single notification which notifies all the waiters.
cat t |grep "Broadcasted"  |wc -l                                                      python
       1
  • Post which they come out of waiting
 cat t |grep "Coming out of Waiting"|wc -l                                              python
      19

And go ahead and read the token

Log file for reference

t.txt

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Copy link

codecov bot commented May 22, 2024

Codecov Report

Attention: Patch coverage is 34.61538% with 51 lines in your changes missing coverage. Please review.

Project coverage is 61.07%. Comparing base (1e61f4e) to head (595de20).
Report is 135 commits behind head on master.

Files with missing lines Patch % Lines
flytectl/pkg/pkce/token_cache_keyring.go 6.25% 30 Missing ⚠️
flyteidl/clients/go/admin/auth_interceptor.go 48.27% 10 Missing and 5 partials ⚠️
...admin/tokenorchestrator/base_token_orchestrator.go 55.55% 2 Missing and 2 partials ⚠️
flyteidl/clients/go/admin/token_source_provider.go 33.33% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5407      +/-   ##
==========================================
- Coverage   61.09%   61.07%   -0.03%     
==========================================
  Files         793      793              
  Lines       51169    51226      +57     
==========================================
+ Hits        31264    31288      +24     
- Misses      17033    17062      +29     
- Partials     2872     2876       +4     
Flag Coverage Δ
unittests-datacatalog 69.31% <ø> (ø)
unittests-flyteadmin 58.90% <ø> (+0.04%) ⬆️
unittests-flytecopilot 17.79% <ø> (ø)
unittests-flytectl 67.97% <16.66%> (-0.38%) ⬇️
unittests-flyteidl 79.04% <50.00%> (-0.26%) ⬇️
unittests-flyteplugins 61.94% <ø> (ø)
unittests-flytepropeller 57.32% <ø> (ø)
unittests-flytestdlib 65.82% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pmahindrakar-oss pmahindrakar-oss force-pushed the upstream-revert-revert-auth-token-fix branch 2 times, most recently from b806d5f to 0c14c31 Compare May 23, 2024 07:18
pmahindrakar-oss and others added 13 commits May 31, 2024 15:58
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: Erwin de Haan <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: Jason Parraga <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: Chi-Sheng Liu <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: Samhita Alla <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>
@pmahindrakar-oss pmahindrakar-oss force-pushed the upstream-revert-revert-auth-token-fix branch from 10655a4 to 144cf30 Compare May 31, 2024 22:58
eapolinario
eapolinario previously approved these changes May 31, 2024
eapolinario
eapolinario previously approved these changes May 31, 2024
Signed-off-by: pmahindrakar-oss <[email protected]>
@eapolinario eapolinario enabled auto-merge (squash) May 31, 2024 23:37
@eapolinario eapolinario merged commit f08eb47 into master May 31, 2024
50 of 53 checks passed
@eapolinario eapolinario deleted the upstream-revert-revert-auth-token-fix branch May 31, 2024 23:52
robert-ulbrich-mercedes-benz pushed a commit to robert-ulbrich-mercedes-benz/flyte that referenced this pull request Jul 2, 2024
* Revert "Revert "Ensure token is refreshed on Unauthenticated (flyteorg#5388)" (flyteorg#5404)"

This reverts commit 7d2f0d0.

Signed-off-by: pmahindrakar-oss <[email protected]>

* Using same mutex for condition variable

Signed-off-by: pmahindrakar-oss <[email protected]>

* Lock the locker in the wait to adher to cond.Wait() semantics

Signed-off-by: pmahindrakar-oss <[email protected]>

* comments

Signed-off-by: pmahindrakar-oss <[email protected]>

* using noop locker as waitlist add is atomic operation

Signed-off-by: pmahindrakar-oss <[email protected]>

* Replace Azure AD OIDC URL with correct one (flyteorg#4075)

Signed-off-by: Erwin de Haan <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>

* Update the example Dockerfile to run on k8s (flyteorg#5412)

Signed-off-by: Jason Parraga <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>

* docs(kubeflow): Fix kubeflow webhook error (flyteorg#5410)

Signed-off-by: Chi-Sheng Liu <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>

* update flytekit version to 1.12.1b2 in monodocs requirements (flyteorg#5411)

Signed-off-by: Samhita Alla <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>

* Add supported task types to agent service config and rename (flyteorg#5402)

Signed-off-by: Jason Parraga <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>

* update lock file (flyteorg#5416)

Signed-off-by: Samhita Alla <[email protected]>
Signed-off-by: pmahindrakar-oss <[email protected]>

* [monorepo] Fix flytectl install script (flyteorg#5405)

Signed-off-by: pmahindrakar-oss <[email protected]>

* bring in changes for flytecl keyring from PR flytectl/pull/488

Signed-off-by: pmahindrakar-oss <[email protected]>

* typo fix

Signed-off-by: pmahindrakar-oss <[email protected]>

---------

Signed-off-by: pmahindrakar-oss <[email protected]>
Signed-off-by: Erwin de Haan <[email protected]>
Signed-off-by: Jason Parraga <[email protected]>
Signed-off-by: Chi-Sheng Liu <[email protected]>
Signed-off-by: Samhita Alla <[email protected]>
Co-authored-by: Erwin de Haan <[email protected]>
Co-authored-by: Jason Parraga <[email protected]>
Co-authored-by: Chi-Sheng Liu <[email protected]>
Co-authored-by: Samhita Alla <[email protected]>
Co-authored-by: Eduardo Apolinario <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants