Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E tests failing on Boskos #534

Closed
dibyom opened this issue Aug 19, 2020 · 7 comments · Fixed by #535
Closed

E2E tests failing on Boskos #534

dibyom opened this issue Aug 19, 2020 · 7 comments · Fixed by #535
Assignees
Labels
area/boskos Issues or PRs related to code in /boskos area/test-infra Issues or PRs related to the testing infrastructure

Comments

@dibyom
Copy link
Member

dibyom commented Aug 19, 2020

The tests are all failing with:

==============================================
==== CREATING TEST CLUSTER IN US-CENTRAL1 ====
==============================================
2020/08/19 08:23:54 util.go:142: Please use kubetest --gcp-service-account=/etc/test-account/service-account.json (instead of deprecated GOOGLE_APPLICATION_CREDENTIALS=/etc/test-account/service-account.json)
2020/08/19 08:23:54 main.go:725: --gcp-project is missing, trying to fetch a project from boskos.
(for local runs please set --gcp-project to your dev project)
2020/08/19 08:23:54 main.go:737: provider gke, will acquire project type gke-project from boskos
2020/08/19 08:28:54 main.go:319: Something went wrong: failed to prepare test environment: --provider=gke boskos failed to acquire project: resources not found

This error message seems similar to #186

Example PR: tektoncd/triggers#720

Prow dasboard: https://prow.tekton.dev/?job=*integration*&state=failure

@dibyom
Copy link
Member Author

dibyom commented Aug 19, 2020

Looking at Boskos logs:

 "failed to clean up project tekton-prow-10, error info: Activated service account credentials for: [[email protected]]
ERROR: (gcloud.logging.sinks.delete) PERMISSION_DENIED: The caller does not have permission
ERROR: (gcloud.logging.sinks.delete) PERMISSION_DENIED: The caller does not have permission
Error try to delete resources sinks: CalledProcessError()
Error try to delete resources sinks: CalledProcessError()
ERROR: (gcloud.container.clusters.list) ResponseError: code=404, message=Not Found.
ERROR: (gcloud.container.clusters.list) ResponseError: code=403, message=Kubernetes Engine API (Staging2) has not been used in project 574248271492 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/staging2-container.sandbox.googleapis.com/overview?project=574248271492 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.
[=== Start Janitor on project 'tekton-prow-10' ===]
[=== Activating service_account /etc/test-account/service-account.json ===]
[=== Finish Janitor on project 'tekton-prow-10' with status 2 ===]

@dibyom
Copy link
Member Author

dibyom commented Aug 19, 2020

failed to clean up project tekton-prow-7, error info: Activated service account credentials for: [[email protected]]
ERROR: (gcloud.logging.sinks.delete) PERMISSION_DENIED: The caller does not have permission
ERROR: (gcloud.logging.sinks.delete) PERMISSION_DENIED: The caller does not have permission
Error try to delete resources sinks: CalledProcessError()
Error try to delete resources sinks: CalledProcessError()
ERROR: (gcloud.container.clusters.list) ResponseError: code=404, message=Not Found.
[=== Start Janitor on project 'tekton-prow-7' ===]
[=== Activating service_account /etc/test-account/service-account.json ===]
[=== Finish Janitor on project 'tekton-prow-7' with status 2 ===]
" 

@dibyom dibyom changed the title Triggers E2E tests failing on Boskos Tekton E2E tests failing on Boskos Aug 19, 2020
@dibyom
Copy link
Member Author

dibyom commented Aug 19, 2020

Looks like this is not limited to Triggers either:
https://prow.tekton.dev/?repo=tektoncd%2Fpipeline&job=pull-tekton-pipeline-integration-tests

@dibyom
Copy link
Member Author

dibyom commented Aug 19, 2020

I ran the command to activate gcloud API in the first boskos error message:
https://console.developers.google.com/apis/api/staging2-container.sandbox.googleapis.com/overview?project=574248271492

@dibyom dibyom changed the title Tekton E2E tests failing on Boskos E2E tests failing on Boskos Aug 19, 2020
@dibyom dibyom transferred this issue from tektoncd/triggers Aug 19, 2020
@dibyom dibyom added area/boskos Issues or PRs related to code in /boskos area/test-infra Issues or PRs related to the testing infrastructure labels Aug 19, 2020
@dibyom
Copy link
Member Author

dibyom commented Aug 19, 2020

I found kubernetes/test-infra#18897 and kubernetes/test-infra#18897 which seem to be related.

I ran gcloud services enable serviceusage.googleapis.com --project=tekton-releases following kubernetes/test-infra#18897 (comment)

@dibyom
Copy link
Member Author

dibyom commented Aug 19, 2020

I'm going to update the boskos image versions to the latest ones

k -n test-pods port-forward service/boskos 7000:80

Current metrics:

curl 'localhost:7000/metric?type=gke-project'
{"type":"gke-project","current":{"cleaning":15},"owner":{"Janitor":15}}

https://github.com/kubernetes/k8s.io/pull/1161/files

@dibyom
Copy link
Member Author

dibyom commented Aug 19, 2020

New image + manually enabling serviceusage seems to have helped:

{"type":"gke-project","current":{"busy":2,"cleaning":12,"free":1},"owner":{"":1,"Janitor":12,"pull-tekton-pipeline-integration-tests":2}}

dibyom added a commit to dibyom/plumbing that referenced this issue Aug 19, 2020
Fixes tektoncd#534

Signed-off-by: Dibyo Mukherjee <[email protected]>
@dibyom dibyom self-assigned this Aug 19, 2020
dibyom added a commit to dibyom/plumbing that referenced this issue Aug 20, 2020
The current boskos janitor was failing to clean up projects which seems to be
due to GCP adding two new logging sinks that cannot be deleted. The new boskos
image fixes this. See kubernetes-sigs/boskos#37 for more details.

Fixes tektoncd#534

Signed-off-by: Dibyo Mukherjee <[email protected]>
tekton-robot pushed a commit that referenced this issue Aug 20, 2020
The current boskos janitor was failing to clean up projects which seems to be
due to GCP adding two new logging sinks that cannot be deleted. The new boskos
image fixes this. See kubernetes-sigs/boskos#37 for more details.

Fixes #534

Signed-off-by: Dibyo Mukherjee <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/boskos Issues or PRs related to code in /boskos area/test-infra Issues or PRs related to the testing infrastructure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant