Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Github Actions CI workflow #3339

Merged
merged 4 commits into from
Sep 4, 2019
Merged

Introduce Github Actions CI workflow #3339

merged 4 commits into from
Sep 4, 2019

Conversation

siggy
Copy link
Member

@siggy siggy commented Aug 28, 2019

The existing Travis CI setup requires additional integrations and
permissions with Github, and also lacks some flexibility around job
dependency management.

Introduce a new CI workflow based on Github Actions. This initial
workflow performs the same CI work that Travis does, and will iniitially
run in parallel:

  • Go unit tests
  • JS unit tests
  • Go lint
  • Validate Go deps
  • Integration tests (deep, upgrade, helm)

Signed-off-by: Andrew Seigner [email protected]

@siggy siggy self-assigned this Aug 28, 2019
@siggy siggy force-pushed the siggy/action-time branch from 1e213cc to 6bd874c Compare August 28, 2019 19:51
Copy link
Member

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siggy This looks awesome! To confirm: we'll continue to rely on Travis for artifact publishing until we change that process?

@siggy
Copy link
Member Author

siggy commented Aug 28, 2019

@olix0r Correct re: artifact publishing. For the moment Travis will continue to handle that. I'd like to get Github Actions in place just for pull request CI, then later take on artifact publishing and release automation.

Copy link
Contributor

@grampelberg grampelberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really just a bunch of questions, this looks great. Super easy to understand what's going on and I love the kind/remote docker implementation.

- name: Checkout code
uses: actions/checkout@v1
- name: ENV
run: env | sort
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this leak secrets?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actions UI obscures them. Also that line was for debugging, I've removed it.

steps:
- name: Checkout code
uses: actions/checkout@v1
- name: Yarn setup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to get rid of this step, would we need to use an image with yarn already installed? Build cache it? Copy files from a tarball?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah any of those approaches would work. We run this command today in Travis, so mostly looking to match that setup before optimizing.

run: |
export PATH="$HOME/.yarn/bin:$PATH"
export NODE_ENV=test
bin/web
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this just came over from the existing tooling, but should it be more explicit what's being done? Maybe bin/web setup && bin/web test? I don't think a build is required and it might speed things up some.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's correct this is cribbed from Travis. I'm totally down to optimize further, but for this PR I mostly want to get it into the pipeline as-is.

DOCKER_HOST_PRIVATE_KEY: ${{ secrets.DOCKER_HOST_PRIVATE_KEY }}
run: |
mkdir -p ~/.ssh/
echo "$DOCKER_HOST_PRIVATE_KEY" > ~/.ssh/id_rsa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll put the full key in the logs, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I was using a Makefile and that auto-expanded this for reasons. Ignore me!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, steps within a job shares the same workspace filesystem. If we ever need to add -x, we can use shell to write the secrets to files on the filesystem, and refer them (e.g. scp -i) in subsequent steps.

DOCKER_HOST_PRIVATE_KEY: ${{ secrets.DOCKER_HOST_PRIVATE_KEY }}
run: |
mkdir -p ~/.ssh/
echo "$DOCKER_HOST_PRIVATE_KEY" > ~/.ssh/id_rsa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

env:
DOCKER_ADDRESS: ${{ secrets.DOCKER_ADDRESS }}
DOCKER_HOST_PRIVATE_KEY: ${{ secrets.DOCKER_HOST_PRIVATE_KEY }}
run: |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to make this step more common? Is there a sandbox shared between all the steps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's unfortunate we're doing this Docker SSH setup step across 4 different jobs. Each job runs in a separate VM so not great shared option.

From a performance perspective, this step only takes 1 second.

From a maintenance perspective, I agree it's less than ideal. We solved this in Travis using YAML aliases, but this is not supported in Github Actions.

run: |
TAG="$(CI_FORCE_CLEAN=1 bin/root-tag)"
export KIND_CLUSTER=github-$TAG-${{ matrix.integration_test }}
bin/kind delete cluster --name=$KIND_CLUSTER
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the kind cluster doesn't exist, does this succeed? Or, does it even matter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the cluster doesn't exist, this command will fail. This is probably fine because if we've gotten this far and the cluster does not exist, a previous job must have failed.

@l5d-bot
Copy link
Collaborator

l5d-bot commented Aug 28, 2019

Integration test results for 57503f8: success 🎉
Log output: https://gist.github.com/3e6d41f156c88efe0149478d6343bb61

env:
GITCOOKIE_SH: ${{ secrets.GITCOOKIE_SH }}
run: |
echo "$GITCOOKIE_SH" | bash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also had this in Travis, but I've never known what it is for...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! It's to mitigate rate-limiting when pulling go dependencies:
golang/go#12933 (comment)

The good news is we should be able to remove this when Go 1.13 lands, as Google will run a module mirror:
https://proxy.golang.org/

kind_setup:
strategy:
matrix:
integration_test: [deep, upgrade, helm]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😎

Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome 🥇

A few ideas about dependencies:

  • Can we have go_unit_tests and js_unit_tests depend on the success of validate_go_deps and go_lint (these being fast, it shouldn't augment too much the overall time)?
  • docker_build could also depend on go_unit_tests and js_unit_tests .

On the other hand, do you know how this interacts with forks?

@siggy
Copy link
Member Author

siggy commented Aug 29, 2019

@alpeb Re: Dependencies between jobs, I'd like to avoid introducing dependencies unless a job actually depends on a previous job. The goal is to minimize the total time for all jobs to complete, and also make clear what jobs depends on other jobs. While validate_go_deps takes only ~8s, go_lint takes ~4m, and go_unit_tests takes ~5m.

Fortunately the Github PR and Actions UIs provide feedback when individual jobs fail, prior to the full workflow completing, so the user gets feedback asap.

For reference, right now the critical path is:
[docker_build | kind_setup] -> kind_integration -> kind_cleanup

.github/workflows/workflow.yml Outdated Show resolved Hide resolved
.github/workflows/workflow.yml Show resolved Hide resolved
DOCKER_ADDRESS: ${{ secrets.DOCKER_ADDRESS }}
DOCKER_HOST: ssh://github@${{ secrets.DOCKER_ADDRESS }}
GITCOOKIE_SH: ${{ secrets.GITCOOKIE_SH }}
run: |
Copy link
Contributor

@ihcsim ihcsim Aug 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about the error reporting in run. So if e.g., the scp command fail, does run terminate immediately? And does the UI show which shell command fail? Or do we need -e? Setting -x is probably not a good idea here, because of the secret env var.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fail fast is on by default:
https://help.github.com/en/articles/workflow-syntax-for-github-actions#jobsjob_idstepsrunshell

I believe the UI highlights which command fails.

DOCKER_HOST_PRIVATE_KEY: ${{ secrets.DOCKER_HOST_PRIVATE_KEY }}
run: |
mkdir -p ~/.ssh/
echo "$DOCKER_HOST_PRIVATE_KEY" > ~/.ssh/id_rsa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, steps within a job shares the same workspace filesystem. If we ever need to add -x, we can use shell to write the secrets to files on the filesystem, and refer them (e.g. scp -i) in subsequent steps.

@siggy
Copy link
Member Author

siggy commented Aug 29, 2019

@ihcsim Agree good to highlight the same workspace filesystem feature between steps in a job. We are taking advantage of this feature in the Docker SSH setup, but the more, the better.

Copy link
Contributor

@ihcsim ihcsim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@siggy
Copy link
Member Author

siggy commented Aug 29, 2019

Filed related issues in the actions/checkout repo and Github Community forums:
actions/checkout#27
https://github.521000.bestmunity/t5/GitHub-API-Development-and/Github-Actions-Inconsistent-repo-checkouts-across-jobs/td-p/30258

Sort of relates to this issue in actions/checkout:
actions/checkout#15

@siggy siggy force-pushed the siggy/action-time branch 2 times, most recently from 172a3e7 to e859815 Compare August 29, 2019 22:14
The existing Travis CI setup requires additional integrations and
permissions with Github, and also lacks some flexibility around job
dependency management.

Introduce a new CI workflow based on Github Actions. This initial
workflow performs the same CI work that Travis does, and will iniitially
run in parallel:
- Go unit tests
- JS unit tests
- Go lint
- Validate Go deps
- Integration tests (deep, upgrade, helm)

Signed-off-by: Andrew Seigner <[email protected]>
Signed-off-by: Andrew Seigner <[email protected]>
Signed-off-by: Andrew Seigner <[email protected]>
@siggy siggy force-pushed the siggy/action-time branch from e859815 to d3c0efe Compare September 3, 2019 23:24
@siggy siggy merged commit 4f71b52 into master Sep 4, 2019
@siggy siggy deleted the siggy/action-time branch September 4, 2019 00:11
siggy added a commit that referenced this pull request Sep 4, 2019
PR #3339 introduced a GitHub Actions CI workflow.

Add a badge to the top of README.md to report status of the CI workflow.

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Sep 4, 2019
PR #3339 introduced a GitHub Actions CI workflow. Booting 6 clusters
simultaneously (3x Github Actions + 3x Travis) exhibits some transient
failures.

Retry kind cluster creation once
Retry log reading from integration k8s clusters once
Add kind cluster creation debug logging
Add a badge to the top of README.md to report status of the CI workflow.

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Sep 4, 2019
PR #3339 introduced a GitHub Actions CI workflow. Booting 6 clusters
simultaneously (3x Github Actions + 3x Travis) exhibits some transient
failures.

Implement fixes in GitHub Actions and integration tests to address kind
cluster creation and testing:
- Retry kind cluster creation once.
- Retry log reading from integration k8s clusters once.
- Add kind cluster creation debug logging.
- Add a GitHub Actions status badge to the top of README.md

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Sep 4, 2019
PR #3339 introduced a GitHub Actions CI workflow. Booting 6 clusters
simultaneously (3x Github Actions + 3x Travis) exhibits some transient
failures.

Implement fixes in GitHub Actions and integration tests to address kind
cluster creation and testing:
- Retry kind cluster creation once.
- Retry log reading from integration k8s clusters once.
- Add kind cluster creation debug logging.
- Add a GitHub Actions status badge to the top of `README.md`.

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Sep 4, 2019
PR #3339 introduced a GitHub Actions CI workflow. Booting 6 clusters
simultaneously (3x Github Actions + 3x Travis) exhibits some transient
failures.

Implement fixes in GitHub Actions and integration tests to address kind
cluster creation and testing:
- Retry kind cluster creation once.
- Retry log reading from integration k8s clusters once.
- Add kind cluster creation debug logging.
- Add a GitHub Actions status badge to the top of `README.md`.

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Sep 4, 2019
PR #3339 introduced a GitHub Actions CI workflow. Booting 6 clusters
simultaneously (3x Github Actions + 3x Travis) exhibits some transient
failures.

Implement fixes in GitHub Actions and integration tests to address kind
cluster creation and testing:
- Retry kind cluster creation once.
- Retry log reading from integration k8s clusters once.
- Add kind cluster creation debug logging.
- Add a GitHub Actions status badge to the top of `README.md`.

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Sep 4, 2019
PR #3339 introduced a GitHub Actions CI workflow. Booting 6 clusters
simultaneously (3x Github Actions + 3x Travis) exhibits some transient
failures.

Implement fixes in GitHub Actions and integration tests to address kind
cluster creation and testing:
- Retry kind cluster creation once.
- Retry log reading from integration k8s clusters once.
- Add kind cluster creation debug logging.
- Add a GitHub Actions status badge to top of `README.md`.

Signed-off-by: Andrew Seigner <[email protected]>
siggy added a commit that referenced this pull request Sep 4, 2019
PR #3339 introduced a GitHub Actions CI workflow. Booting 6 clusters
simultaneously (3x Github Actions + 3x Travis) exhibits some transient
failures.

Implement fixes in GitHub Actions and integration tests to address kind
cluster creation and testing:
- Retry kind cluster creation once.
- Retry log reading from integration k8s clusters once.
- Add kind cluster creation debug logging.
- Add a GitHub Actions status badge to top of `README.md`.

Signed-off-by: Andrew Seigner <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants