Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

edx-platform deployment pipeline/monitoring training #366

Open
2 tasks
davidjoy opened this issue Jul 21, 2023 · 5 comments
Open
2 tasks

edx-platform deployment pipeline/monitoring training #366

davidjoy opened this issue Jul 21, 2023 · 5 comments
Labels

Comments

@davidjoy
Copy link

davidjoy commented Jul 21, 2023

A/C

  • Proposal for how to split edx-platform pipeline ownership to be distributed to engineers and engineering managers
    • Distribute said proposal

We want other squads to be able to help maintain and monitor the edx-platform deployment pipeline in order to reduce our on-call burden and upskill our colleagues.

This probably looks like:

  • Training sessions on how the deployment pipeline works
  • Make sure runbooks are up to date and socialized
  • Creating a new on-call rotation that loops in engineers from other teams over time.
  • Likely some work to re-think what split-ownership of edx-platform means
@davidjoy davidjoy converted this from a draft issue Jul 21, 2023
@jmbowman jmbowman moved this to Prioritized in Arch-BOM Jul 24, 2023
@rgraber rgraber moved this from Prioritized to Groomed in Arch-BOM Aug 3, 2023
@robrap robrap moved this from Groomed to On-Call in Arch-BOM Aug 3, 2023
@robrap
Copy link
Contributor

robrap commented Aug 15, 2023

Additional thoughts:

  1. Who determines if the pipeline is fast enough and would invest in speeding it up?
  2. Who determines if the pipeline is stable enough and who would invest in improving flakiness?
  3. How do we determine common issues? Ideally this would be automated, and would not rely on various teams following some manual process for every failure to catalog issues.

Note that these issues are bad with a single team, and will become worse with split-ownership, so it would be best if we could improve for everyone before we split.

@robrap
Copy link
Contributor

robrap commented Aug 15, 2023

It's becoming clear to me that our current runbook has some deficiencies:

  1. Should the first step simple be to rerun any failed stage, given how many flaky issues we see?
  2. At what point do we ticket a flaky issue?
  3. How do we gather data on each flaky issue to know in what order they should be addressed? Which are happening the most in recent history?

@robrap robrap moved this from On-Call to In Progress in Arch-BOM Aug 15, 2023
@robrap robrap self-assigned this Aug 15, 2023
@robrap robrap moved this from In Progress to In Code Review in Arch-BOM Aug 17, 2023
@robrap
Copy link
Contributor

robrap commented Aug 17, 2023

Waiting on feedback or a Parking Lot discussion around maintenance.

@robrap
Copy link
Contributor

robrap commented Aug 21, 2023

After discussion in Parking Lot, we wondered:

  1. Do we need to improve stability of the pipeline before even trying this?
  2. Is moving on to ArgoCD a prerequisite for minimizing training on a different pipeline for other teams (if the pipeline becomes more like others)?
  3. Is this really worth having split ownership from a 2U perspective?

I'm marking blocked for now, but we may potentially close this.

If we close this ticket, we should:

  • Archive proposal in Confluence.

@jmbowman: Should this remain blocked or should be close?

@robrap robrap moved this from In Code Review to Blocked in Arch-BOM Aug 21, 2023
@robrap
Copy link
Contributor

robrap commented Aug 24, 2023

We are moving this to the backlog until edx-platform is containerized, we've moved on to ArgoCD, and we can reassess then?

@jristau1984 jristau1984 moved this from Blocked to Backlog in Arch-BOM Jul 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Backlog
Development

No branches or pull requests

3 participants