Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Tech Radar - further assessment discussions from Tuesday’s Arch Study Group: Tech Radar

  • Health/monitoring of CI / CD pipelines - impact on squads? +++

    • https://openedx.atlassian.net/wiki/spaces/ENG/pages/1789526017/Operating%2BReviews

    • edx-platform

      • CI to master: ~40mns

      • merge → stage → e2e → prod: ~90mns

        • blocked on previous deploy

    • monitoring CD

      • splunk - though could be unreliable

    • impact on squads

      • e-commerce: CI is long here as well (~40mns), CD is much better though; manual CD after ensuring e2e tests pass

      • ORA: not a pain-point right now

      • edx-platform: couldn’t give a good estimate to stakeholders since uncertainty with rollbacks/etc

      • Actual impact on a team: no code merges after 3pm on weekdays, no code merges after 12pm on Fridays

      • Note:

        • Since CI/CD is about 2-hours, can’t fix-forward reasonably. We need to rollback and rollback all other concurrent changes.

        • Separate paths and testing considerations for infrastructure-level changes versus feature changes

          • feature changes: can use toggles to decouple release from enablement/monitoring in prod

          • infrastructure changes: can use canary releases for controlled testing - separated from the normal pipeline flow to keep pipeline flow

  • Django Signals + best practices! +

  • ChangeLogs - what are they? ++

Backlog of Questions/Discussions

...