Topics
Tech Radar - further assessment discussions from Tuesday’s Arch Study Group: Tech Radar
Health/monitoring of CI / CD pipelines - impact on squads? +++
https://openedx.atlassian.net/wiki/spaces/ENG/pages/1789526017/Operating%2BReviews
edx-platform
CI to master: ~40mns
merge → stage → e2e → prod: ~90mns
blocked on previous deploy
monitoring CD
splunk - though could be unreliable
impact on squads
e-commerce: CI is long here as well (~40mns), CD is much better though; manual CD after ensuring e2e tests pass
ORA: not a pain-point right now
edx-platform: couldn’t give a good estimate to stakeholders since uncertainty with rollbacks/etc
Actual impact on a team: no code merges after 3pm on weekdays, no code merges after 12pm on Fridays
Note:
Since CI/CD is about 2-hours, can’t fix-forward reasonably. We need to rollback and rollback all other concurrent changes.
Separate paths and testing considerations for infrastructure-level changes versus feature changes
feature changes: can use toggles to decouple release from enablement/monitoring in prod
infrastructure changes: can use canary releases for controlled testing - separated from the normal pipeline flow to keep pipeline flow
Django Signals + best practices! +
https://discuss.openedx.org/t/hooks-based-extension-of-open-edx/2867
Event propagation (e.g., User Profile changed)
Django Signals → All other monolith apps and plugins
Message Bus → All other IDAs & Websocket service & external services (via xAPI/Caliper)
Websocket notification → MFEs
MFE React Hooks → Frontend components
ChangeLogs - what are they? ++
https://xkcd.com/1296/ ← example of why we don’t just use the git commit history
Backlog of Questions/Discussions
This section lists a backlog of previous proposed topics that haven’t yet been discussed.
Recent MFE docs
Changelogs
if you are doing them, what would make it easier?
If you aren’t doing them, what would help get you started?