...
Tech Radar - further assessment discussions from Tuesday’s Arch Study Group: Tech Radar
Health/monitoring of CI / CD pipelines - impact on squads? +++
https://openedx.atlassian.net/wiki/spaces/ENG/pages/1789526017/Operating%2BReviews
edx-platform
CI to master: ~40mns
merge → stage → e2e → prod: ~90mns
blocked on previous deploy
monitoring CD
splunk - though could be unreliable
impact on squads
e-commerce: CI is long here as well (~40mns), CD is much better though; manual CD after ensuring e2e tests pass
ORA: not a pain-point right now
edx-platform: couldn’t give a good estimate to stakeholders since uncertainty with rollbacks/etc
Actual impact on a team: no code merges after 3pm on weekdays, no code merges after 12pm on Fridays
Note:
Since CI/CD is about 2-hours, can’t fix-forward reasonably. We need to rollback and rollback all other concurrent changes.
Separate paths and testing considerations for infrastructure-level changes versus feature changes
feature changes: can use toggles to decouple release from enablement/monitoring in prod
infrastructure changes: can use canary releases for controlled testing - separated from the normal pipeline flow to keep pipeline flow
Django Signals + best practices! +
https://discuss.openedx.org/t/hooks-based-extension-of-open-edx/2867
Event propagation (e.g., User Profile changed)
Django Signals → All other monolith apps and plugins
Message Bus → All other IDAs & Websocket service & external services (via xAPI/Caliper)
Websocket notification → MFEs
MFE React Hooks → Frontend components
ChangeLogs - what are they? ++
https://xkcd.com/1296/ ← example of why we don’t just use the git commit history
Backlog of Questions/Discussions
...
Recent MFE docs
Changelogs
what/where is it?if you are doing them, what would make it easier?
If you aren’t doing them, what would help get you started?