...
Prefix your topic with your intention so we are clear on what outcome you are striving from the discussion. Examples:
[inform] You are simply seeking to inform the group of this item. You may field clarifying questions from the group on your inform, but not seeking further discussion at this time.
[ideation] You are seeking divergent and wide perspectives from this group. In this brainstorming mode, all ideas are accepted, without critical analysis.
It may be helpful to clarify whether you’d like to ideate on the problem space or the solution space.
[analysis] You are asking the group to help you poke holes in your idea/topic/plan/etc.
[quest] You are seeking information/responses to a question you have.
2023-12-20
(Dave) New Relic -> DataDog status
Related: Application Performance Monitoring
Contract expires around June
Edx-platform should have open telemetry compatible layer for this so people can plug in their own APM solutions.
[inform] (Kelly) I made a terrible diagram of edx-platform: https://lucid.app/lucidchart/fb870610-f8b4-4b7e-a509-1b871f81c54b/edit?beaconFlowId=8DBD553E85CDC9EE&invitationId=inv_e964873a-34ba-4bea-bc36-2fbe304edf40&page=9J6X4Q5XLMLH#
Related: https://openedx.slack.com/archives/C0497NQCLBT/p1702908527116319 and https://github.com/openedx/docs.openedx.org/issues/449
Within 2U, some of this stuff probably should be owned by Service Experience but isn’t yet
[Ned] Divergence Strategies: https://2u-internal.atlassian.net/wiki/spaces/ENG/pages/730005583/Divergence
Xavier’s doc: https://docs.google.com/document/d/1YyRxBrgIVoxwdcQLTWMyfUdFnxcRaqLJD1kTxBbIUn8/edit
2023-12-13
[quest] (Dave): How’s the MySQL 8.0 switchover going?
Scheduled for 2am tonight
[quest] (Dave): What’s a good way to roll out database connection encoding changes to Studio for http://edx.org ? (configuration repo help)
Dave to make issue in configuration repo (?) to track this.
This may be overridable in edx-internal (which would allow for faster rollback)
Jeremy will see if we want to turn on Issues there
High-level, external-safe discussion of recent 2U staff meetings
[ideation] (Jeremy) 2U executive management has been talking a lot about the “Open edX maintenance burden”. Where are teams feeling this, and how much time is it taking? What parts of it would we actually be comfortable handing over to other organizations to handle?
Fixing bugs?
Merging dependency upgrade PRs?
Big framework upgrades?
Roadmap decisions?
Reviewing changes from outside the core owning/maintaining team?
Deprecating stuff that’s no longer useful?
Building extension points so optional features can be added without being added for everyone?
[quest] (Jeremy) Are there more things like the Insights stack that are effectively only used by 2U, and should be deprecated as far as Open edX is concerned?
And how should we proactively identify things like this moving forward?
(Andy) At least a yearish ago there was at least one other insights user
[quest] How deep are architecture vendor commitments? What failover features are there?
...
[analysis] (Jeremy) Docker Desktop replacement
Wiki page with some analysis: Container Runtime Comparison
Arch-BOM ticket for continuing investigation: https://github.com/edx/edx-arch-experiments/issues/93
[quest] (Ned) Why are we OK with a 2-hour deployment pipeline?
(Andy) It’s worse than that, it’s a 2 hour nondeterministic pipeline
(Phil) From GoCD edxapp statistics: 45+20+1+1+15+(2+3+7)=94 minutes
The 45 minutes (half the duration) is building the AMI
Each number is the average duration of a pipeline step.
Pipeline #’s in parenthesis happen after the build is available on prod.
We don’t have a good basis of comparing even between different pipelines within 2U
(Jeremy) We don’t want to continue using GoCD in the long term, which leads to debate on the value of optimizing it (vs. doing work to switch to Argo CD instead)
(Jeremy) There are several parallel efforts to reduce edx-platform build time, but the time to value delivery is long doing it that way. Maybe we should concentrate our efforts a little better for incremental value delivery?
[quest] (Alex D.) Any patterns that folks like for data replication between services?
[we talked about things]
Architecture Manifesto see points about Eventual Consistency
[quest] (Ned) What kinds of informal education are useful for developers?
High-level block diagram (context/container from c4)
Architectural onboarding has fallen by the wayside
How code is organized (mono-repo and otherwise)
Migrations, what they are and how they go wrong
Celery
Tour of a new ida makefile
What counts as “core”
[quest] (Adam) How do we get better at either smoke tests or health checks so that we can make big changes to infrastructure more confidently and detect things before we ship bugs to prod.
...
[ideation] (Dave O) Proposal: Make MinIO a part of the default Tutor / Devstack install, and let Django apps and services assume an S3-like interface instead of having to accommodate any django-storages backend–i.e. drop support for storing that data directly on the filesystem.
OEP
DEPR
How to migrate folks away
Check on Swift usage/compatibility
Seems better than localstack, which was way to big for this use case
Look into reliability (link)
Don’t force MinIO
[quest] (Jeremy) Should we configure and enable https://github.com/actions/dependency-review-action ?
Have Arbi-BOM try it out, talk to Axim about shared config if it goes well
Quick demo of how to find Dependency Review on PRs
[quest] (Hilary) Does core functionality belong in the platform or should an IDA be used if the functionality could be considered a complete service?
Ownership is much simpler for a separate service, leading 2U to often prefer this
Some operators of smaller sites struggle to manage multiple services, and prefer everything critical to be in edx-platform
Some things make sense as libraries or plugins that get installed into edx-platform
https://2u-internal.atlassian.net/wiki/spaces/ENG/pages/26017856/Directory+Of+edX+Sites (2U internal)
...