Arch Hours: 2021
Meeting Expectations
Why?
Provide an opportunity for generative discussion and ideas.
Foster comradery through technical curiosity and geekdom.
Who?
Open to all edX-ers and Arbisoft-ers
What?
At times, these informal discussions result in follow-up action and beneficial change in our technology or in our organization. While this is not a decision-making body, these serendipitous discussions spark ideas that may result in ADRs/OEPs and tickets on team backlogs.
At times, it serves as a form of informal office hours to ask live technical questions of the archeological collective.
At times, we have pre-planned deep-dive topics that folks propose to gather wide-input or to answer questions.
At times, we have hosted special guests (internal and external to edX) on specialized topics.
When?
Not lunch hour in ET timezone: With Covid remote work, "Arch Lunch" has evolved into “Arch Hour” in order to accommodate various home/life situations during lunch time.
How? Live Co-Editing
To circumvent Confluence’s limitations with the maximum number of concurrent editors:
during the hour together, we capture topics and take notes at https://docs.google.com/document/d/18TmQf3GllPDfjR7WKiMIhR2eqsbwPi1h3Ojdb6yDYCY/edit.
after the hour, we move those notes to this page.
Why not just stick with keeping the notes in the Google doc?
Google docs are not as discoverable.
Google docs don’t notify observers of future edits.
Google doc comments don’t notify all observers.
How? Structure
Please enter your proposed topics for discussion.
When we use Lean Coffee Style (link1, link2), we vote on which topics the group wants to discuss and time-box the discussion to 10 or 15mns → 5mns (if re-voted) → 5mns (if re-voted).
Prefix your topic with your intention so we are clear on what outcome you are striving from the discussion. Examples:
[inform] You are simply seeking to inform the group of this item. You may field clarifying questions from the group on your inform, but not seeking further discussion at this time.
[ideation] You are seeking divergent and wide perspectives from this group. In this brainstorming mode, all ideas are accepted, without critical analysis.
It may be helpful to clarify whether you’d like to ideate on the problem space or the solution space.
[analysis] You are asking the group to help you poke holes in your idea/topic/plan/etc.
[quest] You are seeking information/responses to a question you have.
2021-12-22
[ideation] (David J) What do we need to change in Tutor to make it minimally useful for edX developers to start using?
[quest] (David J) What is paver used for today? Corollary, what is paver?
Paver is a python library for creating tasks to be executed from the command line
Runs unit tests, building static assets, maintenance tasks, in edx-platform. Never got used anywhere else.
Still technically maintained. Started moving away from it and running the underlying commands directly.
We did not build paver.
There's two parts of this:
We didn't make paver, it's someone else's thing.
We did write a lot of code to run things under paver. It may look like we made it because we wrote so much on top of it.
Chain of thought: we need a task runner, make files are default, messy and shell based, want to do something in python, lets use paver. Wait, this is complicated, lets go back to using Makefiles.
Makefiles and paver give you a way of saying "this command needs these other commands first"
Django management commands don't do that.
The appeal was complex dependent-task management ("Only run this thing once even if several commands need it")
Problem with Make: needs to be implemented as shell scripts
Paver lets you use python files instead of shell scripts.
When the project was first starting out, we wanted anyone to be able to run it anywhere. A lot of initial goodwill - as the reality of that set in, we decided supporting windows is hard and not valuable for the amount of investment we'd have to do to keep it working. Usage of paver stemmed out of that, but then getting back out became too big for any team to take on.
[quest] Enterprise team is looking to best leverage feature toggles abilities to do more flexible (per customer or per subscription) toggles of features.A lot of these are done via config models right now. I wanted to get a base understanding of what best practices are in this area (edx-toggles read gave me words like: Waffle, Django setting, SettingToggle). Also, want to brainstorm on how can frontends best leverage these settings? Prior art? link to edx-toggles
2021-12-01
[quest/ideation] (Diana) - Devstack data project - who has a good use case that we should tackle?
Have a prototype that will load data into devstack
Arch-BOM were hunting around for a team with a good use case
Reach out to #arch-bom if you have a use case
[Ideation] (Justin Lapierre) How to set up a course/configuration in Devstack for QA in a reliable, reproducible way
[ideation] (David J) As the balance of edX devs to core contributors shifts in the future, how do we think that affects the risk profile of our CI/CD pipelines deploying off of master, and what might need to change? :eyes:
"Support not changing things until there's a problem"
We may not be the only people deploying from master - what if _we_ break someone else's systems?
SWG perspective - how do we do security releases reliably?
How do we know we're in a more risky situation?
Added complexity of pre-empting this problem is very expensive, so wait on the RCAs
"Eventual trouble is probably inevitable", we don't know what the problem will be, but we may be able to game out how we might deal with the problem. There's no code change we can't roll back (putting aside malicious stuff) Data migrations are scary because they can be 100% destructive and irreversible. And/or wildly unperformant.
We have point-in-time recovery to the minute (!!!)
Does GitHub have a canonical solution to what edx-platform-private solves today?
Jinder said Feanil or others looked into this early on and it wasn’t ready.
May still be hard for GoCD to deploy off of private forks.
Idea: "migration files require an extra approval" via CODEOWNERS
[inform] (Adam B) - Spreading this info more widely: If you want to add users with granular permissions to ecommerce you can now do so through app-permissions, e.g. like this
Also if you want to manage users for other services via code, Matt Hughes has offered to test out these setup docs on registrar.
[ideation/analysis] (David J) In spirit, over time edX employees could be thought of as core contributors themselves. The CC program has a time commitment (20 hrs/month). Could edX engineers/others be CCs? What would that look like for product delivery teams?
Ecommerce collaborative push?
Can we borrow from other communities?
Communities
Flywheel to support community which then helps us
We focused on drag, but what about focusing on benefit?
What features does edX/2U want?
Goal: Reducing ambiguity around expectations
Can we be a bit more formal around finding what folks are interested in and hooking them up with the right WG/people?
2021-11-24
[analysis] (Nathan Sprenkle) Internally-routed XBlock handler calls
I wrote a function for calling XBlock handlers from inside edx-platform for a backend-for-frontend application. This is half [inform] and half [let me know if this is a “Really Bad Idea or Not™”].
See github edx/edx-platform/lms/djangoapps/ora_staff_grader/utils.py#L43
Ask OpenCraft about other ways/options for doing it?
[Ned] Meta-question: are there any more questions or discussion following on from the Eng All-Hands?
Should we be planning for more repo changes programmatically? How will 2U people be able to make changes to things like branch permissions?
Kyle is working on it, he might use https://registry.terraform.io/providers/integrations/github/latest
For now, people who have admin will keep admin.
What’s happening to BD projects?
Some are moving to TCRIL, others are staying with edX
[question] (Jeremy) If you could nominate 1-2 things to improve the experience for new Open edX developers, what would they be? (We’ll likely have many such people soon.)
[dkh] We should pick up the thread of the onboarding work Feanil recently did
[djoy] Better/updated seed data for freshly provisioned devstacks/services
[djoy] push button environment creation and simple installation/setup instructions and documentation
[adam] Make things more consistent (devstack vs sandboxes vs prod, bokchoy vs cypress)
[andy] tests that work outside of a devstack shell
[andy] devstack as cattle not as pet
[andy] meta improvement: measure time lost to devstack to motivate investment
[djoy] is there a third party libraries-so-popular-they're-standards development environment stack that we could adopt so that someone else maintains our development stack for us, effectively. (this is clearly not a small thing)
[Ned] Can TCRIL participate here?
Probably, let’s enable that
[Adam] question: Is there anything stopping us from moving https://build.testeng.edx.org/job/edx-e2e-tests/ to the new tools-automation cluster?
Sub question, if someone changes something in a repo that will break an e2e test how can we make it faster to fix the test?
David Joy: Maybe a GHA?
Diana: Or maybe we can just remove the tests that trip us up
David Joy: e2e tests should be few and far in between, so we should only use them for critical path tests.
2021-11-03
++[analysis] (Dave O): Extracting a low-level learning core out of edx-platform and into a new repo.
(Original post/thread).
Motivation:
Create a smaller/simpler dependency to build extensions on top of (instead of edx-platform). This means smaller, more stable APIs that can add incremental value, instead of the backloaded benefits of removing stuff from edx-platform.
Promote innovation by making it easier to create different experiences on top of Open edX (like LabXchange).
Advance the Studio/LMS split (these would be the core of the LMS).
Some potential apps: publishing, navigation, policy, composition (what’s in a Unit for this user?), scheduling, partitioning (what users are in what groups for various tests).
Strategy for dealing with tricky extractions: Core data models and frameworks exist in new repo. Plugins are implemented in edx-platform. Examples:
For navigation, an outline processor framework exist in new repo, but an EnrollmentOutlineProcessor exists in edx-platform, keeping knowledge of enrollments out of the new core.
For partitioning, the data models to store user/partition mappings live in the new core, but actual partition bucketing logic remains implemented in edx-platform.
Strategy for data migration: Start with content data, that can be rebuilt/backfilled from Modulestore. We do this kind of thing all the time already.
Feedback:
Robert: In general, like the idea. Where to start: serving particular needs, like course overview-like data, verify that it serves the needs. Interfaces vs. implementations: do we need both? E.g. for course overviews, does it just need to move or will work need to be done–define an interface for a mocked version?
Jeremy: Any idea of things currently installed in edx-platform that directly call edx-platform–what are they doing?
Course overviews
Things that reach into modulestore for lack of better APIs that we should probably create
Scheduling
Jeremy: might it be worth identifying APIs in edx-platform that are the main code called by many other parts of the platform, so extracting it could allow all those to also be extracted?
There may be some cases where we move enough of the core of such an app (models, etc.) to a separate repo, but leave all the tangled implementation details in place to minimize the up-front work needed.
+ [quest] (Jeremy) Docker Desktop changes - how does this influence build vs. buy decisions?
Docker is starting to charge for Docker Desktop usage for orgs like ours
There are possible alternatives like Minikube, but we haven’t really evaluated how well they would work for us
Quite possibly worth edX paying for this, not as clear for Open edX at large
Jeremy discussed this with Régis; we agreed it needs to be discussed/resolved, but didn’t come up with any immediate answers
BTR working group at large hasn’t discussed this yet
Does moving to Codespaces or something similar change this
Costs ~$21/user/month for 50+ user orgs
Open edX will probably support multiple options in the future, but edX is likely to pick a default for its own developers
Probably just paying the license fee for now, to free up resources for acquisition-related stuff
Has anyone tried Tutor or Codespaces?
Tutor: Running Open edX — Tutor documentation , written by Régis Behmo
Codespaces, etc.: sounds cool, nobody’s had time to look at it very closely
Quicker to get started (and to restart!)
Internet speed: are codespaces better or worse than downloading docker images for poor internet speed?
Codespaces are latency, docker images are bandwidth
Doesn’t work at all offline, but that may not really be an issue for most people these days
2021-10-06
[discuss] [Jeremy] mypy: where are we, where are we going
We’re running this for edx-platform in an optional GitHub Action check
We don’t have many annotations yet
There are tools that would let us add a lot pretty quickly by analyzing our test suite, etc.
But we probably won’t do this just yet, maybe in a couple of months
Feanil is somewhat excited about this as a way to catch potential problems early
Available for people who are excited about it, but any broader push for adoption will wait for at least a couple of months for other projects to settle down
It’s a tool to enforce best practices on interfaces, but we haven’t necessarily declared such best practices yet
If you find a good starting guide, consider adding to https://openedx.atlassian.net/wiki/pages/createpage.action?spaceKey=ENG&title=Learning%20Resources
[quest](Feanil) What are people’s expectations of arch-hour?
Awkward silence
Updates on what other people are concerned about or coming down the pipeline
Figure out how architecture is handled at edX
Cross-team interaction
Insights into engineering task prioritization
Feedback on what architectural challenges people are facing
[quick question (hopefully)]: Is anyone familiar with the QTI spec? As in worked with it enough for me to ask some questions regarding structure/capabilities?
Not really, it seems
[inform] (Dave O) GitHub - fanout/django-grip: Django GRIP library looks like a potentially useful alternative to channels
[question] (Jeremy B) Does anybody feel it’s worth investigating Django alternatives like FastAPI yet?
(Dave O) Feels like the data layer is the bigger performance problem
[discuss] (Dave O) Possible next steps to resolve database performance issues
It’s hard to optimize this locally without awareness of the overall context of a given request
We may need to more often create custom APIs rather than extending existing ones with new (possibly performance impacting) data
Monitoring is key, especially to catch regressions
People don’t have a good sense on thresholds for action required to improve performance
2021-09-01
++++[ideation] (Jeremy) How should we handle dependencies that are little used/maintained by others? Should we be more actively attempting to switch to alternatives or remove usage?
(djoy) What happened to the idea of having a comprehensive index of the packages we rely on, their licenses, etc? I recall we had a push to figure that out a while back.
(nim) There is a tradeoff today since owning teams are not feeling the pains of managing and upgrading their dependencies. There’s an opportunity to surface this back to the teams or find a better way to help teams be accountable.
+++[quest] (djoy) Curious about any updates on the hooks extension framework, event bus, splitting LMS and studio, etc. Things happening!?
Hooks - API for Django App Backend Plugins
Building on top of Django Signals
Focused on providing APIs for backend plugins
Targeting completion before Maple is released
Recent PR in edx-platform to be reviewed adds initial events that follow this pattern
Frontend Plugin Framework
Initial focus - iFrame-based plugins in MFEs
Next step
Find a use case: Could be Discussions in Learning MFE
How does this dovetail with LTI?
How do we define and document a plugin API?
Starting with a fundamental requirement of having a security sandbox around plugins - this means iframes.
Explored plugins via module federation
Proved complex
Impossible to put a sandbox around the code.
For plugins specifically, doesn’t have many uses that iframe plugins can’t support
Event Bus - Inter-service Communications
Working on POC
Draft OEP in progress
Robert, ChrisP, and Feanil as mini-team on this.
Is there any database-level replication work happening as part of the proof-of-concept?
O’Reilly book proposes that the event stream itself is the source of truth.
Which O’Reilly book are we talkin’ about? I see a few related to event driven architecture, I think.
What’s an initial use case the POC may look at?
Had started looking at Grades events as a possibility.
What are the use cases the POC is trying to solve?
Note: #event-bus slack channel recently created.
xAPI/Caliper
Edly is targeting Maple release cut for v1 of this.
Intended to be a foundation for integrating data across EdTech organizational boundaries.
Boundaries
Splitting LMS and Studio
Arch-BOM making changes to have Studio login through OAuth
Content Theme Architecture
++[ned] wiki space ownership? Good idea, or great idea?
2021-07-28
(5) ***** (discuss) [Dave O] Iframes, Chrome 92, and what we’re going to do about it.
Interim solution for half a year, then will permanently break
See TNL-8559
potential solutions include getting react more usable in our iframe applications, and custom-build the things we are trying to access from outside the iframe (ORA is running into Pothis because we use window.confirm)
Need to backport the temp fix to Lilac?
Known Issues
ORA runs into this in `window.confirm`
LTI launch, open-in-new-window
Probably custom instructor code in various courses
Decision point: double-down on iFrames OR pivot to another way to embed JS components
Perspective: Google Chrome’s move is aligned with making iFrame technology more secure, which we can read as a signal that the industry will continue to advance in the direction of iFrames
Opportunity to seek further input
See where w3c groups are heading
Connect with IMS’ LTI working group
Embracing iFrames in our platform today
Learner MFE
LTI Plugins
Experience Plugins (XPs)
Next steps
DavidJ will continue to explore creating a structured interface for XPs.
T&L will implement a short-gap solution that addresses the issue until December.
Possible approaches
Message passing approach
Backwards compatibility: Have a querystring flag that triggers the XBlock render code to add a snippet of JS that overrides things like window.alert() with a version that doesn’t try to take over the whole window, but instead launches a modal-like-thing for just the frame.
TNL will look at what instructor code might be affected after the patch goes out.
(4) **** (discuss) [Ben W] Implications of community engagement performance requirement on arch topics/work?
In other words, “Are there things currently not being covered by a working group that COULD be, in order to better make use of this org move?”
Frontend engineering working group formation starting up - with BenW, DavidJ, AdamS, and Nim - to tackle frontend technical direction and tasks.
Potential groups:
Data management
Wasn’t there an internal data guild that was started sometime in the last year?
****** Documentation
* Breaking up monolith
Eventing standards and cleanup
Plugin Authoring
Courseware
**** i18n
Does this fall under Frontend WG?
It’s about getting translations done, so I don’t think so
And the tech includes backend code also
* Testing best practices & technology & tools
Today’s Engineering Groups
* DEPR (internal -> become inclusive)
RCAs (internal)
eSRE (internal)
** Security WG (internal)
Built-Test-Release WG (making Open edX Named Releases efficient and effective)
** FedX Working Group (internal -> becoming inclusive)
Fun editorial note: I think this should become the “frontend working group” instead of “fedX working group” - “fedX” is frontend at edX. This group hopes to be more than that by inviting the community in. -djoy
Today’s other (non-engineering) groups
Open edX Marketing WG ()
Paragon Working Group (arguably design-led)
ACTION ITEMS
Ned Batchelder Add this to the Doc Hackathon Ideas sheet - to formulate WG charter, etc. [Ned]
Publicize wider (maybe in slack) for i18n, Documentation
Nimisha Asthagiri Add as agenda to reconnect on status in future Arch Hour
David Joy (collaborators?) - Spruce up our working group page - to update status, expectations, ideas, etc.
External: Working Groups
Is there an internal page?
(4) **** (discuss)[Ned, Usama] We seem to have two ways to use common-constraints.txt (copy into repo, or don’t). Is there common understanding about what to do?
Found the reason for it. Updating Django constraint on local was conflicting with the Django2.3 global constraint due to new pip-tools constraint that raises errors if there are multiple constraints instead of overwriting the previous constraint.
Tested the approach to pull common_constraint on local, remove django constraint from it and update constraint locally: build: Download the common constraint locally. by awais786 · Pull Request #107 · openedx-unsupported/openedxstats .
Move ahead with this approach.
Now, deciding whether to drop the common django constraint and only have a local django constraint in all the packages and services?
2021-07-14
(7) ******* [Question, Ned] tags vs GitHub releases to make Python lib releases?
https://edx-internal.slack.com/archives/CG7FM3BLY/p1617366608199600
Jeremy will write a BOM ticket to switch these actions over to run on tag creation instead. Teams can still optionally use GitHub Releases if desired.
(5) ***** [Quest, BenW] How do we make the case to the organization that Paragon is a valuable thing to be maintained and have stable ownership?
There is confusion over who needs to make the call that we need at least 50% of someone’s time to do this
Feanil will help try to get the right people talking to each other
Part of this involves writing down the arguments for having such an ownership role
(3) *** [Question, Ned] What prep do people need for the “Doc Love” hackathon?
List of docs that should be written or need updating?
Doc personas?
Docs in confluence: what extensions are available, which might be useful? Get IT approval early.
(2) ** [Ideation, DaveO] Lifting out a subset of edx-platform to make extension development easier.
2021-07-07
(1) *[Inform] Doc hackathon planning is underway. Want to help? (I guess that’s a question)
(Discussed) [inform] (David & Nimisha) Tech Radar Workshop next steps and ongoing work
Radar has been created but needs descriptions and ring placement(trial, adopt, etc).
We’ve categorized (by quadrant) and simplified the blips down to a set which feels correct for our first iteration. The next steps are to write up descriptions for each blip and decide what rings they go into. More on this soon!
(5) ***** [quest] (Matt T): What are the biggest challenges our open source community fights that we can fix before we become the open source community?
(Ned) I’m interested in the emphasis on “become” :)
(Adam Blackwell) In the very niche repo of configuration, community members have to fight to get things added that we don’t use. The general ~2-10 repo month PR flow is write a PR, wait a while for a review, merge it, revert it if it breaks http://edx.org things, then put a new PR with it feature gated, then merge it.
One way to fix some of these issues might be to containerize things and use docker images that inherit from other images?
(Adam Blackwell) I’m curious about what http://edx.org specific code is or isn’t in a separate React frontend or Django service
(djoy) Many MFEs have hard coded URLs to edx.org-specific support articles, or i18n strings with edX-specific entities (MicroMasters, http://edx.org , etc.) hard coded into them. We also have places where we’ve coded in particular third party providers that we use that others don’t, such as Cybersource. It’s more a whole laundry list of small things, often.
Nimisha:
I see at least three groups of problems:
1 - Adding new features without modifying the core.
2 - Customizing existing features without modifying the core.
3 - Deployment and operations.
Multi-tenancy
Managing thousand of sites
Upgrades every 6 months
Enabling/disabling features
(Adam) What is one example of an SRE/Deployment challenge for Open edX?
Nimisha: Relating to deployments, e.g. Tahoe
Adam: Q: Is https://github.com/appsembler/tahoe/blob/master/docs/index.md / https://www.appsembler.com/tahoe/ a whitelabel/mutli-tenancy tool?
Feanil: Another challenge is which things are able to be turned off or on via things like toggles and feature flags.
[ideation] (Dave O): Would it make sense to see if we can lift out a small, relatively sane subset of edx-platform to make writing extensions easier (and have edx-platform that)? Things like learning_sequences, user partitioning, scheduling in one place?
Could we do this by virtue of an import linter that people could implement(eg. List all the things they’re allowed to import from edx-platform).
I was actually hoping to make it so that this is a new thing that never imports from edx-platform. :-P But maybe as a stepping stone?
Right, I’m thinking the import linter, lets us test it out and see which parts actually need to be pulled out? Maybe it’s already obvious though.
2021-06-30
(3) *** [inform] (Jeremy) Upgrades in preparation for Maple:
Django
Elasticsearch
MongoDB
Node.js
BTW: Support Windows: https://docs.google.com/spreadsheets/d/11DheEtMDGrbA9hsUvZ2SEd4Cc8CaC4mAfoV8SVaLBGI/edit#gid=195838733
(6) ****** [ideation] (Nim) Documentation strategy - in prep for hackathon and follow-ups from discussion in #institutional_knowledge
OEP-19 Principles
Distinguish between temporal versus permanent information
Co-location (docs and code)
Versioning
Ongoing Maintenance
Open questions
1 - Discoverability
Idea: hierarchy of docs
An independent search tool that searches across all things.
We’ve investigated this in hackathons, with Elastic Enterprise Search being a leading contender at the time, but IT approval is tricky due to the security/privacy ramifications
We have a Google Custom Search, but it’s a bit wonky: https://docs.edx.org/search
Only searches public docs, and misses some of those if we forget to add them to the widget’s configuration
It is still not publicized from docs, which makes it difficult to learn from.
Centralizing our docs is probably best if pan-search tool isn’t feasible. Thinking about docs as a tree structure, there should ideally only be 1 “top node”, as in, you go to confluence, and everything is there (this doesn’t include code comments obvi). Searching across several “top nodes” (google docs, confluence, and so on) hurts discovery ← opinion piece, but *shrug*
One root only makes sense if you have one kind of audience?
2 - Non-technical docs ? Follow the same standards as technical docs or something different?
I think we cannot ask other functions to follow Eng doc processes unless there is a clear reason for them to switch away from what they’re doing. Why would product managers start writing rST?
The Spec/Approach process seems to be working well for product -- if we could get those in Confluence (for discoverability) then that we be a win.
3 - Tools: Confluence, Google, GitHub, Jira
Confluence/Google/Jira: Use these tools strictly for docs that are internally-oriented, temporal, or both.
Many of our doc decisions are intended to work around Confluence limitations (poor search, limited concurrent editing, etc.); may be worth considering a different wiki system
I think there’s some general guidance about “when to use which tool” but it might be useful to give clearer guidance and alignment so we have less fractured docs.
Every squad maintains an up-to-date homepage with links to their own relevant internal/temporal docs
4 - Access: Open edX versus edX.org, Org-specific-structure access (Security, Squad-specific, etc)
I think it could also be a valuable exercise to user-map, figure out who uses our documentation and what they need out of it.
E.g. Open edX, edX internal developers, students, course authors. These are already sort of broken up but having a really strong idea of what they need and where they go to access it could really help clear up the “charter” for each piece of documentation.
We currently allow http://edx.org specific notes in Open edX docs that others may learn from. Maybe we could not only make this explicitly allowed in the OEP, but acknowledge that it doesn’t have to be limited to http://edx.org . Maybe we have Core Orgs?
Idea: for http://edx.org specific technical decisions that would have been an OEP, create a separate closed-sourced repo for http://edx.org OEP-wide decisions.
This way, the other benefits of GitHub can apply.
5 - Ownership of documentation as a framework
Each doc should have a stated owning team
Each team should have a list of all docs they own
There should be guidelines on what doc ownership entails
Change Management
Adoption
by other functions as well (Product, UX, etc) since cross-functional docs also need decisions.
ACTION Nathan, Ned, Feanil - propose a “just-enough” doc strategy for us before the summer’s hackathon so we have an aligned direction to drive during the hackathon.
(5) ***** [quest] (Kyle) Do we foresee edX being on a “level field” with other community members (OpenCraft, RacoonGang, et al) in terms of Open edX core committer rights? Specifically, do we think edX employees and, say, OpenCraft employees will have the same requirements to become CCs?
Nim: One of the principles from the Core Committer program:
Hold an equal bar for both edX engineers and community engineers, in the long-term. In the future, for instance, edX engineers might earn merge rights just as other contributors to the platform.
Nim: For extensions to the core, owning squads as admins of their own repos, can make autonomous decisions on merge rights.
ACTION : Follow up on the B(oundary) part of BEES.
Kyle
Dave O
David Joy (may end up focused on one of the E’s)
Nim
Adam Blackwell if it relates to SRE work or just to learn more about boundaries
2021-06-23
[discuss](Brian) Are we taking steps to remove our dependency on MongoDB?
Not aggressively, but yes
modulestore to S3 and something something?
Have almost gotten permission to get rid of last courses using Old Mongo
Static assets - we could move this to S3/django-storages
Active versions index - Braden working on a PR to move to a Django model
Course structure definitions - we could move this to S3/django-storages
forums to alternative forum software
Would be difficult to get this all done in time for Maple, probably not worth the push
[discuss](Jeremy) Are there any new features people are particularly looking forward to in newer versions of Django/Elasticsearch/Mongo/Node? Or are people generally not keeping up with these release notes?
This seems like a nice thing to keep in a knowledge base style wiki
Dave would like to see some more targeted usage of Python type checking, building on Regis’s mypy work in edx-platform
Jeremy is curious if new-style Django URL configuration would help our regex-related startup performance issues in edx-platform
[discuss](Jeff) How do we want to evolve automated a11y testing?
We last upgraded axe-core 18 months ago
We haven’t updated our set of a11y CI tests in quite a while
Our tools are pretty out of date at this point, there are new ones available
Let’s turn off the a11y tests in Jenkins and GitHub Actions
Jeff will work on new tooling that works for him, and we’ll see if it makes sense to add it to CI
Jeff will create 2 ADRs:
One for turning off existing a11y job
Second one for what we decide to do instead once we’ve decided
2021-06-16
****[quest] (Jeremy) How painful are these ‘make upgrade’ pull requests? Should they be auto-merged?
Today: they are automatically created, not automatically merged nor deployed.
Similar situation exists for Renovate upgrade PRs for frontend repos (e.g., could be configured to automerge patches).
We should probably auto-merge ones that pass tests
Maybe only during Cambridge working hours?
If you want careful control over a particular dependency’s version, consider keeping a constraint on it
Should have the option to pause automated upgrades during sensitive times (major development project, etc.)]n][bb]
Changelog review is slightly automated, but not as easy as it should be
Maybe an allowlist to control which packages are allowed to be upgraded without review?
Assume SemVer and only auto-merge upgrades of non-major version upgrades?
Should we try to get Renovate to work with our Python dependency management?
***[ideation] (Beggs) How can we build a better experience/automation for UX to validate design work done in PRs and give signoff
Two primary options
Spin up an entire devstack/sandbox environment for MFE PRs
Create mock data for MFEs that can be used to allow the MFE to run without any of its dependencies, then deploy it to S3/Netlify and put a link to the environment on the PR.
The group preferred this option, since it could be done on a case-by-case basis in MFEs that need it, and doesn’t rely on us working through the hard problems of spinning up a “complete” environment on every PR. Deploying an MFE takes 5 minutes, deploying a complete sandbox could take an hour.
** [ideation](Feanil) How can we keep docs up-to-date/evergreen?
Hypothesis: Confluence has a lot of stuff that people are afraid to remove/change because very few people feel like they have sufficient context.
Idea: Extend ownership to confluence?
Should we look again into alternatives to Confluence?
Much of why we use Google Docs is due to limitations in Confluence (concurrent editors, etc.)
We need a custom search widget (which can’t find private pages) partially because Confluence search is terrible
OEP-19 (developer docs) is out of date in some respects; mentions that we use Google Docs, but doesn’t recommend it for anything even though we actually do recommend it for some things
2021-06-02
*** [inform] (David and Nimisha): Tech Radar workshop next Arch Hour on June 9!
More info here: https://openedx.atlassian.net/wiki/spaces/AC/pages/2844786770
Formal announcement and invite coming soon!
*** * [quest] (Jeremy) Upgrade
assistance menu options for squads (for Django 3.2 upgrade, etc.)
We’ll do it ourselves
Do the work, we’ll review the PRs
Do it for us, we don’t need to review the PRs
Does this include deployments?
Other?
resourcing: centralized team versus distributing across teams
Centralized team enables efficiency
+ develop a center of excellence and cognitive load
+ can identify and develop automation/tooling
Individual teams are better if the upgrade requires team-specific domain knowledge
Django 3.2 - can take advantage of new ways of doing async
Would require removing Django 2 support to leverage this.
Idea: present this at all-hands once available for usage.
*** [discuss] (ned) should we continue automated “make upgrade” for libraries?
What if the XBlock library doesn’t pin the test requirements in its own library?
Proposal: let XBlock unpin the test requirements; edx-platform will use the latest version when the XBlock is eventually updated in edx-platform.
Note: global constraints file exists
***[discuss] (nim) eventing ADR - versioning, in particular
https://github.com/eduNEXT/openedx-events is for in-process core platform events.
If Django App Plugins or other backend extensions have their own event APIs, they would publish them in their own app. Maybe within
api.py
or a separateevents.py
or something else. @Matt Tuchfarber (Deactivated) can propose in his Django App OEP.Versioning of our events
Let’s be consistent with versioning, as described in OEP-41.
Major version embedded in the name.
Minor version included in the event payload.
Note that for xAPI/Caliper events, the name is from the specs. But their payloads would include version numbers.
We would follow this for frontend events as well. @David Joy (Deactivated) @Adam Stankiewicz
** [adam bl, ideation] How might we enable continuous deployment 24/7?
Would it help to do so by improving our on call runbooks?
Need more info: What is the current CD status of each of our services?
Types of services
MFEs
They deploy to stage upon merge, but need a manual click to push to prod.
Microservices
Teams independently make their own decisions.
Some IDAs deploy to stage automatically, but manual process to release to production.
Monolith
Until recently, we alerted recent-mergers.
2021-05-19
[inform] (Nim/David) Prep for working session(s) at Arch Hour for Tech Radar.
Tentative date: June 9th
++ [quest] (Jeremy) Has anyone worked with an open source license compliance tool before?
Currently, enhancing repo dashboard in order to answer a few immediate questions
But, looking for long-term solutions
+++ [ideation] (David) More documentation generated from code comments? How much of this do we do today? Are there areas of the code that would benefit from more of it?
Examples that have worked well with JSDocs:
We do this in some of our Python projects using Read the Docs:
Example: https://bok-choy.readthedocs.io/en/latest/api_reference.html
We used to do this for edx-platform, but it was very hard to maintain; have to actually install all dependencies on the RTD build worker
https://courses.edx.org/api-docs/ is generated from a mix of decorators and docstrings
Uses edx-api-doc-tools
Code annotations: https://code-annotations.readthedocs.io/en/latest/contrib/how_to/documenting_django_settings.html
Enables pulling
Example output: https://edx.readthedocs.io/projects/edx-platform-technical/en/latest/index.html
https://github.com/oreillymedia/sbo-sphinx (Jeremy wrote this at a previous job for combining Python and JS API docs in a single project)
+++ [quest] (Nim) Book club on Building Event-driven microservices. - at Arch Hour - every N weeks?
✓✓✓✓✓✓ ✓✔︎
Logistics
Asynchronously - read on your time
Lightning talk style presentation at the top of the hour
Can be co-presented if you choose
Rotation-basis / signup basis
Frequency - every other week
Running Doc for notes
Can pre-populate during the week
Eng-all hands next week - to publicize - Kyle or Nim
Access: Login with MIT email address -> SSO -> access
Starting date June 16th - every other week
[quest] (David) Looking for a facilitator for Monthly Arch Standup still!
+ [quest] (Feanil) A clear journey through our docs. What is the right place to start for a new person?
In looking at our onboarding, we currently provided a lot of links to various docs.
Where is the Start Here?
[Nim] In a past attempt, we were thinking: https://docs.edx.org/
[Nim] And for internal engineering onboarding: https://openedx.atlassian.net/wiki/pages/createpage.action?spaceKey=ENG&title=Developer%20Onboarding
[ideation] (Feanil) How do we make documentation easier to consume and maintain continuously?
2021-05-12
[Quest] (Ben W) How to ensure good UX and UI processes now that we are merging the functions into Product Designers?
Why
The current state of ownership responsibilities across themes/squads/etc was messy for UI/UX folx.
We were already seeing UI and UX looking at the holistic process.
Rollout
There will be an initial slow-down since people will be developing skills.
The designer will need to clarify and communicate that this slow-down will be there.
Stacey and AdamBu are developing an education program to upskill the team.
Process
Onus is on the Designer to ensure they are getting the proper reviews.
Organizational structure - theme-specific allocations.
[inform] (Nim/David) Prep for working session(s) at Arch Hour for Tech Radar.
Tracking dependencies of repos over time - Tidelift provides this support.
2021-05-05
[inform/request] (DaveO): I’m working on an ADR to remove modulestore usage from all LMS apps except courseware. If your LMS app needs structural/content data from modulestore to do its job (as opposed to course keys or course config settings), can you please describe your use case?
The performance of modulestore queries can really vary, seems best to just not use it directly whenever possible
[Nim] FYI on Cale’s type-checking PR, which he hoped would eventually help us find callers to Modulestore: https://github.com/edx/edx-platform/pull/26985
++++++ [inform/ideation] (Jeremy): Suggestions/feedback on candidate projects for Arbi-BOM: https://openedx.atlassian.net/wiki/spaces/AT/pages/2689925443
++++[ideation] (djoy) How can we make our developer docs more obviously part of our process? Do people like readthedocs? Is there an amazing alternative out there? Can we make it easier to contribute? To find what you’re looking for? Have you contributed? I haven’t! When you did, why did you do it, and how did you know to?
Readthedocs++, can do markdown, RST, etc.
edx-documentation repo organization is not intuitive to developers
focused on end user and broader community documentation
OEPs on developer documentation… still need to get things to match the new pattern
Notes transferred here: https://openedx.atlassian.net/wiki/spaces/AC/pages/2725348094
++++[quest/ideate] ( Awais ) How can we make sandboxes more useful for different services ? ( adding data for ecom, credentials or discovery ).
[Feanil] I think this is a worthwhile investment but there is a question of knowing what to add.
Arch-BOM tried to do this from Dev Data because teams don’t have time to prioritize this work at the moment.
Is there a gap here because the teams with the domain knowledge don’t feel as much pain?
Currently it looks like the team just accepts the sunk cost even if they do have the domain knowledge.
2021-04-28
++ [quest] [Feanil/Jazib] How mature is our k8s stuff, should existing work be moving to it?
+++++++++ [quest] [djoy] What do we think of the idea of non-python backends here at edX?
I was mulling this over this morning - not talking about anything specific or looking for permission.
Why? Skipping for today.
Possibility of using node for specific *types* of services (MFE translators),
Pros:
Best tool for the job may not be Python
Mitigate risk of Python and/or Django stagnating relative to alternatives
May allow use cases that Python/Django aren’t very appropriate for
Allows frontend developers to manage their comms layer without stack-changing
without touching the django and risking mucking up the data models
Allows runtime access to data without mixing technologies.
Cons:
We already have a lot of Python expertise
Potential lack of operational experience for new languages and frameworks
Want to avoid “we did it this way because it's what our team knows better”
Poster child: forums in Ruby
Django has an ORM, accessing data created by that ORM via other ORMs can be risky/error prone.
Note: many of these apply mainly if we are mixing paradigms
Proposal (Ben W):
Django can be used for data model creation, management and manipulation.
Node can be used for MFE data translation access.
tl;dr from Jeremy: We’re not open to new backends in general because there are very real costs to supporting diverse backend technologies, but we’re open to considering specific proposals if there are good business reasons why Python/Django isn’t an appropriate choice (performance, existing implementation using front-end technology, etc).
++++++ [Ned; but I can’t be there until later] How can we use the repo-health information to improve the health of repos? For example: some repos have no openedx.yaml file.
How do we decide which things are most important to make consistent?
Once we’ve decided what to make consistent, how do we do that in the face of competing priorities on specific squads?
How actively is this data being utilized currently?
Is the current presentation of the data working, or do we need a better interface for it?
[Ned] I’m using the csv
Comment: currently very wide (which makes it less approachable)
There have been convos about integrating with Snowflake and neo4j.
Maybe we want pluggable output generators, similar to how we have pluggable check implementations? [Ned: is this over-engineering considering a csv is available?]
Frontend Checks
Is it using renovate?
Is it deployed to npm? [Ned: I am planning to add this check soon]
+++[quest] [Robert/Hassan] How do people feel about using the new annotation tooling to annotate custom attributes? Would you find it useful to have a consolidated doc of their purpose?
2021-04-15
[ideate & vote] Deep topics for future Arch hour
Votes -> Focused study group, or pre-announced deep-dive discussions
Topics
(6) ++++++ Eventing data technology and usage best practices (segment, new relic, etc)
(5) +++++ QA Process
writing test plans
working with external resources
@Justin Lapierre Share squad huddle notes and pilot
(3) +++ Design-pattern/code-organizational concerns
Applying DDD design patterns to specific problems: could be a case study
Alex Dusenbery volunteers
(2) ++ Making sandboxes useful?
@Justin Lapierre @OpenCraft - They have an automated system for creating these.
(2) ++ Testing
Testing xblocks
How we use Cypress
(1) + Raising knowledge gaps
How does <old thing> work?
Current practices on specific topics, e.g. caching
(1) +Travis to Github Actions
(1) +Working session
FYI: OEP Review (2020)
+ edX Tech Radar
How do I get us/my team to do x?
Ownership question
Experimentation
apis
practices
workflows
Build/deployment concerns
Data-Flow concerns
How to reduce pain of maintaining Ansible?
Frontend and Experiment API
MFE configuration
[ideate] Bite-sized tickets: how to create/maintain them for onboarding contributors?
To give people a way to start with the project
To get useful small things done
Maybe a tutorial + sample project is better in some cases?
Types of work
DEPR
Toggle-related
Other DEPR migration projects
Django 2.2 -> 3.2 upgrade
Documentation
2021-04-09
Arch Hour
Why keep Arch Hour?
Opportunity for cross-team discussion
https://openedx.atlassian.net/wiki/spaces/AC/pages/1800470616/Proposal+Arch+Weekly+DRAFT+WIP
Next increment
Timing
Wed at 11am
Frequency
Weekly
Structure
Lean Coffee
Topic - open forum - not necessarily a presentation
Topic - study - prepped, possibly as a workshop
Working session
Community participation
Invited guests to pre planned long form topics.
Coming in for led or working-sessions
Functional distinctions:
List investigation/ topic-generation/curation
Group-interrogation when no-one feels quite expert enough to lead
Led topic dive
Topics for these meetings
Raising knowledge gaps
How does <old thing> work?
Current practices on specific topics, e.g. caching
How do I get us/my team to do x?
Ownership question
QA
writing test plans
working with external resources
Experimentation
apis
practices
workflows
Testing
Build/deployment concerns
Data-Flow concerns
Design-pattern/organizational concerns
Making sandboxes useful?
How to reduce pain of maintaining Ansible?
Frontend and Experiment API
FYI: OEP Review (2020)
Testing continued ++++
Ben Notes:
QA testers are less expensive than Senior Devs and, more specialized at the task
Types of QA that is helpful for here may exclude “exploratory” testing, which is more focused on “no bugs at all” and is much more slow-inducing
QA engineers are people whose whole focus, every day, is how to validate the work, and make sure it is validate-able.
QA are better/practiced/trained at building test plans that actually validate a product with a minimum amount of active testing.
Notes from Arch Hour with a QA Architect
Things we could work on if there is a perceived high ROI:
When to use https://docs.pact.io
When to use https://docs.pact.io
making e2e tests faster/better
teaching how to add e2e tests
How to test ORA?
Why test ORA -> So we have fewer RCAs like this one
2021-03-25
Note from Jeremy: I copied these over from the meeting notes after the meeting finished, but my Zoom connection was unreliable for most of the meeting so I’m not sure all these topics were actually discussed. Please edit if you spot any discrepancies.
(5) ***** [JJ, ideation?] Ideas on how to tell people to deploy their changes to production
Do Continuous Deployment? So push to master will just lead to deployment
Team has resisted this, they would like someone to be actively monitoring when changes go out
Developers not on the owning team don’t really know how to do this monitoring
Make the GoCD pipeline do it
Slack: “Please push your changes to production”
What repos have/don’t have CD enabled?
To my knowledge, none of the MFEs are auto-deployed to production. They are auto-deployed to stage, though. -djoy
It seems like only edxapp and prospectus have CD enabled. But many developers work primarily in these repos, so other deployment patterns can catch them by surprise.
(4) **** [JJ, discussion?] How do people manage/”own” repos that other teams contribute to?
(3) ***[djoy] How can we discover - without scaring off - the people still at the company who understand our legacy frontend code (like, but not at all limited to, comp theming)?
Or how do we invest in getting enough knowledge of the old stuff to thoughtfully get rid of or improve it.
(2) **[Ned; open-ended discussion, no agenda!] Thoughts about Open edX theme?
Can we get rid of comprehensive theming?
Maybe?
But probably not.
(2) **[Ned; opinion poll] renaming master branches to main
2021-03-18
(5) ***** [question, Matt T] Are we continuing to support multi-site? If I’m writing new code, should I make it site aware, are we planning on ripping it out, somewhere in between?
Matt is going to kick off the DEPR process for multi-site and see what the community thinks
Note (a single dev in) community is somewhat-actively supporting multi-site; see recent PR in this area
EduNext, which hosts 1000+ sites, doesn’t use SiteConfiguration anymore.
We’d like to move the community off of SiteConfiguration. Have them use EduNext’s design instead: runtime override of Django Settings.
Here’s a peak