Arch Hours: 2021
Meeting Expectations
Why?
Provide an opportunity for generative discussion and ideas.
Foster comradery through technical curiosity and geekdom.
Who?
Open to all edX-ers and Arbisoft-ers
What?
At times, these informal discussions result in follow-up action and beneficial change in our technology or in our organization. While this is not a decision-making body, these serendipitous discussions spark ideas that may result in ADRs/OEPs and tickets on team backlogs.
At times, it serves as a form of informal office hours to ask live technical questions of the archeological collective.
At times, we have pre-planned deep-dive topics that folks propose to gather wide-input or to answer questions.
At times, we have hosted special guests (internal and external to edX) on specialized topics.
When?
Not lunch hour in ET timezone: With Covid remote work, "Arch Lunch" has evolved into “Arch Hour” in order to accommodate various home/life situations during lunch time.
How? Live Co-Editing
To circumvent Confluence’s limitations with the maximum number of concurrent editors:
during the hour together, we capture topics and take notes at https://docs.google.com/document/d/18TmQf3GllPDfjR7WKiMIhR2eqsbwPi1h3Ojdb6yDYCY/edit.
after the hour, we move those notes to this page.
Why not just stick with keeping the notes in the Google doc?
Google docs are not as discoverable.
Google docs don’t notify observers of future edits.
Google doc comments don’t notify all observers.
How? Structure
Please enter your proposed topics for discussion.
When we use Lean Coffee Style (link1, link2), we vote on which topics the group wants to discuss and time-box the discussion to 10 or 15mns → 5mns (if re-voted) → 5mns (if re-voted).
Prefix your topic with your intention so we are clear on what outcome you are striving from the discussion. Examples:
[inform] You are simply seeking to inform the group of this item. You may field clarifying questions from the group on your inform, but not seeking further discussion at this time.
[ideation] You are seeking divergent and wide perspectives from this group. In this brainstorming mode, all ideas are accepted, without critical analysis.
It may be helpful to clarify whether you’d like to ideate on the problem space or the solution space.
[analysis] You are asking the group to help you poke holes in your idea/topic/plan/etc.
[quest] You are seeking information/responses to a question you have.
2021-12-22
[ideation] (David J) What do we need to change in Tutor to make it minimally useful for edX developers to start using?
[quest] (David J) What is paver used for today? Corollary, what is paver?
Paver is a python library for creating tasks to be executed from the command line
Runs unit tests, building static assets, maintenance tasks, in edx-platform. Never got used anywhere else.
Still technically maintained. Started moving away from it and running the underlying commands directly.
We did not build paver.
There's two parts of this:
We didn't make paver, it's someone else's thing.
We did write a lot of code to run things under paver. It may look like we made it because we wrote so much on top of it.
Chain of thought: we need a task runner, make files are default, messy and shell based, want to do something in python, lets use paver. Wait, this is complicated, lets go back to using Makefiles.
Makefiles and paver give you a way of saying "this command needs these other commands first"
Django management commands don't do that.
The appeal was complex dependent-task management ("Only run this thing once even if several commands need it")
Problem with Make: needs to be implemented as shell scripts
Paver lets you use python files instead of shell scripts.
When the project was first starting out, we wanted anyone to be able to run it anywhere. A lot of initial goodwill - as the reality of that set in, we decided supporting windows is hard and not valuable for the amount of investment we'd have to do to keep it working. Usage of paver stemmed out of that, but then getting back out became too big for any team to take on.
[quest] Enterprise team is looking to best leverage feature toggles abilities to do more flexible (per customer or per subscription) toggles of features.A lot of these are done via config models right now. I wanted to get a base understanding of what best practices are in this area (edx-toggles read gave me words like: Waffle, Django setting, SettingToggle). Also, want to brainstorm on how can frontends best leverage these settings? Prior art? link to edx-toggles
2021-12-01
[quest/ideation] (Diana) - Devstack data project - who has a good use case that we should tackle?
Have a prototype that will load data into devstack
Arch-BOM were hunting around for a team with a good use case
Reach out to #arch-bom if you have a use case
[Ideation] (Justin Lapierre) How to set up a course/configuration in Devstack for QA in a reliable, reproducible way
[ideation] (David J) As the balance of edX devs to core contributors shifts in the future, how do we think that affects the risk profile of our CI/CD pipelines deploying off of master, and what might need to change? :eyes:
"Support not changing things until there's a problem"
We may not be the only people deploying from master - what if _we_ break someone else's systems?
SWG perspective - how do we do security releases reliably?
How do we know we're in a more risky situation?
Added complexity of pre-empting this problem is very expensive, so wait on the RCAs
"Eventual trouble is probably inevitable", we don't know what the problem will be, but we may be able to game out how we might deal with the problem. There's no code change we can't roll back (putting aside malicious stuff) Data migrations are scary because they can be 100% destructive and irreversible. And/or wildly unperformant.
We have point-in-time recovery to the minute (!!!)
Does GitHub have a canonical solution to what edx-platform-private solves today?
Jinder said Feanil or others looked into this early on and it wasn’t ready.
May still be hard for GoCD to deploy off of private forks.
Idea: "migration files require an extra approval" via CODEOWNERS
[inform] (Adam B) - Spreading this info more widely: If you want to add users with granular permissions to ecommerce you can now do so through app-permissions, e.g. like this
Also if you want to manage users for other services via code, Matt Hughes has offered to test out these setup docs on registrar.
[ideation/analysis] (David J) In spirit, over time edX employees could be thought of as core contributors themselves. The CC program has a time commitment (20 hrs/month). Could edX engineers/others be CCs? What would that look like for product delivery teams?
Ecommerce collaborative push?
Can we borrow from other communities?
Communities
Flywheel to support community which then helps us
We focused on drag, but what about focusing on benefit?
What features does edX/2U want?
Goal: Reducing ambiguity around expectations
Can we be a bit more formal around finding what folks are interested in and hooking them up with the right WG/people?
2021-11-24
[analysis] (Nathan Sprenkle) Internally-routed XBlock handler calls
I wrote a function for calling XBlock handlers from inside edx-platform for a backend-for-frontend application. This is half [inform] and half [let me know if this is a “Really Bad Idea or Not™”].
See github edx/edx-platform/lms/djangoapps/ora_staff_grader/utils.py#L43
Ask OpenCraft about other ways/options for doing it?
[Ned] Meta-question: are there any more questions or discussion following on from the Eng All-Hands?
Should we be planning for more repo changes programmatically? How will 2U people be able to make changes to things like branch permissions?
Kyle is working on it, he might use https://registry.terraform.io/providers/integrations/github/latest
For now, people who have admin will keep admin.
What’s happening to BD projects?
Some are moving to TCRIL, others are staying with edX
[question] (Jeremy) If you could nominate 1-2 things to improve the experience for new Open edX developers, what would they be? (We’ll likely have many such people soon.)
[dkh] We should pick up the thread of the onboarding work Feanil recently did
[djoy] Better/updated seed data for freshly provisioned devstacks/services
[djoy] push button environment creation and simple installation/setup instructions and documentation
[adam] Make things more consistent (devstack vs sandboxes vs prod, bokchoy vs cypress)
[andy] tests that work outside of a devstack shell
[andy] devstack as cattle not as pet
[andy] meta improvement: measure time lost to devstack to motivate investment
[djoy] is there a third party libraries-so-popular-they're-standards development environment stack that we could adopt so that someone else maintains our development stack for us, effectively. (this is clearly not a small thing)
[Ned] Can TCRIL participate here?
Probably, let’s enable that
[Adam] question: Is there anything stopping us from moving https://build.testeng.edx.org/job/edx-e2e-tests/ to the new tools-automation cluster?
Sub question, if someone changes something in a repo that will break an e2e test how can we make it faster to fix the test?
David Joy: Maybe a GHA?
Diana: Or maybe we can just remove the tests that trip us up
David Joy: e2e tests should be few and far in between, so we should only use them for critical path tests.
2021-11-03
++[analysis] (Dave O): Extracting a low-level learning core out of edx-platform and into a new repo.
(Original post/thread).
Motivation:
Create a smaller/simpler dependency to build extensions on top of (instead of edx-platform). This means smaller, more stable APIs that can add incremental value, instead of the backloaded benefits of removing stuff from edx-platform.
Promote innovation by making it easier to create different experiences on top of Open edX (like LabXchange).
Advance the Studio/LMS split (these would be the core of the LMS).
Some potential apps: publishing, navigation, policy, composition (what’s in a Unit for this user?), scheduling, partitioning (what users are in what groups for various tests).
Strategy for dealing with tricky extractions: Core data models and frameworks exist in new repo. Plugins are implemented in edx-platform. Examples:
For navigation, an outline processor framework exist in new repo, but an EnrollmentOutlineProcessor exists in edx-platform, keeping knowledge of enrollments out of the new core.
For partitioning, the data models to store user/partition mappings live in the new core, but actual partition bucketing logic remains implemented in edx-platform.
Strategy for data migration: Start with content data, that can be rebuilt/backfilled from Modulestore. We do this kind of thing all the time already.
Feedback:
Robert: In general, like the idea. Where to start: serving particular needs, like course overview-like data, verify that it serves the needs. Interfaces vs. implementations: do we need both? E.g. for course overviews, does it just need to move or will work need to be done–define an interface for a mocked version?
Jeremy: Any idea of things currently installed in edx-platform that directly call edx-platform–what are they doing?
Course overviews
Things that reach into modulestore for lack of better APIs that we should probably create
Scheduling
Jeremy: might it be worth identifying APIs in edx-platform that are the main code called by many other parts of the platform, so extracting it could allow all those to also be extracted?
There may be some cases where we move enough of the core of such an app (models, etc.) to a separate repo, but leave all the tangled implementation details in place to minimize the up-front work needed.
+ [quest] (Jeremy) Docker Desktop changes - how does this influence build vs. buy decisions?
Docker is starting to charge for Docker Desktop usage for orgs like ours
There are possible alternatives like Minikube, but we haven’t really evaluated how well they would work for us
Quite possibly worth edX paying for this, not as clear for Open edX at large
Jeremy discussed this with Régis; we agreed it needs to be discussed/resolved, but didn’t come up with any immediate answers
BTR working group at large hasn’t discussed this yet
Does moving to Codespaces or something similar change this
Costs ~$21/user/month for 50+ user orgs
Open edX will probably support multiple options in the future, but edX is likely to pick a default for its own developers
Probably just paying the license fee for now, to free up resources for acquisition-related stuff
Has anyone tried Tutor or Codespaces?
Tutor: https://docs.tutor.overhang.io/run.html , written by Régis Behmo
Codespaces, etc.: sounds cool, nobody’s had time to look at it very closely
Quicker to get started (and to restart!)
Internet speed: are codespaces better or worse than downloading docker images for poor internet speed?
Codespaces are latency, docker images are bandwidth
Doesn’t work at all offline, but that may not really be an issue for most people these days
2021-10-06
[discuss] [Jeremy] mypy: where are we, where are we going
We’re running this for edx-platform in an optional GitHub Action check
We don’t have many annotations yet
There are tools that would let us add a lot pretty quickly by analyzing our test suite, etc.
But we probably won’t do this just yet, maybe in a couple of months
Feanil is somewhat excited about this as a way to catch potential problems early
Available for people who are excited about it, but any broader push for adoption will wait for at least a couple of months for other projects to settle down
It’s a tool to enforce best practices on interfaces, but we haven’t necessarily declared such best practices yet
If you find a good starting guide, consider adding to https://openedx.atlassian.net/wiki/pages/createpage.action?spaceKey=ENG&title=Learning%20Resources
[quest](Feanil) What are people’s expectations of arch-hour?
Awkward silence
Updates on what other people are concerned about or coming down the pipeline
Figure out how architecture is handled at edX
Cross-team interaction
Insights into engineering task prioritization
Feedback on what architectural challenges people are facing
[quick question (hopefully)]: Is anyone familiar with the QTI spec? As in worked with it enough for me to ask some questions regarding structure/capabilities?
Not really, it seems
[inform] (Dave O) https://github.com/fanout/django-grip looks like a potentially useful alternative to channels
[question] (Jeremy B) Does anybody feel it’s worth investigating Django alternatives like FastAPI yet?
(Dave O) Feels like the data layer is the bigger performance problem
[discuss] (Dave O) Possible next steps to resolve database performance issues
It’s hard to optimize this locally without awareness of the overall context of a given request
We may need to more often create custom APIs rather than extending existing ones with new (possibly performance impacting) data
Monitoring is key, especially to catch regressions
People don’t have a good sense on thresholds for action required to improve performance
2021-09-01
++++[ideation] (Jeremy) How should we handle dependencies that are little used/maintained by others? Should we be more actively attempting to switch to alternatives or remove usage?
(djoy) What happened to the idea of having a comprehensive index of the packages we rely on, their licenses, etc? I recall we had a push to figure that out a while back.
(nim) There is a tradeoff today since owning teams are not feeling the pains of managing and upgrading their dependencies. There’s an opportunity to surface this back to the teams or find a better way to help teams be accountable.
+++[quest] (djoy) Curious about any updates on the hooks extension framework, event bus, splitting LMS and studio, etc. Things happening!?
Hooks - API for Django App Backend Plugins
Building on top of Django Signals
Focused on providing APIs for backend plugins
Targeting completion before Maple is released
Recent PR in edx-platform to be reviewed adds initial events that follow this pattern
Frontend Plugin Framework
Initial focus - iFrame-based plugins in MFEs
Next step
Find a use case: Could be Discussions in Learning MFE
How does this dovetail with LTI?
How do we define and document a plugin API?
Starting with a fundamental requirement of having a security sandbox around plugins - this means iframes.
Explored plugins via module federation
Proved complex
Impossible to put a sandbox around the code.
For plugins specifically, doesn’t have many uses that iframe plugins can’t support
Event Bus - Inter-service Communications
Working on POC
Draft OEP in progress
Robert, ChrisP, and Feanil as mini-team on this.
Is there any database-level replication work happening as part of the proof-of-concept?
O’Reilly book proposes that the event stream itself is the source of truth.
Which O’Reilly book are we talkin’ about? I see a few related to event driven architecture, I think.
What’s an initial use case the POC may look at?
Had started looking at Grades events as a possibility.
What are the use cases the POC is trying to solve?
Note: #event-bus slack channel recently created.
xAPI/Caliper
Edly is targeting Maple release cut for v1 of this.
Intended to be a foundation for integrating data across EdTech organizational boundaries.
Boundaries
Splitting LMS and Studio
Arch-BOM making changes to have Studio login through OAuth
Content Theme Architecture
++[ned] wiki space ownership? Good idea, or great idea?
2021-07-28
(5) ***** (discuss) [Dave O] Iframes, Chrome 92, and what we’re going to do about it.
Interim solution for half a year, then will permanently break
See TNL-8559
potential solutions include getting react more usable in our iframe applications, and custom-build the things we are trying to access from outside the iframe (ORA is running into Pothis because we use window.confirm)
Need to backport the temp fix to Lilac?
Known Issues
ORA runs into this in `window.confirm`
LTI launch, open-in-new-window
Probably custom instructor code in various courses
Decision point: double-down on iFrames OR pivot to another way to embed JS components
Perspective: Google Chrome’s move is aligned with making iFrame technology more secure, which we can read as a signal that the industry will continue to advance in the direction of iFrames
Opportunity to seek further input
See where w3c groups are heading
Connect with IMS’ LTI working group
Embracing iFrames in our platform today
Learner MFE
LTI Plugins
Experience Plugins (XPs)
Next steps
DavidJ will continue to explore creating a structured interface for XPs.
T&L will implement a short-gap solution that addresses the issue until December.
Possible approaches
Message passing approach
Backwards compatibility: Have a querystring flag that triggers the XBlock render code to add a snippet of JS that overrides things like window.alert() with a version that doesn’t try to take over the whole window, but instead launches a modal-like-thing for just the frame.
TNL will look at what instructor code might be affected after the patch goes out.
(4) **** (discuss) [Ben W] Implications of community engagement performance requirement on arch topics/work?
In other words, “Are there things currently not being covered by a working group that COULD be, in order to better make use of this org move?”
Frontend engineering working group formation starting up - with BenW, DavidJ, AdamS, and Nim - to tackle frontend technical direction and tasks.
Potential groups:
Data management
Wasn’t there an internal data guild that was started sometime in the last year?
****** Documentation
* Breaking up monolith
Eventing standards and cleanup
Plugin Authoring
Courseware
**** i18n
Does this fall under Frontend WG?
It’s about getting translations done, so I don’t think so
And the tech includes backend code also
* Testing best practices & technology & tools
Today’s Engineering Groups
* DEPR (internal -> become inclusive)
RCAs (internal)
eSRE (internal)
** Security WG (internal)
Built-Test-Release WG (making Open edX Named Releases efficient and effective)
** FedX Working Group (internal -> becoming inclusive)
Fun editorial note: I think this should become the “frontend working group” instead of “fedX working group” - “fedX” is frontend at edX. This group hopes to be more than that by inviting the community in. -djoy
Today’s other (non-engineering) groups
Open edX Marketing WG ()
Paragon Working Group (arguably design-led)
ACTION ITEMS
Ned Batchelder Add this to the Doc Hackathon Ideas sheet - to formulate WG charter, etc. [Ned]
Publicize wider (maybe in slack) for i18n, Documentation
Nimisha Asthagiri Add as agenda to reconnect on status in future Arch Hour
David Joy (collaborators?) - Spruce up our working group page - to update status, expectations, ideas, etc.
External: https://openedx.atlassian.net/wiki/spaces/COMM/pages/46793351
Is there an internal page?
(4) **** (discuss)[Ned, Usama] We seem to have two ways to use common-constraints.txt (copy into repo, or don’t). Is there common understanding about what to do?
Found the reason for it. Updating Django constraint on local was conflicting with the Django2.3 global constraint due to new pip-tools constraint that raises errors if there are multiple constraints instead of overwriting the previous constraint.
Tested the approach to pull common_constraint on local, remove django constraint from it and update constraint locally: https://github.com/edx/openedxstats/pull/107 .
Move ahead with this approach.
Now, deciding whether to drop the common django constraint and only have a local django constraint in all the packages and services?