Arch Hours: 2021

Meeting Expectations

Why?

  • Provide an opportunity for generative discussion and ideas.

  • Foster comradery through technical curiosity and geekdom.

Who?

  • Open to all edX-ers and Arbisoft-ers

What?

  • At times, these informal discussions result in follow-up action and beneficial change in our technology or in our organization. While this is not a decision-making body, these serendipitous discussions spark ideas that may result in ADRs/OEPs and tickets on team backlogs.

  • At times, it serves as a form of informal office hours to ask live technical questions of the archeological collective.

  • At times, we have pre-planned deep-dive topics that folks propose to gather wide-input or to answer questions.

  • At times, we have hosted special guests (internal and external to edX) on specialized topics.

When?

  • Not lunch hour in ET timezone: With Covid remote work, "Arch Lunch" has evolved into “Arch Hour” in order to accommodate various home/life situations during lunch time.

How? Live Co-Editing

To circumvent Confluence’s limitations with the maximum number of concurrent editors:

Why not just stick with keeping the notes in the Google doc?

  • Google docs are not as discoverable.

  • Google docs don’t notify observers of future edits.

  • Google doc comments don’t notify all observers.

How? Structure

Please enter your proposed topics for discussion.
When we use Lean Coffee Style (link1, link2), we vote on which topics the group wants to discuss and time-box the discussion to 10 or 15mns → 5mns (if re-voted) → 5mns (if re-voted).

Prefix your topic with your intention so we are clear on what outcome you are striving from the discussion. Examples:

  • [inform] You are simply seeking to inform the group of this item. You may field clarifying questions from the group on your inform, but not seeking further discussion at this time.

  • [ideation] You are seeking divergent and wide perspectives from this group. In this brainstorming mode, all ideas are accepted, without critical analysis.

    • It may be helpful to clarify whether you’d like to ideate on the problem space or the solution space.

  • [analysis] You are asking the group to help you poke holes in your idea/topic/plan/etc.

  • [quest] You are seeking information/responses to a question you have.

2021-12-22

  • [ideation] (David J) What do we need to change in Tutor to make it minimally useful for edX developers to start using?

  • [quest] (David J) What is paver used for today? Corollary, what is paver?

    • Paver is a python library for creating tasks to be executed from the command line

    • Runs unit tests, building static assets, maintenance tasks, in edx-platform.  Never got used anywhere else.

    • Still technically maintained.  Started moving away from it and running the underlying commands directly.  

    • We did not build paver.

      • There's two parts of this:  

        • We didn't make paver, it's someone else's thing.  

        • We did write a lot of code to run things under paver.  It may look like we made it because we wrote so much on top of it.

    • Chain of thought: we need a task runner, make files are default, messy and shell based, want to do something in python, lets use paver.  Wait, this is complicated, lets go back to using Makefiles.

    • Makefiles and paver give you a way of saying "this command needs these other commands first"

      • Django management commands don't do that.  

      • The appeal was complex dependent-task management ("Only run this thing once even if several commands need it")

      • Problem with Make: needs to be implemented as shell scripts

    • Paver lets you use python files instead of shell scripts.

    • When the project was first starting out, we wanted anyone to be able to run it anywhere.  A lot of initial goodwill - as the reality of that set in, we decided supporting windows is hard and not valuable for the amount of investment we'd have to do to keep it working.  Usage of paver stemmed out of that, but then getting back out became too big for any team to take on.

  • [quest] Enterprise team is looking to best leverage feature toggles abilities to do more flexible (per customer or per subscription) toggles of features.A lot of these are done via config models right now. I wanted to get a base understanding of what best practices are in this area (edx-toggles read gave me words like: Waffle, Django setting, SettingToggle). Also, want to brainstorm on how can frontends best leverage these settings? Prior art? link to edx-toggles

2021-12-01

  • [quest/ideation] (Diana) - Devstack data project - who has a good use case that we should tackle?

    • Have a prototype that will load data into devstack

    • Arch-BOM were hunting around for a team with a good use case

    • Reach out to #arch-bom if you have a use case

  • [Ideation] (Justin Lapierre) How to set up a course/configuration in Devstack for QA in a reliable, reproducible way 

  • [ideation] (David J) As the balance of edX devs to core contributors shifts in the future, how do we think that affects the risk profile of our CI/CD pipelines deploying off of master, and what might need to change? :eyes:

    • "Support not changing things until there's a problem"

    • We may not be the only people deploying from master - what if _we_ break someone else's systems?

    • SWG perspective - how do we do security releases reliably?

    • How do we know we're in a more risky situation?

      • Added complexity of pre-empting this problem is very expensive, so wait on the RCAs

    • "Eventual trouble is probably inevitable", we don't know what the problem will be, but we may be able to game out how we might deal with the problem.  There's no code change we can't roll back (putting aside malicious stuff)  Data migrations are scary because they can be 100% destructive and irreversible.  And/or wildly unperformant.

      • We have point-in-time recovery to the minute (!!!)

    • Does GitHub have a canonical solution to what edx-platform-private solves today?

      • "Temporary private forks"

        • Jinder said Feanil or others looked into this early on and it wasn’t ready.

        • May still be hard for GoCD to deploy off of private forks.

    • Idea: "migration files require an extra approval" via CODEOWNERS

  • [inform] (Adam B) - Spreading this info more widely: If you want to add users with granular permissions to ecommerce you can now do so through app-permissions, e.g. like this

  • [ideation/analysis] (David J) In spirit, over time edX employees could be thought of as core contributors themselves.  The CC program has a time commitment (20 hrs/month).  Could edX engineers/others be CCs?  What would that look like for product delivery teams?

2021-11-24

  • [analysis] (Nathan Sprenkle) Internally-routed XBlock handler calls

  • [Ned] Meta-question: are there any more questions or discussion following on from the Eng All-Hands?

    • Should we be planning for more repo changes programmatically?  How will 2U people be able to make changes to things like branch permissions?

    • What’s happening to BD projects?

      • Some are moving to TCRIL, others are staying with edX

  • [question] (Jeremy) If you could nominate 1-2 things to improve the experience for new Open edX developers, what would they be?  (We’ll likely have many such people soon.)

    • [dkh] We should pick up the thread of the onboarding work Feanil recently did

    • [djoy] Better/updated seed data for freshly provisioned devstacks/services

    • [djoy] push button environment creation and simple installation/setup instructions and documentation

    • [adam] Make things more consistent (devstack vs sandboxes vs prod, bokchoy vs cypress)

    • [andy] tests that work outside of a devstack shell

    • [andy] devstack as cattle not as pet

    • [andy] meta improvement: measure time lost to devstack to motivate investment

    • [djoy] is there a third party libraries-so-popular-they're-standards development environment stack that we could adopt so that someone else maintains our development stack for us, effectively. (this is clearly not a small thing)

  • [Ned] Can TCRIL participate here?

    • Probably, let’s enable that

  • [Adam] question: Is there anything stopping us from moving https://build.testeng.edx.org/job/edx-e2e-tests/ to the new tools-automation cluster?

    • Sub question, if someone changes something in a repo that will break an e2e test how can we make it faster to fix the test?

      • David Joy: Maybe a GHA?

      • Diana: Or maybe we can just remove the tests that trip us up

      • David Joy: e2e tests should be few and far in between, so we should only use them for critical path tests.

2021-11-03

  • ++[analysis] (Dave O): Extracting a low-level learning core out of edx-platform and into a new repo.

    • (Original post/thread).

    • Motivation:

      • Create a smaller/simpler dependency to build extensions on top of (instead of edx-platform). This means smaller, more stable APIs that can add incremental value, instead of the backloaded benefits of removing stuff from edx-platform.

      • Promote innovation by making it easier to create different experiences on top of Open edX (like LabXchange).

      • Advance the Studio/LMS split (these would be the core of the LMS).

    • Some potential apps: publishing, navigation, policy, composition (what’s in a Unit for this user?), scheduling, partitioning (what users are in what groups for various tests).

    • Strategy for dealing with tricky extractions: Core data models and frameworks exist in new repo. Plugins are implemented in edx-platform. Examples:

      • For navigation, an outline processor framework exist in new repo, but an EnrollmentOutlineProcessor exists in edx-platform, keeping knowledge of enrollments out of the new core.

      • For partitioning, the data models to store user/partition mappings live in the new core, but actual partition bucketing logic remains implemented in edx-platform.

    • Strategy for data migration: Start with content data, that can be rebuilt/backfilled from Modulestore. We do this kind of thing all the time already.

    • Feedback:

      • Robert: In general, like the idea. Where to start: serving particular needs, like course overview-like data, verify that it serves the needs. Interfaces vs. implementations: do we need both? E.g. for course overviews, does it just need to move or will work need to be done–define an interface for a mocked version?

      • Jeremy: Any idea of things currently installed in edx-platform that directly call edx-platform–what are they doing?

        • Course overviews

        • Things that reach into modulestore for lack of better APIs that we should probably create

        • Scheduling

      • Jeremy: might it be worth identifying APIs in edx-platform that are the main code called by many other parts of the platform, so extracting it could allow all those to also be extracted?

        • There may be some cases where we move enough of the core of such an app (models, etc.) to a separate repo, but leave all the tangled implementation details in place to minimize the up-front work needed.

  • + [quest] (Jeremy) Docker Desktop changes - how does this influence build vs. buy decisions?

    • Docker is starting to charge for Docker Desktop usage for orgs like ours

    • There are possible alternatives like Minikube, but we haven’t really evaluated how well they would work for us

    • Quite possibly worth edX paying for this, not as clear for Open edX at large

    • Jeremy discussed this with Régis; we agreed it needs to be discussed/resolved, but didn’t come up with any immediate answers

    • BTR working group at large hasn’t discussed this yet

    • Does moving to Codespaces or something similar change this

    • Costs ~$21/user/month for 50+ user orgs

    • Open edX will probably support multiple options in the future, but edX is likely to pick a default for its own developers

    • Probably just paying the license fee for now, to free up resources for acquisition-related stuff

  •  Has anyone tried Tutor or Codespaces?

2021-10-06

  • [discuss] [Jeremy] mypy: where are we, where are we going

    • We’re running this for edx-platform in an optional GitHub Action check

    • We don’t have many annotations yet

    • There are tools that would let us add a lot pretty quickly by analyzing our test suite, etc.

      • But we probably won’t do this just yet, maybe in a couple of months

    • Feanil is somewhat excited about this as a way to catch potential problems early

    • Available for people who are excited about it, but any broader push for adoption will wait for at least a couple of months for other projects to settle down

    • It’s a tool to enforce best practices on interfaces, but we haven’t necessarily declared such best practices yet

    • If you find a good starting guide, consider adding to https://openedx.atlassian.net/wiki/pages/createpage.action?spaceKey=ENG&title=Learning%20Resources  

  • [quest](Feanil) What are people’s expectations of arch-hour?

    • Awkward silence

    • Updates on what other people are concerned about or coming down the pipeline

    • Figure out how architecture is handled at edX

    • Cross-team interaction

    • Insights into engineering task prioritization

    • Feedback on what architectural challenges people are facing

  • [quick question (hopefully)]: Is anyone familiar with the QTI spec? As in worked with it enough for me to ask some questions regarding structure/capabilities?

    • Not really, it seems

  • [inform] (Dave O) GitHub - fanout/django-grip: Django GRIP library looks like a potentially useful alternative to channels

  • [question] (Jeremy B) Does anybody feel it’s worth investigating Django alternatives like FastAPI yet?

    • (Dave O) Feels like the data layer is the bigger performance problem

  • [discuss] (Dave O) Possible next steps to resolve database performance issues

    • It’s hard to optimize this locally without awareness of the overall context of a given request

    • We may need to more often create custom APIs rather than extending existing ones with new (possibly performance impacting) data

    • Monitoring is key, especially to catch regressions

    • People don’t have a good sense on thresholds for action required to improve performance

2021-09-01

2021-07-28

  • (5) ***** (discuss) [Dave O] Iframes, Chrome 92, and what we’re going to do about it.

    • Interim solution for half a year, then will permanently break

    • potential solutions include getting react more usable in our iframe applications, and custom-build the things we are trying to access from outside the iframe (ORA is running into Pothis because we use window.confirm)

    • Need to backport the temp fix to Lilac?

    • Known Issues

      • ORA runs into this in `window.confirm`

      • LTI launch, open-in-new-window

      • Probably custom instructor code in various courses

    • Decision point: double-down on iFrames OR pivot to another way to embed JS components

      • Perspective: Google Chrome’s move is aligned with making iFrame technology more secure, which we can read as a signal that the industry will continue to advance in the direction of iFrames

      • Opportunity to seek further input

        • See where w3c groups are heading

        • Connect with IMS’ LTI working group

      • Embracing iFrames in our platform today

        • Learner MFE

        • LTI Plugins

        • Experience Plugins (XPs)

    • Next steps

      • DavidJ will continue to explore creating a structured interface for XPs.

      • T&L will implement a short-gap solution that addresses the issue until December.

      • Possible approaches

        • Message passing approach

        • Backwards compatibility: Have a querystring flag that triggers the XBlock render code to add a snippet of JS that overrides things like window.alert() with a version that doesn’t try to take over the whole window, but instead launches a modal-like-thing for just the frame.

          • TNL will look at what instructor code might be affected after the patch goes out.

  • (4) **** (discuss) [Ben W] Implications of community engagement performance requirement on arch topics/work?

    • In other words, “Are there things currently not being covered by a working group that COULD be, in order to better make use of this org move?”

    • Frontend engineering working group formation starting up - with BenW, DavidJ, AdamS, and Nim - to tackle frontend technical direction and tasks. 

    • Potential groups:

      • Data management

        • Wasn’t there an internal data guild that was started sometime in the last year?

      • ****** Documentation

      • * Breaking up monolith

      • Eventing standards and cleanup

      • Plugin Authoring

      • Courseware

      • **** i18n

        • Does this fall under Frontend WG?

          • It’s about getting translations done, so I don’t think so

          • And the tech includes backend code also

      • * Testing best practices & technology & tools

    • Today’s Engineering Groups

      • * DEPR (internal -> become inclusive)

      • RCAs (internal)

      • eSRE (internal)

      • ** Security WG (internal)

      • Built-Test-Release WG (making Open edX Named Releases efficient and effective)

      • ** FedX Working Group (internal -> becoming inclusive)

        • Fun editorial note: I think this should become the “frontend working group” instead of “fedX working group” - “fedX” is frontend at edX.  This group hopes to be more than that by inviting the community in. -djoy

    • Today’s other (non-engineering) groups

      • Open edX Marketing WG ()

      • Paragon Working Group (arguably design-led)

    • ACTION ITEMS

      • Ned Batchelder Add this to the Doc Hackathon Ideas sheet - to formulate WG charter, etc. [Ned]

      • Publicize wider (maybe in slack) for i18n, Documentation

      • Nimisha Asthagiri Add as agenda to reconnect on status in future Arch Hour 

      • David Joy (collaborators?) - Spruce up our working group page - to update status, expectations, ideas, etc.

  • (4) **** (discuss)[Ned, Usama] We seem to have two ways to use common-constraints.txt (copy into repo, or don’t). Is there common understanding about what to do?

    • Found the reason for it. Updating Django constraint on local was conflicting with the Django2.3 global constraint due to new pip-tools constraint that raises errors if there are multiple constraints instead of overwriting the previous constraint.

    • Tested the approach to pull common_constraint on local, remove django constraint from it and update constraint locally: build: Download the common constraint locally. by awais786 · Pull Request #107 · openedx-unsupported/openedxstats

      • Move ahead with this approach. 

    • Now, deciding whether to drop the common django constraint and only have a local django constraint in all the packages and services?

2021-07-14

  • (7) ******* [Question, Ned] tags vs GitHub releases to make Python lib releases?

  • (5) ***** [Quest, BenW] How do we make the case to the organization that Paragon is a valuable thing to be maintained and have stable ownership?

    • There is confusion over who needs to make the call that we need at least 50% of someone’s time to do this

    • Feanil will help try to get the right people talking to each other

      • Part of this involves writing down the arguments for having such an ownership role

  • (3) *** [Question, Ned] What prep do people need for the “Doc Love” hackathon?

    • List of docs that should be written or need updating?

    • Doc personas?

    • Docs in confluence: what extensions are available, which might be useful? Get IT approval early.

(2) ** [Ideation, DaveO] Lifting out a subset of edx-platform to make extension development easier.

2021-07-07

  • (1) *[Inform] Doc hackathon planning is underway. Want to help? (I guess that’s a question)

  • (Discussed) [inform] (David & Nimisha) Tech Radar Workshop next steps and ongoing work

    • Radar has been created but needs descriptions and ring placement(trial, adopt, etc).

    • We’ve categorized (by quadrant) and simplified the blips down to a set which feels correct for our first iteration.  The next steps are to write up descriptions for each blip and decide what rings they go into.  More on this soon!

  • (5) ***** [quest] (Matt T): What are the biggest challenges our open source community fights that we can fix before we become the open source community?

    • (Ned) I’m interested in the emphasis on “become” :)

    • (Adam Blackwell) In the very niche repo of configuration, community members have to fight to get things added that we don’t use.  The general ~2-10 repo month PR flow is write a PR, wait a while for a review, merge it, revert it if it breaks http://edx.org things, then put a new PR with it feature gated, then merge it.

      • One way to fix some of these issues might be to containerize things and use docker images that inherit from other images?

    • (Adam Blackwell) I’m curious about what http://edx.org specific code is or isn’t in a separate React frontend or Django service

      • (djoy) Many MFEs have hard coded URLs to edx.org-specific support articles, or i18n strings with edX-specific entities (MicroMasters, http://edx.org , etc.) hard coded into them.  We also have places where we’ve coded in particular third party providers that we use that others don’t, such as Cybersource.  It’s more a whole laundry list of small things, often.

    • Nimisha:

      • I see at least three groups of problems:

        • 1 - Adding new features without modifying the core.

        • 2 - Customizing existing features without modifying the core.

        • 3 - Deployment and operations.

          • Multi-tenancy

          • Managing thousand of sites

          • Upgrades every 6 months

          • Enabling/disabling features

        • (Adam) What is one example of an SRE/Deployment challenge for Open edX?

  • [ideation] (Dave O): Would it make sense to see if we can lift out a small, relatively sane subset of edx-platform to make writing extensions easier (and have edx-platform that)? Things like learning_sequences, user partitioning, scheduling in one place?

    • Could we do this by virtue of an import linter that people could implement(eg. List all the things they’re allowed to import from edx-platform).

      • I was actually hoping to make it so that this is a new thing that never imports from edx-platform. :-P But maybe as a stepping stone?

        • Right, I’m thinking the import linter, lets us test it out and see which parts actually need to be pulled out?  Maybe it’s already obvious though.

2021-06-30

  • (3) *** [inform] (Jeremy) Upgrades in preparation for Maple:

  • (6) ****** [ideation] (Nim) Documentation strategy - in prep for hackathon and follow-ups from discussion in #institutional_knowledge

    • OEP-19 Principles

      • Distinguish between temporal versus permanent information

      • Co-location (docs and code)

      • Versioning

      • Ongoing Maintenance

    • Open questions

      • 1 - Discoverability

        • Idea: hierarchy of docs

        •  An independent search tool that searches across all things.

          • We’ve investigated this in hackathons, with Elastic Enterprise Search being a leading contender at the time, but IT approval is tricky due to the security/privacy ramifications

          • We have a Google Custom Search, but it’s a bit wonky: https://docs.edx.org/search  

            • Only searches public docs, and misses some of those if we forget to add them to the widget’s configuration

            • It is still not publicized from docs, which makes it difficult to learn from.

        • Centralizing our docs is probably best if pan-search tool isn’t feasible. Thinking about docs as a tree structure, there should ideally only be 1 “top node”, as in, you go to confluence, and everything is there (this doesn’t include code comments obvi). Searching across several “top nodes” (google docs, confluence, and so on) hurts discovery  ← opinion piece, but *shrug*

          • One root only makes sense if you have one kind of audience?

      • 2 - Non-technical docs ?  Follow the same standards as technical docs or something different?

        • I think we cannot ask other functions to follow Eng doc processes unless there is a clear reason for them to switch away from what they’re doing. Why would product managers start writing rST?

        • The Spec/Approach process seems to be working well for product -- if we could get those in Confluence (for discoverability) then that we be a win.

      • 3 - Tools: Confluence, Google, GitHub, Jira

        • Confluence/Google/Jira: Use these tools strictly for docs that are internally-oriented, temporal, or both.

        • Many of our doc decisions are intended to work around Confluence limitations (poor search, limited concurrent editing, etc.); may be worth considering a different wiki system

        • I think there’s some general guidance about “when to use which tool” but it might be useful to give clearer guidance and alignment so we have less fractured docs. 

        • Every squad maintains an up-to-date homepage with links to their own relevant internal/temporal docs

      • 4 - Access: Open edX versus edX.org, Org-specific-structure access (Security, Squad-specific, etc)

        • I think it could also be a valuable exercise to user-map, figure out who uses our documentation and what they need out of it.

          • E.g. Open edX, edX internal developers, students, course authors. These are already sort of broken up but having a really strong idea of what they need and where they go to access it could really help clear up the “charter” for each piece of documentation.

        • We currently allow http://edx.org specific notes in Open edX docs that others may learn from. Maybe we could not only make this explicitly allowed in the OEP, but acknowledge that it doesn’t have to be limited to http://edx.org . Maybe we have Core Orgs?

        • Idea: for http://edx.org specific technical decisions that would have been an OEP, create a separate closed-sourced repo for http://edx.org OEP-wide decisions.

          • This way, the other benefits of GitHub can apply.

      • 5 - Ownership of documentation as a framework

        • Each doc should have a stated owning team

        • Each team should have a list of all docs they own

        • There should be guidelines on what doc ownership entails

    • Change Management

      • Adoption

        • by other functions as well (Product, UX, etc) since cross-functional docs also need decisions.

    • ACTION Nathan, Ned, Feanil - propose a “just-enough” doc strategy for us before the summer’s hackathon so we have an aligned direction to drive during the hackathon.

  • (5) ***** [quest] (Kyle) Do we foresee edX being on a “level field” with other community members (OpenCraft, RacoonGang, et al) in terms of Open edX core committer rights? Specifically, do we think edX employees and, say, OpenCraft employees will have the same requirements to become CCs?

    • Nim: One of the principles from the Core Committer program:

      • Hold an equal bar for both edX engineers and community engineers, in the long-term. In the future, for instance, edX engineers might earn merge rights just as other contributors to the platform.

    • Nim: For extensions to the core, owning squads as admins of their own repos, can make autonomous decisions on merge rights.

    •  ACTION : Follow up on the B(oundary) part of BEES.

      • Kyle

      • Dave O

      • David Joy (may end up focused on one of the E’s)

      • Nim 

      • Adam Blackwell if it relates to SRE work or just to learn more about boundaries

2021-06-23

  • [discuss](Brian) Are we taking steps to remove our dependency on MongoDB?

    • Not aggressively, but yes

      • modulestore to S3 and something something?

        • Have almost gotten permission to get rid of last courses using Old Mongo

        • Static assets - we could move this to S3/django-storages

        • Active versions index - Braden working on a PR to move to a Django model

        • Course structure definitions - we could move this to S3/django-storages

      • forums to alternative forum software

      • Would be difficult to get this all done in time for Maple, probably not worth the push

  • [discuss](Jeremy) Are there any new features people are particularly looking forward to in newer versions of Django/Elasticsearch/Mongo/Node?  Or are people generally not keeping up with these release notes?

    • This seems like a nice thing to keep in a knowledge base style wiki

    • Dave would like to see some more targeted usage of Python type checking, building on Regis’s mypy work in edx-platform

    • Jeremy is curious if new-style Django URL configuration would help our regex-related startup performance issues in edx-platform

  • [discuss](Jeff) How do we want to evolve automated a11y testing?

    • We last upgraded axe-core 18 months ago

    • We haven’t updated our set of a11y CI tests in quite a while

    • Our tools are pretty out of date at this point, there are new ones available

    • Let’s turn off the a11y tests in Jenkins and GitHub Actions

    • Jeff will work on new tooling that works for him, and we’ll see if it makes sense to add it to CI

    • Jeff will create 2 ADRs:

      • One for turning off existing a11y job

      • Second one for what we decide to do instead once we’ve decided

2021-06-16

  • ****[quest] (Jeremy) How painful are these ‘make upgrade’ pull requests? Should they be auto-merged?

    • Today: they are automatically created, not automatically merged nor deployed.

    • Similar situation exists for Renovate upgrade PRs for frontend repos (e.g., could be configured to automerge patches).

    • We should probably auto-merge ones that pass tests

      • Maybe only during Cambridge working hours?

      • If you want careful control over a particular dependency’s version, consider keeping a constraint on it

      • Should have the option to pause automated upgrades during sensitive times (major development project, etc.)]n][bb]

    • Changelog review is slightly automated, but not as easy as it should be

    • Maybe an allowlist to control which packages are allowed to be upgraded without review?

    • Assume SemVer and only auto-merge upgrades of non-major version upgrades?

    • Should we try to get Renovate to work with our Python dependency management?

  • ***[ideation] (Beggs) How can we build a better experience/automation for UX to validate design work done in PRs and give signoff

    • Two primary options

      • Spin up an entire devstack/sandbox environment for MFE PRs

      • Create mock data for MFEs that can be used to allow the MFE to run without any of its dependencies, then deploy it to S3/Netlify and put a link to the environment on the PR.

        • The group preferred this option, since it could be done on a case-by-case basis in MFEs that need it, and doesn’t rely on us working through the hard problems of spinning up a “complete” environment on every PR.  Deploying an MFE takes 5 minutes, deploying a complete sandbox could take an hour. 

  • ** [ideation](Feanil) How can we keep docs up-to-date/evergreen?

    • Hypothesis: Confluence has a lot of stuff that people are afraid to remove/change because very few people feel like they have sufficient context.

    • Idea: Extend ownership to confluence?

    • Should we look again into alternatives to Confluence?

      • Much of why we use Google Docs is due to limitations in Confluence (concurrent editors, etc.)

      • We need a custom search widget (which can’t find private pages) partially because Confluence search is terrible

    • OEP-19 (developer docs) is out of date in some respects; mentions that we use Google Docs, but doesn’t recommend it for anything even though we actually do recommend it for some things

2021-06-02

  • *** [inform] (David and Nimisha): Tech Radar workshop next Arch Hour on June 9!

  • *** * [quest] (Jeremy) Upgrade

    • assistance menu options for squads (for Django 3.2 upgrade, etc.)

      • We’ll do it ourselves

      • Do the work, we’ll review the PRs

      • Do it for us, we don’t need to review the PRs

        • Does this include deployments?

      • Other?

    • resourcing: centralized team versus distributing across teams

      • Centralized team enables efficiency

        • + develop a center of excellence and cognitive load

        • + can identify and develop automation/tooling

      • Individual teams are better if the upgrade requires team-specific domain knowledge

    • Django 3.2 - can take advantage of new ways of doing async

      • Would require removing Django 2 support to leverage this.

      • Idea: present this at all-hands once available for usage.

  • *** [discuss] (ned) should we continue automated “make upgrade” for libraries?

    • What if the XBlock library doesn’t pin the test requirements in its own library?

    • Proposal: let XBlock unpin the test requirements; edx-platform will use the latest version when the XBlock is eventually updated in edx-platform.

    • Note: global constraints file exists

  • ***[discuss] (nim) eventing ADR - versioning, in particular 

    • https://github.com/eduNEXT/openedx-events is for in-process core platform events.

    • If Django App Plugins or other backend extensions have their own event APIs, they would publish them in their own app. Maybe within api.py or a separate events.py or something else. @Matt Tuchfarber (Deactivated) can propose in his Django App OEP.

    • Versioning of our events

      • Let’s be consistent with versioning, as described in OEP-41.

        • Major version embedded in the name.

        • Minor version included in the event payload.

      • Note that for xAPI/Caliper events, the name is from the specs. But their payloads would include version numbers.

      • We would follow this for frontend events as well. @David Joy (Deactivated) @Adam Stankiewicz

  • ** [adam bl, ideation] How might we enable continuous deployment 24/7?

    • Would it help to do so by improving our on call runbooks?

    • Need more info: What is the current CD status of each of our services? 

    • Types of services

      • MFEs

        • They deploy to stage upon merge, but need a manual click to push to prod.

      • Microservices

        • Teams independently make their own decisions.

        • Some IDAs deploy to stage automatically, but manual process to release to production.

      • Monolith

        • Until recently, we alerted recent-mergers.

2021-05-19

2021-05-12

  • [Quest] (Ben W) How to ensure good UX and UI processes now that we are merging the functions into Product Designers?

    • Why

      • The current state of ownership responsibilities across themes/squads/etc was messy for UI/UX folx.

      • We were already seeing UI and UX looking at the holistic process.

    • Rollout

      • There will be an initial slow-down since people will be developing skills.

        • The designer will need to clarify and communicate that this slow-down will be there.

      • Stacey and AdamBu are developing an education program to upskill the team.

    • Process

      • Onus is on the Designer to ensure they are getting the proper reviews.

      • Organizational structure - theme-specific allocations.

  • [inform] (Nim/David) Prep for working session(s) at Arch Hour for Tech Radar.

  • Tracking dependencies of repos over time - Tidelift provides this support.

2021-05-05

  • [inform/request] (DaveO): I’m working on an ADR to remove modulestore usage from all LMS apps except courseware. If your LMS app needs structural/content data from modulestore to do its job (as opposed to course keys or course config settings), can you please describe your use case?

    • The performance of modulestore queries can really vary, seems best to just not use it directly whenever possible

    • [Nim] FYI on Cale’s type-checking PR, which he hoped would eventually help us find callers to Modulestore: https://github.com/edx/edx-platform/pull/26985

  • ++++++ [inform/ideation] (Jeremy): Suggestions/feedback on candidate projects for Arbi-BOM: https://openedx.atlassian.net/wiki/spaces/AT/pages/2689925443

  • ++++[ideation] (djoy) How can we make our developer docs more obviously part of our process?  Do people like readthedocs? Is there an amazing alternative out there? Can we make it easier to contribute?  To find what you’re looking for?  Have you contributed?  I haven’t!  When you did, why did you do it, and how did you know to?

    • Readthedocs++, can do markdown, RST, etc.  

    • edx-documentation repo organization is not intuitive to developers

      • focused on end user and broader community documentation

    • OEPs on developer documentation… still need to get things to match the new pattern

    • Notes transferred here: https://openedx.atlassian.net/wiki/spaces/AC/pages/2725348094

  • ++++[quest/ideate] ( Awais ) How can we make sandboxes more useful for different services ? ( adding data for ecom, credentials or discovery ).

    • [Feanil] I think this is a worthwhile investment but there is a question of knowing what to add.

    • Arch-BOM tried to do this from Dev Data because teams don’t have time to prioritize this work at the moment.

    • Is there a gap here because the teams with the domain knowledge don’t feel as much pain?

      • Currently it looks like the team just accepts the sunk cost even if they do have the domain knowledge.

2021-04-28

  • ++ [quest] [Feanil/Jazib] How mature is our k8s stuff, should existing work be moving to it?

  • +++++++++ [quest] [djoy] What do we think of the idea of non-python backends here at edX? 

    • I was mulling this over this morning - not talking about anything specific or looking for permission.

      • Why? Skipping for today.

      • Possibility of using node for specific *types* of services (MFE translators),

      • Pros:

        • Best tool for the job may not be Python

        • Mitigate risk of Python and/or Django stagnating relative to alternatives

        • May allow use cases that Python/Django aren’t very appropriate for

        • Allows frontend developers to manage their comms layer without stack-changing

          • without touching the django and risking mucking up the data models

        • Allows runtime access to data without mixing technologies.

      • Cons: 

        • We already have a lot of Python expertise

        • Potential lack of operational experience for new languages and frameworks

        • Want to avoid “we did it this way because it's what our team knows better”

        • Poster child: forums in Ruby

        • Django has an ORM, accessing data created by that ORM via other ORMs can be risky/error prone.

        • Note: many of these apply mainly if we are mixing paradigms

      •  Proposal (Ben W):

        • Django can be used for data model creation, management and manipulation.

        • Node can be used for MFE data translation access.

      • tl;dr from Jeremy: We’re not open to new backends in general because there are very real costs to supporting diverse backend technologies, but we’re open to considering specific proposals if there are good business reasons why Python/Django isn’t an appropriate choice (performance, existing implementation using front-end technology, etc).

  • ++++++ [Ned; but I can’t be there until later] How can we use the repo-health information to improve the health of repos?  For example: some repos have no openedx.yaml file.

    • How do we decide which things are most important to make consistent?

    • Once we’ve decided what to make consistent, how do we do that in the face of competing priorities on specific squads?

    • How actively is this data being utilized currently?

    • Is the current presentation of the data working, or do we need a better interface for it?

      • [Ned] I’m using the csv

      • Comment: currently very wide (which makes it less approachable)

      • There have been convos about integrating with Snowflake and neo4j.

      • Maybe we want pluggable output generators, similar to how we have pluggable check implementations? [Ned: is this over-engineering considering a csv is available?]

    • Frontend Checks

      • Is it using renovate?

      • Is it deployed to npm? [Ned: I am planning to add this check soon]

  • +++[quest] [Robert/Hassan] How do people feel about using the new annotation tooling to annotate custom attributes? Would you find it useful to have a consolidated doc of their purpose? 

2021-04-15

  • [ideate & vote] Deep topics for future Arch hour

    • Votes -> Focused study group, or pre-announced deep-dive discussions

    • Topics

      • (6) ++++++ Eventing data technology and usage best practices (segment, new relic, etc) 

      • (5) +++++ QA Process

        • writing test plans

        • working with external resources

        • @Justin Lapierre Share squad huddle notes and pilot

      • (3) +++ Design-pattern/code-organizational concerns

        • Applying DDD design patterns to specific problems: could be a case study

        • Alex Dusenbery volunteers

      • (2) ++ Making sandboxes useful?

        • @Justin Lapierre @OpenCraft - They have an automated system for creating these.

      • (2) ++ Testing

        • Testing xblocks

        • How we use Cypress

      • (1) + Raising knowledge gaps

        • How does <old thing> work?

        • Current practices on specific topics, e.g. caching

      • (1) +Travis to Github Actions

      • (1) +Working session

      • How do I get us/my team to do x?

      • Ownership question

      • Experimentation

        • apis

        • practices

        • workflows

      • Build/deployment concerns

      • Data-Flow concerns

      • How to reduce pain of maintaining Ansible?  

      • Frontend and Experiment API

      • MFE configuration

  • [ideate] Bite-sized tickets: how to create/maintain them for onboarding contributors?

    • To give people a way to start with the project

    • To get useful small things done

    • Maybe a tutorial + sample project is better in some cases?

    • Types of work

      • DEPR

        • Toggle-related

        • Other DEPR migration projects

      • Django 2.2 -> 3.2 upgrade

      • Documentation

2021-04-09

  • Arch Hour

    • Why keep Arch Hour?

      • Opportunity for cross-team discussion

    • https://openedx.atlassian.net/wiki/spaces/AC/pages/1800470616/Proposal+Arch+Weekly+DRAFT+WIP

    • Next increment

      • Timing

        • Wed at 11am

      • Frequency

        • Weekly

      • Structure

        • Lean Coffee

        • Topic - open forum - not necessarily a presentation

        • Topic - study - prepped, possibly as a workshop

        • Working session

      • Community participation

        • Invited guests to pre planned long form topics.

        • Coming in for led or working-sessions

    • Functional distinctions:

      • List investigation/ topic-generation/curation

      • Group-interrogation when no-one feels quite expert enough to lead

      • Led topic dive

    • Topics for these meetings

      • Raising knowledge gaps

        • How does <old thing> work?

        • Current practices on specific topics, e.g. caching

      • How do I get us/my team to do x?

      • Ownership question

      • QA

        • writing test plans

        • working with external resources

      • Experimentation

        • apis

        • practices

        • workflows

      • Testing

      • Build/deployment concerns

      • Data-Flow concerns

      • Design-pattern/organizational concerns

      • Making sandboxes useful?

      • How to reduce pain of maintaining Ansible?  

      • Frontend and Experiment API

    • FYI: OEP Review (2020)

  • Testing continued ++++

    • Ben Notes:

      • QA testers are less expensive than Senior Devs and, more specialized at the task

      • Types of QA that is helpful for here may exclude “exploratory” testing, which is more focused on “no bugs at all” and is much more slow-inducing

      • QA engineers are people whose whole focus, every day, is how to validate the work, and make sure it is validate-able.

      • QA are better/practiced/trained at building test plans that actually validate a product with a minimum amount of active testing.

    • Notes from Arch Hour with a QA Architect

    • Things we could work on if there is a perceived high ROI:

    • How to test ORA?

      • Why test ORA -> So we have fewer RCAs like this one

2021-03-25

Note from Jeremy: I copied these over from the meeting notes after the meeting finished, but my Zoom connection was unreliable for most of the meeting so I’m not sure all these topics were actually discussed. Please edit if you spot any discrepancies.

  • (5) ***** [JJ, ideation?] Ideas on how to tell people to deploy their changes to production

    • Do Continuous Deployment? So push to master will just lead to deployment

      • Team has resisted this, they would like someone to be actively monitoring when changes go out

      • Developers not on the owning team don’t really know how to do this monitoring

    • Make the GoCD pipeline do it

      • How do i get said robot?   

        • Auto-deploy: Change this line: to “success”

        • Tell devs: Add a slack notification in the ‘armed’ stage here

    • Slack: “Please push your changes to production”

    • What repos have/don’t have CD enabled?

      • To my knowledge, none of the MFEs are auto-deployed to production.  They are auto-deployed to stage, though. -djoy

      • It seems like only edxapp and prospectus have CD enabled.  But many developers work primarily in these repos, so other deployment patterns can catch them by surprise.

  • (4) **** [JJ, discussion?] How do people manage/”own” repos that other teams contribute to?

  • (3) ***[djoy] How can we discover - without scaring off - the people still at the company who understand our legacy frontend code (like, but not at all limited to, comp theming)?

    • Or how do we invest in getting enough knowledge of the old stuff to thoughtfully get rid of or improve it.

  • (2) **[Ned; open-ended discussion, no agenda!] Thoughts about Open edX theme?

    • Can we get rid of comprehensive theming?

      • Maybe?

      • But probably not.

  • (2) **[Ned; opinion poll] renaming master branches to main

2021-03-18

  • (5) ***** [question, Matt T] Are we continuing to support multi-site? If I’m writing new code, should I make it site aware, are we planning on ripping it out, somewhere in between?

    • Matt is going to kick off the DEPR process for multi-site and see what the community thinks

    • Note (a single dev in) community is somewhat-actively supporting multi-site; see recent PR in this area

    • EduNext, which hosts 1000+ sites, doesn’t use SiteConfiguration anymore.

    • We’d like to move the community off of SiteConfiguration. Have them use EduNext’s design instead: runtime override of Django Settings.

      • Here’s a peak