[BD-04] Project Retrospectives

2021-03-19: Phase 1 Retro

What went well?

  • Code readability has improved

  • XModules were all converted!

  • It came under budget!

  • Relatively small number of dependencies between conversions, so some parallelization was possible before getting blocked on reviews.

  • There were no bugs or issues during the conversions that got shipped to production.

    • even with many tests not working! (Just to be clear, those tests had been running before the final rebase before the merge)

      • Dave: OMG this gave me a heart attack, but this was definitely a test infra issue on our end.

  • Progress toward completion was easier to understand for this project compared to other blended efforts, especially when we made the Conversion Tracker. It’s all green now!

  • Ideal sort of Blended project to be a reviewer while working on other projects. Minimal overhead; easy to jump in and review for an hour or two in between other tasks.

  • Very low meeting overhead (for me, at least) -Kyle

    • What were the meetings/touchpoints? Sync & async stuff

  • The PRs themselves were really easy to review - they always included great context & testing instructions.

    • The MRO charts were also really helpful, FWIW

What could be improved?

  • Centralized Platform Knowledge & Code Complexity: This part of the system is understood by few people (they’re on this call!) which limits our ability to distribute work.

    • By simplifying this part of the platform we are helping this issue, but just wanted to flag this risk of an area of concentrated platform context.

  • Project Duration & Centralized Expertise: The project took longer to get done than initially estimated. This is mostly because of being bottlenecked on one person.

    • one person → is that Usman or Dave/Kyle? Usman.

      • Usman - was doing almost all of the dev work

    • Few reasons: this area of the codebase is really tricky; lots of implications across codebase when making changes; hard to have a good sense of what can go wrong. Constrained how we could distribute the work

    • Getting it done right >> getting it done 2 months earlier

      • The platform complexity of these changes made production issues a reasonably large risk. (which is why lack of bugs is such a big win - see other section)

    • Why delayed?

      • Having trouble finding time to work on BD-04 (get pulled into other things) - hard to do in smaller chunks; need dedicated heads-down times

      • Deep work requires deep work time.

  • Project Decision / Task Tracking Options: Lightweight PR + Slack conversation project tracking may mean action items / tasks / decisions are harder to track?

    • other projects with heavier coordination also have meeting notes / task lists

    • tricky balance here since speed is more important than complete docs perhaps?

    • minimum threshold: shared slack channel

      • regular meetings & meeting notes is more heavyweight, likely unneeded for this type of project

      • Was there any synchronous meetings? Possibly, we can’t quite remember

        • Not a strong need for requirements gathering

        • Usman had a strong sense of what needed to be done

  • Sometimes technical questions/decision points fell through the cracks and Usman had to gently poke Dave to get answers.

    • Were we using ADRs to make these decisions or some other method?

      • They were not really ADR worthy–talking about individual XModules and low level things.

  • Test Infrastructure: Anything to say about test infrastructure?

    • Anything that would ensure Dave DOESN’T have a heart attack (see note above) :)

    • catching that common/lib/xmodule tests stopped running would have been really good. I think this is on Jeremy'’s mind

      • Sarina: yes, I’ve discussed this with Jeremy

    • Going to need to rely on it more as we do Phase 2

      • Thinking we’ll do major breakages (expect to see more obvious failures than converting an individual xmodule)

      • Dave not as concerned about this

  • Shepherding edx-platform merges (b/c of continuous deployment) is always a minor point of friction – this is true for any OSPR and for the core committer program as a whole.

    • In practice though, I think this was smoother than most because nothing is critically blocking.

      • agreed

  • PR Review Cycle Times: Sometimes it took some time for PR reviews to be done. Perhaps more capacity would have helped?

    • +1. T&L has had a lot on its plate and blended review always lags noticeably on busy weeks

    • But also, very few people at edX feel comfortable reviewing this stuff.

    • Usman: phase 2 done differently. Much larger than Phase 1.

      • Get 3 other devs onboarded

      • Usman support Dave/Kyle with reviews

      • Work is more chunkable - feels lower risk than Phase 1, easier to have other people do the dev work

What did we achieve?

  • Helped demonstrate the benefits of blended development and core committer programs toward continuous upgrade/improvement efforts.

  • Showed that we could do a major core platform refactoring through blended without major production issues.

  • Converted key content infrastructure areas toward modernizing the guts of open edX’s content delivery core

  • edx-platform is a (at least) a bit more comprehensible than it used to be, whether or not you’re working in the guts of courseware. No more explaining to new devs why course objects are called “CourseDescriptors”.

  •  

Decisions

  • For Phase 2: Valuable to be able to describe why we’re doing things; project tracker/milestones are important to communicate out - helps us celebrate progress, and share out w/ other people in the org.

  • For Phase 2: Are there ways to measure code complexity to show value?

    • Length of inheritance chain?

  • .

  •  

Action Items

@Kyle McCormick (Deactivated) to make a shared slack channel for the project (edx-internal: #external-openedx-bd-04-xmodule-conversion ; openedx: #bd-04-xmodule-conversion).
@Usman Khalid to provide a list of everyone involved in the project to @Dave Ormsbee (Deactivated)

Anything else you want to talk about? (parking lot)

  • What is the name of the project now?

    • phase 2 will be a separate BD project (so not BD-04 no more)

    • Opp to name next phase of project in a way that conveys more value/outcome

  • .

  • .

  • .

Key Take-Aways

  • Infra cleanup works really well as a Blended project

    • Consider future projects: Phase 2, Old Mongo deprecation

  • Async conversations via email, PRs, and shared Slack channels worked well for this project - very little synchronous conversation

    • Low meeting overhead left more time to get project work done

  • Working in an area that few understand (internally as well as externally) can lead to delays - bottlenecks are likely, and we’re highly constrained on how work can be distributed.

  • Failure of common/lib/xmodule tests could have spelled disaster for the project (fortunately everything was OK) - but test infrastructure needs to be robust for intensive projects like this