Architecture Decisions for eventual OEPs

This document captures various architectural decisions that are proposed and strongly favored in recent architectural discussions.  They are temporarily captured here until they have been formally documented in an OEP.

1. XBlocks can depend on Django (Decided) 

Summary

xBlocks can use standard Django features, such as Django models and Django i18n services, without needing to depend on custom xBlock runtimes to provide these services.

History

Motivation

Although this proposal is mentioned in OEP-12, it hasn't yet been widely announced nor formally captured as a decision.  Essentially, we have decided to allow xBlock developers to depend on Django directly rather than require all Django services and dependencies to be exported via xBlock Runtime services.  This allows them to use standard Django functionality and capabilities rather than us implementing home-grown solutions/wrappers in xBlock runtimes, APIs, etc.

2. Eliminating LMS' dependency on the Modulestore (In Progress)

Summary

The Modulestore's responsibility should be scoped as a read-write storage layer for course structures for Studio only.  Other services, such as LMS, should use alternative read-optimized storage for course content.

History

  • Course MetaData. Summer of 2015: edX Mobile team proposes and implements Course Overviews, a read-optimized LMS view/cache of Course Metadata that is stored in SQL and synchronized with the Modulestore on every Course Publish.
    • On edx.org, some Course MetaData fields (e.g., language, marketing_url) are synchronized from the Course Catalog as well - by a daily cron job.
    • The following features currently use Course Overviews:
      • Courses API (used by Mobile and other features)
      • Course Dashboard
  • Course Structure. Fall of 2015: edX Mobile team designs and implements Block Transformers, a pluggable framework for pre-computed, de-normalized, read-optimized cache of Course Graphs to be used for fast-reads in the LMS; also re-computed on every Course Publish.
    • The following features currently use Block Transformers:
      • Course Blocks API (used by Mobile and other features)
      • Course Outline
  • Course Settings. 11/30/2017: At an edX arch lunch, attendees (Nimisha Asthagiri (Deactivated)Dave OrmsbeeCalen Pennington (Deactivated)Douglas Hall (Unlicensed), JesseZ (Deactivated), Brittney Exline (Deactivated), et al) decided the following on accessing Course Settings in the LMS.
    • Features that have their own course-wide settings can implement their own Django tables to store their LMS view/cache of their settings.  These features would synchronize their settings' values between the Modulestore and the LMS on every Course Publish.  We will pilot this approach with the Dynamic Pacing feature's setting for "highlights_enabled_for_messaging".
    • At this time, it's unclear whether settings related to "Course Management" even need to be part of the OLX.  If they don't, features can just store these values in their own SQL table as the Single Source of Truth.  For now, the working assumption is that these settings need to be colocated with the Course content so they can be transported from one open-edX instance to another (via OLX export/import).  As a result, the Modulestore will continue to be the Single Source of Truth for the time being.  Eventually, we may have separate exports for "Course Authored Content" versus "Course Management Data".
    • Alternatives considered:
      • We considered an alternative approach of having a common shared table for all feature settings.  However, that approach violates the SOLID principle of Interface Segregation.
      • We also considered adding feature-specific course settings to the existing Course Overviews table.  However, we chose to keep Course Overviews' responsibility focused on Course MetaData that's needed for the Course Dashboard.  This maintains the SOLID principle of Single Responsibility.

Motivation

  • Performance
    • The Modulestore infrastructure is not optimized for read-access as it is instead designed for versioning and read-write access. When retrieving course content from the Modulestore, the LMS faces unacceptable latency from (1) accessing data from Mongo, (2) instantiating xBlocks, and (3) traversing the course graph when resolving field inheritance.
  • Reliability
    • In edx.org's experience thus far, the Mongo database isn't as reliable as the MySQL database.

3. Services Isolation, especially in Blocking User-facing requests (In Progress)

Summary

Follow the Reactive Manifesto principles and have inter-service dependencies be asynchronous, without user-facing blocking calls from one service to another.

History

  • March 2017: Mobile team adds marketing_url to the LMS' local Course Overviews table and is asynchronously synched on a daily basis with the Course Catalog service. This design pattern is embraced after a production issue where LMS hangs waiting on the Course Catalog.
  • November 2017: Dynamic Pacing team adds language to the Course Overviews table (also synched daily with the Catalog service) instead of making blocking calls to the Catalog service.

Motivation

  • Resiliency (via Isolation). By embracing the Message-driven and Asyncronous recommendations in the Reactive Manifesto, we keep our microservices decoupled and simplify our overall system (less need for Bulkheads and Circuit breakers).
  • Performance (via Local Views/Caches). Each microservice maintains its own data as it needs it - transforming it into whatever optimized structure/value that it needs.
  • Maintainability and Availability (via Clear Synchronization Points). Each microservice connection can have its own anti-corruption layer to validate and transform received data at a single point for all features within the microservice. Since this happens within a background process, previously validated and persisted synched-data can continue to be used in case of interface breakage - erring on the side of Eventual Consistency over Blocking Accuracy.

Alternatives

  • One design alternative that is currently implemented by some features is to make blocking calls to the Catalog service from the LMS, but with timeouts and ephemeral caches in place. However, ephemeral caches should be considered only that - ephemeral - and cannot be guaranteed to always have the data that is needed to complete the request.  Also, automatically generated caches need to scale as the number of permutations of API request parameters increase.
  • User-facing features should be resilient to failures and assume that the depending service is down at the time of user requests.  A close read of the Reactive Manifesto is suggested.

Exceptions

  • Features that really need Consistency/Accuracy over Availability may need to make blocking calls to other Microservices.  For example, a purchasing workflow that needs accurate information about both product and pricing information may need to access the latest data from distributed services - and cannot rely on a daily batch synchronization process.

4. Break Monolith by Features, not by Studio/LMS split (Evaluating)

Summary

Follow the principles of Domain Driven Design and break services by core domain concepts rather than front-end separations.

History

Motivation (TODO)

  • Component / Feature cohesion
  • Eliminate Deployment dependencies
  • Studio and LMS become mostly Frontend Views

5. Plugins and SOLID Principles: edx-platform becomes a Plugin Platform (Decided)

Summary

To reverse the direction of the monolithic evolution of the edx-platform, it is necessary to have individual apps plugin to the platform and follow the S.O.D. of the SOLID principles as described in Django App Plugin.  Additionally, to support the varying feature requirements and experimentation by the open edX community (and within edx.org), a plugin framework provides a powerful flexibility that keeps the edX core as an invulnerable and stable platform but a welcoming enabler.

History

Motivation

6. Automated Communication Engine (In Progress)

Summary

Use edX ACE as the notifications framework for automatically sending messages to users.

History

  • Fall 2014: Design of edx-notifications for McKinsey
  • Sep 2016: Design of Notifications: System diagram, with learnings from above
  • Fall 2017: RET team implements ACE, based on above 

Motivation

A common extensible and scalable messaging framework used by all edX features that handles personalization, translations, policies and configurability.

7. OLX Data Format and Versioning (Todo)

8. Ubiquitous Language (In Progress)

  • CatalogCourse
  • CourseRun
  • Content-Provider Organization
  • User-Provider Organization
  • Credit-Accessor Organization

9. Testing Best Practice (Todo)

10. API Best Practice (Todo)

11. Domain-driven Design Principles

  • Build vs Buy → Core, Supporting, Generic
  • Location of code (which IDA? which app?) → Bounded Context Responsibilities and Boundaries
  • API as a Product → Domain models
  • Inter-service interfaces → Business use cases
  • Inter-service communication and relationships → Bounded Context life cycles versus Entity-based Services 

12. Celery usage

  • celery-utils capabilities - LoggingTask and PersistOnFailureTask
  • Using kwargs instead of positional args
  • For real-time synchronization with management command as fallback of celery failures

13. Authorization (In Progress)