Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

...

...

...

...


Info

This is the most current Blockstore design document, but many details continue to be refined in conversations on the Issues page of the Blockstore repo. Some of these topics still under debate are:

  • How granular is a Bundle in different use cases (i.e. single problem, entire unit, outline of entire course, etc.)
  • Exactly what files get placed where inside a Bundle?
  • What does the import/export look like for courses and content libraries.


This is the design document for Blockstore, a system for authoring, discovering, and reusing educational content. Development is being funded by Harvard LabXchange and the Amgen Foundation, with significant in-kind contributions from edX.

Table of Contents

Abstract

All lesson content in the Open edX platform is currently stored in the modulestore, which requires that all content is organized into “courses” that are each a directed acyclic graph (DAG) of XBlocks/XModules (or in “libraries” which are implemented in the same way as courses, but which have a shallower graph and support a limited set of content types).

This proposal outlines a design for a new service that stores content for the Open edX platform, called “Blockstore.” Blockstore is meant to be a lower-level service than the modulestore, and it is designed around the concept of storing small, reusable pieces of content, rather than large, fixed content structures such as courses. In other systems and academic contexts, these are often called “learning objects,” and Blockstore is thus a type of Learning Object Repository (LOR). For Open edX, Blockstore is designed to facilitate a much greater level of content re-use than is currently possible, enable new adaptive learning features, and enable delivery of learning content in new ways (not just large traditional courses).

Motivation

At its heart, edx-platform's current modulestore works with large, static course structures. Various dynamic courseware features such as A/B tests, cohorts, and randomized problem banks work around this by copying every piece of content that might be displayed to any user and then selectively showing a subset of that using permission access checks. When you use a randomized problem bank in a sequence, the system is in fact copying the entire content library into that sequence.

This poses a number of problems:

  • It creates very large data structures, degrading courseware performance. Many common courseware interactions noticeably slow down as the amount of content in a course increases.
  • The underlying structure is static, so the ordering of elements is fixed, making adaptive learning sequences extremely cumbersome to implement. Course teams have heroically worked around this using LTI hacks, using Open edX as both an LTI provider and consumer in chained LTI launches (sequences with one unit that acts as an LTI consumer to an adaptive engine interface that then becomes an LTI consumer for individual problems in the original course).
  • Course content is largely duplicated for every run, making it cumbersome to manage across multiple runs, especially if those runs are on different instances of Open edX as is the case with some partners.
  • Trying to work around these limitations and maintain performance has significantly complicated the codebase and slowed feature development. Content Libraries are far less powerful than they were intended to be because of the large infrastructure changes that would have been required to execute the original vision.

General Themes / Concepts

...

  • Blockstore exposes Bundle-level metadata as a .blockstore directory.
    • This folder a fully virtual folder (nothing actually exists there on S3, and it is optionally materialized on export)
  • Metadata is exported as JSON files.
  • All Links to other Content Bundles will be of the form links/{alias}
    • Link mapping is stored in the .blockstore/info.json directory on export.

...