Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

The proposals:

The Blockstore proposals are fairly long, so this doc tries to summarize some of the principle differences.


Content Primitives

  • Identified by UUID.
  • Versioned numerically (1, 2, 3, etc.)
  • Tagging metadata is stored outside of the core Blockstore.

Differences: Granularity and Versioning

Original Proposal and Database Proposal

  • Files/Assets are tracked individually.
  • Units are tracked individually.

Original Proposal

  • In addition to per-Unit and per-File tracking, ContentSets (a group of Links) are also versioned.

File Proposal

  • ContentBundles are versioned as a whole, not individual assets inside them.
  • Depending on the intended usage, a ContentBundle could be a single video, a Unit, or an entire Sequence.

Differences: OLX vs. Assets

Original and Database Proposals

  • Separate models for Unit (i.e. a Studio Unit) and Files/Assets (e.g. images, PDFs, video files)
  • Unit OLX content is stored in the database.
  • Files/Assets live in an object store like S3, and are pointed to by rows in the database.
  • All metadata about Units and Assets are stored in the database.
  • Assets used by a Unit are tied together in the database using Links.
  • Advantages
    • Access to OLX has better latency guarantees, particularly for multi-gets.
    • Transactions make it easier to guarantee atomic operations involving many Units/Links/etc.
    • Able to track usage at a fine granularity (e.g. what are all the places this exact version of this image is used?) without requiring external indexing like Elasticsearch.

File Proposal

  • All content is stored in Content Bundles, which is like a small directory of files.
  • The OLX for a Unit would go into an XML file in a ContentBundle.
  • All Bundle content is stored in an S3-like object store.
  • Metadata about what content constitutes a particular version is in the object store, not the database.
  • Assets used by the Unit would go into the same ContentBundle.
  • Advantages
    • Units are more self contained.
    • Easier to adapt for use cases outside of Open edX, since ContentBundles don't assume an OLX/Assets divide.
    • Easier to associate bundles of related Assets, like a Video's various encodings, subtitles, thumbnails, etc.
    • Cheaper storage.

Differences: Modeling Sequences and Courses

Original Proposal

  • ContentSets are collections of Links that point to Units, Files, or other ContentSets.
  • Statically defined Sequences and Courses are composed using ContentSets.

Database Proposal

  • Sequences are out of scope – Blockstore's job is to provide fast access to the Units for a separate Compositor service.

File Proposal

  • A statically defined Sequence is modeled as a single ContentBundle, and versioned as a whole.
  • A Course would be a ContentBundle with a root OLX file defining the chapters and a set of Links to Sequences.

Links

  • Links are versioned in all proposals.
  • Conceptually like symlinks.

Differences: Scope of Usage

Original Proposal

  • Links are used to tie together Units and Files.
  • ContentSets tie together Units with each other, as well as with Files and other ContentSets.
  • Units, Files, and ContentSets are all considered "Linkables", and share a common interface that includes version history, tags, and draft status.
  • Links are stored in the database.

Database Proposal

  • Links are used to tie together Units and Files only.
  • Links are stored in the database.

File Proposal

  • Links are used a lot less, because Units and Sequences typically contain their own assets within the same ContentBundle.
  • A shallow, versionless representation of Links exists in the database for notification purposes, but full Link information is stored in the object store.
    • This is for scaling and performance reasons when dealing with large numbers of links and extended dependencies.
    • This makes it much harder to find out which things are using a specific Version of a given piece of content unless we index separately with something like ES.

Differences: Garbage Collection

Original and Database Proposals

  • Use Links in the database to garbage collect content that is outdated and is no longer being referenced.

File Proposal

  • Don't garbage collect.
    • Versioned OLX content is relatively small compared to the size of other assets stored in the object store.
    • It's not clear how we'd know what was being used in a multi-site distributed sharing arrangement.

Search & Tagging

None of the proposals really addresses this, but all of them assume that there will be an external system (either a plugin or separate service) that uses ElasticSearch as a backend.

Collections

  • All proposals center ownership, permissions, and licensing around the concept of a Collection.
  • Each piece of content belongs to exactly one Collection.
  • Examples:
    • single course (possibly multiple runs)
    • problem bank
    • library of videos created by a video team

Differences: Collection Versioning

Original and Database Proposals

  • Collections point to versioned content, and are also versioned as a whole.

File Proposal

  • Collections point to versioned content, but the Collection itself is not versioned.

Neither of these stances is fundamental to the designs.

Questions for Discussion

  1. What are the use cases for Collection-level versioning?
    1. Licensing version ranges for the Collection.



  • No labels