Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

The proposals:

The Blockstore proposals are fairly long, so this doc tries to summarize some of the principle differences.


Collections

  • All proposals center ownership, permissions, and licensing around the concept of a Collection.
  • Each piece of content belongs to exactly one Collection.
  • Examples:
    • single course (possibly multiple runs)
    • problem bank
    • library of videos created by a video team

Differences: Collection Versioning

Original and Database Proposals

  • Collections point to versioned content, and are also versioned as a whole.

File Proposal

  • Collections point to versioned content, but the Collection itself is not versioned.

Content Primitives

  • Identified by UUID.
  • Versioned numerically (1, 2, 3, etc.)
  • Tagging metadata is stored outside of the core Blockstore.

Differences: OLX vs. Assets

Original and Database Proposals

  • Separate models for Unit (i.e. a Studio Unit) and Files/Assets (e.g. images, PDFs, video files)
  • Unit OLX content is stored in the database.
  • Files/Assets live in an object store like S3, and are pointed to by rows in the database.
  • All metadata about Units and Assets are stored in the database.
  • Assets used by a Unit are tied together in the database using Links.
  • Advantages
    • Access to OLX has better latency guarantees, particularly for multi-gets.
    • Transactions make it easier to guarantee atomic operations involving many Units/Links/etc.
    • Able to track usage at a fine granularity (e.g. what are all the places this exact version of this image is used?) without requiring external indexing like Elasticsearch.

File Proposal

  • All content is stored in Content Bundles, which is like a small directory of files.
  • The OLX for a Unit would go into an XML file in a ContentBundle.
  • All Bundle content is stored in an S3-like object store.
  • Metadata about what content constitutes a particular version is in the object store, not the database.
  • Assets used by the Unit would go into the same ContentBundle.
  • Advantages
    • Units are more self contained.
    • Easier to adapt for use cases outside of Open edX, since ContentBundles don't assume an OLX/Assets divide.
    • Easier to associate bundles of related Assets, like a Video's various encodings, subtitles, thumbnails, etc.
    • Cheaper storage.

Differences: Granularity and Versioning

Original Proposal and Database Proposal

  • Files/Assets are tracked individually.
  • Units are tracked individually.

Original Proposal

  • In addition to per-Unit and per-File tracking, ContentSets (a group of Links) are also versioned.

File Proposal

  • ContentBundles are versioned as a whole, not individual assets inside them.
  • Depending on the intended usage, a ContentBundle could be a single video, a Unit, or an entire Sequence.

Differences: Modeling Sequences and Courses

Original Proposal

  • ContentSets are collections of Links that point to Units, Files, or other ContentSets.
  • Statically defined Sequences and Courses are composed using ContentSets.

Database Proposal

  • Sequences are out of scope – Blockstore's job is to provide fast access to the Units for a separate Compositor service.

File Proposal

  • A statically defined Sequence is modeled as a single ContentBundle, and versioned as a whole.
  • A Course would be a ContentBundle with a root OLX file defining the chapters and a set of Links to Sequences.

Links

  • Links are versioned in all proposals.
  • Conceptually like symlinks.

Differences: Scope of Usage

Original Proposal

  • Links are used to tie together Units and Files.
  • ContentSets tie together Units with each other, as well as with Files and other ContentSets.
  • Units, Files, and ContentSets are all considered "Linkables", and share a common interface that includes version history, tags, and draft status.
  • Links are stored in the database.

Database Proposal

  • Links are used to tie together Units and Files only.
  • Links are stored in the database.

File Proposal

  • Links are used a lot less, because Units and Sequences typically contain their own assets within the same ContentBundle.
  • A shallow, versionless representation of Links exists in the database for notification purposes, but full Link information is stored in the object store.
    • This is for scaling and performance reasons when dealing with large numbers of links and extended dependencies.
    • This makes it much harder to find out which things are using a specific Version of a given piece of content unless we index separately with something like ES.

Differences: Garbage Collection

Original and Database Proposals

  • Use Links in the database to garbage collect content that is outdated and is no longer being referenced.

File Proposal

  • Don't garbage collect.
    • Versioned OLX content is relatively small compared to the size of other assets stored in the object store.
    • It's not clear how we'd know what was being used in a multi-site distributed sharing arrangement.

Search & Tagging

None of the proposals really addresses this, but all of them assume that there will be an external system (either a plugin or separate service) that uses ElasticSearch as a backend.




  • No labels