note

This is a work-in-progress artifact of these two tickets:

Once general consensus is reached, I’ll move some of the technical decisions to edx-platform ADRs (architectural decision records).

This is a work-in-progress artifact of these two tickets:

Once general consensus is reached, I’ll move some of the technical decisions to edx-platform ADRs (architectural decision records).

Overarching architectural goals

In tandem with bringing significant improvements to the content authoring workflow, we believe BD-14 presents an opportunity to advance the technical state of content authoring and core. To achieve this improvement, we have a few top-level architectural goals in mind, in no particular order:

  1. Store library content in Blockstore. This has been a goal since the outset of the project – it’s in the original specification of it. Just as LabXChange does today, we will store the master copy of v2 library content in Blockstore.

  2. Do NOT store library content in Modulestore. This may seem obvious given the first goal, but it’s not. Since courses are currently stored in Modulestore, we need some way of referencing library-based content from the context of Modulestore-backed content. One way of doing this would be to copy all course-referenced library content from Blockstore into Modulestore. While technically an option, we would like to avoid doing this, because (i) it is hugely ineffecient space-wise, and (ii) Modulestore is fraught with complexity and performance issues, which we are trying to avoid coupling BD-14 as much as possible.

  3. DO store library content customizations with the course in Modulestore. Although we want library content to live firmly in Blockstore, we still want to any customizations that course authors make to referenced content to live within the course structure/definition. This helps us establish and maintain a distinction between (a) original library content and (b) course-local tweaks that authors may make, a distinction which we believe is important for both technical and user-facing coherence.

  4. Build new frontend features using the micro-frontend (MFE) framework. This is also a stated goal of the original BD-14 pitch. Wherever feasible, we will build all new frontend features using edX’s React-based micro-frontend framework, as opposed to building within the legacy Django-templated frontend upon which most of Studio is currently implemented. We have already begun this process: the Library Authoring MFE has been created and is currently deployed for preview in the staging environment.

  5. Continue allowing for library content to be exposed via LTI.

Product terminology

Bold terms are already used in production. Underlined terms are new with BD-14 and thus could still be revised.

Term

Definition

Notes

Content Library

A collection of reusable components.

Legacy Content Library

Our user-facing term for v1 (modulestore-backed) libraries, once BD-14 lands.

Problem Library

A v2 (blockstore-backed) library that may only contain problem components.

Video Library

A v2 library that may only contain video components.

Complex Library

A v2 library that may contain any mixture of component types.

Library Content Reference

A single location in a course in which component(s) from a library are included.

In BD-14, we will build a “library reference block” to implement this operation.

Library content referencing

The act of using a Library Content Reference to include one or more components from a library into a course.

I chose this term in favor of “inclusion” or “sourcing” because I think it more accessible and less ambiguous. Open to other opinions here. More radical idea: call it a “launch” as a callback to LTI. -Kyle

Library content customization

The act of making edits to components included via a Library Reference. The edits are local to the Reference within the course; they are not reflected back in the source library.

“customization” is my suggestion, happy to use a different term instead.

Library Version

An immutable snapshot of a library, as published at a certain point in time.

Accompanied by a version name and comment, perhaps?

Library publishing

The act of releasing all changes to a v2 library since its last publish, creating a new Library Version.

v1 libraries do not support the idea of “publishing”.

Course

A run of course, which is authorable in Studio and hosted in LMS.

More precisely, this is called a “course run” in order to differentiate it from a “catalog course” (which are advertised on the marketing site).

Course publishing

The act of pushing out all edits to a course made in Studio such that they manifest in the LMS. Until a course is published, the edits only manifest in Studio.

Some edits must be explicitly published using the “Publish” button. However, some edits, especially structural or course-wide changes, automatically trigger a publish.

Component

A single piece of course content. Examples include HTML, a discussion, a video, an ORA, a problem (note that a problem may contain multiple responses), and advanced components.

Every component is an XBlock.

Advanced Component

Components other than the core five (HTML, discussion, video, problem, ORA), some of which are authored by external providers, and many of which are not fully supported. Must be explicitly enabled in advanced settings.

Unit

A series of zero or more components, displayed on a single page.

Under the hood these are called “verticals” sometimes. In the Learning MFE, the contents of a unit are rendered together within an iframe.

Background and technical terminology

Libraries, blocks, courses, and contexts

A content library (or just “library”) is a collection of reusable content. Each reusable piece of content is an XBlock usage, aka a block. These blocks are the same pieces that make up courses; by storing them in libraries, though, they can be authored, versioned, and referenced independent of any course-authoring workflow.

At first, this document we will focus on libraries made of component-level blocks; that is, blocks that are individual bites of content, such as problems, videos, HTML, and advanced components (polls, ORA, etc). Future iterations of this document may consider libraries that contain structural blocks, such as units, sequences, and sections.

On edx.org, content from libraries is primarily intended to be reused in courses. With the advent of LabXChange, though, the Open edX platform has also begun serving learning content from a labxchange pathway, which is a “a short collection of XBlocks that a student works through in a linear sequence”. Excitingly, these pathways actually store all their content in v2 libraries, although the content inclusion mechanism and authoring interface is different than what we will be building for edx.org courses.

So, we have generalized courses into the idea of learning contexts, or just “contexts”. Courses, labxchange pathways, and even libraries themselves are types of contexts, and all benefit from the ability to consume reusable content. So, although BD-14 product messaging sometimes presents libraries as collections of content for re-use across courses, in reality content libraries are a collection of blocks authored for re-use across learning contexts.

Storage backends

Libraries exist for use on edx.org today, backed by the split-mongo modulestore (aka “split-mongo”), our MongoDB-backed, versioning, immutable-definition content storage system. We refer to this generation of the content library feature as version 1 (“v1”). For a dive into how v1 libraries are implemented, check out Dave’s v1 library writeup.

All active edx.org courses are also stored in split-mongo. In this document, we will call these courses v1 courses. There also exist courses in the deprecated old-mongo modulestore (which one may call v0 courses), but since old-mongo cannot store libraries nor courses referencing library content, we will not talk about it further in this document.

BD-14 aims to replace v1 libraries with the version 2 (“v2”) implementation, backed instead by blockstore, our SQL + Amazon S3 -backed, versioning, immutable-definition content storage system, which is:

The content in v2 libraries will need to be usable by split-mongo-backed courses, although the technical design of v2 libraries will keep in mind our desire to eventually move all course content to blockstore, which we’ll speculatively refer to as v2 courses. This should be easy, since v2 libraries already serve content to labxchange pathways, which are blockstore-backed.

Keys

Content stored in the Open edX platform is referenced by a variety of types of opaque keys, which are generally semi-human-readable, URL-safe, immutable, and stable string identifiers. The “opaque” adjective describes that each key should generally be treated as indivisible, allowing us to change keys' structure over time without breaking assumpions. For example, URL parsers should not assume that learning context keys being with course-v1:, because that URL may one day need to handle keys prefixed with lx-pathway: or lib: instead.

Different types of keys refer to different entities and have different structures, as shown below. Key parts written in CAPITALS are variables that would be substituted with specific content information.

Implementation questions

This section is organized in a question-answer format, even though many of the questions are already answered.

To see which questions are still open, search for the text “TBD” (to be determined). Sections marked “TODO” are things I’m still working on fleshing out.

Questions with “High” urgency need to be answered before closing out the v3.1 milestone. For questions with “Medium” or “Low” urgency, we can safely hold off on answering them, as long as we have them in the back of our mind.

Authoring V2 libraries

Should component editors be rendered in a secure sandbox?

Answer: TBD by https://openedx.atlassian.net/browse/TNL-7458 .

Should the library authors be able to name and/or comment their version updates?

Context: Unlike in v1 libraries, when a library author changes a v2 library, they will need to explicitly push a “Publish Library” button to push out the changes, which will release a new version of the library. This begs the question: would we like to enable authors to annotate these published versions?

Urgency: Medium - this affects how we think about versioning, but our stance could be revised later.

Answer:

If library authors are able to name/comment updates, should they also be required to?

Said another way: Assuming the answer to the above question is “yes”… should we go one step further and mandate that a version name and/or comment is included?

Pros:

Cons:

Urgency: Low - this is a UX decision that we could make later.

Answer:

What types of content can be stored in libraries? Are there different types of libraries?

The original pitch for BD-14 described three library types:

The idea is that a library author may choose to mark their library as Problem-only or Video-only, and thus may see a richer library-editing experience, as the library editor would be tailored to just problems or videos. Otherwise, the Complex library editor would be shown, which allows for generic editing of a collection of blocks.

Some questions:

Urgency: Medium - We are assuming “yes” on this feature since it was in the old spec and is part of the prototype, but we could reverse the decision later without too much fallout.

Answer:

When should course teams make updates in the library vs. in the course?

Consider: Course teams are often, but not always, going to have permission to write to the Library.

Notes:

Urgency: High - I think this guides how we design and build the entire project.

Answer: TBD

Referencing V2 library content from V1 courses

What are all the different use cases for referencing library content in courses?

Urgency: High - It’s important to know whether there are any use cases we haven’t thought of.

Answer:

  1. I want to reference a single specific block from a library into a course.

  2. I want to reference multiple specific blocks from a library into a course.

  3. I want to randomly reference a single block from a library into a course.

  4. I want to randomly reference a number blocks from a library into a course.

  5. I want to randomly reference a single block of a certain type from a library into a course.

  6. I want to randomly reference a number blocks of a certain type from a library into a course.

  7. I want to randomly reference a single block from a specific subset of a library into a course.

  8. I want to randomly reference a number blocks from a specific subset of a library into a course.

Notes:

Which referencing uses cases are we building for in BD-14, and in which order?

(“Case #X” references the cases defined directly above)

Notes / more specific questions:

Urgency: High - This is important in determining how we sequence features.

Answer: TBD

What is the workflow for referencing library content in courses?

Urgency: High - Knowing the general flow here is important in order to implement the UX and backend.

Answer: For all 8 use cases, the interface for library content referencing will look something like this:

  1. Course author chooses to add a Library Content Reference.

    1. Author chooses a library.

    2. Author chooses a version of that library (?)

      • It’s an open question of whether we want this step. We dive into this in a later question.

    3. Author chooses a pool of blocks (?)

      • It’s an open question of whether need this step. It depends on the use cases we want to build for (listed in a previous question).

    4. If multiple chosen: Author specifies randomization or ordering.

      1. Randomize: yes/no.

      2. If yes: how many blocks should be randomly selected from pool?

      3. If no: what order should the blocks be shown in?

  2. Course author may customize included block(s) by editing them.

How is referencing implemented?

There are two major approaches to implementing library content referencing that we’ve talked about:

Approach 1: LTI

This idea was spurred by Dave’s thinking on the topic. In his words:

What if there’s no deep relationship between libraries and courses or overrides at all? What if content lives in content libraries, and it’s included into a course via an XBlock that is like a slightly more sophisticated LTI block with a set of parameters (that represents the override)? We could make that a part of the LTI provider functionality for Open edX–the POST parameters to it would include some subset of overrides that the system knows how to apply. We can then apply that (stupidly) at the OLX level before the XBlock runtime even sees it.

Succinctly, a course would be an LTI-consumer, a library would be an LTI-provider, and a library reference would just be an LTI launch. We would use LTI 1.3 specification, the most-recently released and supported version of LTI, along with LTI Advantage, a collection of three services that extend LTI 1.3.

The lti_consumer XBlock already supports LTI 1.3 (with LTI Advantage) consumption. Furthermore, edx-platform V2 libraries are already LTI 1.3 providers, as implemented by OpenCraft for a Harvard(?) initiative. Using these two existing features, I managed to simulate a very basic version of library content referencing on Stage.

Of course, content referencing needs to support more than just simple reference of a singular block. I believe the advanced capabilities of referencing could be implemented using LTI Deep Linking service, one of the services included in LTI Advantage. From the specification:

The IMS Learning Tools Interoperability® (LTI) Deep Linking specification allows a Platform to more easily integrate content gathered from an external Tool. Using the Deep Linking message defined in this specification, Platform users can launch to a URI specified by an external Tool, then select specific content appropriate for their use, and receive a URI that other platform users can use at a later time for launches directly to that specific content.

A course author will need to be able to configure a library reference by selecting particular blocks, choosing randomization parameters, and applying customizations. Following the Deep Linking spec, the interface for this would exposed by the library as part of the deep linking step. After configuring the library reference via said interface, the library would return a deep link, which would serve as the block’s Launch URL. When a learner’s browser accesses the deep link, the library would return the set of blocks as configured by the course author.

Approach 2: In-Platform

The library referencing flow would be implemented within the Studio and/or LMS processes of a single Open edX instance, as it is for V1 libraries. Unlike V1 libraries, and though, library content would be stored and rendered outside of modulestore and the modulestore-based XBlock runtimes (which is also true of the LTI implementation).

We would achieve this by introducing a unit_compositor subsystem. Like learning_sequences, the subsystem would be populated by CMS upon course publish. It would store a read-optimized form of:

We would then update the LmsXBlockRuntime (called “CombinedSystem” until BD-13 is done) to use the unit_compositor as its backing store for units. When a unit is requested for a particular user, the unit_compositor would:

  1. Load the unit’s child blocks from modulestore.

  2. Replace each library reference block with its corresponding library block definitions, each overridden with any course-author-specified customizations, and each given a usage key that composes the library reference block's usage information with the library block’s definition key.

  3. Return the list of blocks wrapped under a VerticalBlock, with the same usage key as the original unit, for the LmsXBlockRuntime to render.

Example:

Something authored in Studio and saved to modulestore like this:

Vertical(  # our original unit.
  usage_key="block-v1:edX+UseLib+1+type@vertical+block@1",
  children=[
    HtmlBlock(  # a typical HTML block.
      usage_key="block-v1:edX+UseLib+1+type@html+block@2",
      ...,
    ),
    RandomizedLibraryReferenceBlock(
      usage_key="block-v1:edX+UseLib+1+type@random_lib_ref+block@3",
      # reference from version 5 of library 'someProblems' by 'orgX'.
      library_key="lib:orgX:someProblems:v5",
      # grab 2 random problems.
      count=2,
      # cap attempts on each problem to 3,
      field_overrides={
        "max_attempts": 3,
      },
      # for problem block 'xyz' in particular, set a custom display name.
      block_field_overrides={
        # in the key below, 'lb:' indicates 'library block (definition)'
        "lb:orgX:someProblems:problem:xyz": {
          "display_name": "Custom problem title for XYZ!"
        },
      },
    ),
    VideoBlock(  # a typical Video block.
      usage_key="block-v1:edX+UseLib+1+type@video+block@4",
      ...,
    ),
  ],
)

Would come out of the unit_compositor like this:

Vertical(  # unit, post- processing by unit compositor
  usage_key="block-v1:edX+UseLib+1+type@vertical+block@1",
  children=[
    HtmlBlock(  # same old HTML block.
      usage_key="block-v1:edX+UseLib+1+type@html+block@2",
      ...,
    ),
    ProblemBlock(  # problem block, from the library!
      # in the usage key, 'lb-ref:' indicates 'library block reference'.
      # note that the usage key combines information from both the library
      # block definition key and from the library reference block usage key.
      usage_key="lb-ref:someProblems:problem:abc:edX+UseLib+1+type@random_lib_ref+block@3",
      max_attempts=3,
      display_name="Library-defined problem title for ABC",
      ...,
    ),
    ProblemBlock(  # another problem block, from the library!
      usage_key="lb-ref:someProblems:problem:xyz:edX+UseLib+1+type@random_lib_ref+block@3",
      max_attempts=3,
      display_name="Custom problem title for XYZ!",
      ...,
    ),
    VideoBlock(  # same old Video block.
      usage_key="block-v1:edX+UseLib+1+type@video+block@4",
      ...,
    ),
  ],
)

Approach Comparison

In favor of LTI:

In favor of In-Platform:

Decision

Urgency: High - This affects many parts of the technical implementation.

Answer: We will take the In-Platform approach, keeping in mind as we build that LTI library referencing capabilities may be something we want in the future.

We arrived at a rough consensus on this in the T&L’s 2021-10-20 Geek Time meeting. The gist was that:

When a new version of a library is published, what happens to courses that reference that library’s content?

Context: There are two apparent options (with a third option of “let the user choose either”):

Urgency: High - This affects many parts of the technical implementation.

Answer: TBD

Which versions of libraries can a course author create references to?

The simplest answer would be “latest only” – that is, when a course author creates a new library reference within a course, they can only select content from the latest version of their available libraries. For what it’s worth, this is how V1 libraries work.

More complex answers to this question could be “versions published in the past X months are available” or “any version of the library ever published is fair game.”

Urgency: Low - We could assume “latest only” and then get more flexible later if the product requires it. On the backend, we will still be building something that allows for reference to any library version, since that’s required in order to support courses with reference to old library versions.

Answer: TBD

How can a course author customize library-referenced content?

Urgency: High - Different answers to this question will lead us down different implementation paths for library referencing.

Answer:

Implementation notes:

Shall courses in one Open edX instance be able to reference libraries in another instance?

Notes:

Urgency: High - If the answer here is a “hard no” then we will want to reconsider the decision to use LTI.

Answer: TBD

Can library content be referenced across multiple units, as to split up an assignment into pages but avoid duplicates?

One use case is “select a dozen or so components out of this library for this assessment”, where we don’t want to repeat any components. Today, doing that requires that you reference all the components into a single unit, which is not a great UX for the learner (slow to load, lots of scrolling). If you were to spread them across multiple units, you’d risk having repeat components appear, since component uniqueness is only guaranteed within a reference block, and each reference block belongs to a single unit.

With v2 libraries, we could implement something to spread referenced content across several units, guaranteeing component uniqueness across all of them. We would need to establish a new term for the concept of “a series library references across units, which together should all yield a unique set of components.”

Under the hood, we’d instrument this by creating a key for each instance of this new concept, and passing the key to each usage of the library reference block. The LTI provider would be sure not to return repeat components for successive launches with the same key.

Specific Questions:

  1. Would we want to implement this? (We wouldn’t have to do it right away - could punt the decision until later).

  2. If we implemented this, what would call this new concept?

    1. A “reference group” or “reference context”? (too vague?)

    2. An “invocation”? (too technical?)

    3. A “randomized assignment”? (suggestive of more than it actually is?)

    4. An “activity”? (too overloaded with other use of this term?)

Urgency: Low - We could implement this later.

Answer: TBD

Configuring V2 libraries

Which course authors and/or courses can reference content from which libraries?

TODO

How do we manage and enforce library permissions?

TODO

How does access control work across Open edX instances?

TODO

Export and Import

The export/import section is still under construction.

How are V2 libraries exported and imported?

What does the exported file format look like?

Answer: Didn’t get a chance to dive into this.

What happens to V2-library-referenced content when a course is exported and imported?

Context:

In theory, course authors should be able to export their course to OLX (contained within a .tar.gz file) from one Studio instance, import it into another Studio instance, and it should Just Work (blue star) . In practice, courses sometimes rely on entities with differing lifecycles. For example, video files are not exported along with courses; course authors must handle video hosting separately, ensuring that their videos are reachable from both their source and destination course.

This complexity applies to libraries too. Since content libraries exist as separate entities, it is not a given that their contents would be included in a course export. We could imagine two scenarios at different extremes:

  1. Referenced library blocks are exported with the course. They are exported as ordinary blocks would be, so their relationship with the original library is severed. When the course is imported Studio, either the same instance or a different one, it does not matter whether the original library is present, since the blocks are ordinary course blocks. In this scenario, it is unclear how randomized library references would be treated.

  2. Referenced library blocks are not exported with a course. The relationships between the course and the library are reflected in the exported OLX. When the course is imported back into Studio, the library blocks will not render unless the proper version of the same library exists in the instance.

For what it’s worth: V1 libraries take an approach that is between the extremes. Library content is copied into courses when it’s referenced, so when a course is exported, the library content comes with it. However, the relationship with the library is also maintained, so that if the library is present upon course import, then the library blocks will be linked to it. However, two big drawbacks exist in the current system:

Here’s a matrix of the approaches we could take:

Decreasing library centralization (blue star)

Decreasing size of course exports (blue star)

(a) Library only exists on original instance, so course-library relationship is severed when exporting to other instances.

(b) Library versions are centrally managed by original instance, but replicated on other instances.

(c) Library versions managed in a distributed fashion across instances, identifying versions by git-like hashes.

(1) Entire library is bundled with the course export.

(blue star) = ( (blue star) + (blue star) )

(1a) nonsensical

(1b) The original library is brought to the new instance along with the course, although the library does not become part of the instance’s list of libraries available for authoring. Instead, the library maintains a connection with the original instance, and can receive updates from the library on that instance. In this sense, the library copy is a replicate of the “centrally” managed library.

Pro: Libraries can be used across instances, while still maintaining simplicity of central management.

Con: Library bundling increases export size.

Challenge: Would need to figure out how to stop remote library keys (from original instance) from clashing with local library keys (on new instance).

Challenge: We’d need to figure out the replication strategy. Push-by-original-instance? Or pull-by-new-instance? And, how we keep track of the URL of the original instance?

(1c)

Specific version(s) of each library can be chosen for export. Multiple versions of the same library can be imported into an Open edX instance without disrupting one another, and importing a library at a version does not necessarily mark it as the “newest”. Library versions are uniquely identified and referenced by a hash of their contents, avoiding conflicts that may arise between human-friendly version names/numbers across instances.

If the required version of the required library exists on an Open edX instance, then the course automatically uses it. Otherwise, library blocks in the course gracefully degrade to rendering hidden blocks.

Example story: https://lucid.app/lucidchart/4950383b-a703-4bd8-b08e-f94a6f144c1a/edit?invitationId=inv_73bdba67-80b6-45d3-9c7b-bf2e5d8e0ac3&page=Mm5j-6d_dnBg# (in the “Import/Export Story” tab, start at the very top and read downwards.

(2) Necessary pieces of library are bundled with course export.

(blue star) = ( (blue star) + (blue star) )

(blue star) = (blue star)

(2a) Essentially, library blocks turn into ordinary blocks when a course is exported.

Pro: Courses will work out-of-the-box when exported and imported. We avoid wrestling with cross-instance library versioning altogether.

Con: Use of a library is confined to a single instance.

Challenge: Would need to develop a non-library version of randomized content banks.

(2b) Same pros and cons of 2b, except:

  • remove Con: Library bundling increases export size.

  • new Challenge: Including just the necessary library blocks in the course export.

(2c) Same pros and cons of 1c, except:

  • remove Con: Library bundling increases export size.

  • new Challenge: Including just the necessary library blocks in the course export.

(3) Library is excluded from course export.

(blue star) = (blue star)

(blue star) = (blue star)

(3a) nonsensical

(3b) Same pros and cons of 1b, except:

  • remove Con: Library bundling increases export size.

  • New Con: Course doesn’t work out-of-the box until first replication-from-original-instance occurs.

(3c) Same pros and cons of 1c, except:

  • remove Con: Library bundling increases export size.

  • New Con: Course doesn’t work out-of-the box. User must import correct library version into instance.

Decision

Urgency: High

Answer: Approach 1b.

Some additional decisions & implications:

What does an OLX export with V2 library content look like?

Urgency: High

Answer, Iteration 1:

./                         # Root directory of an exported .tar.gz for a course.
  course/                  #   As before, the ./course/ folder contains:
    assets/                #     (1) an assets folder,
    chapter/               #     (2) one folder for
    .../                   #         each
    vertical/              #         block
    video/                 #         type, and
    course.xml             #     (3) the root course.xml file.
  libraries/               #   New folder! Contains all referenced librari1es, one XML file per library.
    Greendale_VideoLib.xml #     Given a library with the key `lib:ORG:LIBSLUG`, filename is `ORG_LIBSLUG.xml`.
    Greendale_ChemLib.xml  #     <- For example, this defines 'lib:Greendale:ChemLib'.
    CityClg_mathlib.xml    #     XML files contain just metadata, including bundle UUIDs, but no block content.
  bundles/                 #   New folder! Contains definitions for bundles that were referenced in ./libraries/.
    a578e5...5b181d/       #     In the blockstore-course future, ./course/ would also refernce ./bundles/.
    259fa0...71f87f/       #     The idea is that all content lives in ./bundles/ and other folders are metadata.
    d8b24f...2640c8/       #     Bundle folder names are the full 32-char, lowercase, un-hyphenated bundle UUID.
      bundle.xml           #       bundle.xml contains title, slug, version #, and snapshot hash.
      files/               #       Finally, files/ contains the OLX serialization of the bundle contents.
        problem/           #
        .../               #
        video/             #

Dave pointed out that the structure o the bundles/ directory would not be very human-friendly, even if it reflects the blockstore data model.

Answer, Iteration 2:

./                         # Root directory of an exported .tar.gz for a course.
  course/                  #   As before, the ./course/ folder contains:
    assets/                #     (1) an assets folder,
    chapter/               #     (2) one folder for
    ...                    #         each
    vertical/              #         block
    video/                 #         type, and
    course.xml             #     (3) the root course.xml file.
  libraries/               #   New folder! Contains all referenced libraries and their content.
    edx.org/               #     Folder is structured by domain,
       Greendale/          #       then organization,
         VideoLib/         #         then library slug,
           v3/             #           and finally library version, mirroring the library key structure.
             ...           #
         ChemLib/          #         Each library contain an OLX serialization of its contents,
           v2/             #
             meta.xml      #           along with a "meta.xml" file to hold versioning #, snapshot hash, et al.
             problem/      #
             ...           #
             video/        #
       CityCollege         #
         mathlib/          #
           v12/            #
             ...           #
           v13/            #
             ...           #
    labxchange.org/        #
      ...                  #

Referencing V2 library content from V2 courses

V2 (blockstore-backed) courses don’t exist yet, so we don’t need to worry about implementing this as part of BD-14. Rather, these questions are here so that we keep in mind the eventual migration to V2 courses, avoiding decisions during BD-14 that would complicate that process.

How might V2 library content referencing differ in V2 courses (as opposed to V1 courses)?

We didn’t end up getting a change to dive into this, but it’s worth keeping in mind.

Archive

Misc notes and stuff

Approach 1: Distributed version control

This is the first approach that came to my mind, after thinking about how to preserve course-library relationships across instances while avoiding library version conflicts from arising.

  • Library blocks don’t export with a course’s OLX. However, library content customizations do export with the OLX (since they are saved on the LibraryReferenceBlock).

  • Libraries can be exported. Notably, specific version(s) of each library can be chosen for export. Multiple versions of the same library can be imported into an Open edX instance without disrupting one another, and importing a library at a version does not necessarily mark it as the “newest”. Library versions are uniquely identified and referenced by a hash of their contents, avoiding conflicts that may arise between human-friendly version names/numbers across instances.

  • If the required version of the required library exists on an Open edX instance, then the course automatically uses it. Otherwise, library blocks in the course gracefully degrade to rendering hidden blocks.

Example story: https://lucid.app/lucidchart/4950383b-a703-4bd8-b08e-f94a6f144c1a/edit?invitationId=inv_73bdba67-80b6-45d3-9c7b-bf2e5d8e0ac3&page=Mm5j-6d_dnBg# (in the “Import/Export Story” tab, start at the very top and read downwards.

Approach 2: Centralized version control

Dave Ormsbee , in reaction to Approach 1:

It might be useful to export [library blocks], even if it doesn’t import back on the other side. There are a number of tools that do content analysis of exports, and if you’re using content from multiple libraries, it’d be convenient to have that in the same export file.

….

I’m leery of making that step [(exporting at any version)] into full blown distributed version control. A question: Can we get away with only having one primary source of a library, and allow instances to essentially make read-replicas?

So for instance, a particular library could exist on edx.org and have some set of users who can modify it, and a simple linear history. But we can configure something so that Edge keeps an updated read-replica, so that it should theoretically always have that library’s data available.

This could be low level functionality that gets built at the blockstore layer as well. Some kind of namespaced replication so that Edge’s blockstore still has edx.org / {library_uuid} . Right now, the OLX export for a library assumes that the library belongs on whatever instance the course is in, but we could explicitly reference a fully qualified domain.

Essentially, instead of relying on library content hashes to uniquely identify libraries across instances, we’d somehow annotate and/or namespace libraries with their origin Open edX instance. Furthermore, libraries would be bundled with course exports.

TODO: write more

  • jennifer’s opinion

    • content should be edited in lib

    • settings should be edited in course

    • distinction is sometimes fuzzy–e.g. content is sometimes corrected within the context of a course today (minor changes to text)

  • fields that are ambiguous content v setting?

    • display_name?

      • renaming after import isn’t usually done in v1 libraries

      • confusion around problem titles coming from libraries sometimes

        • Names in libraries often have different meaning: e.g. 1a, 1b, 1c. vs. what it would be in the course (“Problem 1: {helpful description}”).

    • video start time

      • not used much currently?

      • would need to verify with CourseGraph

  • in units with library content, is there other stuff?

    • often a preamble HTML followed by random problems

    • mixture of HTML+video+problems in library very uncommon, since specific-block selection has been broken for a while

    • why: a single LTI launch for the entire unit would be cool

      • save ourselves an iframe layer

  • stress test

    • 30 blocks pulled in from a library of 50 in a single Unit

      • this is because it’s the only way right now to make sure that content doesn’t repeat.

        • interesting data model implications. Have a saved state for this inclusion of this library, but allow it to be launched across multiple pages?

  • LMS interactions

    • Can one LTI-launched thing map to multiple scores? This would be nice for grading purposes, but may be hard to retrofit.

      • Reactive use case: Need to correct and re-score one problem from the possible selection of the library.

      • Grades information.

      • Deep linking here?

    • Does it make sense for there to be two inclusion modes: one with single and one for a whole unit?

    • Settings tend to be re-used across a whole subsection, or even type “e.g. when using it as a HW, always give them five attempts”

      • Rare case: We find that one problem is too hard and adjust it to allow for more attempts.

Customization

  • TERMINOLOGY: Often configuration is reused. If you are selecting five problems from a library to put in an exam, you likely want to set the same policy to all of them (max attempts, don’t show the answers after it’s due, etc.) Should this be a top level Thing in the system, and what should it be called?

    • ???? = “the configuration/overrides applied to some content from the library by the course”

      • “Settings” = what we call it today. Does this still make sense?

      • Jennifer: Things that vary from context to context: Difficulty (attempts, randomization, etc.) → more or less rigorous. We do tend to lump a bunch of 'stuff' under the same umbrella of settings, and it's fair to call out that some are best set at the course/subsection/section level (release date, graded, visible-to-track) and others make more sense to keep closer to the library block (display name, attempts, show correctness) but that can vary depending on where they appear in the course.

      • Connor: “Problem Complexity Settings”?

      • Chimuanya: “Problem Configuration”?

        • C: Can we use a more general name that’s useful in both areas

      • Monica: “Library Block Settings? Library Block Controls?”

      • Marco: “Course Defaults - Videos, Problems, HTML, whatever else.”

      • Jeremy: What are we configuring vs overriding?

      • What about content vs. settings? Is everything overridable?

        • Can content overrides persist? Valuable but hard?

      • Monica: One thing that came up when doing UX was the concept of a preview, so they could more easily see diffs on update.

      • Chimuanya: Valuable to elevate the policy rules to multiple problems.

      • Marco: Separate course defaults (e.g. videos are downloadable) from content libraries usage (extra step that these library blocks differ in their defaults, do you want to apply it?).

        • Chimuanya: So course level setting, not an override at this point. If you import 5 problems into your course, you’re given a warning that they don’t match your current defaults. Do we need that kind of pop-up? Open question.

      • Jennifer: Settings discussion feels like the tension between getting a problem to behave as desired and efficiency of setting up the course. Suggestion: One of the most significant settings “graded” is at the subsection level. If we wanted to take a stab at minimizing clicking, having “Assignment Settings” might help.

      • Monica: Partner interview: would love to set settings based on the kind of assignment it’s a part of.

      • Marco: Feels that it would be better to have course-level rules. Worried about having anything anywhere in Studio.

      • Jeremy: Differentiation between problems inside the libraries and the courses that run those problems. Are there blocks that live in libraries, and do they have settings associated, and are there use cases for overriding and those overrides come at different levels (course, subsection, block). Aggregated use cases.

      • Are there settings that only exist at the course level and not at the library level?

      • Chimuanya: Rigor related settings do make sense at the library level. But when bringing that content in context, there’s are course level settings.

      • Chimuanya: Sequence level / Course level inherited overrides always wins when bringing in content library.

      • Marco: As long as authors can see where the rules apply.

      • Course would hold overrides

      • Monica: Few thoughts for now or another time:

        • What about new course shells with no specified settings, just defaults in place? What happens when someone pulls in a library into a randomized content block in a course that's just being created? Are we assuming design is happening explicitly in the course always and only?

        • Use case re: problem settings that came up occasionally when at Stanford was instructors who wanted to have XYZ settings for their problems during the course, and then, when the course ended and was available in archive mode, they wanted to apply different settings (ex: show answer settings stricter in the live course, more relaxed so things could be there as a resource after the course ended).

      • Jennifer: Seeking clarification, possibly tabled for next time: does this decision framework need to be tested against "what if libraries scale up to units/subsections/larger pieces of content"? Is there more value to keeping library settings if there are more complex pieces of content or reuse?

      • Monica: Monica Diaz : [Can you please put in your Stanford example of researchers needing content library defaults here] Researchers at Stanford using Stanford’s Open edX instance created content libraries and then used them to flesh out multiple courses delegated out to different universities they were working with. Since these were for research projects, being able to control the settings - or at least define things within the content library - was important to the researchers.

      • Marco: Content no matter where it is authored is a complete representation of its editor + settings content, whether library or course.

      • Jennifer: It feels like one of the efficiencies libraries can bring is ‘bulk editing’ the settings of all library content, and I don’t want to conflate the value of bulk editing with the value of having decisions about settings pre-made at the library versus course level.

  • Would above policy Thing be mapped at the sequence level? Reused across sequences? (Everything has to be overridden individually today.

  • What parameters are always provided by courses and never by libraries?

  • What parameters are always provided by libraries and never overridden by courses?

  • What parameters may have defaults in libraries but potentially overrides in courses?

    • Note: Some fields are used differently in libraries or courses–e.g. the display name in a library might be “1a”, “1b”, “1c”, etc. in a library, but need more descriptive names in a course.

  • Would we ever include more than one type of thing from the library in the same Unit?

    • Tech side: Is library customization mostly stateless, or is it configured in the library?

  • We are strongly leaning approach #2. Why?

    • Portability: You have references to something that can be centrally managed.

    • Simplicity: There’s a lot less deep magic going on. Things belong in a library. They’re referenced in an LTI launch situation, where parameters are specified by the course and applied by the LTI provider at runtime.

    • Platform extensibility: It’s pretty crazy powerful for content re-use if you can pick anything in a library, set up your parameters, and launch it LTI-style from anywhere. edX.org could become a central repository for content used by Edge and MITx and the like.

  • However, we realize that using LTI means that library-referenced content would be served from within an IFrame. IFrames are known to negatively impact frontend performance. Furthermore, we already encapsulate each learning unit’s content within an IFrame, so this would be a nested IFrame.