[BD-14] Library use cases & implementation discovery

This is a work-in-progress artifact of these two tickets:

Once general consensus is reached, I’ll move some of the technical decisions to edx-platform ADRs (architectural decision records).

 

Overarching architectural goals

In tandem with bringing significant improvements to the content authoring workflow, we believe BD-14 presents an opportunity to advance the technical state of content authoring and core. To achieve this improvement, we have a few top-level architectural goals in mind, in no particular order:

  1. Store library content in Blockstore. This has been a goal since the outset of the project – it’s in the original specification of it. Just as LabXChange does today, we will store the master copy of v2 library content in Blockstore.

    • Why Blockstore? The decision to move away from Modulestore and towards Blockstore has been thoroughly hashed out at several levels, and Blockstore Design shares some of the justification. LabXChange is currently successfully proving out Blockstore’s utility in a production environment. Happy to add more details here if helpful.

  2. Do NOT store library content in Modulestore. This may seem obvious given the first goal, but it’s not. Since courses are currently stored in Modulestore, we need some way of referencing library-based content from the context of Modulestore-backed content. One way of doing this would be to copy all course-referenced library content from Blockstore into Modulestore. While technically an option, we would like to avoid doing this, because (i) it is hugely ineffecient space-wise, and (ii) Modulestore is fraught with complexity and performance issues, which we are trying to avoid coupling BD-14 as much as possible.

  3. DO store library content customizations with the course in Modulestore. Although we want library content to live firmly in Blockstore, we still want to any customizations that course authors make to referenced content to live within the course structure/definition. This helps us establish and maintain a distinction between (a) original library content and (b) course-local tweaks that authors may make, a distinction which we believe is important for both technical and user-facing coherence.

    • An intentional implication of this is that library content customizations get exported along with course content, whereas the original library content does not. Furthermore, re-runs of a course will copy any library content customizations into the new course run, with further customization of library content in the original run not affecting the new run, and vice versa. These behaviors mirror the intended behavior of v1 libraries, but the v2 implementation should achieve them with more clarity and fewer edge-cases.

  4. Build new frontend features using the micro-frontend (MFE) framework. This is also a stated goal of the original BD-14 pitch. Wherever feasible, we will build all new frontend features using edX’s React-based micro-frontend framework, as opposed to building within the legacy Django-templated frontend upon which most of Studio is currently implemented. We have already begun this process: the Library Authoring MFE has been created and is currently deployed for preview in the staging environment.

    • One area where this will get interesting is the editor rewrites. It is not yet clear whether or not we can leverage the MFE framework in the editors.

  5. Continue allowing for library content to be exposed via LTI.

    • There is currently an LTI Provider implementation in the content_libraries app. It is not used on edX.org, but is presumably used by other instances (some OpenCraft-hosted ones?).

    • We believe that enabling library content to be exposed via LTI is a big value-add for V2 libraries, making Open edX more powerful in any setting where different LMSs, CMSs, or SISs are being integrated.

    • Although BD-14 will not focus on developing this aspect of content_libraries, we would like to leave the door open for future improvements and usage of the LTI Provider functionality. Phrased another way, we shouldn’t “build against” the LTI provider implementation.

    • This is an amendment to an older goal that we dropped (see “How is referencing implemented?” below for context):

      • Leverage LTI as the mechanism for referencing library content from courses. LTI (learning tools interoperability) is an established standard for sharing learning content between LMSs. Viewing Open edX courses as “LTI consumers” and v2 Open edX libraries as “LTI providers”, we believe that the “referencing” part of BD-14 can be essentially implemented as a thin layer on top of the LTI specification. The Open edX platform is already capable of consuming and providing LTI v1.3 content, giving us a head start, although the existing “provider” functionality may need some enhancements to meet BD-14’s requirements.

Product terminology

Bold terms are already used in production. Underlined terms are new with BD-14 and thus could still be revised.

Term

Definition

Notes

Term

Definition

Notes

Content Library

A collection of reusable components.

 

Legacy Content Library

Our user-facing term for v1 (modulestore-backed) libraries, once BD-14 lands.

 

Problem Library

A v2 (blockstore-backed) library that may only contain problem components.

 

Video Library

A v2 library that may only contain video components.

 

Complex Library

A v2 library that may contain any mixture of component types.

 

Library Content Reference

A single location in a course in which component(s) from a library are included.

In BD-14, we will build a “library reference block” to implement this operation.

Library content referencing

The act of using a Library Content Reference to include one or more components from a library into a course.

I chose this term in favor of “inclusion” or “sourcing” because I think it more accessible and less ambiguous. Open to other opinions here. More radical idea: call it a “launch” as a callback to LTI. -Kyle

Library content customization

The act of making edits to components included via a Library Reference. The edits are local to the Reference within the course; they are not reflected back in the source library.

“customization” is my suggestion, happy to use a different term instead.

Library Version

An immutable snapshot of a library, as published at a certain point in time.

Accompanied by a version name and comment, perhaps?

Library publishing

The act of releasing all changes to a v2 library since its last publish, creating a new Library Version.

v1 libraries do not support the idea of “publishing”.

Course

A run of course, which is authorable in Studio and hosted in LMS.

More precisely, this is called a “course run” in order to differentiate it from a “catalog course” (which are advertised on the marketing site).

Course publishing

The act of pushing out all edits to a course made in Studio such that they manifest in the LMS. Until a course is published, the edits only manifest in Studio.

Some edits must be explicitly published using the “Publish” button. However, some edits, especially structural or course-wide changes, automatically trigger a publish.

Component

A single piece of course content. Examples include HTML, a discussion, a video, an ORA, a problem (note that a problem may contain multiple responses), and advanced components.

Every component is an XBlock.

Advanced Component

Components other than the core five (HTML, discussion, video, problem, ORA), some of which are authored by external providers, and many of which are not fully supported. Must be explicitly enabled in advanced settings.

 

Unit

A series of zero or more components, displayed on a single page.

Under the hood these are called “verticals” sometimes. In the Learning MFE, the contents of a unit are rendered together within an iframe.

Background and technical terminology

Libraries, blocks, courses, and contexts

A content library (or just “library”) is a collection of reusable content. Each reusable piece of content is an XBlock usage, aka a block. These blocks are the same pieces that make up courses; by storing them in libraries, though, they can be authored, versioned, and referenced independent of any course-authoring workflow.

At first, this document we will focus on libraries made of component-level blocks; that is, blocks that are individual bites of content, such as problems, videos, HTML, and advanced components (polls, ORA, etc). Future iterations of this document may consider libraries that contain structural blocks, such as units, sequences, and sections.

On edx.org, content from libraries is primarily intended to be reused in courses. With the advent of LabXChange, though, the Open edX platform has also begun serving learning content from a labxchange pathway, which is a “a short collection of XBlocks that a student works through in a linear sequence”. Excitingly, these pathways actually store all their content in v2 libraries, although the content inclusion mechanism and authoring interface is different than what we will be building for edx.org courses.

So, we have generalized courses into the idea of learning contexts, or just “contexts”. Courses, labxchange pathways, and even libraries themselves are types of contexts, and all benefit from the ability to consume reusable content. So, although BD-14 product messaging sometimes presents libraries as collections of content for re-use across courses, in reality content libraries are a collection of blocks authored for re-use across learning contexts.

Storage backends

Libraries exist for use on edx.org today, backed by the split-mongo modulestore (aka “split-mongo”), our MongoDB-backed, versioning, immutable-definition content storage system. We refer to this generation of the content library feature as version 1 (“v1”). For a dive into how v1 libraries are implemented, check out Dave’s v1 library writeup.

All active edx.org courses are also stored in split-mongo. In this document, we will call these courses v1 courses. There also exist courses in the deprecated old-mongo modulestore (which one may call v0 courses), but since old-mongo cannot store libraries nor courses referencing library content, we will not talk about it further in this document.

BD-14 aims to replace v1 libraries with the version 2 (“v2”) implementation, backed instead by blockstore, our SQL + Amazon S3 -backed, versioning, immutable-definition content storage system, which is:

  • not dependent on edx-platform

  • significantly simpler than either modulestore, and

  • designed ground-up with content reuse in mind.

The content in v2 libraries will need to be usable by split-mongo-backed courses, although the technical design of v2 libraries will keep in mind our desire to eventually move all course content to blockstore, which we’ll speculatively refer to as v2 courses. This should be easy, since v2 libraries already serve content to labxchange pathways, which are blockstore-backed.

Keys

Content stored in the Open edX platform is referenced by a variety of types of opaque keys, which are generally semi-human-readable, URL-safe, immutable, and stable string identifiers. The “opaque” adjective describes that each key should generally be treated as indivisible, allowing us to change keys' structure over time without breaking assumpions. For example, URL parsers should not assume that learning context keys being with course-v1:, because that URL may one day need to handle keys prefixed with lx-pathway: or lib: instead.

Different types of keys refer to different entities and have different structures, as shown below. Key parts written in CAPITALS are variables that would be substituted with specific content information.

  • context keys identify learning contexts.

    • course-v1:ORG+COURSE+RUN is a v1 course key.

    • library-v1:ORG+LIBRARY is a v1 library key.

    • lib:ORG:SLUG is a v2 library key.

    • lx-pathway:PATHWAY_UUID is a labxchange pathway key.

    • Note: Some of these keys can also contain version info (+version@abc123...), but this version info should be stripped out before use in LMS.

  • definition keys point to a block’s content in the data storage backend.

    • def-v1:DEFINITION_ID+type@BLOCK_TYPE is a v1 block definition id, identifying a block definition within split-mongo modulestore.

    • bundle-olx:BUNDLE_UUID:VERSION:BLOCK_TYPE:OLX_PATH is a bundle definition key, identifying a block definition with a blockstore OLX bundle.

    • i4x://ORG/COURSE/RUN/SLASH_SEPARATED_BLOCK_IDS is a v0 block location, which served as both a usage and definition key for old-mongo course blocks.

  • usage keys locate the usage of a defined block in a learning context (definition + context = usage).

    • block-v1:ORG+COURSE+RUN+type@BLOCK_TYPE+block@BLOCK_ID is a v1 block usage key, identifying the usage of a block in a v1 course.

    • lb:ORG:SLUG:BLOCK_TYPE:USAGE_ID is a v2 library block usage key, identifying the usage of a block in a v2 library. Usages of blocks in libraries are not intended for learner consumption, but instead for referencing from other contexts. However, library block usages can be interacted with directly by authors as they preview their work – user state is stored ephemerally.

    • lx-pb:PATHWAY_UUID:BLOCK_TYPE:USAGE_ID and lx-pb:PATHWAY_UUID:BLOCK_TYPE:USAGE_ID:CHILD_USAGE_ID are labxchange pathway usage keys. The former format identifies a top-level block usage within the pathway, whereas the latter format identifies a nested block.

    • i4x://ORG/COURSE/RUN/SLASH_SEPARATED_BLOCK_IDS is a v0 block location, which served as both a usage and definition key for old-mongo course blocks.

Implementation questions

This section is organized in a question-answer format, even though many of the questions are already answered.

To see which questions are still open, search for the text “TBD” (to be determined). Sections marked “TODO” are things I’m still working on fleshing out.

Questions with “High” urgency need to be answered before closing out the v3.1 milestone. For questions with “Medium” or “Low” urgency, we can safely hold off on answering them, as long as we have them in the back of our mind.

Authoring V2 libraries

Should component editors be rendered in a secure sandbox?

Answer: TBD by https://openedx.atlassian.net/browse/TNL-7458 .

Should the library authors be able to name and/or comment their version updates?

Context: Unlike in v1 libraries, when a library author changes a v2 library, they will need to explicitly push a “Publish Library” button to push out the changes, which will release a new version of the library. This begs the question: would we like to enable authors to annotate these published versions?

  • Example version names: “v1”, “v1.5”, “v2”, “v2021-Fall”.

  • Example version comments: “Added new cell question”, “Change wording of question 1”, “Remove misleading answer from question 3”.

Urgency: Medium - this affects how we think about versioning, but our stance could be revised later.

Answer:

  • Yes, we believe this is a great feature that will unlock additional value for course teams and course department collaboration.

If library authors are able to name/comment updates, should they also be required to?

Said another way: Assuming the answer to the above question is “yes”… should we go one step further and mandate that a version name and/or comment is included?

Pros:

  • Interface consistency - all library versions will have names.

  • May encourage collaboration in authoring by nudging library authors to externalize what they’re changing.

Cons:

  • Library authors may use garbage (or even worse: misleading) names just to fill the requirements.

  • For libraries only ever authored and used by an individual, this could just become an unhelpful point of friction.

Urgency: Low - this is a UX decision that we could make later.

Answer:

  • Require a version name.

  • Version comment is optional.

What types of content can be stored in libraries? Are there different types of libraries?

The original pitch for BD-14 described three library types:

  • Problem libraries: Can only contain problems.

  • Video libraries: Can only contain videos.

  • Complex libraries: Can contain a mix of block types.

The idea is that a library author may choose to mark their library as Problem-only or Video-only, and thus may see a richer library-editing experience, as the library editor would be tailored to just problems or videos. Otherwise, the Complex library editor would be shown, which allows for generic editing of a collection of blocks.

Some questions:

  • Do we still like this distinction?

  • For Problem libraries, are we OK with “problems” meaning “CAPA problems”? This would exclude Drag n' Drop and all other advanced problem types.

  • For Complex libraries, do they contain a mix of component block types? Or, should libraries be able to contain structural blocks like Units, Subsections, and/or Sections?

Urgency: Medium - We are assuming “yes” on this feature since it was in the old spec and is part of the prototype, but we could reverse the decision later without too much fallout.

Answer:

  • Yes, three library types (problem, video, and complex).

  • Yes, it’s okay that problem libraries will only have CAPA problems.

  • Structural blocks in complex libararies: still TBD

When should course teams make updates in the library vs. in the course?

Consider: Course teams are often, but not always, going to have permission to write to the Library.

Notes:

  • The opinion that Partner Support shares with partners today is: The library is often a source of truth for multiple courses. So, edits should only be made back to the library when they are meant to benefit all users of the library. For example, fixing a typo should be done in the library. Changes that are meant for the benefit of the course should remain as course-level customizations.

Urgency: High - I think this guides how we design and build the entire project.

Answer: TBD

Referencing V2 library content from V1 courses

What are all the different use cases for referencing library content in courses?

Urgency: High - It’s important to know whether there are any use cases we haven’t thought of.

Answer:

  1. I want to reference a single specific block from a library into a course.

  2. I want to reference multiple specific blocks from a library into a course.

  3. I want to randomly reference a single block from a library into a course.

  4. I want to randomly reference a number blocks from a library into a course.

  5. I want to randomly reference a single block of a certain type from a library into a course.

  6. I want to randomly reference a number blocks of a certain type from a library into a course.

  7. I want to randomly reference a single block from a specific subset of a library into a course.

  8. I want to randomly reference a number blocks from a specific subset of a library into a course.

Notes:

  • From a technical perspective, Case #8 encapsulates all eight use cases. For example, Case #1 could be thought of as a special case of Case #8, where the “number” is 1 and the “certain subset” is a single specific block. Nonetheless, we are differentiating these cases because they may be best suited by different user interfaces, and may each be of different importance levels.

  • V1 libraries implement Cases #4 and #6 (and therefore, indirectly, can handle Cases #3 and #5).

Which referencing uses cases are we building for in BD-14, and in which order?

(“Case #X” references the cases defined directly above)

Notes / more specific questions:

  • I think it would be simplest to start by building Case #1.

  • I recall hearing recently that we aim to eventually implement Case #6, reaching parity with V1 libraries. Is that true?

  • Will the block-selecting experience for Case #1 and Case #6 be separate?

  • Would we want to eventually implement Cases #2 or #8? (essentially, adding support for specifying a subset of a library)

Urgency: High - This is important in determining how we sequence features.

Answer: TBD

What is the workflow for referencing library content in courses?

Urgency: High - Knowing the general flow here is important in order to implement the UX and backend.

Answer: For all 8 use cases, the interface for library content referencing will look something like this:

  1. Course author chooses to add a Library Content Reference.

    1. Author chooses a library.

    2. Author chooses a version of that library (?)

      • It’s an open question of whether we want this step. We dive into this in a later question.

    3. Author chooses a pool of blocks (?)

      • It’s an open question of whether need this step. It depends on the use cases we want to build for (listed in a previous question).

    4. If multiple chosen: Author specifies randomization or ordering.

      1. Randomize: yes/no.

      2. If yes: how many blocks should be randomly selected from pool?

      3. If no: what order should the blocks be shown in?

  2. Course author may customize included block(s) by editing them.

    • There is a lot to unpack in this step. We dive deeper into it in a later question.

How is referencing implemented?

There are two major approaches to implementing library content referencing that we’ve talked about:

Approach 1: LTI

This idea was spurred by Dave’s thinking on the topic. In his words:

What if there’s no deep relationship between libraries and courses or overrides at all? What if content lives in content libraries, and it’s included into a course via an XBlock that is like a slightly more sophisticated LTI block with a set of parameters (that represents the override)? We could make that a part of the LTI provider functionality for Open edX–the POST parameters to it would include some subset of overrides that the system knows how to apply. We can then apply that (stupidly) at the OLX level before the XBlock runtime even sees it.

Succinctly, a course would be an LTI-consumer, a library would be an LTI-provider, and a library reference would just be an LTI launch. We would use LTI 1.3 specification, the most-recently released and supported version of LTI, along with LTI Advantage, a collection of three services that extend LTI 1.3.

The lti_consumer XBlock already supports LTI 1.3 (with LTI Advantage) consumption. Furthermore, edx-platform V2 libraries are already LTI 1.3 providers, as implemented by OpenCraft for a Harvard(?) initiative. Using these two existing features, I managed to simulate a very basic version of library content referencing on Stage.

Of course, content referencing needs to support more than just simple reference of a singular block. I believe the advanced capabilities of referencing could be implemented using LTI Deep Linking service, one of the services included in LTI Advantage. From the specification:

The IMS Learning Tools Interoperability® (LTI) Deep Linking specification allows a Platform to more easily integrate content gathered from an external Tool. Using the Deep Linking message defined in this specification, Platform users can launch to a URI specified by an external Tool, then select specific content appropriate for their use, and receive a URI that other platform users can use at a later time for launches directly to that specific content.

A course author will need to be able to configure a library reference by selecting particular blocks, choosing randomization parameters, and applying customizations. Following the Deep Linking spec, the interface for this would exposed by the library as part of the deep linking step. After configuring the library reference via said interface, the library would return a deep link, which would serve as the block’s Launch URL. When a learner’s browser accesses the deep link, the library would return the set of blocks as configured by the course author.

Approach 2: In-Platform

The library referencing flow would be implemented within the Studio and/or LMS processes of a single Open edX instance, as it is for V1 libraries. Unlike V1 libraries, and though, library content would be stored and rendered outside of modulestore and the modulestore-based XBlock runtimes (which is also true of the LTI implementation).

We would achieve this by introducing a unit_compositor subsystem. Like learning_sequences, the subsystem would be populated by CMS upon course publish. It would store a read-optimized form of:

  • Metadata and child-block lists for all course Units, and

  • Definitions of library blocks.

We would then update the LmsXBlockRuntime (called “CombinedSystem” until BD-13 is done) to use the unit_compositor as its backing store for units. When a unit is requested for a particular user, the unit_compositor would:

  1. Load the unit’s child blocks from modulestore.

  2. Replace each library reference block with its corresponding library block definitions, each overridden with any course-author-specified customizations, and each given a usage key that composes the library reference block's usage information with the library block’s definition key.

  3. Return the list of blocks wrapped under a VerticalBlock, with the same usage key as the original unit, for the LmsXBlockRuntime to render.

Example:

Something authored in Studio and saved to modulestore like this:

Vertical( # our original unit. usage_key="block-v1:edX+UseLib+1+type@vertical+block@1", children=[ HtmlBlock( # a typical HTML block. usage_key="block-v1:edX+UseLib+1+type@html+block@2", ..., ), RandomizedLibraryReferenceBlock( usage_key="block-v1:edX+UseLib+1+type@random_lib_ref+block@3", # reference from version 5 of library 'someProblems' by 'orgX'. library_key="lib:orgX:someProblems:v5", # grab 2 random problems. count=2, # cap attempts on each problem to 3, field_overrides={ "max_attempts": 3, }, # for problem block 'xyz' in particular, set a custom display name. block_field_overrides={ # in the key below, 'lb:' indicates 'library block (definition)' "lb:orgX:someProblems:problem:xyz": { "display_name": "Custom problem title for XYZ!" }, }, ), VideoBlock( # a typical Video block. usage_key="block-v1:edX+UseLib+1+type@video+block@4", ..., ), ], )

Would come out of the unit_compositor like this:

Vertical( # unit, post- processing by unit compositor usage_key="block-v1:edX+UseLib+1+type@vertical+block@1", children=[ HtmlBlock( # same old HTML block. usage_key="block-v1:edX+UseLib+1+type@html+block@2", ..., ), ProblemBlock( # problem block, from the library! # in the usage key, 'lb-ref:' indicates 'library block reference'. # note that the usage key combines information from both the library # block definition key and from the library reference block usage key. usage_key="lb-ref:someProblems:problem:abc:edX+UseLib+1+type@random_lib_ref+block@3", max_attempts=3, display_name="Library-defined problem title for ABC", ..., ), ProblemBlock( # another problem block, from the library! usage_key="lb-ref:someProblems:problem:xyz:edX+UseLib+1+type@random_lib_ref+block@3", max_attempts=3, display_name="Custom problem title for XYZ!", ..., ), VideoBlock( # same old Video block. usage_key="block-v1:edX+UseLib+1+type@video+block@4", ..., ), ], )

Approach Comparison

In favor of LTI:

  • Cross-instance referencing is immediately supported with LTI.

    • @Sergiy Movchan “we at RaccoonGang used this approach [LTI] for ALOSI (adaptive question banks) experiment. LTI worked like a charm. and what’s more important - the content library could be a separate Open edX instance”

  • Library content is usable in any other LTI-consuming LMS.

  • With LTI, we “drink our own champagne”. That is, by building a feature on top of our own LTI 1.3 consumer/provider implementations, we are directly experiencing the joys and pains of our LTI offering, incentivizing and empowering us to maintain+improve those implementations for our benefit and for others'. This is the same reason we build some of our training materials as edX courses. This is also referred to in the industry as “eating your own dogfood”.

  • The LTI approach is already partially implemented. edx-platform is a certified LTI 1.3 and LTI Advantage content consumer. edx-platform is an LTI 1.3 Provider. Although both of those implementations (especially the provider) may need more development, we would start in a place where content referencing is already implemented at a basic level.

In favor of In-Platform:

  • In-Platform pushes forward work on unit composition. This is something we’ve identified as a critical piece of the de-modulestore-ification of LMS and de-coupling of LMS and CMS. While both the LTI and In-Platform approach would require some sort of LMS-side representation of blockstore data, I think the In-Platform approach would require us to really begin to tackle the unit_compositor idea, which could be a positive in the long-term.

  • In-Platform allows referenced content to be presented as separate blocks. With the LTI approach, each library reference is surfaced to the LMS as one block of the lti_consumer type, even if multiple library blocks are contained within the reference. With the In-Platform approach, we can surface the library blocks to LMS as direct children of the Unit, thus taking advantage of LMS features that use blocks as the atoms of courseware (problem grade report, gradebook, etc).

  • In-Platform leaves the door open for structural libraries. The Learning team has indicated to Product that reusing “structural” blocks (units, sequences, chapters) may be something that we eventually want to do. The In-Platform implementation would make V2 libraries much more suitable to this use case.

  • I have some uncertainty around LTI specifics. I am not an expert on LTI. Although I think it’s a good fit for this project, I can only predict that LTI 1.3 & LTI Advantage will mesh well with our goals for the product. I am not familiar enough to say with 100% certainty that there exists an elegant mapping between LTI and edX’s vision for V2 libraries.

    • What happens if/when there is a mismatch between LTI and BD-14’s goals?

      • I fully expect this to happen a couple times, causing minor slowdowns while we hash out the way forward. If this happens a lot, though, then it could impact the timeline of the project.

    • How does this compare to the In-Platform approach?

      • Of course, there’s still uncertainty of the exact implementation we’d use if we eschewed LTI. The big difference, though, is that we have a large degree of control over edx-platform. If there’s a mismatch between our goals and edx-platform, we can change edx-platform. If there’s a mismatch between our goals and LTI, we must either create a bridge between the two, or change our goals.

        • Hypothetical example, just to make things more concrete: Imagine we find that it’s only possible to return one grade from an LTI launch (fwiw, I don’t think this is true). We’d either have to (a) provide a method on the side for Open edX library content to send multiple grades back to the course, or (b) we’d have to accept that library references can only return one grade each, even if they contain multiple blocks.

  • There are LTI performance concerns. LTI lauches take place in an IFrame and require additional API calls. This IFrame would be nested within the Unit IFrame that already exists in the Learning MFE. So, we would be accepting a frontend performance hit by going with this approach.

    • How much? It is really hard to scientifically say, but currently, I would estimate that:

      • if it takes X seconds for typical content in the Unit IFrame to render, then

      • it takes ~2*X seconds for content within a nested (non-LTI) IFrame within the Unit IFrame render, and

      • it takes ~3*X seconds for content included via an IFramed LTI 1.3 launch of a library block to render.

    • This could be mitigated during BD-14 development by:

      • Serving LTI 1.3-provided content from LMS. The current implementation serves from Studio, which has undesirable performance as well as architectural implications.

      • Optimizing the chromeless xblock view (which serves xblocks from the LMS), thus reducing the weight of the additional IFrame.

      • Optimizing the LTI 1.3 provider endpoints, thus decreasing the time between (i) the moment the Unit initiates the LTI launch and (ii) the moment the LTI content is rendered to the inner IFrame.

    • And it could be mitigated in the long-term by:

      • Removing the requirement of having an IFrame around all Unit content, allowing the LTI IFrame to exist at the top-level.

Decision

Urgency: High - This affects many parts of the technical implementation.

Answer: We will take the In-Platform approach, keeping in mind as we build that LTI library referencing capabilities may be something we want in the future.

We arrived at a rough consensus on this in the T&L’s 2021-10-20 Geek Time meeting. The gist was that:

  • The benefits of the LTI approach centered around a long-term architectural strategy of making easier for Open edX instances to share content with one another and with other LMSs. This strategy is not what edX is looking for in BD-14 right now.

  • This makes the performance challenges and implementation uncertainty around the LTI approach hard to justify.

  • The In-Platform approach, on the other hand, leaves open the door for an interesting future use of libraries (structure re-use) while still pushing forward an architectural initiative (LMS unit composition).

  • Thus, we will go with the In-Platform approach for content referencing, with three caveats:

    • Even though we’re not going the LTI route, we should still build the In-Platform approach with strong separation between the library content consumer and provider sides of the system.

    • Post 2U/edX merger, we should take another look at how important interoperability is for edX’s business, and consider whether a stronger LTI story for libraries is worth doing.

    • The semantics of library import/export and course-with-library-content import/export should be nailed down ASAP, since they are tricker with the In-Platform approach.

When a new version of a library is published, what happens to courses that reference that library’s content?

Context: There are two apparent options (with a third option of “let the user choose either”):

  • Live updates. Library references always pull from the latest published version of a library.

    • This option runs counter to a Studio philosophy we’ve had: The course should not change without author action. eg:

      • if a block is deleted from library, it would be deleted from courses that link it, which would be bad.

      • if a block is modified in a way that clashes with the author’s customizations, that could be bad.

    • However, there would be some benefits:

      • when the updates are things the author wants in their course (either b/c they’re trivial, or the course author themself made the library edits), then this removes a potentially annoying+confusing “version-bump” step from the workflow.

  • Voluntary updates: The library reference is pinned to a specific library version. Course authors would be shown the option of updating to a later library version if one exists.

    • (This is roughly how V1 libraries work – except the only option is to “update to latest”)

    • What does the interface look like for updating?

      • Do we show them what has changed?

      • If blocks they are using would be deleted upon update, do we message this?

    • Can the course author only ever update to the latest version of the library, or can they update to any version? Can they downgrade to older versions? This is related to another product question, “Which versions of libraries can a course author create references to?”.

    • An implication of this system is that we need to retain every version of every library that is referenced by a published course.

Urgency: High - This affects many parts of the technical implementation.

Answer: TBD

Which versions of libraries can a course author create references to?

The simplest answer would be “latest only” – that is, when a course author creates a new library reference within a course, they can only select content from the latest version of their available libraries. For what it’s worth, this is how V1 libraries work.

  • Comment from @Chimuanya Okoro (Deactivated) : I believe partners/course teams would take issue with this limitation. I’d like to confirm with key stakeholders; however, I could imagine that course teams would want to access an older version of a library while they experiment with a new video or problem.

More complex answers to this question could be “versions published in the past X months are available” or “any version of the library ever published is fair game.”

Urgency: Low - We could assume “latest only” and then get more flexible later if the product requires it. On the backend, we will still be building something that allows for reference to any library version, since that’s required in order to support courses with reference to old library versions.

Answer: TBD

How can a course author customize library-referenced content?

Urgency: High - Different answers to this question will lead us down different implementation paths for library referencing.

Answer:

  • After referencing component(s) from a library, course authors can apply course-local customizations to the components by opening the components' respective editing interfaces and making changes, as they would for a course-native component. These changes are not reflected back in the original library.

  • When updating to a newer version of a library, any customizations that the course author made will be applied on top of the library content updates.

  • Course authors will be able to clear (specific?) (all?) customizations to components, causing them to fall back to their library-defined values.

Implementation notes:

  • We want the data flow to work like this:

    • User hits “edit” on a library-referenced component.

    • Studio sends the course-local customization details to the library provider, launching the component editor via LTI.

      • The library provider renders the editor using the content it knows (from Blockstore) plus the customizations (sent via Studio, acting as the LTI consumer).

    • The user makes their edits.

    • Upon hitting “Save”, the library provider sends the customizations (along with any changes) back to Studio, which saves them to the enclosing library reference block.

  • Could this all be done via the LTI deep linking flow?

  • TODO

Shall courses in one Open edX instance be able to reference libraries in another instance?

Notes:

  • Kyle and Dave’s recommendation is that the answer to this question be emphatically yes – courses on say, edX.org, should be technically capable of reference content on Edge or even some external Open edX instance. This capability is one of the major selling points of using LTI as the referencing mechanism instead of baking referencing into the platform. If our answer here was “no” in all cases, we may want to re-evalulate using LTI.

  • Of course, we would likely still want to build in some access control mechanisms in order to ensure that:

    • library authors can limit which courses/authors can access their content, and/or

    • Open edX instance operators can limit which external instances content be used from.

  • The details of the access control are explored below.

Urgency: High - If the answer here is a “hard no” then we will want to reconsider the decision to use LTI.

Answer: TBD

Can library content be referenced across multiple units, as to split up an assignment into pages but avoid duplicates?

One use case is “select a dozen or so components out of this library for this assessment”, where we don’t want to repeat any components. Today, doing that requires that you reference all the components into a single unit, which is not a great UX for the learner (slow to load, lots of scrolling). If you were to spread them across multiple units, you’d risk having repeat components appear, since component uniqueness is only guaranteed within a reference block, and each reference block belongs to a single unit.

With v2 libraries, we could implement something to spread referenced content across several units, guaranteeing component uniqueness across all of them. We would need to establish a new term for the concept of “a series library references across units, which together should all yield a unique set of components.”

Under the hood, we’d instrument this by creating a key for each instance of this new concept, and passing the key to each usage of the library reference block. The LTI provider would be sure not to return repeat components for successive launches with the same key.

Specific Questions:

  1. Would we want to implement this? (We wouldn’t have to do it right away - could punt the decision until later).

  2. If we implemented this, what would call this new concept?

    1. A “reference group” or “reference context”? (too vague?)

    2. An “invocation”? (too technical?)

    3. A “randomized assignment”? (suggestive of more than it actually is?)

    4. An “activity”? (too overloaded with other use of this term?)

Urgency: Low - We could implement this later.

Answer: TBD

Configuring V2 libraries

Which course authors and/or courses can reference content from which libraries?

TODO

How do we manage and enforce library permissions?

TODO

How does access control work across Open edX instances?

TODO

Export and Import

The export/import section is still under construction.

How are V2 libraries exported and imported?

What does the exported file format look like?

Answer: Didn’t get a chance to dive into this.

What happens to V2-library-referenced content when a course is exported and imported?

Context:

In theory, course authors should be able to export their course to OLX (contained within a .tar.gz file) from one Studio instance, import it into another Studio instance, and it should Just Work . In practice, courses sometimes rely on entities with differing lifecycles. For example, video files are not exported along with courses; course authors must handle video hosting separately, ensuring that their videos are reachable from both their source and destination course.

This complexity applies to libraries too. Since content libraries exist as separate entities, it is not a given that their contents would be included in a course export. We could imagine two scenarios at different extremes:

  1. Referenced library blocks are exported with the course. They are exported as ordinary blocks would be, so their relationship with the original library is severed. When the course is imported Studio, either the same instance or a different one, it does not matter whether the original library is present, since the blocks are ordinary course blocks. In this scenario, it is unclear how randomized library references would be treated.

  2. Referenced library blocks are not exported with a course. The relationships between the course and the library are reflected in the exported OLX. When the course is imported back into Studio, the library blocks will not render unless the proper version of the same library exists in the instance.

For what it’s worth: V1 libraries take an approach that is between the extremes. Library content is copied into courses when it’s referenced, so when a course is exported, the library content comes with it. However, the relationship with the library is also maintained, so that if the library is present upon course import, then the library blocks will be linked to it. However, two big drawbacks exist in the current system:

  • Every block from every referenced library is copied in, whether or not it’s necessary.

  • Library-set default values for settings of referenced library blocks are not exported.

    • Example of how this can cause issues:

      1. reference a problem from a v1 library in a course.

      2. don’t customize the problem’s title within the course; ie, keep the library-set default title.

      3. export the course.

      4. import the course into another instance without the original library.

      5. the imported problem doesn’t have a title.

    • (I think this is how it works; haven’t verified the above scenario yet. But I know import/export definitely acts weird for v1 libraries. -Kyle)

Here’s a matrix of the approaches we could take:

Decreasing library centralization

 

Decreasing size of course exports

(a) Library only exists on original instance, so course-library relationship is severed when exporting to other instances.

(b) Library versions are centrally managed by original instance, but replicated on other instances.

(c) Library versions managed in a distributed fashion across instances, identifying versions by git-like hashes.

Decreasing library centralization

 

Decreasing size of course exports

(a) Library only exists on original instance, so course-library relationship is severed when exporting to other instances.

(b) Library versions are centrally managed by original instance, but replicated on other instances.

(c) Library versions managed in a distributed fashion across instances, identifying versions by git-like hashes.

(1) Entire library is bundled with the course export.

 

= ( + )

(1a) nonsensical

(1b) The original library is brought to the new instance along with the course, although the library does not become part of the instance’s list of libraries available for authoring. Instead, the library maintains a connection with the original instance, and can receive updates from the library on that instance. In this sense, the library copy is a replicate of the “centrally” managed library.

Pro: Libraries can be used across instances, while still maintaining simplicity of central management.

Con: Library bundling increases export size.

Challenge: Would need to figure out how to stop remote library keys (from original instance) from clashing with local library keys (on new instance).

Challenge: We’d need to figure out the replication strategy. Push-by-original-instance? Or pull-by-new-instance? And, how we keep track of the URL of the original instance?

(1c)

Specific version(s) of each library can be chosen for export. Multiple versions of the same library can be imported into an Open edX instance without disrupting one another, and importing a library at a version does not necessarily mark it as the “newest”. Library versions are uniquely identified and referenced by a hash of their contents, avoiding conflicts that may arise between human-friendly version names/numbers across instances.

If the required version of the required library exists on an Open edX instance, then the course automatically uses it. Otherwise, library blocks in the course gracefully degrade to rendering hidden blocks.

Example story: (in the “Import/Export Story” tab, start at the very top and read downwards.

 

(2) Necessary pieces of library are bundled with course export.

 

= ( + )

=

(2a) Essentially, library blocks turn into ordinary blocks when a course is exported.

Pro: Courses will work out-of-the-box when exported and imported. We avoid wrestling with cross-instance library versioning altogether.

Con: Use of a library is confined to a single instance.

Challenge: Would need to develop a non-library version of randomized content banks.

(2b) Same pros and cons of 2b, except:

  • remove Con: Library bundling increases export size.

  • new Challenge: Including just the necessary library blocks in the course export.

(2c) Same pros and cons of 1c, except:

  • remove Con: Library bundling increases export size.

  • new Challenge: Including just the necessary library blocks in the course export.

(3) Library is excluded from course export.

 

=

=

(3a) nonsensical

(3b) Same pros and cons of 1b, except:

  • remove Con: Library bundling increases export size.

  • New Con: Course doesn’t work out-of-the box until first replication-from-original-instance occurs.

(3c) Same pros and cons of 1c, except:

  • remove Con: Library bundling increases export size.

  • New Con: Course doesn’t work out-of-the box. User must import correct library version into instance.

Decision

Urgency: High

Answer: Approach 1b.

Some additional decisions & implications:

  • To support this approach, library keys ought to be namespaced by “domain” (a string identifying the Open edX instance, if not the literal domain of the instance). For example, instead of library keys like lib:MITx:Problems1, you’d have lib:edx.org:MITx:Problems1 and lib:labxchange.org:LabXChange:lib-abcdef123.

  • Course authors in an Open edX instance ought to be able to “subscribe” to libraries on other instances, an action that would require an “access key” for non-public libraries. For example, if a hypothetical user of studio.labxchange.org wanted to use the library lib:edx.org:HarvardX:ScienceLib, they would need to request an access token from a ScienceLib staff user, and enter it into LabXChange’s Studio instance in order to make the external library available to LX authors.

What does an OLX export with V2 library content look like?

Urgency: High

Answer, Iteration 1:

./ # Root directory of an exported .tar.gz for a course. course/ # As before, the ./course/ folder contains: assets/ # (1) an assets folder, chapter/ # (2) one folder for .../ # each vertical/ # block video/ # type, and course.xml # (3) the root course.xml file. libraries/ # New folder! Contains all referenced librari1es, one XML file per library. Greendale_VideoLib.xml # Given a library with the key `lib:ORG:LIBSLUG`, filename is `ORG_LIBSLUG.xml`. Greendale_ChemLib.xml # <- For example, this defines 'lib:Greendale:ChemLib'. CityClg_mathlib.xml # XML files contain just metadata, including bundle UUIDs, but no block content. bundles/ # New folder! Contains definitions for bundles that were referenced in ./libraries/. a578e5...5b181d/ # In the blockstore-course future, ./course/ would also refernce ./bundles/. 259fa0...71f87f/ # The idea is that all content lives in ./bundles/ and other folders are metadata. d8b24f...2640c8/ # Bundle folder names are the full 32-char, lowercase, un-hyphenated bundle UUID. bundle.xml # bundle.xml contains title, slug, version #, and snapshot hash. files/ # Finally, files/ contains the OLX serialization of the bundle contents. problem/ # .../ # video/ #

Dave pointed out that the structure o the bundles/ directory would not be very human-friendly, even if it reflects the blockstore data model.

Answer, Iteration 2:

Referencing V2 library content from V2 courses

V2 (blockstore-backed) courses don’t exist yet, so we don’t need to worry about implementing this as part of BD-14. Rather, these questions are here so that we keep in mind the eventual migration to V2 courses, avoiding decisions during BD-14 that would complicate that process.

How might V2 library content referencing differ in V2 courses (as opposed to V1 courses)?

We didn’t end up getting a change to dive into this, but it’s worth keeping in mind.

Archive

Misc notes and stuff

Approach 1: Distributed version control

This is the first approach that came to my mind, after thinking about how to preserve course-library relationships across instances while avoiding library version conflicts from arising.

  • Library blocks don’t export with a course’s OLX. However, library content customizations do export with the OLX (since they are saved on the LibraryReferenceBlock).

  • Libraries can be exported. Notably, specific version(s) of each library can be chosen for export. Multiple versions of the same library can be imported into an Open edX instance without disrupting one another, and importing a library at a version does not necessarily mark it as the “newest”. Library versions are uniquely identified and referenced by a hash of their contents, avoiding conflicts that may arise between human-friendly version names/numbers across instances.

  • If the required version of the required library exists on an Open edX instance, then the course automatically uses it. Otherwise, library blocks in the course gracefully degrade to rendering hidden blocks.

Example story: (in the “Import/Export Story” tab, start at the very top and read downwards.

Approach 2: Centralized version control

@Dave Ormsbee , in reaction to Approach 1:

It might be useful to export [library blocks], even if it doesn’t import back on the other side. There are a number of tools that do content analysis of exports, and if you’re using content from multiple libraries, it’d be convenient to have that in the same export file.

….

I’m leery of making that step [(exporting at any version)] into full blown distributed version control. A question: Can we get away with only having one primary source of a library, and allow instances to essentially make read-replicas?

So for instance, a particular library could exist on edx.org and have some set of users who can modify it, and a simple linear history. But we can configure something so that Edge keeps an updated read-replica, so that it should theoretically always have that library’s data available.

This could be low level functionality that gets built at the blockstore layer as well. Some kind of namespaced replication so that Edge’s blockstore still has edx.org / {library_uuid} . Right now, the OLX export for a library assumes that the library belongs on whatever instance the course is in, but we could explicitly reference a fully qualified domain.

Essentially, instead of relying on library content hashes to uniquely identify libraries across instances, we’d somehow annotate and/or namespace libraries with their origin Open edX instance. Furthermore, libraries would be bundled with course exports.

TODO: write more

  • jennifer’s opinion

    • content should be edited in lib

    • settings should be edited in course

    • distinction is sometimes fuzzy–e.g. content is sometimes corrected within the context of a course today (minor changes to text)

  • fields that are ambiguous content v setting?

    • display_name?

      • renaming after import isn’t usually done in v1 libraries

      • confusion around problem titles coming from libraries sometimes

        • Names in libraries often have different meaning: e.g. 1a, 1b, 1c. vs. what it would be in the course (“Problem 1: {helpful description}”).

    • video start time

      • not used much currently?

      • would need to verify with CourseGraph

  • in units with library content, is there other stuff?

    • often a preamble HTML followed by random problems

    • mixture of HTML+video+problems in library very uncommon, since specific-block selection has been broken for a while

    • why: a single LTI launch for the entire unit would be cool

      • save ourselves an iframe layer

  • stress test

    • 30 blocks pulled in from a library of 50 in a single Unit

      • this is because it’s the only way right now to make sure that content doesn’t repeat.

        • interesting data model implications. Have a saved state for this inclusion of this library, but allow it to be launched across multiple pages?

  • LMS interactions

    • Can one LTI-launched thing map to multiple scores? This would be nice for grading purposes, but may be hard to retrofit.

      • Reactive use case: Need to correct and re-score one problem from the possible selection of the library.

      • Grades information.

      • Deep linking here?

    • Does it make sense for there to be two inclusion modes: one with single and one for a whole unit?

    • Settings tend to be re-used across a whole subsection, or even type “e.g. when using it as a HW, always give them five attempts”

      • Rare case: We find that one problem is too hard and adjust it to allow for more attempts.

Customization

  • TERMINOLOGY: Often configuration is reused. If you are selecting five problems from a library to put in an exam, you likely want to set the same policy to all of them (max attempts, don’t show the answers after it’s due, etc.) Should this be a top level Thing in the system, and what should it be called?

    • ???? = “the configuration/overrides applied to some content from the library by the course”

      • “Settings” = what we call it today. Does this still make sense?

      • Jennifer: Things that vary from context to context: Difficulty (attempts, randomization, etc.) → more or less rigorous. We do tend to lump a bunch of 'stuff' under the same umbrella of settings, and it's fair to call out that some are best set at the course/subsection/section level (release date, graded, visible-to-track) and others make more sense to keep closer to the library block (display name, attempts, show correctness) but that can vary depending on where they appear in the course.

      • Connor: “Problem Complexity Settings”?

      • Chimuanya: “Problem Configuration”?

        • C: Can we use a more general name that’s useful in both areas

      • Monica: “Library Block Settings? Library Block Controls?”

      • Marco: “Course Defaults - Videos, Problems, HTML, whatever else.”

      • Jeremy: What are we configuring vs overriding?

      • What about content vs. settings? Is everything overridable?

        • Can content overrides persist? Valuable but hard?

      • Monica: One thing that came up when doing UX was the concept of a preview, so they could more easily see diffs on update.

      • Chimuanya: Valuable to elevate the policy rules to multiple problems.

      • Marco: Separate course defaults (e.g. videos are downloadable) from content libraries usage (extra step that these library blocks differ in their defaults, do you want to apply it?).

        • Chimuanya: So course level setting, not an override at this point. If you import 5 problems into your course, you’re given a warning that they don’t match your current defaults. Do we need that kind of pop-up? Open question.

      • Jennifer: Settings discussion feels like the tension between getting a problem to behave as desired and efficiency of setting up the course. Suggestion: One of the most significant settings “graded” is at the subsection level. If we wanted to take a stab at minimizing clicking, having “Assignment Settings” might help.

      • Monica: Partner interview: would love to set settings based on the kind of assignment it’s a part of.

      • Marco: Feels that it would be better to have course-level rules. Worried about having anything anywhere in Studio.

      • Jeremy: Differentiation between problems inside the libraries and the courses that run those problems. Are there blocks that live in libraries, and do they have settings associated, and are there use cases for overriding and those overrides come at different levels (course, subsection, block). Aggregated use cases.

      • Are there settings that only exist at the course level and not at the library level?

      • Chimuanya: Rigor related settings do make sense at the library level. But when bringing that content in context, there’s are course level settings.

      • Chimuanya: Sequence level / Course level inherited overrides always wins when bringing in content library.

      • Marco: As long as authors can see where the rules apply.

      • Course would hold overrides

      • Monica: Few thoughts for now or another time:

        • What about new course shells with no specified settings, just defaults in place? What happens when someone pulls in a library into a randomized content block in a course that's just being created? Are we assuming design is happening explicitly in the course always and only?

        • Use case re: problem settings that came up occasionally when at Stanford was instructors who wanted to have XYZ settings for their problems during the course, and then, when the course ended and was available in archive mode, they wanted to apply different settings (ex: show answer settings stricter in the live course, more relaxed so things could be there as a resource after the course ended).

      • Jennifer: Seeking clarification, possibly tabled for next time: does this decision framework need to be tested against "what if libraries scale up to units/subsections/larger pieces of content"? Is there more value to keeping library settings if there are more complex pieces of content or reuse?

      • Monica: @Monica Diaz : [Can you please put in your Stanford example of researchers needing content library defaults here] Researchers at Stanford using Stanford’s Open edX instance created content libraries and then used them to flesh out multiple courses delegated out to different universities they were working with. Since these were for research projects, being able to control the settings - or at least define things within the content library - was important to the researchers.

      • Marco: Content no matter where it is authored is a complete representation of its editor + settings content, whether library or course.

      • Jennifer: It feels like one of the efficiencies libraries can bring is ‘bulk editing’ the settings of all library content, and I don’t want to conflate the value of bulk editing with the value of having decisions about settings pre-made at the library versus course level.

  • Would above policy Thing be mapped at the sequence level? Reused across sequences? (Everything has to be overridden individually today.

  • What parameters are always provided by courses and never by libraries?

  • What parameters are always provided by libraries and never overridden by courses?

  • What parameters may have defaults in libraries but potentially overrides in courses?

    • Note: Some fields are used differently in libraries or courses–e.g. the display name in a library might be “1a”, “1b”, “1c”, etc. in a library, but need more descriptive names in a course.

  • Would we ever include more than one type of thing from the library in the same Unit?

    • Tech side: Is library customization mostly stateless, or is it configured in the library?

  • We are strongly leaning approach #2. Why?

    • Portability: You have references to something that can be centrally managed.

    • Simplicity: There’s a lot less deep magic going on. Things belong in a library. They’re referenced in an LTI launch situation, where parameters are specified by the course and applied by the LTI provider at runtime.

    • Platform extensibility: It’s pretty crazy powerful for content re-use if you can pick anything in a library, set up your parameters, and launch it LTI-style from anywhere. edX.org could become a central repository for content used by Edge and MITx and the like.

  • However, we realize that using LTI means that library-referenced content would be served from within an IFrame. IFrames are known to negatively impact frontend performance. Furthermore, we already encapsulate each learning unit’s content within an IFrame, so this would be a nested IFrame.

    • We need to do some brief discovery to figure out:

      1. How noticeable is the performance difference between a piece of content placed directly in a unit vs. that same content hosted via an LTI launch IFrame?

      2. How common it is to use multiple V1 library reference blocks per page. This will give us an idea of whether we will be generally increasing the number of IFrames per page by 1, or by multiple.

        • We will use coursegraph (data was last updated Oct 2020) to determine this.