Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Feature Overview

A content library is a collection of course content (XBlocks) that can be used in one or more courses.

...

The original description/goals set out before any development began are at https://openedx.atlassian.net/wiki/display/SOL/Content+Libraries and the original implementation epic was SOL-119.

Architecture

Content libraries are not implemented in any one part of the code, but there are some key pieces that together comprise the feature:

  1. Split Modulestore, which was modified to support content libraries. Instead of just storing "course" structures, Split Modulestore now has the concept of a "course-like" structure, which is either a course or a library.
    • Courses and libraries are implemented similarly, each with a directed acyclic graph (DAG) of XBlocks, and a history of all changes made. The graph of blocks for both courses and libraries are stored in the "structures" MongoDB collection.
    • Split's "active_versions" MongoDB collection stores a list of all course-like objects (courses and libraries). Each one has an ("org", "course", "run") triplet which is the unique ID of that course-like object. In that triplet, "org" is the organization that published the course-like object, and "course" is the field that stores the name ID value of the course-like object, which may be a library name; e.g. if the library's ID is "library-v1:UniversityX+LIB100", then "org" is "UniversityX" and "course" is "LIB100". Since libraries do not have the concept of "runs", the "run" value of a library is always set to "library." The other difference between courses and libraries in this collection is that courses have a "version" object that contains a "draft-branch" and/or "published-branch" entries (that point to the current version of the course DAG in the "structures" collection), whereas libraries only have a "library" entry (which points to the current version of the library DAG in the same "structures" collection).

      Screenshot of an "active_versions" collection, showing MongoDB documents for both a library and a course:
       
    • Any XBlock's Scope.content field values will be stored in split's "definitions" collection. A single definition document may be shared among any uses of that same XBlock in libraries and courses.
    • Documents in the split modulestore are considered immutable; any changes to content in a course or library result in a new version of the definition and/or structure object being created, and the active_versions record is then updated to point to the new versions. (This is just a note - this aspect of the modulestore was not changed when content libraries were added.)
       
  2. The Library Root XBlock, a simple structural XBlock analogous to the "course" XBlock. This XBlock is the root of each content library. It is found in the platform code at common/lib/xmodule/xmodule/library_root_xblock.py. This block is just a container and does not have any user-visible functionality.
     
  3. "LibraryLocator" and "LibraryUsageLocator" opaque keys, which are identifiers used to uniquely identify libraries and the XBlocks that are contained within a content library. - https://github.com/edx/opaque-keys/pull/46
     
  4. Studio support for editing content librarieshttps://github.com/edx/edx-platform/pull/6046
     
  5. Studio Import/Export code allows exporting and importing content libraries as XML - https://github.com/edx/edx-platform/pull/6846
     
  6. The Randomized Content Block, an XModule that is used to show content from a library to students. In the code, the block is called the Library Content Module because the original intention was for it to support two "modes": randomized content and manually selected content. Since the latter mode was never implemented, the module's display name was changed to "Randomized Content Block."
     
  7. LibraryToolsService, an XBlock runtime service that provides some functionality that the Randomized Content Block (LibraryContentModule) needs in order to function, such as:
    • list_available_libraries(): used in the Randomized Content Block settings UI to allow the user to select which content library to draw content from.
    • get_library_version(): given a library ID, returns the current version number of that library.  Used to determine when new/updated content has been added to a library.
    • update_children(): described below in "How Components Are Copied Into the Course"

How Components Are Copied Into the Course

When users want to insert content from a library into a course, they first need to add a Randomized Content Block to the course, then edit that block's settings to select a library to source content from. Authors can also specify settings such as how many components from the library to show to each learner, and what types of components to select from the library (e.g. select only multiple choice problems).

...

Updating a Randomized Content Block will also delete any child components which were deleted in the new version of the library.

Settings Overrides

When a Randomized Content Block is present in a course, authors can use the "View" button to view the child components sourced from the library:

...

The UI currently allows authors to modify Scope.content field values of components sourced from a library as well. Such Scope.content field changes only affect the block as seen in that particular usage (that place in that course), and do not affect the original library component nor other courses/LibraryContentModules that use the same component. Additionally, changes to any Scope.content fields will be lost when the Randomized Content Block is updated (when it replaces its children with the latest versions sourced from the library). See "Future direction" below for how this could be improved.

How Components are Randomized

When a learner views a Randomized Content Block in a course, the LMS calls the Randomized Content Block's get_child_descriptors() method, which is responsible for determining which subset of components to show to that particular learner. Recall that all the possible blocks from the library have been copied into the course and exist as children of the Randomized Content Block; this means that get_child_descriptors() is responsible for "filtering" the children, so that only N children will be shown to the learner, where N is usually 1 but can be customized by the course author. The IDs of the blocks that were randomly selected for each learner are saved into the Randomized Content Block's "selected" field. For details on how this selection is made and what happens if the library is updated, or the Randomized Content Block settings are changed in a way that affects the selection, refer to the source code of make_selection(), which is well-commented.

Tracking Log Events

Tracking log events are emitted whenever a particular student is randomly assigned content from a content library, as well as any time that selection had to be changed (e.g. if a block was deleted). These events are documented at http://edx.readthedocs.io/projects/devdata/en/latest/internal_data_formats/tracking_logs.html#library-interaction-events

Future direction and technical debt

  1. Libraries currently cannot store assets (e.g. images), which is a big limitation. (See also GridFS Replacement.)
  2. When editing a component that was sourced from a library, authors cannot tell which fields are Scope.content fields (changes will be lost when updating the parent Randomized Content Block with the latest version of the library) and which fields are Scope.settings fields (changes are considered course-specific overrides and are preserved when updating the Randomized Content Block). The Scope.content fields should be disabled and authors should be prevented from editing them within the course.
  3. We need a new XBlock like the randomized content module, but allowing manual selection of one or more components from the library, instead of random selection. This was part of the original plan but was cut from the MVP.
  4. We need a way to tag content in the library (e.g. align with a taxonomy) and then have the randomized content block only draw problems that match certain criteria (difficulty, topic, etc.). This can also be the basis of adaptive learning features.
  5. Currently, content libraries support a very limited subset of XBlocks. More types of XBlocks should be tested and enabled for use in content libraries.
  6. The Library Content XModule (Randomized Content Block) should be converted to an XBlock. This is currently not possible because it depends on the following methods which are part of the XModule API but not the XBlock API:
    • get_child_descriptors()
    • has_dynamic_children()
    • get_content_titles()
  7. Libraries do not currently have a draft/published workflow, though the basic support for that exists in the split modulestore, analogous to courses.
  8. Libraries can support nested structures and can hold chapters, sections, units, etc. However the studio UI does not provide a way to do this. It could be interesting to explore use cases where authors have access to a library of course units or chapters and can build new courses by combining existing units or entire chapters that are sourced from a library.
  9. Enable search and filtering of content library content in the Studio UI and the Randomized Content Block UI
  10. If a course contains two randomized content blocks that each select one problem from the same library, there is a chance the student will have to do the same problem twice. It would be cool to prevent such duplicate random selections from happening, but is likely not worth the trouble.
  11. Brian Wilson suggested: Eventually, "research and course design teams may wish to be able to have access to scores, aggregated on a single [library component] across courses."
    • This may be easier to implement once the Robust Grades work is completed.
    • The tracking log events emitted by any blocks that were sourced from a content library already emit a "context" section that includes the original_usage_key and original_usage_version fields, which are required to identify library components across courses (see documentation).
  12. Authors should be able to move/copy a component from a course page into a content library.
  13. Support for external content libraries could allow a central repository of content, used by multiple Open edX instances.
  14. There seems to be a bug in XModuleMixin's location.setter: it sets def_id and usage_id to the same UsageKey value, but def_id should not be a UsageKey and should not be the same as usage_id in general. The code should use .runtime.id_reader to get the definition key. It's unclear why this bug isn't causing more problems, or if fixing it will cause any issues - this needs investigation.
  15. Jira Legacy
    serverJIRA (openedx.atlassian.net)
    serverId13fd1930-5608-3aac-a5dd-21b934d3a4b4
    keyTNL-5947