Collections Spec

In https://openedx.atlassian.net/wiki/spaces/OEPM/pages/4100358166 , we listed a number of questions around Collections, and reached the decision that we would have one generic Collection type. This page is for drilling down into the details of what the use cases and requirements are.

 

Definitions

Item

Any author-able, publishable entity in Libraries. This will include Component, Unit, Subsection, Section, and possibly others in the future. Items are individually versioned.

Collection

A grouping of any combination of Items in a Library. Collections can be created for pedagogical or administrative/workflow reasons. They can range in size from very small (a few Items), but can be very large (100K+). The same Item may appear in multiple Collections.

*Note: A Collection is not a Unit, Subsection or Section. Units, Subsections and Sections are defined as precise sequences of components/content, where the sequencing is intentional and delivers a specific learning outcome. Collections are non-sequential and may be much larger/contain much more content than a typical course unit, subsection or section. Furthermore, Collections are not versioned, nor can be published, whereas Units, Subsections and Sections are versioned and can be published.

Problem Bank [Item Pool? Content Pool?]

A block type that will be added to the course and is responsible for showing a random subset of Items to a learner. These Items will ultimately come from a Library, either by selecting a particular Collection and/or some set of tags.

MVP Limitations - Randomized pools can only be components, with the likely use case being problem blocks.

Problem blocks defined as scorable types, as listed here: https://openedx.atlassian.net/wiki/pages/resumedraft.action?draftId=4113006597

[Future state question: Do we need a concept for something that sits between a component and a unit. Ie, two components that need to be paired, sequentially. ]

Key user stories

As a content author, I don’t want to have to create a new Library every time I need to create a different pool of content for randomization.

  • This is a departure from v1 library behavior.

  • Relaunched libraries are intended to contain many more Items than v1 libraries, as well as Collections wtihin Libraries..

  • Implies a migration path where a v1 library maps to a Collection within a new library experience, rather than a 1:1 mapping of v1 to new style libraries.

As a content author, I want to be able to create subsets of content within a Library with any media type (videos, text blocks, problem blocks, or any combination thereof) as a means of organizing my content, and without the expectation that I would ever need to randomize from it.

Example: I want to organize a subset of videos on the same topic so that it’s easier for me to find them later when I want to use them.

  • Collections may be created for pedagogical, administrative, or workflow reasons.

  • Collections may hold any combination of types–there are no meaningful restrictions on them.

As a content author, I don’t want to lose the ability to create a subset of content (problems) to randomize from.

I still use randomized problem sets for my exams and assignments. But I need an easier way to create these subsets and use them in one or multiple courses, because the current workflow is cumbersome.

Non-Use Cases

These are things we explicitly want to discourage, because we plan to implement better ways to do them.

  • Modeling Units as Collections

    • In earlier iterations, we planned to have an “option to randomize or not randomize the set”, but doing so could encourage people to treat Collections as Units and

    • We plan to have Units be explicitly author-able things in Libraries. We don’t want Collections used for this purpose because it would add a lot of noise.

Requirements

  • Authors can create subsets of components for the sake of grouping like components together as a method of content organization and management. 

    • For example, I want to create a subset of evergreen videos about how to do peer-reviews.

  • Authors can create as many Collections in a Library as they wish.

  • Authors can add as many components to a collection as they wish.

  • Components can live in multiple collections.

  • Collections can contain mixed media types (for example, a collection may contain video components, text blocks and problems all within the same collection).

  • Collections may be sorted by:

    • Title

    • Last modified datetime of their contents.

  • Search/sort/filter within a collection

    • Basic keyword search within a collection

    • Basic sort and filter

      • Sort alphabetical

      • Filter by tag

    • @Jenna Makowski notes: “Really need to weigh the immediate user need against feasibility and eng effort, especially for MVP, esp since Collections will be assumed fairly small at the outset. Alternative barebones MVP approach is to auto-populate lists of components in collections alphabetically.”

    • [Key here is whether changes/updates to a collection in a Library need to auto-sync to item pools already being used in a course]

  • [FUTURE STATE] When Libraries support units, subsections and section, collections may also contain any combination of components, units, sections and subsections. Note that a single unit or a single section is not considered a Collection. Rather, a Collection is comprised of multiple sections, multiple units, etc.

How authors create Collections

  • When authors create new components in a Library or edit them, they can choose to “add to a new Collection” or “add to a pre-existing Collection”.

  • Authors can create a collection and add components to it. I know I need a collection of all my ‘how-to videos’, I go to my library > create new collection > title it > save. Now I can go ahead and add content to the existing library.

  • Authors can use tag queries to create Collections, based on tags that have already been added to components. For example, create a Collection, or add, with all components that have been tagged with “algebra”, “easy” and “multiple choice”. [live updates/sync option]

    • [Does adding tags to new content create an auto-sync for that content to be added to relevant collections? Would be a complex implementation]

  • Authors can give Collections titles and brief descriptions. These can be edited later.

  • Authors can add tags to Collections, in the same way that they can add tags to components.

How Collections function within Libraries

  • Authors can view all of their Collections in one place within the Library.

  • Authors can search for Collections in the same way they search for content. Collections turn up in search results, and authors can refine search results by collection.

    • Search results indicate whether the result is a component or a collection.

    • Eg, I conduct a free-text search for “algebra” and 239 results display, including individual components and collections. I can further refine my search for “collections only”.

  • OUT OF SCOPE FOR NOW: Searching or refining searches within a collection. The assumption is that Collections will be small enough that complex search functionality within a Collection probably isn’t necessary.

Limitations

  • A Collection can have up to 100,000 ( ? ) Items.

  • There can be at most 10,000 ( ? ) Collections in a Library

  • A single Item may belong to at most 100 Collections

    • This is largely to make sure that we can quickly do updates to Collections indexing that involve their contents, e.g. “sort Collections by their last modified Item”

  • Items have no manually selected ordering within a Collection.

  • Collections can overlap. An Item can belong to multiple Collections.

  • Collections do not nest. Collections are not Items. A Collection can be a superset of another Collection, but Collection cannot contain another Collection as one of its elements.

  • Collections are not publishable entities themselves.

    • They contain things that can be published (like Components), but there is no “draft” vs. “published” version of a Collection. Changes to Collection metadata like its name, description, or contents are saved immediately.

Open Questions

What’s the user story for needing to track changes to collections?

None of these are required for MVP.

Current assumptions:

  • Log what was added to and removed from a Collection.

  • Log who made those changes.

  • Log when the changes were made.

  • Log when edits are made.

  • Log when there are published updates to the Items in a Collection.