Extracting Course Scheduling and Discovery-related Data from ModuleStore

This is very much a collection of thoughts prompted by some recent discussions, and is not a plan of action or even a coherent proposal at this point.

Context

Right now, there’s bi-directional synchronization of Studio and Discovery course metadata. This is a source of confusion and operational pain:

  • We’ve had bugs related to infinite update cycles.

  • It’s not always clear what changed the dates and why.

  • Everyone hates that the default start date is in 2030

    • It’s only another seven years before this becomes an even bigger problem.

  • Even though it’s in modulestore, the scheduling changes happen immediately (i.e. they’re not tied to publishing)

  • We don’t currently have an authoritative thing to attach to in edx-platform when we want “the abstract notion of a course”

Goals

  • Streamline course creation

  • Allow for in-platform management of scheduling information and discovery metadata

  • Allow for external management (e.g. Salesforce integration, catalog service, etc.)

Current State

Within edx-platform the flow of course scheduling goes: Studio ModuleStore → Course Overview → LMS

Outside of edx-platform, there is a bidirectional relationship between Studio Modulestore ↔︎ course-discovery. The publisher tries to keep these in sync.

Proposal 1: Upstream Course Provisioning

  • Remove most of “Schedule & Details” from ModuleStore and store in one or more new apps in edx-platform.

  • New “Schedule & Details” app(s):

    • Not tied to the Studio publish cycle.

    • Expose a REST API so that they can be manipulated upstream.

    • Map “courses” and “course runs” to content/modulestore courses

    • Separate apps:

      • CORE: Course scheduling information (start, end, pacing, maybe enrollment start/end)

      • CORE: Catalog course/run mapping to ModuleStore, tying multiple runs together, etc. This includes course title.

        • Pre-reqs relationships?

      • Entrance exam?

      • Licensing?

      • Course introduction, language, images, details, effort sound like a separate discovery app that could be pluggable, but lives downstream of content authoring (because it’ll want tagging data that’s not in provisioning).

  • Is the Provisioning area the place where we also make decisions about what LTI keys are available?

  • Things like what forum to use seem like they’d be a Course Admin decision.

  • CourseOverview information should be derived from this new data source as a backwards compatibility measure.

  • Course content stores only relative dates (e.g. start + 1 week)

  • New system lives upstream of both Studio and LMS courseware.

  • This would require substantial changes to edx-when as well, for LMS-materialized dates.

  • Transition phase would involve Studio ignoring modulestore settings for these on a course by course basis.

  • Would impact git-authored courses the most.

    • Alternate import/export mechanism that sets things in the REST API?

  • We’ve talked before about content authoring vs. course run administration as separations in the interface, but this would be one step further back at the start.

    • Course Provisioning → Content Authoring → Course Administration

  • Q: How much of the Course → CourseRun relationship do we expose in a new Studio interface?

Follow-on Thoughts

Provisioning is its own family of apps (DDD subdomain even?), that would include:

  • Course → Course Run relationships.

  • The canonical representation of both of those concepts, that you would make foreign keys against when you are talking about Course Runs themselves and not their content.

  • User/role provisioning (including a lot of what is enrollment today).

But it doesn’t include other marketing-related details, like description, logo, professor bios and the like.

WIP (this is really half-baked) High Level Picture

Provisioning

  • Users

  • Organizations

  • Creation/existence of:

    • Courses

    • Course Runs

    • Libraries

  • Access control

    • Enrollment

  • High level scheduling–start dates, beta access, enrollment allowed, etc.

  • App/plugin allocation/configuration (e.g. LTI credentials?)

Authoring

  • Content authoring–problems, videos, HTML in (libraries, course runs)

  • Mapping of the LearningPackage contents to Courses, Runs, Libraries.

Course Admin

  • Relative scheduling (week 1, week 2, etc.)

  • Student management (grades, extensions, moderation, etc.)

  • Grading policy management.

Learning

  • LMS student interaction

Discovery

  • Description

  • Tagging

  • Search

Replication

If the provisioning apps were a separate repo that could be installed into any service, then we could have it work in one of two modes–primary or replica. Then we could have a message bus driven replication mode so that every service would get updates of course runs, users, and role data.

Appendix

Salesforce Object Manager

Field List

Field Editor