Extracting Course Scheduling and Discovery-related Data from ModuleStore

This is very much a collection of thoughts prompted by some recent discussions, and is not a plan of action or even a coherent proposal at this point.

Context

Right now, there’s bi-directional synchronization of Studio and Discovery course metadata. This is a source of confusion and operational pain:

  • We’ve had bugs related to infinite update cycles.

  • It’s not always clear what changed the dates and why.

  • Everyone hates that the default start date is in 2030

    • It’s only another seven years before this becomes an even bigger problem.

  • Even though it’s in modulestore, the scheduling changes happen immediately (i.e. they’re not tied to publishing)

  • We don’t currently have an authoritative thing to attach to in edx-platform when we want “the abstract notion of a course”

Goals

  • Streamline course creation

  • Allow for in-platform management of scheduling information and discovery metadata

  • Allow for external management (e.g. Salesforce integration, catalog service, etc.)

Current State

Within edx-platform the flow of course scheduling goes: Studio ModuleStore → Course Overview → LMS

Outside of edx-platform, there is a bidirectional relationship between Studio Modulestore ↔︎ course-discovery. The publisher tries to keep these in sync.

Proposal 1: Upstream Course Provisioning

  • Remove most of “Schedule & Details” from ModuleStore and store in one or more new apps in edx-platform.

  • New “Schedule & Details” app(s):

    • Not tied to the Studio publish cycle.

    • Expose a REST API so that they can be manipulated upstream.

    • Map “courses” and “course runs” to content/modulestore courses

    • Separate apps:

      • CORE: Course scheduling information (start, end, pacing, maybe enrollment start/end)

      • CORE: Catalog course/run mapping to ModuleStore, tying multiple runs together, etc. This includes course title.

        • Pre-reqs relationships?

      • Entrance exam?

      • Licensing?

      • Course introduction, language, images, details, effort sound like a separate discovery app that could be pluggable, but lives downstream of content authoring (because it’ll want tagging data that’s not in provisioning).

  • Is the Provisioning area the place where we also make decisions about what LTI keys are available?

  • Things like what forum to use seem like they’d be a Course Admin decision.

  • CourseOverview information should be derived from this new data source as a backwards compatibility measure.

  • Course content stores only relative dates (e.g. start + 1 week)

  • New system lives upstream of both Studio and LMS courseware.

  • This would require substantial changes to edx-when as well, for LMS-materialized dates.

  • Transition phase would involve Studio ignoring modulestore settings for these on a course by course basis.

  • Would impact git-authored courses the most.

    • Alternate import/export mechanism that sets things in the REST API?

  • We’ve talked before about content authoring vs. course run administration as separations in the interface, but this would be one step further back at the start.

    • Course Provisioning → Content Authoring → Course Administration

  • Q: How much of the Course → CourseRun relationship do we expose in a new Studio interface?

Follow-on Thoughts

Provisioning is its own family of apps (DDD subdomain even?), that would include:

  • Course → Course Run relationships.

  • The canonical representation of both of those concepts, that you would make foreign keys against when you are talking about Course Runs themselves and not their content.

  • User/role provisioning (including a lot of what is enrollment today).

But it doesn’t include other marketing-related details, like description, logo, professor bios and the like.

WIP (this is really half-baked) High Level Picture

Provisioning

  • Users

  • Organizations

  • Creation/existence of:

    • Courses

    • Course Runs

    • Libraries

  • Access control

    • Enrollment

  • High level scheduling–start dates, beta access, enrollment allowed, etc.

  • App/plugin allocation/configuration (e.g. LTI credentials?)

Authoring

  • Content authoring–problems, videos, HTML in (libraries, course runs)

  • Mapping of the LearningPackage contents to Courses, Runs, Libraries.

Course Admin

  • Relative scheduling (week 1, week 2, etc.)

  • Student management (grades, extensions, moderation, etc.)

  • Grading policy management.

Learning

  • LMS student interaction

Discovery

  • Description

  • Tagging

  • Search

Replication

If the provisioning apps were a separate repo that could be installed into any service, then we could have it work in one of two modes–primary or replica. Then we could have a message bus driven replication mode so that every service would get updates of course runs, users, and role data.

Appendix

Salesforce Object Manager

Field List

Field Editor

 

 

Dave Ormsbee (Axim)
May 23, 2023

Do you mean having some kind of freeform key/value store where sites can store arbitrary metadata against the CourseOverview? I suspect that we’d want to re-use tagging for that use case (i.e. be able to tag the abstract notions of a Course/CourseRun in addition to the specific contents).

Edward Zarecor
May 23, 2023

Yeah, that’s what I have in mind. I think this is a pattern frequently seen in platforms, Salesforce, Jira, etc. They ship with core objects, but allow them to be extended without altering the software. I’ll add a screenshot of the Salesforce Object Manager below. I think this could be represented via tagging, though we’d have to consider whether we’re supporting types for values.

Dave Ormsbee (Axim)
May 23, 2023

Looking at the screenshots, that appears a lot more sophisticated than simple tagging, and implies a whole separate app for this kind of supplemental data.

Edward Zarecor
May 23, 2023

I think Salesforce is the most complex example. There’s probably a good enough version that’s much simpler, but perhaps still more complex than tags. As a use case I’d state it: As an organization I want to enrich the core Open edX models to support my specific needs in a way that doesn’t complicate upgrades.