Extracting Course Scheduling and Discovery-related Data from ModuleStore

This is very much a collection of thoughts prompted by some recent discussions, and is not a plan of action or even a coherent proposal at this point.

Context

Right now, there’s bi-directional synchronization of Studio and Discovery course metadata. This is a source of confusion and operational pain:

  • We’ve had bugs related to infinite update cycles.

  • It’s not always clear what changed the dates and why.

  • Everyone hates that the default start date is in 2030

    • It’s only another seven years before this becomes an even bigger problem.

  • Even though it’s in modulestore, the scheduling changes happen immediately (i.e. they’re not tied to publishing)

  • We don’t currently have an authoritative thing to attach to in edx-platform when we want “the abstract notion of a course”

Goals

  • Streamline course creation

  • Allow for in-platform management of scheduling information and discovery metadata

  • Allow for external management (e.g. Salesforce integration, catalog service, etc.)

Current State

Within edx-platform the flow of course scheduling goes: Studio ModuleStore → Course Overview → LMS

Outside of edx-platform, there is a bidirectional relationship between Studio Modulestore ↔︎ course-discovery. The publisher tries to keep these in sync.

Proposal 1: Upstream Course Provisioning

  • Remove most of “Schedule & Details” from ModuleStore and store in one or more new apps in edx-platform.

  • New “Schedule & Details” app(s):

    • Not tied to the Studio publish cycle.

    • Expose a REST API so that they can be manipulated upstream.

    • Map “courses” and “course runs” to content/modulestore courses

    • Separate apps:

      • CORE: Course scheduling information (start, end, pacing, maybe enrollment start/end)

      • CORE: Catalog course/run mapping to ModuleStore, tying multiple runs together, etc. This includes course title.

        • Pre-reqs relationships?

      • Entrance exam?

      • Licensing?

      • Course introduction, language, images, details, effort sound like a separate discovery app that could be pluggable, but lives downstream of content authoring (because it’ll want tagging data that’s not in provisioning).

  • Is the Provisioning area the place where we also make decisions about what LTI keys are available?

  • Things like what forum to use seem like they’d be a Course Admin decision.

  • CourseOverview information should be derived from this new data source as a backwards compatibility measure.

  • Course content stores only relative dates (e.g. start + 1 week)

  • New system lives upstream of both Studio and LMS courseware.

  • This would require substantial changes to edx-when as well, for LMS-materialized dates.

  • Transition phase would involve Studio ignoring modulestore settings for these on a course by course basis.

  • Would impact git-authored courses the most.

    • Alternate import/export mechanism that sets things in the REST API?

  • We’ve talked before about content authoring vs. course run administration as separations in the interface, but this would be one step further back at the start.

    • Course Provisioning → Content Authoring → Course Administration

  • Q: How much of the Course → CourseRun relationship do we expose in a new Studio interface?

Follow-on Thoughts

Provisioning is its own family of apps (DDD subdomain even?), that would include:

  • Course → Course Run relationships.

  • The canonical representation of both of those concepts, that you would make foreign keys against when you are talking about Course Runs themselves and not their content.

  • User/role provisioning (including a lot of what is enrollment today).

But it doesn’t include other marketing-related details, like description, logo, professor bios and the like.

WIP (this is really half-baked) High Level Picture

Provisioning

  • Users

  • Organizations

  • Creation/existence of:

    • Courses

    • Course Runs

    • Libraries

  • Access control

    • Enrollment

  • High level scheduling–start dates, beta access, enrollment allowed, etc.

  • App/plugin allocation/configuration (e.g. LTI credentials?)

Authoring

  • Content authoring–problems, videos, HTML in (libraries, course runs)

  • Mapping of the LearningPackage contents to Courses, Runs, Libraries.

Course Admin

  • Relative scheduling (week 1, week 2, etc.)

  • Student management (grades, extensions, moderation, etc.)

  • Grading policy management.

Learning

  • LMS student interaction

Discovery

  • Description

  • Tagging

  • Search

Replication

If the provisioning apps were a separate repo that could be installed into any service, then we could have it work in one of two modes–primary or replica. Then we could have a message bus driven replication mode so that every service would get updates of course runs, users, and role data.

Appendix

Salesforce Object Manager

Field List

Field Editor

 

 

Jenna Makowski
May 23, 2023

Yes. 100%, from a product/Studio user (author, course team, etc) perspective.

This is the product proposal as outlined in the Studio Home Redesign: Product Brief:

  • Separate the UI workflows: 1) For creating a new course and 2) For creating a course run/re-run. This is guided by the idea that the persona responsible for courses are the instructional designers and SMEs, while the persona responsible for the course runs are the course teams, instructors, moderators. Sometimes they overlap but many times they don't.

  • Disentangle the metadata types into two distinct categories of metadata that align with those two workflows: course metadata and course run metadata

    • Course metadata: Static fields that describe the course content or course experience and do not change from run to run, for example effort, learning objectives. These are currently positioned as discovery fields but should be owned by the content SMEs.

    • Course run metadata: Dynamic fields that change for each run, for example logistics data like course dates, enrollment dates.

 

Dave Ormsbee (Axim)
May 23, 2023

I had some questions in terms of how we’re thinking of Courses vs. Course Runs, as it relates to how we might tease this data apart.

A number of the older edX courses were originally direct adaptations of semester long courses, but were then split up into two, three, or four separate courses–so “6.002x” became “6.002.1x”, “6.002.2x”, and “6.002.3x” . The content was pretty much identical, but just split out into more manageable chunks.

  1. Would you see these as entirely separate courses or different runs of the same course?

  2. If they’re entirely different courses, is there any sort of relationship between them? If so, what is that relationship called?

  3. CCX course content can vary wildly in scope from its parent, since the schedule can be completely changed and parts of the course hidden. Is each CCX a separate course, or a different type of course run? Or something else?