Libraries/Learning Core Arch Sync Notes
@Kyle McCormick @Dave Ormsbee (Axim)
- 1.1 Jan 13, 2026
- 1.2 Jan 6, 2026
- 2 Dec 16, 2025
- 3 Nov 25, 2025
- 4 Oct 21, 2025
- 5 Oct 7, 2025
- 6 Aug 19, 2025
- 7 Jul 29, 2025
- 8 Jul 22, 2025
- 9 Jun 17, 2025
- 10 Jun 10, 2025
- 10.1.1 Outline (45 min talk)
- 10.2 Outline Strawman:
- 11 Jun 3, 2025
- 11.1 May 15, 2025
- 12 2025-05-15
- 13 2025-05-06
- 14 2025-04-29
- 15 2025-04-02
- 16 2025-03-05
- 17 2025-02-05
- 18 2025-01-09
- 19 2024-12-18
- 20 Old Notes
- 20.1 Talk Proposal
- 20.1.1 Title
- 20.1.2 Description (<500 words)
- 20.1.3 Type
- 20.1.4 Target Audience
- 20.1.5 Proposal
- 20.1.6 Rough Talk Outline
- 20.1.7 Additional Notes
- 20.2 2024-11-20
- 20.3 2024-11-13
- 20.1 Talk Proposal
Jan 13, 2026
“ContainerType” (https://github.com/openedx/openedx-learning/issues/412)
edx-platform has developed the idea of each container having a single “ContainerType” – currently Unit, Subsection, or Section. https://github.com/openedx/edx-platform/blob/e6deac0cf12226c0b8d744ad17395373cfe0de03/openedx/core/djangoapps/content_libraries/api/container_metadata.py#L42
Do we want to actually support a Container having multiple “types”? E.g. can a Unit also be a ____ ?
If so, how should edx-platform change?
If not, can we codify the single-type restriction somehow?
Can we pull the idea of ContainerTypes into learning core?
Assumptions we could make:
Weaker: Every container has one type
In favor
Stronger: Every container is: Unit, Subsection, Section, (OutlineRoot)
Not in favor
Data model options
Put an actual field on the model for container_type?
we have this for components already. we also have the ComponentType model.
Somewhere we need to store the mapping of classes to OLX tags
How to register?
for components, it’s done thru xblock, and there’s a deterministic mapping between OLX tags and component_type names (
blah↔︎xblock.v1:blah)
Case study: “Assessment”
What data does the Assessment have?
Assessment
(student scores)
AssessmentVersion
proctoring / timing info
a PubEnt that is the assessment’s content
Or a Container?
3 options:
Assessment → Container
Container → Assessment
?
this got interesting – see recording
Implementation
Similar to ComponentType, create a ContainerType table
Add container_type field and make a migration to backpopulate
Update Django admin (optional but nice)
Rather than testing children stuff on every single concrete builtin container, register a fake container type and run all the core Container tests on that. This is necessary now since we are disallowing “naked” Containers
class FakeContainerSubtype(Container)← register it asfakecontainerAlso: https://github.com/openedx/openedx-learning/issues/308 , may need adjusting
Remove all the now-unnecessary logic in edx-platform’s modulestore_migrator and content_libraries apps now that container_type is better codified in Learning Core
including select_related queries, like this: https://github.com/openedx/edx-platform/blob/e6deac0cf12226c0b8d744ad17395373cfe0de03/cms/djangoapps/modulestore_migrator/api/read_api.py#L51-L55
Jan 6, 2026
Proposal: https://github.com/openedx/openedx-learning/pull/454
We already require release-to-release migration
Investigate squashed migrations for going across app and bootstrapping
authoring.subappsinstead ofapplets?
Taxonomies as PublishableEntities
Braden: Could we feasibly store it as a blob and not do the full versioning, or keep just the current draft/published versions to allow foreign keys to content?
Sam wants to implement a feature for bulk publishing--does our data model support this?
Dave thinks this is doable. Will look into making that query.
Dec 16, 2025
https://openedx.atlassian.net/wiki/spaces/OEPM/pages/5381390339
two layers of pluggability / two axes
granularity of content (course run as a node, unit as a node)
flexibility of completion criteria (all the things, some subset of things)
mary from unicon was talking about different kinds of criteria
v1: a fairly powerful datamodel with ands and ors on ndes
unicon proposal focuses on tags
tags map to competencies
can aggregate tags together
“you completed 3/5 things that represent some concept, which then rolls up into some greater concept”
we could bake everything on that notion
have competencies as the lowest common denominator type thing
it would be good to keep these axes independent of one another
we have to support CBEs, but it should also be agnostic
you should be able to say “complete these 6 courses” without involving competencies
“X is in the pathway”, without encoding how it X is part of the pathway’s completion
braden: what is CBE in this context?
dave: evaluate what you know rather than what you do
being able to assess what you understand
Nov 25, 2025
Kyle: Probably able to get FC money to push this along, but haven’t gotten that yet.
Dave: Is it worth considering alternatives to Pydantic for schema validation?
No.
Dave: Thought was that validation mechanism would be converting to JSON.
Multiple layers of validation--e.g. containers and components vs. more specific things that need plugins.
Plugin Examples:
Annotating items with notes/discussion. Could be limited to Studio (author notes).
PublishableEntity → messages
Frontend: Tab entry
Workflow
(see how far we can go with just tags)
Does this replace Asides?
AI Content Generation
Individual XBlocks
VideoBlock
ProblemBlock
Is it okay if we have field data that is redundant with OLX?
Braden: Okay, as long as source of truth is clear.
@Dave Ormsbee (Axim) to look at how to discourage direct model writes.
Braden: Let’s make sure there’s a solid justification for making this, i.e. we could just make a JSON blob of metadata for plugins to play with.
High risk areas / things that need a lot of deliberation:
https://github.com/openedx/openedx-learning/issues/435
Braden: Would be good to identify plugin/application use cases where they could use purely Learning Core as a library and do something useful without having to use edx-platform. e.g. analyze content and produce a report.
Kyle: For Verawood/LC 1.0, focus on edx-platform plugins, not necessary LC plugins.
Should we have OLX parsing for containers in LC?
Braden: Are we going to keep components separate from XBlock?
Kyle: XBlock as a layer over Component, but move as much of the common stuff into Component.
Leave XBlock as an escape hatch
Would be nice if the core “Component” API can handle basic things like display_name, scores (max_score, is_scorable), (tags?), even parsing them from the OLX if needed but never invoking/requiring xblock plugins (for specific xblock types). This should be doable since the outer OLX tag is very consistent with how it handles these mixin attributes, even though the inner content can vary wildly among different xblock types.
Oct 21, 2025
python_lib.zip in v2 libs
ideas
ability to load a python_lib.zip in a component in the Advanced section
conclusion: this would cause pain for import/export/backup/restore. Too hacky to support long term, don’t do this.
what is a code library?
a component? no, we want it to belong to other components (problems)
an asset? no, we’d like versioning
maybe something that works with zip archives, which we can store cheaply. (want to avoid the high overhead of having many versioned files given how we store version-name mappings)
backport?
yes
Important note: v1 libraries don’t copy python_lib.zip either. We might be able to get away with creating a new Resource type that works how we want within libraries, but relies on authors to upload the python_lib.zip file to their course. (Or maybe we let them select one python_lib.zip from their course to sync over?) It would let us side-step the “every library problem could have a different version of a python_lib.zip file to store” issue.
Oct 7, 2025
Reviewed and updated https://github.com/openedx/openedx-learning/issues/353
Aug 19, 2025
Efficient Container Hierarchy Traversal · Issue #360 · openedx/openedx-learning
Braden: general agreement on table structure and
Discussion of the justification for outline root being special (fixed location, expectations, ordering, simple select of everything in a course). Breaks assumption that parent entity is n+1.
Special, nullable root field.
Why component vs. entity?
tighter focus
ability to
But higher levels, could be containers/container versions instead. more correct, db can verify
Schema: special fields for the leaf node and the root node, container references for intermediate nodes
(will need to join against xblock field data for inheritance calculation)
Dynamic children
Braden: If we have randomized content, where does it get represented?
Kyle: Where I’ve been leaning lately is that these Selectors exist outside the hierarchy and have metadata that affects how you’d turn an authored hierarchy into a user’s hierarchy.
Braden’s concerns: Are we going to have randomized content blocks with 10K+ children? Will it break this? If people naively look at this API, they might see these children and mistake them. How do we keep API users from misinterpreting this.
Can we make pointers to DAG’d problems, something lightweight?
Followup: talk to product/UX about DAGs in courses.
Jul 29, 2025
Dynamic Children
Do we try to unify the user partition functionality with dynamic child selection, or keep them separate?
Do we build a separate model for more efficiently storing hierarchy for rapid traversal?
(We really didn’t take notes well this session.)
Jul 22, 2025
CCX in learning core
Decision: Move CCX to openedx/ so it can be shared by LMS and CMS. But make sure that LMS can’t write to modulestore via the CCX wrapper.
Implications for permissions requirements?
Future
CCXs are PEs within a LearningPackage, the same LearningPackage of the CourseRun that they are based on
Pearson (along with others) want to have a more flexible level of customization.
MIT’s CCX use case is more restrictive, intentional limitations of customization to scheduling and hiding particular content.
It’s possible that we can implement the more flexible use case while having a list of customizable things, and then the MIT use case can be covered by disabling certain allowed customizations..
Retro: Lessons Learned from the Prototype
Learning Core → ModuleStore Shim
It looks workable.
We should store Course Usage Keys separately from the
PublishableEntity. I first thought this was a compromise, but I’ve come around to the idea that it makes perfect sense, since those usage keys are very much an XBlock runtime concern, and the XBlock runtime app can control that mapping and the constraints on it.The definition doc envelopes are easy enough to generate on the fly.
We do need the PublishLog and proper side-effect tracking in order to provide a real value for
subtree_edited_on, because that’s used for caching and other comparisons.There are structure doc fields that aren’t necessary to fully preserve (e.g. what version initially created this thing), and would only be used for historical comparisons to other structure docs that won’t exist.
We do need to create branch awareness for preview purposes (this is more a reminder to myself).
We can get into a weird state if we let Modulestore try to edit courses that are being shimmed because the structure writes are thrown away.
We should check our structure caching with CCX to make sure it’ll work correctly (it’s a bit broken in split today).
Whatever we do for dynamic children needs to be able to compile out into parent-child relationships for the purposes of the shim.
Maybe we hide the supporting/weird blocks? Some of these look like they’re just bugs, so it’s probably worth figuring out what’s going on and maybe remove their usage in the course editing code.
Import of course data
We should batch search indexing (on the DraftChangeSets?)
Search indexing in general needs a bunch of improvements right now
While using migrator, it’s possible to get into a corrupt data state that is irrecoverable. Example: having a section and adding its container versions without adding related section versions. May be other places where we have app level constraints that aren’t reflected by database constraints.
Many ways for data to become inconsistent
Want clean separation on platform vs. learning core concerns. Not always clear when to put stuff in one place or another.
Django Admin is super-helpful when you build it out.
Modulestore Migrator is a mix of the libraries API and the XBlock API and the Learning Core APIs, and it needs to currently use all the libraries API to make sure we don’t miss upstream library stuff (like indexing). But should make it safe to use the authoring API directly so it bubbles up.
Events need to get into Learning Core. Hard to make sure it’s consistent otherwise.
What is a plugin that you could test with just Learning Core?
Discussions, where you need references to content and configuration that’s in content, but it has its own data and views.
ProblemBlock if there were a minimal xblock runtime
ORA2, sophisticated models
VideoBlock, being able to add VAL-like data
Would be great to have a minimal XBlock runtime that is used by edx-platform as well as xblock-sdk envs.
We don’t have a good story for Asides support.
Formally Deprecate XBlock after we figure out a good plan for dynamic children.
Need to measure data accumulation / pruning needs.
Are there other big unknowns we need to figure out aside from dynamic children?
userpartitions (part of selectors effort?)
Catalog course? (Not in authoring, anyway)
Mostly have static assets figured out?
Other, not-really-XBlock things:
Grading Policy
Scheduling?
Re-runs. e.g., whether we redo the versioning data model to allow for re-runs to reuse more componentversions. Or whether we add something lighter-weight. “branches”?
Keys, uuids, pks
Is there a faster path to support courses in Learning Core keeping the existing Studio UI exactly as-is?
Would it be worth it?
Jun 17, 2025
Kyle:
Migration tool and Django admin for outline roots
Braden
Mostly been finishing up other projects behind schedule
Will work on slide
Dave
Pushing basic structures to LC->Split shim works, but defs still coming from MongoDB right now.
Jun 10, 2025
Dave
Got Piotr started on the rendered JSON
Hack for Learning Core course key mapping: run starts with “LC”
Kyle
Import API:
feat!: modulestore_migrator by kdmccormick · Pull Request #36873 · openedx/edx-platform focusing on this over dynamic content
For integration:
Prototype models for Courses and Outline Roots in Learning Core by bradenmacdonald · Pull Request #316 · openedx/openedx-learning
What we submitted: https://sessionize.com/app/speaker/session/799693
Original Text of submission
The new Libraries experience introduced in Sumac stores content using Learning Core–a new, more efficient, and more extensible successor to the MongoDB-backed ModuleStore backend currently used for courses and legacy libraries. Learning Core offers tremendous benefits to operators and developers alike, but we must migrate our course content in order to fully realize those benefits.
We will explain these benefits in detail, propose a migration process, and explore the longer term implications.
Primarily, we want to communicate the benefits for site operators who undertake this migration. Secondarily, we want to touch on some nuances of the migration that developers may be interested in, providing them the knowledge and resources to learn more outside of the talk.
Short-term benefits (pre migration of courses) include a stable plugin API for library content authoring. We hope to include a reference plugin that enhances the library authoring experience in some way– for example, a plugin that displays version history for all library components. We would like to explain how this new plugin API dovetails with other existing Open edX extension points like Events, Filters, Slots, and XBlock.
Medium-term benefits (immediately post migration of courses) include: removal of MongoDB as platform dependency, a stable plugin API for learning components and assets, better Files & Uploads experience for course authors including versioning and searching, better content inspection and querying for administrators, more efficient serving of assets, reduced storage needs, and reduced memory overhead.
Long-term benefits include: stable plugin APIs for authoring units and sequences, user partitioning, and other learner-content interactions; ability to offer enrollable content outside of the traditional 3-level Open edX course hierarchy; massively simplified edx-platform maintenance; and better unit test data for edx-platform developers and plugin developers.
We also want to discuss backwards compatibility. We expect to be fully backwards-compatible with all Studio content and most-if-not-all OLX-authored content, with some caveats where compatibility is at odds with content security. For Open edX plugins which access ModuleStore today, we expect some of them to continue working, and others to break; we will go into more detail on the distinction between those two categories.
Outline (45 min talk)
Outline Brainstorming
Kyle: What would it be like to run LMS without MongoDB?
(Demo)
Here’s why you can’t do that in Teak
Talk about migration.
Braden: A lot of folks have ModuleStore understanding, vaguely know LC. Main point would be migration process, timeline
Kyle: Why is this additive, not just subtractive. Powerful plugin API.
Braden: Example of properly integrating video information into libraries and not the mess we have with VAL today.
Kyle: Would be nice to add some data to content just to show we can do it.
Call to action?
Kyle: Upgrade
Braden: Once it’s stable enough, would love to have a Learning Core course that’s editable from Libraries and a part of the dev experience.
Dave: anything they can do to prepare their content for this?
Selling it
removing mongodb obvi
cost savings?
ztraboo’s post on gridfs - good case study on how much s3 would save operators
one less piece of infra
show off the improved data model?
can we spruce the admin interface up more? → @Kyle McCormick
libraries UI lets you see raw olx
extensibility in the future
Much easier to show/access history
Have one xblock that extends the model
talk about shimming and how that’ll smooth transition
Outline Strawman:
(Dave: A starting point for conversations about our talk outline. I don’t feel like it flows together very well at the moment, but let’s talk about it in the next session.)
What would mean to run LMS without MongoDB?
What’s in our way?
Motivation
Cost reduction.
Platform simplification.
Transactions (good and bad (celery)).
Granular, extensible data model.
Concrete example here would be great, e.g. video or problemblock information, contrast with VAL.
We can show screenshots of existing Django admin functionality
How will this work?
Goals: Transition quickly while preserving backwards compatibility.
The LMS ModuleStore read-shim/compilation step.
Porting Studio to be able to write Courses in Learning Core.
Files and Uploads and where they’re stored.
Gradual porting of other systems to bypass ModuleStore, e.g. grades, course blocks API, outlines, CSM.
What changes will there be to the authoring experience?
Course-centric editing is not going away, though it may not be exactly the same as the current course experience.
We want to make it easier to bridge different levels/types of content, e.g. small courses,
Timeline? Call to action?
Other ideas:
Break this up by target persona? Students aren’t intended to notice any difference during the transition, but we can separately map out Course Authors, Developers, and Ops folks?
Jun 3, 2025
May 15, 2025
Talking git-ification:
We could do add a
version_numin a join table with the PE if necessary. Also depends on whether we want to restart history on re-run or not.Cleanup gets harder because no direct fkey to
PublishableEntityWon’t happen automatically, but as long as there’s a join table for entity version ↔︎ entities. So look for versions that aren’t referenced (have cascade deletion for the entity)
Where should the CatalogCourse and Course live? Separate provisioning repo? Separate package? Want to be able to provision it before content exists potentially.
Kyle: openedx-learning should stop short of any catalog understanding, separation content from how people find and get access to content.
Dave went over Explicitly modeling publishing dependencies · Issue #317 · openedx/openedx-learning
2025-05-15
@Braden MacDonald opened a PR for OutlineRoot
Why did we want to put CourseRun in edx-platform?
Dave: dependencies we’d need to pull into learning core to have CourseRun there…
cohorts
grading policy
etc