What is (and is not) modulestore?

In the process of trying to extract the modulestore (and "closely related code") from edx-platform I have consistently run into issues defining the scope of work in a way that made consistent sense to everyone interested in following the work. I think several of my problems boiled down to the rookie mistakes of not knowing a) what, exactly, the thing was that I was moving and b) who is the customer for the move and what do they want to gain from it?

My initial understanding was that "modulestore" was everything in edx-platform/common/lib/xmodule/xmodule/modulestore, but once I understood more about how it worked and especially how it was being tested I bumped the target code up to everything in edx-platform/common/lib/xmodule/. Neither of these was precisely correct and not knowing really what it was that conceptually constituted "modulestore" I've flailed a fair amount attempting to remove as much as possible on the advice of folks who have a lot more experience here than I do. I think in the end modulestore has grown beyond its original design and taken on some functionality it probably shouldn't have. Here are my best attempts at defining what modulestore is, what it does, and how it does it. Hopefully this will inform decisions related to what it should do and therefore which code might move to a new repo and what should stay.

What is modulestore?

  • Modulestore is a Python module that stores course Xmodule and XBlock data for edx-platform.
  • Much of the code for this lives at edx-platform/common/lib/xmodule/xmodule/modulestore, but there is also a fair amount of tightly bound code at the edx-platform/common/lib/xmodule/ level and there are code-level dependencies in a few other locations inside edx-platform.
  • For our purposes here "modulestore" is the code in the modulestore directory as well as its tightly-bound dependencies in common/lib/xmodule/

What does modulestore do?

  • Stores course-related Xmodule and XBlock data in 3 different backends
    • OLX (deprecated, but still used for import/export and testing)
    • Old mongo (deprecated, but still in use)
    • Split mongo
  • Caches that data
  • Presents a key/value store interface to that data
  • Implements a draft / published versioning system for courses
  • Provides facilities for migrating course data from one modulestore backend to another

Other things...?

  • contentstore / assetstore allow storing of static assets in split Mongo, their code is tightly bound with modulestore's
  • The "raw" module is the default class type in modulestore, does it make sense for modulestore to own that?
  • OLX import / export?
    • In order for this to work, modulestore needs access to all Xmodules / XBlocks referenced in the OLX, creating a dependency on edx-platform?
  • "Bundled" functionality?
    • inheritance?
    • sequence module?
    • course module?
    • tabs?
    • error handling?

Detailed breakdown of files as I get to them...

Functionality that is in edx-platform/common/lib/xmodule/xmodule/modulestore:

  • mongo/* ("old mongo")
    • Stores xmodules as a single document along with definitions and children
    • Has draft/published functionality
    • Implements a KVS
    • Implements a caching descriptor system
    • Implements bulk ops to prevent the cache from being cleared repeatedly during a publish
    • Implements a modulestore which:
      • Implements inheritance at the Mongo level
      • Implements draft / publish at the Mongo level
      • Allows CRUD access to courses, asset metadata, module data, etc.
    • Implements some Mongo management
  • split_mongo/* ("split mongo")
    • All of the same things as old mongo, but in the split methodology. Plus:
    • Implements a lazy loader for xblocks
    • Implements a cached id manager for split mongo
    • Implements a "write only dirty" pattern
    • Stores library data as well
  • xml.py / xml_importer.py / xml_exporter.py
    • Handles importing / exporting courses from OLX on disk
  • A handful of other utilities / helpers:
    • Django signals helper
    • Versioning helper for draft modulestores
    • edit_info mixin
    • custom exceptions
    • additional inheritance code
    • MixedModuleStore for aggregating across different modulestore, including across old and split instances
    • Helper functions for dealing with settings
    • Cusom field types for mongoengine
    • Search helpers
    • Helper for migrating xml or old mongo modulestore instances to split
    • A handful of other utils

That functionality is what I would deem "modulestore", personally. A pretty flexible way of storing off module data for course trees.

Other things that live in edx-platform/common/lib/xmodule/ and edx-platform/common/lib/xmodule/xmodule:

  • assetstore
  • contentstore
  • bunches of test files
  • templates, css, and js for xmodules / xblocks in those dirs
  • user partitioning code
  • some random util methods
  • many things called "modules", only some of which are xmodules:
    • video_module
    • many things dealing with annotation
    • backcompat_module
      • Smooths transition from old xml modulestore formats to newer
    • capa stuff
    • XModuleMixin and CourseOverview shared util code
    • conditional_module
    • course_metadata_utils
    • course_module
    • editing_module
    • edx_notes
    • error_module
    • 38 more files....

Places where modulestore (edx-platform/common/lib/xmodule/xmodule/modulestore) calls up to "higher level" code:

  • mongo/base.py
    • from xmodule.assetstore import AssetMetadata, CourseAssetsFromStorage
      from xmodule.course_module import CourseSummary
      from xmodule.error_module import ErrorDescriptor
      from xmodule.errortracker import null_error_tracker, exc_info_to_str
      from xmodule.exceptions import HeartbeatFailure
      from xmodule.mako_module import MakoDescriptorSystem
      from xmodule.mongo_utils import connect_to_mongodb, create_collection_index
      from xmodule.partitions.partitions_service import PartitionService
      from xmodule.services import SettingsService
  • split_mongo/caching_descriptor_system.py

    • from xmodule.library_tools import LibraryToolsService
      from xmodule.mako_module import MakoDescriptorSystem
      from xmodule.error_module import ErrorDescriptor
      from xmodule.errortracker import exc_info_to_str
      from xmodule.x_module import XModuleMixin
  • split_mongo/mongo_connection.py
    • from xmodule.exceptions import HeartbeatFailure
      from xmodule.mongo_utils import connect_to_mongodb, create_collection_index
  • split_mongo/split.py
    • from xmodule.course_module import CourseSummary
      from xmodule.errortracker import null_error_tracker
      from ..exceptions import ItemNotFoundError
      from xmodule.partitions.partitions_service import PartitionService
      from xmodule.error_module import ErrorDescriptor
      from xmodule.assetstore import AssetMetadata
  • split_mongo/split_draft.py
    • from xmodule.exceptions import InvalidVersionError
  • __init__.py