Migrating to Persistent Grading

Background

A learner’s grades are calculated by aggregated scores across individual XBlock usages. In the old days, grades would be recalculated from scores on-demand, every single time. This caused serious performance limitations around grade display, editing, reporting.

Starting in Hawthorne, there was a PersistentGradesEnabledFlag flag. When enabled, grades would be persisted in dedicated MySQL tables. They would only be re-calculated from scores when necessary. This improved performance and unlocked the development of in-demand features like the write-able grade book, but the system was still opt-in for each Open edX instance. When a instance operator enabled persistent grades, they’d also need to run the backfill (described below) in order to fill up the new tables.

Starting in Olive, persistent grades are enabled by-default. There is no option to disable them. This simplifies LMS feature development, but it does require that all operators run the backfill (described below) as part of their Olive upgrade if they have not already enabled persistent grades. The backfill can be run before or after upgrading to Olive.

The Persistent Grades Backfill

Here's how we did it for edx.org.  Note that you could do things somewhat differently; for example, you could backfill grades for a subset of courses at a time (using the --courses flag) instead of all your courses.

The compute_grades management command is the entry point for backfilling persistent course grades.  It works against a list of course ids, generating celery tasks over chunks of course enrollments.  For example, when backfilling a single course, the first generated task would create persistent grade records for enrollments 0-99 of the course, the next task would do enrollments 100-199, and so on.  We do this chunking to help spread load.  In addition, we randomize the order of the generated tasks to get a more even load among task workers.  This helps reduce spikes on workers and makes the overall time to completion more predictable.  So when backfilling more than one course, tasks for different courses will be interspersed.

The ComputeGradesSetting configuration model stores a list of course ids that the compute_grades management command does a backfill for.  In addition, it also stores the chunk size (i.e. number of enrollments) of each backfill celery task.

Running the Backfill

  • In your LMS instance, go to /admin/grades/computegradessetting/ to create a new instance of this model, specifying the per-task enrollment batch size (defaults to 100) and the course ids to do the backfill for.

  • Now it's time to run your backfill job.  Note that this could take quite a while to complete - we suggest using something like Jenkins to actually invoke the command.

  • It is run from an LMS installation as follows: 

    python manage.py lms compute_grades -v1 --settings=production --from_settings --routing_key=edx.lms.core.grades_backfill



  • The --from-settings flag specifies that the command should use the batch size and course id list from the latest ComputeGradesSetting configuration model.

Old Configuration Details

These are the toggles that controlled persistent grading from Hawthorne to Nutmeg. They have been removed for Olive.

  • Enabling persistent grades via PersistentGradesEnabledFlag in Django admin:

    • <LMS_ROOT>/admin/grades/persistentgradesenabledflag/ - when configuring for all Courses

    • <LMS_ROOT>/admin/grades/coursepersistentgradesflag/ - only if need to configure for specific Courses (e.g., for staged rollout purposes)

    • Persistent grading must be enabled in order to run the backfill.

  • How to treat missing grades:

    • grades.assume_zero_grade_if_absent  waffle switch at <LMS_ROOT>/admin/waffle/switch/