/
Robust Grades Design

Robust Grades Design


This document provides motivation, background, and design for creating a more robust and reliable grading subsystem in the edX platform.

Motivation

As edX further invests in creating a viable platform for teaching and learning credit-deserving courses and degrees, it is essential for us to have a robust grading framework.  As noted in Grading @ duke.edu, accurate grading and prevention of cheating in a MOOC are necessary for successfully making its "MOOC Credit" credible.  This document describes the next steps that the edX Teaching & Learning team plans to make on the path towards more reliable grades.

As we step back and design grading from the perspective of reliability and robustness, we need to ensure that we address the short-term issues faced by our users, including course teams, partners, and learners.  The Inventory of grade issues document gives a rough idea of the types of grading reliability issues that edX has faced over the last 2 years. This design attempts to address the various issues tagged in that document as

  • Hidden Content - Course teams want to hide subsections to prevent cheating, without affecting grades.
  • Changed Content - Edits to course content should not affect learners' grades, certificate status, credit eligibility, etc.
  • Certificates - Final course grades seen on learners' certificates should be accurate and reliable.

Executive Summary

The following table provides a high level summary of our plan of attack to address the high-level use cases and grading-related issues, to date.

Use CaseFeature Design that satisfies the use case
Hide Content to prevent cheatingBlockContentViewPermission
Change of Content doesn't affect past grades

SavedSubsectionGrade

CourseAuthorManualRescoring

Course Version Locking (no longer needed)

Certificates retain accurate gradesPersistentCourseGrade
In Exceptional cases, grades can be overridenGradeOverrides
Event trail enables debugging of grade issuesEventing/Logging of Grading Changes

Background

Please see Grades Background.

Note: For a published description of the final architecture see Grades Architecture.

Requirements

Deferred goals

These are things we would like to do, but will defer until after the initial rollout of these changes.

  • Create a public API for accessing grading information.
  • Create a separate grading service, with grading-related logic pulled out of the monolith.

Future goals

These are things we have considered, but have no plans to pursue at the moment.

  • Modifying overall course grade computation policy to account for:
    • showing a cumulative percentage based on completion so far while discarding uncompleted assignments
      • e.g., If a learner gets a 100% on the 1st homework out of a potential 5 homeworks (yet to be released), the learner doesn't confusingly see only a 20% grade on homework.
    • requiring a minimum grade for each assignment type
      • e.g., A course author can specify a passing threshold on each exam, requiring each exam to be at least 60%, regardless of what the cumulative exam grade percentage is.
  • Consolidating/generalizing ORA's handling of content and grade versions with what's available for other problem blocks.
  • Allowing learners to view older locked-down versions of the course after content changes, per Course Version Locking.

Use Cases

Robust Grades

The importance of grades is increasing as we grow the number of courses offered for credit on the edX platform - edX, learners, and course staff want to have reliable grades that all parties can feel confident are accurate.

  • Course Team Use Cases: 

    • I want to be able to determine with confidence whether a learner has demonstrated mastery of the material in a course.

    • I want to modify content without having any impact on grades or certificates (unless specifically desired).

    • I want to hide assignments to maintain integrity and prevent cheating.

  • Learner Use Cases:

    • While I work through my course, I want to understand my grades and my progress towards completion.

    • I want to know that any time and anywhere I see my grade on edX.org, it is consistent and accurate.

    • I want to always have a record of my achievements in a course, so that I can share with potential employers or schools.

  • edX Staff Use Cases:

    • I want to feel confident that the grades we share with partners are truly accurate.

    • In Support, I want to be able to address any questions a learner has regarding their grades confidently.

Grade Exceptions

Like courses on campus, course authors want the ability to provide exceptions to individual learners. Course authors also want to maintain a record of when and why the exception was allowed, to ensure integrity.

  • Course Team Use Cases:

    • I want to provide an individual learner a grade override on an assignment for extenuating circumstances  

    • I want to provide an individual learner an extension on an assignment for extenuating circumstances

Scenarios

The scenarios in this section describe the expected behavior whenever an instructor changes something in an active course after learners have already submitted answers to problems.  These scenarios are not supported on the edX platform today (July, 2016), but will be supported once the changes explained in SavedSubsectionGrade and CourseAuthorManualConflictHandling are implemented.

Starting Condition

For the following scenarios, consider the following starting state:

  • A Subsection has two problems (P1, P2).  For each problem:
    • Maximum score (M) is 4.
    • Weight (W) is 5.
    • Total possible score (MW), Maximum * Weight, is 20.
  • A learner has attempted the two problems in the subsection, and gotten the following scores:
    • P1: Learner's Raw score (R) is 2, so Total score (T) is 10, since T = R*W.
    • P2: Learner's Raw score (R) is 0, so Total score (T) is 0.
  • The learner's Aggregated Subsection score (SA) is therefore (10 / 40), which is (sum of all Ts / sum of all MWs).

This is visually depicted as follows:

Scenario 1: Problem Content Edited

In this scenario, the instructor edits the content of the first problem (P1) after the learner has already submitted a response to the problem.  The instructor decides whether or not to rescore the problem, and if so whether or not to rescore only if the learner "gains" (score is improved) from the rescore.

Scenario 2: Problem Weight Edited

In these two scenarios, the instructor edits the weight of the second problem (P2) after the learner has already submitted a response to the problem.  The weight is either increased or decreased.  Increasing the weight of P2 negatively affects the learner's score since the learner had gotten 0 points on it.  So the decision of "rescore if gain only" matters.  In contrast, decreasing the weight of P2 has a positive effect on the learner's score and so the "rescore if gain only" decision doesn't matter.

Scenario 3: Problem Added

In this scenario, an instructor adds a problem (P3) after the learner has already submitted problems in the containing subsection.  Since adding a problem with a non-zero weight will always negatively affect any pre-recorded grades for the containing subsection, "rescore if gain only" will always revert back to the previously recorded grade.

Scenario 4: Problem Removed

In these scenarios, an instructor removes a problem that a learner has already submitted.  If the instructor chooses to rescore, the learner's grade may be negatively or positively impacted depending on which problem is removed.

Scenario 5: Grading Policy changes

When an instructor makes any changes related to the course's grading policy, any learner's grades previously persisted for a subsection is not automatically impacted.  Consider each of the following cases separately:

  • Policy settings that impact the aggregated course grade but don't impact the aggregated subsection grade, since they are used "above" the subsection grade computation layer:
    • Grade Range policy changes
    • Assignment Type policy: Weights and Allowable drops
    • A subsection's designation to an assignment type
    • A subsection's setting of whether it is 'graded'
  • Problem settings that affect a problem's score.  Instructors can change these settings and then optionally decide whether or not to rescore the problem.  If they choose to rescore, the containing subsection's grade is updated as a side effect of updating the problem's score.  Otherwise, all saved subsection grades remain as they are.
    • A problem's weight
    • A problem's external grader configuration
    • A problem's individual grading policy - as currently supported by ORA's assessment configuration

Design

We expect that the following set of changes will collectively make the grading subsystem more robust and reliable going forward.

Block Content View Permission

Distinguish between

  1. hiding a block's content from the learner's current view (block is still available for grading and appears in course outline), and
  2. hiding entire block so the learner does not have access to it completely (block is not at all available in the courseware for the learner).

See further description in TNL-4896 - Getting issue details... STATUS .

Persistent Subsection Grade

  • Let subsections be the granularity of saving grades for the following reasons:
    • Subsections are the granularity for managing grading policies, hiding content, specifying due dates, etc.
    • Subsections appear (with annotations for assignment type, graded, etc) in the left-hand navigation bar in the LMS.
    • Progress in the course is visualized and broken down by subsections.
    • Note: There is an existing open edX fork that considers verticals, instead of subsections, as the granularity for grades.  So when implementing, look into parameterizing the component level at which grades are persisted.
  • Save checkpoints of Subsection Grades in a SQL table (not necessarily using the Submission API).
    • Save a subsection's grade whether or not the subsection is marked as "graded".
      • This way, grade persistence would be agnostic to "graded" designation changes.
      • Besides, progress information is displayed for ungraded subsections as well (as "practice" problems).
    • Save the total_weighted_raw_score and the total_weighted_max_score for the subsection as separate fields.
      • This way, each learner can have different maximum values if the maximum changed in subsequent edits of the course.
    • Each record would have the following columns: 
      • creation timestamp, edit timestamp, course edit version, user_id, course_id, content_id, total_raw_score, total_max_score
  • Update the subsection grade whenever a score of any problem within the subsection changes
    • There is no reason to wait for all problems in the subsection to be graded before persisting the subsection grade.
    • A problem score may change when:
      • A learner submits the problem.
      • An async grader grades the problem and updates the score (e.g., external python grader or ORA).
      • An instructor manually rescores a component, after
        • changing a weight of the problem
        • editing the problem, etc.
  • Version the saved subsection grade by also saving the edit version number of the course at the time the subsection grade was computed.  This edit version number is useful when debugging since each learner's saved grade will now be based on different edits of the course.

Data Model Notes

Note: See Grades Data Model (for published description).

  • For storage scalability, 
    • we will have a mutable table of saved subsection grades, where each row represents a user's grade for a subsection in a course.
    • we will not generalize this table to store other types of aggregate information - at least not for now until the data model has been more tested in production. 
  • For performance,
    • we will create indices for common query patterns, such as an index_together object for (user_id, course_id).
columnpurposeadditional info

Identifiers
id
course_id
user_id
usage_key

uniqueness and identification

(course_id, user_id, usage_key) together form a unique identifier for each row in the database. 

Additionally, the id field is an automatically generated primary key for the table.

The usage_key is the opaque key identifying the location of the Subsection.

Timestamps
created
modified
subtree_edited_timestamp

debugging & recovery

The modified timestamp allows us to find and recover from issues that were active during a limited timeframe. For example, find and reset all grades that were computed while bug X was alive in production.

The subtree_edited_timestamp allows us to determine whether any content within the subsection has changed since the grade was computed.

course_version

debugging & recovery

Allows us to immediately find and retrieve the exact version of the course that was active when the grade was computed.
visible_blocks

grading visualization
debugging & recovery 

Record the block-ids that the user had access to when the grade was computed.

Use this information to populate the information on the progress page so the saved subsection grade is always consistent with the list of shown problems on the progress page. The list of problems currently available in the course may be different at the current time, but this data allows us to retrieve the blocks that were used in computing the grade.

total_weighted_raw_score (earned_all)
total_weighted_max_score (possible_all)
total_graded_raw_score (earned_graded)
total_graded_max_score (possible_graded)

grading computationRecord the total score and the total maximum score in separate columns so the saved grades can be aggregated further to compute higher-level grades

Persistent Course Grade

  • Why persist?
    • Grade Override - In order to allow course teams to override (and have a final say on) a learner's course grade, we need a table to store the learner's grade.
    • Performance - Although may not be as necessary given that the grading infrastructure is now faster and the grades at the subsection will be persisted, by also persisting the course-level grade, requesting a learner's course grade would be satisfied with a quick database lookup.
  • Decouple persistent course-level grades and component-level in-progress grades.
    • The PersistentSubsectionGrade section provides details on saving and updating grades for sub-hierarchical components within a course. 
    • The data model for those grades are specific to component-level grading information and generally consistent throughout all components.
    • The data model for Course-level grades, on the other hand, is dictated by a configurable course-level grading policy.  At this time, the platform is configured to use only a single-type grading policy (WeightedSubsectionsGrader), that provides a course-level grade percentage and letter-value.
  • Decouple persistent course grades and certificates.
    • Certificates are dependent on the final course grade - but only to determine whether the grade is a passing grade.  Beyond that, certificates do not depend on the exact grade value and grade value changes.
    • In the future, if we ever support multiple types of certificates that vary by grade, we will need to annotate the certificate record with the certificate type.  However, it still does not need to depend on the exact value of the grade.
    • Currently, the certificates_generatedcertificate table has a column for the learner's course grade.  Once this work is implemented, the grade column in that certificates table will no longer be used.
  • Save the course grade in a different SQL table from the certificates table and from the subsection-grades table.
    • Include columns for the course_id, course_version, subtree_edited_on, user_id, and the grade percentage.
    • Should the course letter grade also be persisted?
      • Why yes:
        • Performance: All information related to the user's course grade will be persisted with no need for dynamic computation.
        • Robustness: The persisted letter grade will be the value computed at the same time the percentage is computed.  So it would be unaffected by any subsequent course grading policy.
      • Why no:
        • Coupling: By storing secondary fields obtained from the grading policy, the data model is coupled to the artifacts of the configured grading policy.  If in the future, we decide to support other grading policies with additional course-grade values (for example, a grade on social activity or a curve-based grade) then the schema for the course-grades table will need to be updated.  On the other hand, in order to support overriding of all course-level grade data, it seems we would need to anyway.
      • Given the above, we should go ahead and persist the letter grade as well.
      • Overriding course-grade: Since we have multiple course-grade fields (percentage and letter) that can be overridden, the grade override feature can intelligently allow overriding just an individual field, both fields, or just the percentage and have the letter grade automatically recomputed.
  • Update the persisted course grade automatically (and asynchronously) whenever any of the subsection grades in the course is updated.
  • Notify all subscribers:
    • When a course grade is updated/saved. 
    • When an updated course grade exceeds the passing threshold.  Interested listeners would include:
      • Credit eligibility to check whether the user is now credit eligible per course policy.
      • Gating code to check whether new course pre-requisites are now satisfied.
      • Certificate generation code if course policy allows automatic generation of certificates once the course is passed.
    • Note: it's an implementation detail whether the above signals are implemented as separate signal-types or as a single signal-type with a distinguishing field.

Data Model Notes

Note: See Grades Data Model (for published description).

columnadditional info

Identifiers
id
course_id
user_id

(course_id, user_id) together form a unique identifier for each row in the database. 

Additionally, the id field is an automatically generated primary key for the table.

Timestamps
created
modified
course_edited_timestamp


course_versionAllows us to immediately find and retrieve the exact version of the course that was active when the grade was computed.
grading_policy_hashA SHA-1 digest of the grading policy allows us to detect and update grades whenever a course's grading policy changes.

percent_grade
letter_grade

Records percent and letter course grades for the user.

Certificate Generation and Availability

  • Currently, certificates are generated (manually, I believe) at the following times:
    • By course teams (via edX support staff) in an instructor-paced course
      • with knobs to generate new certificates, regenerate existing ones, update white-list of certificate recipients, etc.
    • By a learner in a self-paced course
  • Currently, certificates become available at the following times, depending on the certificates_display_behavior course policy:
    • When the course ends.
    • As soon as a certificate is generated.
  • In the future, we may also automatically generate certificates as soon as a passing grade is achieved.  This can be supported by the notification trigger from the course-grade layer as described above.
  • But since course-grade-values will be decoupled from certificate generation (except for notification when the grade exceeds the passing threshold), the timing of certificate generation will no longer have any implication on grades.

Transcripts and Course Finality

  • Once we support transcripts, we may need a concrete definition of "course finality" and "final course grades".  At that time, we'll need to take into consideration course end dates, self-paced courses, and pending graders.
  • Until then, persistent course grades remains decoupled from a future notion of "course finality".

Course Author Manual Rescoring

Have course authors make a conscious decision when they change content/policy that affects learners' grades.

Course Authors can decide to

  1. Retain already computed subsection grades, if any, as they are.

  2. Rescore to recompute already computed subsection grades. (In the future, we may support automatically notifying students when this happens.)

  3. Rescore to update-if-gain only those subsection grades that have improved scores after the rescore.  That is, keep the score that is higher.

For the initial rollout of this, course authors can use the already existing UI in the Instructor Dashboard to rescore a problem's grade.  Nothing is needed for option #1 since it's a no-op.  Option #2 exists already as the existing "rescore" feature.  Option #3 would need to be implemented.

Note: #1 and #3 can potentially benefit with improved handling of course versions for better user experience.  Learners who already completed problems that are now removed may want to see their previous answers and the content of the removed block, per Course Version Locking.

Grade Overrides

Implement UI for overriding any of the following to a specific value:

  1. a final course grade
  2. a subsection grade
  3. a problem's score (implemented in EDUCATOR-165)

For final course and subsection grades, course teams will need to request the edX staff to manually override a learner's grade directly in the database.

Eventing/Logging of Grading Changes

  • Track/Log whenever changes that affect a learner's grade occurs.  
    • Add log entries (to be read via Splunk) for all grading-related events.  
    • Add tracking events (to go to tracking logs) for only those cases that add value to researchers.
      • TBD - add link to the Grading Event Design wiki here.
  • Events should include
    • course state: the edit version number of the course in each event.
    • student state: a compressed list of all the block-ids that the user had access to when computing the grade.
      • for scalability reasons, this information need not go into the tracking logs.
      • but since it may be useful for debugging purposes, we can add this information to log statements.
  • Fire an event for each of the following:
    • A grading policy is changed (as listed in GradingPolicy).
    • A new value for a subsection grade is calculated and saved.
    • A problem score is saved.
      • A problem is rescored.

Course Structure Assumptions

This grading system is built upon some assumptions about the structure of a course. Note that this only applies to graded content, and does not touch upon videos, HTML blocks, etc. as they are unrelated to grading.

  • Every problem is either a leaf node or a container node within a containing subsection.
  • Every problem is the child of at least one subsection, with any non-negative number of nodes between itself and this ancestor, including zero.
    • That is, a problem may be contained directly within a subsection, even though the current practice on edx.org is to use verticals to organize the problems within a subsection.
  • Every subsection is the direct child of at least one chapter.
  • Every chapter is the direct child of a course.

What does (and doesn't) trigger regrades

Actions that cause a regrade of a single learnerActions that cause a regrade of all learnersActions that do NOT cause a regrade
Learner submits a problemEdit the course' Grading Policy in Studio's Grading pageCreate new course content at any level

Learner's group access changes:

  • Cohort affiliation changes
  • Enrollment track changes
Change a subsection's assignment type (e.g., ungraded → homework)Delete course content at any level


Move course content at any level


Change course content visibility to learners


Change problem weight


Edit the content of a problem in any way

Turning on Persistent Grades

To turn on persistent grading in an Open edX instance:

  1. Go to /admin/grades/persistentgradesenabledflag/ of your site's Django admin
  2. Click on "Add a Persistent grades enabled flag"
  3. To turn on the feature for all courses make sure to check both "enabled" and "enabled for all courses."
    1. If you want to turn on the setting for only a specific course, you will need to create a "Course persistent grades flags" entry in /admin/grades/coursepersistentgradesflag/ for the course and ensure that your main "Persistent grades enabled flag" has "enabled" set to True, but "enabled for all courses" set to False.

There is a waffle switch that can be used to enhance grades performance, but should be enabled only after any Backfilling process is completed:

  • grades.assume_zero_if_absent: if there is no grade in the database for a given user, assumes a score of zero for that user in the course or subsection in question. This is a performance optimization, but should be used only if persistent grades have been calculated for all users on the instance. Otherwise, the platform will fail to calculate on-the-fly grades for users whose grades are missing from the database.