Completion API

Author / Context

This proposal has been prepared by OpenCraft, on behalf of one of our clients.

Goals and Scope

Goals of this proposal are:

  1. To provide the data required to allow learners to see how much of the course they have completed (which XBlocks they have completed).

  2. To provide a scalable mechanism for computing and reporting completion for all users and courses on both small and huge (edx.org scale) Open edX instances. This includes generating completion reports for instructors/partners that provide per-student completion breakdowns within a course.

  3. To provide an API that can be used [on mobile and desktop] to retrieve a user’s completion of a course, even when courses are modified or are different per learner (adaptive content, cohorts, etc.).

  4. Completion stats must be consistent between mobile and desktop.

  5. To support offline interaction with [supported] XBlocks on mobile (e.g. watching videos offline), and include an API that native mobile apps can use to later synchronize the user’s completion with the Open edX server.

  6. To support leaderboards within courses or cohorts (allow computing in which learners have completed most of the course).  This is most likely only useful for small courses where there is enough diversity in the ranking.

  7. To dovetail with the new Open edX persistent grades architecture

  8. To allow future platform improvements like locked content that unlocks based on completion/consumption of ungraded content. (e.g. lock most of the courseware until the user watches a video).

    1. Example: The conditional module (new Studio UI).

    2. Example: the "Entrance Exam" feature which works at the section level. The learners must complete one section before being able to access the following sections.

  9. To have a solution that can be implemented and deployed quickly.

Non-goals of this proposal are:

  1. To propose, plan, or implement any UI in Open edX for displaying completion (that will come as a second phase, but OpenCraft has no plans so it's likely up to the community or edX to take that on).

Design Details / Questions

Some of these are open questions - comments are welcome!

What is “completion”?

  • "Completion" means how much of the course content the learner has completed (as a percentage and also a breakdown of specifically which parts are completed so far).

  • We mean whether or not a learner has completed a course component (not merely viewed the component).

    • e.g. for videos you must watch (at least part of) the video, or for problems you must submit an answer. (n.b. Insights defines “complete” as having watched up to “either 30 seconds from the end or at the 95% complete mark, whichever means that more of the video has elapsed.”)

    • Only for presentation/ungraded XBlocks like “HTML” does “completed” mean “viewed”, since there is no action to take to “complete” the XBlock. (n.b. Per Shelby, edX “previously considered either just accessed, or viewed for ~X number of seconds (particularly for units that are text only - to account for just navigating quickly through a subsection).”)

    • e.g. for activities that require a file upload, you must upload a document.

  • If a user submits a problem that allows only one attempt, and they get a failing score (0%), is that problem considered “complete”? Yes.

  • If a user submits a problem that allows unlimited attempts, and they get a failing score, is that problem considered “complete”? Yes.

Does anything in Open edX currently track completion?

  • Not really - see “Prior Art” below.

  • We can’t use the presence of a grade, because some XBlocks like videos or HTML blocks do not emit grades.

  • We can’t use the presence of StudentModule etc. because those can be created for actions other than using/completing an XBlock (like opening the progress page) and reflect viewing an XBlock, not completing one.

Is there any value in storing “viewed” state separately from “completed”?

  • Probably not. (Note that specific XBlocks like video will still do their own tracking of how much of the video is viewed, but this proposal does not affect that nor generalize it)

Should completion be binary?

  • For future flexibility, we will represent completion for each completable block as a fractional float value in the range [0.0, 1.0], but the current implementation will treat completion at the completable block level as binary (only submitting values of 0.0 and 1.0).  The aggregation tables will sum the values provided by the completable blocks.

Does completion ever decrease? (Do users ever “uncomplete” a block?)

  • Yes, we need to support decreasing completion (“uncompleting” a block) for these reasons:

    • Blocks can have their attempts reset and would need to be marked as incomplete again.

    • To implement something like the DoneXBlock (“Check this box if you’ve completed the offline activity”) which must be checked to be counted as "completed": we need to “undo” the completed state if the user unchecks the box.

    • For any XBlock with a file upload feature, we would need to mark it as incomplete if the user deletes the file

    • We should handle the situation where an XBlock is deleted/hidden from students, and that should update the completion appropriately.

    • The current API will handle automated completion at this time.  There are possible use cases for allowing a learner to override the automated completion in the UI, but that would be a separate layer added when and if it is called for.

What Blocks should be excluded from completion calculations?

  • Inline Discussion XBlocks

  • Blocks meant only for instructors, hidden course sections, etc.

  • Blocks should be able to be marked as excluded from completion calculations (e.g. child XBlocks of problem builder are often non-visual and must be excluded); our client here also has this requirement for other block types.

Should completion be cached at the subsection, section, and/or course levels?

  • Yes - this is necessary for scalable leaderboards, which is an explicit goal here.


Prior Art / Implementations Currently In Use

Open edX

Open edX does not currently track completion. It has the concept of a course grade, where

Grade = completion × score

(summed over all XBlocks, then divided by the number of blocks; we are ignoring weighting here)

e.g. in a course with two assignments, if a user completes one with a score of 75% and has not even attempted the second one yet, their “grade” is

Grade = ([1 × 75%] + [0 × 0%]) / 2 = 37.5%

Contrast this with how grades work in traditional schoolroom classes, where while the course is progressing, a traditional grade would not include the zero scores from assignments that have not yet been given to the student (so in the example above, the student’s interim grade would be “75%” not “37.5%”)


Open edX Vestigial Completion Code


There is some minimal completion-tracking code in Open edX but it is currently unused for the most part. Marco explained: “We used to have fractional progress [completion] on problem pages. The sequence bar showed a sort of visual bar underneath the icon. This broke at one point and instead of fixing it we stripped it out.”.


edx-solutions fork and progress-edx-platform-extensions

This custom plugin for Open edX called progress-edx-platform-extensions (or just “progress”), together with corresponding features in the Solutions REST API, provides an implementation of completion that meets some of the goals in this document. 


  • User completion per-XBlock is stored in a model called CourseModuleCompletion.

    • Note for people checking out the code: the "stage" column now seems to be unused.

  • Another model called StudentProgress summarizes the student’s progress through the whole course (as a single number)

  • Beyond simple tracking of “completed” / “incomplete”, this progress plugin can:

    • Ignore certain block types.

  • It has some drawbacks:

    • It is designed for server-to-server access, so any authorized API client can read/write the progress of any user on the system.

    • Not known to be scalable (or possibly known not to be scalable)

    • It is not part of the core Open edX platform

    • Statistics can be inaccurate or out of date in some edge cases; if the cached data is wrong, it is necessary to forcibly updated using the “recalculate_progress_for_users” command.
    • Some of the code necessary for actually computing a user's progress as a percentage is maintained separately in a different closed-source repository.
  • What is considered “completion”?
  • What is the API?
    • Generally, completion is written/updated automatically by the LMS
    • Completion can be read from GET /api/server/courses/{course_id}/metrics/completions/leaders/ or /api/server/courses/:course_id/completions/ but only by a trusted server (shared secret API authentication only)
      • The /completions/ API requires the API client to do further computation to determine a user’s overall completion; the /leaders/ API “actually gives the completion of a single user or N top users in a course and client does have not make any further computations” so it is more scalable and consistent.
    • Completion can be written by POST to /api/server/courses/:course_id/completions/ but only by a trusted server. We are not aware of anyone/anything that actually uses this method to update completion, however.

Implementation Proposal Overview

The following implementation approach is intended to meet all the goals described at the beginning of this document. It allows the platform to track “completion” for every single user and every single XBlock in the system. It does not provide any way to “backfill” the completion data on edx.org or other Open edX instances that aren't already using progress-edx-platform-extensions, so it could make sense to enable this on edx.org once it’s ready but to only begin using the data or enabling a UI for this data for courses that start after this feature is launched.

What is completion?

  1. Completion refers to the percentage of a given block that a learner has interacted with enough to be considered "done" with the block.
    1. It does not imply that the block is unavailable to the user.
    2. It does not imply that the user has achieved mastery of the block's content.
  2. Particular block types can use one of three methods of completion: Completable, aggregators, and excluded.
    1. Completable blocks are XBlocks or XModules that can be directly assigned a completion value. 
      1. This will normally happen automatically when the learner interacts with the block in a specified way. 
      2. If a completable block has children, the child blocks are not traversed by the Completion API in calculating completion. 
      3. A complete block has a completion value of 1.0, and an incomplete block has a completion value of 0.0.  Future iterations of this feature may introduce partial completion values. 
      4. Specific block types are completed according to the following rules:
        1. Video blocks - Watched to N% completion or for T-minus-M seconds (N and M are configurable).
        2. Scorable blocks - An answer (correct or incorrect) was submitted.
        3. Custom - Any completable block can emit a "mark_complete" or "mark_incomplete" event to indicate its completion status.
        4. Default - The block was viewed.
    2. Aggregators are blocks that contain other blocks, are not themselves completable, and are considered complete when all descendant blocks are complete (not counting descendants that are completely inaccessible to the user). 
      1. Aggregator blocks will only have their completion value calculated if the block type is configured as a COMPLETION_AGGREGATOR_BLOCK_TYPE in the site's settings.  Blocks of type "course" will always be aggregated, and do not need to be specifically configured.
      2. An aggregator block has a completion value equal to the sum of the completion of its descendant completable blocks, divided by the total number of descendant completable blocks.  In the case of Directed Acyclic Graphs (DAGs), if an aggregator block is an ancestor to a completable block along N paths, the completion value will be added to the aggregator block's earned value N times.
      3. Example: given the following subtree, if the learner completes prob1 and prob2, the completion value for seq1 will be 100% (2/2), the completion value for seq2 will be 50% (1/2), and the completion value for the chapter will be 75% (3/4):
               chapter

               /     \
           seq1       seq2

          /    \     /    \
        prob1   prob2    prob3
      4. When a learner marks a completable block complete or incomplete, a background (celery) task will asynchronously "roll up" the completion to the aggregator blocks in the completable blocks ancestry as configured, by recalculating the completion value for each of the containing aggregate blocks.
        1. Typical aggregator block types include course, chapter, sequential, and vertical. 
      5. Aggregators are useful/required for:
        1. Displaying subsection completion status on the course outline/navigation page

        2. Including “completion” reports in grade reports (e.g. download a CSV showing the completion/progress and grades of all users in a course)

        3. Displaying course completion on the dashboard (e.g. saying “30% complete” below each course)

        4. Course-level leaderboards.

      6. More complex aggregations could be defined.
        1. More complex aggregations could map an arbitrary name to an aggregator block type and a filter for determining which blocks to include. The name would need to be unique, and not conflict with the names of any aggregator block types; it would be used as the requested_fields value for those aggregations.  A method for configuring these kinds of aggregators is beyond the scope of this proposal.
    3. Excluded block types are ones that do not participate in completion.
      1. They are neither completable nor can they be used as aggregators.
      2. Blocks that descend from excluded blocks are not included in aggregation calculations.

How is completion modeled?

  1. Create a new model in the DB (BlockCompletion) that can indicate when a user has “completed” a completable XBlock.
    1. Fields: user, course_key, usage_key, block_type, completed (float [0.0, 1.0]) + fields from TimeStampedModel.

    2. A record with "completed" == 0.0 is equivalent to no record that exists.

    3. It should have appropriate indexes for fast lookups.

    4. This is roughly based on the CourseModuleCompletion model from progress-edx-platform-extensions but with some changes. The “get_actual_completions” code and the "stage" column would not be used.

  2. Create a new model in the DB (AggregateCompletion) that indicates for each block whose block-type is an aggregator what percentage of the contained completable blocks have been completed. 
    1. Fields: user, course_key, usage_key, aggregation_name (equivalent to block_type for simple aggregations covered by this proposal), earned (float), possible (float), + fields from TimeStampedModel.
    2. When a BlockCompletion record is updated, a signal handler will trigger an asynchronous task to update the AggregateCompletion records for all ancestor aggregators of the updated block.  The possible value is the sum of the completable blocks contained within the calculable block.  The earned value is the sum of the completed values for all contained completable blocks.  See the note above about DAGs.
    3. Available property: percent (float [0.0, 1.0]) = earned / possible.  An AggregateCompletion record with a possible value of 0.0 will have a percent of 1.0, reflecting the observation that if there is nothing to complete, the learner is done with the block, and that a learner should expect to be able to reach 100% completion for all aggregator blocks once they have completed the course.
    4. Note: Course-level completions will also be stored in this table, with "course" as the aggregate type.
  3. A mechanism will be provided to prevent returning invalid completion data for any users that start a course before completion data was captured, as there is no simple way to backfill this data.
    1. Proposal 1: A system-wide setting, "COMPLETION_TRACKING_START_DATE" enables calculation of aggregator completion values for all enrollments that start after the specified date.  This exists to prevent returning invalid completion data for any courses that started before the feature was implemented, as there is no way to backfill the data.  This will be calculated using the new "Content Availability Date" (CAD). A per-course start date model config will provide a way to do staged-rollout of completion aggregation on a course-by-course basis, but calculation will still be subject to the restrictions of the CAD.
    2. Proposal 2: Tracking per user when they started the course and when completion data had begun to be captured, in order to determine whether the completion data is useful for that particular user.

How is completion updated/written?

  1. The LMS should be modified to mark an XBlock as “completed” using this logic:

    1. If the block type indicates that it is not completable (completion_method=EXCLUDED or completion_method=AGGREGATOR), then the block is ignored.

    2. If the XBlock indicates that it is aware of the Completion API, and has a custom implementation of completion (“has_custom_completion” is true): Wait for the XBlock to emit a “complete” event with a value in the range [0.0, 1.0] (also support the deprecated name “progress” for this event, which is how it currently works for progress-edx-platform-extensions), and then mark it as completed.

      1. We will create a mixin to provide a template for custom completion, and tools for making it more simple, with has_custom_completion set to True, and an emit_completion method.
    3. Otherwise, if the XBlock indicates that it is gradable (“has_score” is true): when the XBlock emits a grade, have a signal handler emit a completion event.

      1. An XBlock can also emit a “completion” event with a value of 0.0, to indicate that the user has no longer completed the XBlock.
    4. Otherwise: when the XBlock is first viewed by the student (student_view is loaded and rendered), mark it as completed.

      1. The code to emit the event for default blocks will live in the "vertical" XBlock JS, so that when a unit becomes visible, it emits the "complete" event for every descendent that does not match the criteria above (that is, when completion_method == COMPLETABLE && has_custom_completion == False && has_score == False).

      2. A block will be marked complete when it has been visible in the browser window for N seconds, or has had focus for N seconds (to account for users whose interface does not contain a view window).

  2. When the completion subsystem receives an emitted completion event, it will check if the block has track_completion set to False, and otherwise, it will update_or_create the relevant BlockCompletion object.
  3. Whenever the "completed" value of an XBlock change, the LMS will call an asynchronous task to update the corresponding AggregateCompletion table rows. The asynchronous task request will contain a modified timestamp of the completed block and will check the modified timestamp on those rows to verify they are out of date before performing an update. (This task must not be processed until after the completable block update is committed).

    1. This may require optimization to avoid computing aggregators too frequently.
  4. Any addition or removal of completable blocks from a subsection or course should cause cached "possible" values to be recomputed asynchronously for all enrolled users, using the same mechanism as for changed completion values, above, though the user will be less sensitive to stale values in this situation.
    1. TODO: Figure out all the conditions that will require recomputing completion values.
      1. Created or removed blocks.
      2. Visibility changes.
  5. If a user completes blocks while offline (this could be fairly common in a phone app, but is also possible in a browser), they will need to see their completion updated when they come back online.  This is being explored further in a separate proposal for Offline XBlock support.  Some draft ideas:

    1. There could be an API endpoint that can accept a mapping of block IDs to completion ratios, and then the app can mark blocks as completed for the current user when they come back online.

    2. Alternative: Allow both JavaScript XBlock runtimes and native iOS/Android XBlock views to emit events (“complete”, “submit”, etc.) somewhat like current handler calls, but asynchronous and offline-compatible. When the app is online again, submit all the timestamped events to the LMS server. Any XBlocks that received a “complete” event will be marked as complete. (This allows the definition of completion to be controlled by the XBlock, e.g. in offline webviews, rather than requiring the native app to code its own definition of complete for each XBlock).

How is completion read?

A RESTful API:

  • GET /api/completion/v1/courses/:course_id/[?username=username&requested_fields=comma,separated,fields] to get an individual user's course completion. Not guaranteed to be completely up to date, since this is cached and updated asynchronously.
    • If no username is provided, default to the requesting user.
    • Non-staff users can only retrieve their own completion.
    • Data for other aggregator block types can be retrieved by specifying the aggregator name (block type) in a "requested_fields" request parameter.  requested_fields is a comma-separated list of aggregator block types to retrieve data for.  If a block type is specified that is not listed as a valid aggregator, a 400 error will be returned.
    • Mean completion for the course can be requested via a "mean" value passed to requested_fields.
  • GET /api/completion/v1/courses/[?username=username&requested_fields=comma,separated,fields] to get the completion data for all courses for a given user
    • This view is paginated.  It will use the standard page size default for Open edX.
    • requested_fields and username request parameters behave the same way as above.
  • GET /api/completion/v1/leaders/:course_id/ to get the top 100 users in a course by completion, potentially filtered by criteria like cohort
    • This will be implemented behind a feature flag, as it is not desired for edx.org.
    • It will be available to all users enrolled in the course and all staff users.
    • If needed, a ?count=N parameter could be added, for v1.0 a sensible default should be sufficient.  The parameter could be added later with no concern about backward compatibility.

In the course blocks API:

  • Every XBlock can be annotated with completion percentage (float in the range [0.0, 1.0]), unless that XBlock is excluded from completion calculations. This would be an optional field and not included by default. The field would be added by a block transformer.
  • Aggregator blocks can similarly be annotated with the corresponding completion percentage.

Notes about this approach:

  • The RESTful API does not support returning completion information per "completable" XBlock, but the course blocks API does.
  • XBlocks would not be able to read their own "is completed" state - do we need this?

RESTful API

The following format would be used for API requests:

A single course for a user with extra fields

Request

GET /api/completion/v1/courses/course-v1:McKA+BBV+Fall2017/?requested_fields=sequential,chapter

Response

{
  "course_key": "course-v1:McKA+BBV+Fall2017",
  "completion": {
    "earned": 3.0,
    "possible": 9.0,
    "percent": 0.33333333
  },
  "chapter": [
    {
      "usage_key": "block-v1:McKA+BBV+Fall2017+type@chapter+block@week-1",
      "completion": {
        "earned": 3.0,
        "possible": 9.0,
        "percent": 0.33333333
      }
    }
  ],
  "sequential": [
    {
      "usage_key": "block-v1:McKA+BBV+Fall2017+type@sequential+block@week-1-a",
      "completion": {
        "earned": 3.0,
        "possible": 3.0,
        "percent": 1.0,
      },
    },
    {
      "usage_key": "block-v1:McKA+BBV+Fall2017+type@sequential+block@week-1-b",
      "completion": {
        "earned": 0.0,
        "possible": 6.0,
        "percent": 0.0,
      },
    }
  ]
}

Single course for a user without requested fields

Request

GET /api/completion/v1/courses/course-v1:McKA+BBV+Fall2017/


Response

{
  "course_key": "course-v1:McKA+BBV+Fall2017",
  "completion": {
    "earned": 3.0,
    "possible": 9.0,
    "percent": 0.33333333
  }
}

All courses for a user with extra requested fields

Request

GET /api/completion/v1/courses/?requested_fields=sequential,mean

Response

{
  "pagination": {"values": "TBD"},
  "data": [
    {
      "course_key": "course-v1:McKA+BBV+Fall2017",
      "completion": {
        "earned": 3.0,
        "possible": 9.0,
        "percent": 0.33333333
      },
      "mean": 0.6,
      "sequential": [
        {
          "usage_key": "block-v1:McKA+BBV+Fall2017+type@sequential+block@week-1",
          "completion": {
            "earned": 3.0,
            "possible": 3.0,
            "percent": 1.0,
          },
        },
        {
          "usage_key": "block-v1:McKA+BBV+Fall2017+type@sequential+block@week-2",
          "completion": {
            "earned": 0.0,
            "possible": 6.0,
            "percent": 0.0,
          },
        }
      ]
    },
    {
      "course_key": "course-v1:McKA+Drive+Fall2017",
      "completion": {
        "earned": 0.0,
        "possible": 2.0,
        "percent": 0.0
      },
      "mean": 0.95,
      "sequential": [
        "usage_key": "block-v1:McKA+Drive+Fall2017+type@sequential+block@thewholething",
        "completion": {
          "earned": 0.0,
          "possible": 2.0,
          "percent": 0.0
        }
      ]
    }
  ]
}


All courses for a user without requested fields

Request

GET /api/completion/v1/courses/

Response

{
  "pagination": {"values": "TBD"},
  "data": [
    {
      "course_key": "course-v1:McKA+BBV+Fall2017",
      "completion": {
        "earned": 3.0,
        "possible": 9.0,
        "percent": 0.33333333
      },
    },
    {
      "course_key": "course-v1:McKA+Drive+Fall2017",
      "completion": {
        "earned": 0.0,
        "possible": 2.0,
        "percent": 0.0
      }
    }
  ]
}

Leaderboard for a course

Request

GET /api/completion/v1/leaders/course-v1:McKA+BBV+Fall2017/

Response

{
  "results": [
    {"username": "akiko", "rank": 1, "completion": {"earned": 100.0, "possible": 100.0, "ratio": 1.0}},
    {"username": "benazir", "rank": 2, "completion": {"earned": 98.0, "possible": 100.0, "percent": 0.98}},
    {"username": "clyde", "rank": 2, "completion": {"earned": 98.0, "possible": 100.0, "percent": 0.98}},
    {"username": "d'artagnan", "rank": 4, "completion": {"earned": 92.0, "possible": 100.0, "percent": 0.92222222}}
  ]
}