How to Configure and Drive the User Retirement Workflow

Overview

The user retirement workflow is a configurable pipeline of building-block APIs that are used to “forget” a user’s personally identifiable identification (PII) as well as preventing that user from logging back in, and preventing re-use of their username. The process is complicated, potentially involving several different internal and external services which any given OpenEdx install may or may not use, and allows for injection of site-specific stages to handle custom use cases.

The workflow is designed to be linear and re-runnable, allowing recovery and continuation in cases where a particular stage fails. The LMS is the source of authority of where a user is in the retirement process via the UserRetirementStatus model and associated APIs. UserRetirementStatus uses RetirementState objects to track the progress through the workflow.

Configuring RetirementStates

The discrete “states” that make up the “stages” of the retirement workflow are represented in the RetirementState model, which is populated via the populate_retirement_states Django management command which in turn relies on the `RETIREMENT_STATES` LMS Django setting. That setting can be overridden in configuration to adjust the process to your necessary workflow. The states can also be configured via Django Admin, though this is not recommended due to the difficulties of keeping environments in sync and having to hand manage the sort order.

The RetirementStates fall into a few different categories, which you should be familiar with before trying to setup a retirement workflow. Here’s a small example workflow configuration (as it would appear in Django settings) to refer to as we break them down:

RETIREMENT_STATES = [
  'PENDING',
  'LOCKING_ACCOUNT',
  'LOCKING_COMPLETE',
  'RETIRING_EMAIL_LISTS',
  'EMAIL_LISTS_COMPLETE',
  'RETIRING_ENROLLMENTS',
  'ENROLLMENTS_COMPLETE',
  'RETIRING_LMS',
  'LMS_COMPLETE',
  'ERRORED',
  'ABORTED',
  'COMPLETE',
]

These basic stages could be used to retire a user entirely from an LMS that didn’t use any other OpenEdx deployed applications (forums, ecommerce, notes, etc.).

  • The stages colored red are required states, the retirement process won’t work without them, and attempting to configure your LMS’ RetirementStates via the management command without them will fail.
    • The “PENDING” state is the state a UserRetirementStatus row is created in, signaling that the user has requested retirement.
    • The “ERRORED”, “ABORTED”, and “COMPLETE” states are special “dead end” states. A user cannot be moved from one of these states to any other state via the API. Manual intervention is required to return one of these users to the workflow.
    • “ABORTED” is a special state that allows a user to cancel their retirement request if they change their mind. Currently no UI is planned to support this, it would have to be manually done by, for example, a customer service representative via Django admin.
  • The stages colored blue are the working states. While a UserRetirementStatus is in one of these states it is assumed that the workflow driver is actively taking action of some type to retire the user.
  • The stages colored green are the completed states. They indicate that the user is ready to begin the next stage.
  • Each working + completed pair constitutes a “stage”, as referred to in this document.

The states must be executed in the order they are listed in the configuration. Attempting to move to an earlier state using the API will result in an error, though a user can be shifted to any state manually via the Django admin.

Since the working and completed states are completely arbitrary you are free to integrate any stages you need for your workflow, anywhere between “PENDING” and the dead end states.

Reasons you may want to do this:

  • Adding in a manual stage where a person unsubscribes the user from a mailing list provider not integrated with LMS
  • Hitting custom LMS endpoints crafted to remove user data from models not handled by the default LMS endpoints
  • Hitting 3rd party services or other applications to perform retirements, or notify them of a user’s request to be forgotten

Driving the Workflow

OpenEdx does not provide an out-of-the-box solution for driving the workflow at this time, though one could be easily written in LMS using the various retirement models. A Jenkins Pipeline DSL workflow is in progress and is expected to be released when complete to provide a framework that follows what will be used on edx.org. Right now an external service is expected to handle the following tasks:

  1. Get a list of users who need to be advanced in their retirements
  2. Understand the order of the retirement stages and which API calls are to be made for each, along with the appropriate authentication
  3. Update LMS as to the user’s progress on each stage
    1. Updating the UserRetirementStatus to a “working” state for each stage before communicating with the appropriate API
    2. Updating the UserRetirementStatus to a “completed” of  “errored” state for each stage when the API call is complete

In order to facilitate those responsibilities the LMS provides the following APIs:

The assumed workflow for a user’s retirement looks something like this:

  1. The user requests retirement via LMS (work for this UI is in progress).
  2. A UserRetirementStatus row is created in the default, required, state “PENDING” (generally by using the create_retirement method on the UserRetirementStatus model).
  3. The workflow driver, running on a periodic timer calls retirement_queue for users in the “PENDING” state, the user is returned as part of a larger list.
    1. If a “cool down” period is desired before advancing the user through retirement (in case they wish to change their mind, or as a security measure) this call is configured to accept the number of days to cool down as a parameter.
  4. The workflow driver advances the user to the next “working” state in the workflow via update_retirement_status.
  5. The workflow driver, using it's own configuration of mapping states to actions, calls the appropriate API, or takes other appropriate action, for the given “working” state and captures the result.
  6. The workflow driver advances the user to the appropriate “completed” state for the previous “working” state via update_retirement_status, or to “ERRORED” and passes in the captured result to be logged.
  7. Numbers 4-6 are repeated for all stages in the workflow.
  8. When the final stage is completed the workflow driver updates the user to the “COMPLETED” state via update_retirement_status.
  9. Eventually the UserRetirementStatus row is deleted by a separate operation (to be determined).

Debugging the Workflow

Site operators should monitor for rows stuck in “working” states for unusually long periods of time and either set off monitoring alarms or automatically move those rows to “ERRORED” to catch cases where the driver itself fails mid-pipeline. This check can be done by monitoring in the database or using the retirement_queue API, with a comma-separated “states” parameter of all of the working state names and looking at the last ‘updated’ field returned to find the last time the row was touched. Similarly rows ending up in the "ERRORED" state should raise alarms as they will need to be manually moved (via the Django admin) to an appropriate state to finish the retirement.

UserRetirementStatus contains some information that might be helpful in debugging stuck users, or users that have ended up in the ERRORED state. Namely these columns:

  • updated- Timestamp of the last time that a row was touched by the system
  • last_state- The RetirementState that the row was in before being moved to the current state. When a row is moved to ERRORED, this should indicate the stage that failed.
  • responses- A text field that contains a log of data sent along with every call to update_retirement_status. Well written pipeline stages should return useful error messages on failures, and workflow drivers should be sure to pass this information back for storage when updating the user state.