Data Loss and Recovery from Migrations and Rollback

Context

It is very likely the system will go through data loss when removing data table or dropping columns based on a django migration. Most of the time, this is intentional. However, in the case the migration needed to be rolled back, it is important to recover the data loss to restore the overall system into previous state.

This document provides a runbook or recommended steps of instructions to be prepared before the migration changes merges and deploys.

 

Preparations

Go to Django Admin of the edx-platform instance of all of your environments (for example: test, stage and prod). Identify the table where the data will be removed. You might need to update your own django admin permission. For edx.org, please refer to . Make sure you record down the data within those tables

  1. For example, the waffle flag remove PR for course persistent grade has the data at hostname/admin/grades/coursepersistentgradesflag/ being removed on stage. Go to the stage environment https://internal.courses.stage.edx.org/admin/grades/coursepersistentgradesflag/ to record down the data.

  2. Do the same for production environment.

  3. Do the same for edge environment

Merge and deploy

Merge the code PR first. Ensure the tables and columns to be removed are unaffected. Watch the code change PR to go to production. Perform smoke test on system manually to gain confidence things didn’t go wrong.

Merge the migration PR next. Let it go through the environments using your deployment pipeline. Perform smoke tests on your system manually to gain confidence things didn’t break.

Monitor with logs and integrated monitoring system to ensure no major errors happens because of the database table and column removal.

Revert/Roll back

In the case of revert and roll back, perform the instructions on this doc.

Then recover in the associated the data in all environments that was removed as part of the Preparations step above.