This document contains information on what you need to know about Django Migrations in the Open edX platform.  If you are unfamiliar with database migrations, or, specifically, Django migrations, please read the reference documentation.  Django does a good job of abstracting away what's behind the scenes when you run "./manage.py makemigrations" and "./manage.py migrate"- understanding what happens during these operations is extremely useful.

Reference documentation

When in doubt

If you have non-trivial migrations to apply, or if two non-local environments (e.g. stage and production) have different migration states, describe your situation in the #django Slack channel and go talk to the SRE team before doing anything else.  Similarly, migrations can become complicated when two different people create two different migrations around the same time.  When in doubt, post in #django and talk to SRE.

It is often useful to review or provide the SQL generated by a migration. See sqlmigrate doc for details.

Don't revert code that includes migrations, don't change old migrations.

Django migrations should be considered "applied" as soon as they land on master of a repo. Missing ( ghost ) migrations cause problems for Django, and require manual intervention to fix.  Fix forward on migrations. Be sure to properly consider all the points below so that you're less likely to want to delete or change a migration.  If you do delete a migration, or revert a commit that contains a migration, follow this guide to communicating with the organization and the community: How to revert a migration from master.  Also, you should never roll back migrations (manually) without rolling back the code that relies on them.

If you still think you need to change old migrations, and you want to verify that there isn't an alternative, see the "When in doubt" section.

Also, if you absolutely must change old migrations (this includes things like squashing), after merging and verifying your migration changes, you should update the sql that is used to populate devstack during provisioning. This is to prevent conflicts during provisioning in the future. To update the sql, in the devstack repository, run .update-dbs-init-sql-scripts.sh . This should update edxapp.sql, edxapp_csmh.sql, and ecommerce.sql. Create a PR with the updated sql and merge to devstack as usual.

Don't change the parent of a migration

Along the same lines of a migration being considered "applied" once merged into master, you should never change the dependencies of a migration once it has landed on master.  It will cause real problems and probably downtime for which environment it is deployed to.  When you create new migrations in a feature branch, you want those to be the most recent migrations when you merge into master.  Using an analogy to git, you always want your new migrations to be at the "HEAD" of the migration history in your app.

Deployment and backward-compatible migrations

Here at edX, we use the blue-green deployment method. The important detail about this deployment method is that, for some period of time, traffic is going to both the old code and new code. That detail is especially important when deploying database migrations that alter database columns and tables in a manner that is not backward-compatible with the previous release.

Let's go through a couple examples with our user table, auth_user. It has a few different columns, but we'll use the full_name column for the examples.

Say we decide to change the column's name from full_name (with an underscore) to fullname (no underscore). Our code in production is using full_name. When it's time to deploy this new release, we simply generate a migration and deploy it. Since we are using blue-green deployments, our old code is still looking for the original column name, full_name. However, the new deployment changed the name to fullname, so the original code starts failing.

Instead of renaming the column, say we delete it completely. Again, the database is modified when we deploy, and the original code that is still running will fail.

Because we operate in an environment where new and old code are running simultaneously against the same database, new code must always be compatible with the older database schema. Newer deployments can add tables and columns, but neither can be deleted unless the old code is no longer referencing the deleted tables or columns.

Migration Unit Test

In the edx-platform codebase there is a unit test test_migrations_are_in_sync in test_db.py which ensures that django migrations and models are in sync. Migrations to drop columns or tables generally require at least two releases, one which removes references and one which has the drop migration. The first release will fail the unit test. For this reason you will need to skip that unit test during your release sequence and restore it when you are done. This also applies to libraries used by edx-platform, such as edx-proctoring, the test will fail when edx-platform receives the interim version.

The skip should include a ticket number and brief info on what it's for:

    @unittest.skip(
        "Temporary skip for TICKET-1234 while the fnord column is removed from the snood table"
    )

How to drop a column

Nullable/Non-Nullable Columns

For either a nullable or non-nullable column, first make sure there are no other models or code that actually use the column. If there are, make adjustments to those first before working through the deletion flow.

For NULLABLE columns, this involves TWO releases:

  1. Remove all usages of the column, including updating the model to not refer to the field/column anymore (i.e. Model field must be removed in this step)
    1. If this change is in the edx-platform codebase, add a skip to the test_migrations_are_in_sync unit test.
  2. Drop the column (with a migration).
    1. If this change is in the edx-platform codebase, remove the skip to the test_migrations_are_in_sync unit test.

For NOT-NULLABLE columns, this involves THREE releases:

  1. Update the model and generate a migration making the column nullable (`null=True`)
  2. Remove all usages of the column, including updating the model to not refer to the field/column anymore (i.e. Model field must be removed in this step)
    1. If this change is in the edx-platform codebase, add a skip to the test_migrations_are_in_sync unit test.
  3. Drop the column (with a migration).
    1. If this change is in the edx-platform codebase, remove the skip to the test_migrations_are_in_sync unit test.

Returning to our example with the auth_user table. If we still want to drop the full_name column, we should do the following:

  1. Remove every usage of the full_name column in our codebase. Skip the unit test. Release that change to production, and ensure older code is no longer running. (We once had a stale ASG in production a few hours after a release, and it caused a few issues when we dropped a column.)
  2. Create a database migration to drop the column. Restore the unit test. Release it.
  3. (This step intentionally left bank...because nothing broke in production!)

ManyToManyField Columns

When dropping ManyToManyField columns, consider that the Django ORM uses a complicated automatic mapping to map the field to certain model names. So unlike other columns where it's easy to remove all usages of the column, unexpected column usages can still occur via the Django ORM manager (such as django.db.models.fields.related_descriptors.create_reverse_many_to_one_manager). So the field removal and the migration should be two separate steps.

So - for dropping ManyToManyField columns, use at least TWO releases:

  1. Remove the ManyToManyField field from the model, while skipping the test_migrations_are_in_sync unit test. Deploy.
  2. Add the migration which actually removes the ManyToManyField from the DB (which is actually implemented via a separate table) and unskip the test_migrations_are_in_sync unit test. Deploy.

Failing to separate the two steps may result in the field being used by the old code during the blue-green deployment after the migration has been performed, resulting in production errors.

How to rename a column

Renaming a column while keeping the business logic fully functional and without taking any down time is a very delicate and complex process.  Some things to keep in mind before you start:

  1. Do not allow downtime or alter business logic between releases.
  2. Do not allow downtime or alter business logic during a release, i.e. after migrations and before code deployment.
  3. Do not allow any data to be permanently dropped, even if only a subset of the data.
  4. Every release must have a functional rollback plan.
  5. As best as possible, avoid releases that must be immediately followed up by another release.  It should be safe to walk away from the rollout halfway through (e.g. code freezes, vacations, etc. might stop work).

THREE releases:

  1. Release:
  2. Release:
  3. Release:

How to drop a table


Due to the current workings of the Open edX ecosystem, some 2U-specific required steps are part of this process to avoid unnecessary problems for 2U, and the community as a whole.

Pre-requirements:

TWO releases (after pre-requirements):

  1. Remove all references to the table by removing references to the model and the model itself
    1. If this change is in the edx-platform codebase, add a skip to the test_migrations_are_in_sync unit test.
  2. Remove the table with a migration
    1. Important: If 2U has determined that their data can't be lost during pre-requirements above, the merge should be timed with a pipeline pause so the delete migration can be faked.
      1. This should be a rare occurrence, but it has happened.
    2. Remove the skip if you added one

Once a table is removed:

How to delete a Django app containing tables

See Removing a Djangoapp from an existing project.

Mathematical perspective: Database Expansion/Contraction

A good way to think of this is that migrations can "expand" and "contract" the database. Adding fields is an expansion, and removing them is a contraction. If you're feeling a bit more mathematical today, there's a partial ordering relation on (db, code) where your database and code are in the relation iff the set of fields in the DB is a (non-strict) superset of the fields in the code (... well, this isn't quite right, since changing fields is OK in circumstances like extending the length of a CharField. Defining the relation precisely is left as an exercise to the reader).  Under this model, changing a field (say, from a plain CharField to an EmailField) would consist of an expansion (adding the EmailField) followed later (potentially much later, but mainly not in the same release) by a contraction (deleting the CharField). Code can also expand and contract in similar ways, by changing which fields are declared in your Django models.

For a migration to be backwards compatible, the database must always be at least as "large" as the code. It can be larger (contain a field not referenced by the code), but not smaller.

Data migrations

If you're writing a data migration, don't import the model directly. Instead, allow Django to use the historical version of your model. This will allow your migration step to use the old (historical) version of your model, even if the model will later by changed by a subsequent database migration.


def combine_names(apps, schema_editor):
    # We can't import the Person model directly as it may be a newer
    # version than this migration expects. We use the historical version.
    Person = apps.get_model('yourappname', 'Person')
    for person in Person.objects.all():
        person.name = '%s %s' % (person.first_name, person.last_name)
        person.save()

Deployment and migrations for large or problematic tables

First read "Deployment and backward-compatible migrations" for general information about handling blue-green deployments. Then see this section for special consideration for large or problematic tables.

How to add a nullable column to an existing table (AWS Aurora)

When using AWS Aurora, a nullable column can be added to existing large (100k rows+?) tables without causing downtime. However, the migration may still timeout in GoCD - so please coordinate the release with SRE.

  1. Make a new field addition in model with null=True.
  2. Generate the model-change migrations locally.
  3. Create a pull request containing the model change and newly-generated migrations.
  4. Merge the migration pull request.
  5. The release process will run the migration and add the nullable column to table.

NOTES:

AWS Aurora Docs: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/AuroraMySQL.Managing.FastDDL.html

How to add index to existing table (AWS Aurora)

On AWS Aurora, indexes can be build on large tables without causing downtime, but this requires SRE coordination as the migration may timeout in GoCD.

  1. Add the new index on model fields.
  2. Generate the model-change migrations locally.
  3. Make pull request containing the model change and newly-generated migrations.
  4. Merge the migration pull request.
  5. Release will run the migration and add the index to table.

Consider Making a New Model with a OneToOneField

Adding fields to large tables can cause operational issues. What is safe varies from database version (MySQL 5.7 vs. 5.8. vs. 8.0) and specialized backends (like Aurora). Also, even if the database supports adding things in a non-locking way, Django's migrations framework may not understand how to formulate the right SQL to do so.

A lower risk alternative is to create a new model and link it together with a OneToOneField. You can use the primary_key=True option in order to have the new table's primary key match the values of the parent table.

This does complicate the code somewhat, and you should be careful about avoiding n+1 queries by calling select_related. The benefit is not having to hold your breath when the migration rolls out, for fear that you just froze a heavily used table and brought down the site.

Known large and/or problematic tables

Large tables

Contentious tables

Useful Checklists

Checklist for structural migrations

Existing Tables

New Tables

Checklist for data migrations

Checklist for adding indexes

Testing migrations

Unit testing

Migrations are currently not run in unit tests.

Acceptance tests

The paver commands that kick off the Lettuce and bokchoy tests run migrations. However, because this would take a long time if we started from scratch, we cache the latest state of the database after certain intervals (every couple months when someone checks in a new cache) so all the migrations are not run, but only the ones added since the last time the database state was cached.

Common migration tasks

Making a migration to create a new table

  1. Create a new directory under djangoapps and create a models.py file within it describing your model fields (example: common/djangoapps/track/models.py).
  2. Add the name of your module to INSTALLED_APPS in the appropriate environment file. Do NOT run manage.py syncdb (recommended in the Django documentation).

Once you are happy with how your fields are defined in models.py, run the following command. The resulting file will be checked in with your PR.

./manage.py [lms|cms] --settings=devstack_docker makemigrations --initial name_of_app


Making a migration to modify an existing table 

When you make changes to your model, create migration file and check it in:

./manage.py [lms|cms] --settings=devstack_docker makemigrations name_of_app --pythonpath=.

Make sure you are pointing to the correct environment file.

Checking SQL for a migration

After creating your migration file, if you are running Open edX via the DevStack configuration, it is sometimes useful to review the SQL for your new migration:

./manage.py [lms|cms] --settings=devstack_docker sqlmigrate name_of_app number

Performing a migration

After creating your migration file, if you are running Open edX via the DevStack configuration, you can perform the migration using the following command:

./manage.py [lms|cms] --settings=devstack_docker migrate name_of_app

Rolling back a migration

./manage.py [lms|cms] --settings=devstack_docker migrate name_of_app <number>

where <number> is the prefix of the migration file that you want to roll back to.

Unapply all migrations

There's a special "zero" migration name to unapply all migrations, including the initial migration.

./manage.py [lms|cms] --settings=devstack_docker migrate name_of_app zero

Rare migration tasks

Faking migrations

Example for CSM primary key to bigint migration.

Do the following before merging/deploying the code, otherwise the pipeline will try to run the migrations

Copy the migration file onto a machine, can be a worker if the app has worker machines

# Copy file contents into /edx/app/edxapp/edx-platform/lms/djangoapps/courseware/migrations/0011_csm_id_bigint.py on worker machine
# Check that the migration shows up in the list as unapplied

root@ip-10-3-71-92:/edx/app/edxapp/edx-platform# /edx/bin/edxapp-migrate-cms --noinput --list courseware
sudo: unable to resolve host ip-10-3-71-92
WARNING:py.warnings:/edx/app/edxapp/edx-platform/lms/djangoapps/courseware/__init__.py:7: DeprecationWarning: Importing 'lms.djangoapps.courseware' as 'courseware' is no longer supported
  warnings.warn("Importing 'lms.djangoapps.courseware' as 'courseware' is no longer supported", DeprecationWarning)

2019-08-30 18:08:47,084 WARNING 11141 [enterprise.utils] [user None] utils.py:55 - Could not import Registry from third_party_auth.provider
2019-08-30 18:08:47,084 WARNING 11141 [enterprise.utils] [user None] utils.py:56 - cannot import name _LTI_BACKENDS
courseware
 [X] 0001_initial
 [X] 0002_coursedynamicupgradedeadlineconfiguration_dynamicupgradedeadlineconfiguration
 [X] 0003_auto_20170825_0935
 [X] 0004_auto_20171010_1639
 [X] 0005_orgdynamicupgradedeadlineconfiguration
 [X] 0006_remove_module_id_index
 [X] 0007_remove_done_index
 [X] 0008_move_idde_to_edx_when
 [X] 0009_auto_20190703_1955
 [X] 0010_auto_20190709_1559
 [ ] 0011_csm_id_bigint
sudo: unable to resolve host ip-10-3-71-92
WARNING:py.warnings:/edx/app/edxapp/edx-platform/lms/djangoapps/courseware/__init__.py:7: DeprecationWarning: Importing 'lms.djangoapps.courseware' as 'courseware' is no longer supported
  warnings.warn("Importing 'lms.djangoapps.courseware' as 'courseware' is no longer supported", DeprecationWarning)

2019-08-30 18:08:52,683 WARNING 11392 [enterprise.utils] [user None] utils.py:55 - Could not import Registry from third_party_auth.provider
2019-08-30 18:08:52,684 WARNING 11392 [enterprise.utils] [user None] utils.py:56 - cannot import name _LTI_BACKENDS
courseware
 [X] 0001_initial
 [X] 0002_coursedynamicupgradedeadlineconfiguration_dynamicupgradedeadlineconfiguration
 [X] 0003_auto_20170825_0935
 [X] 0004_auto_20171010_1639
 [X] 0005_orgdynamicupgradedeadlineconfiguration
 [X] 0006_remove_module_id_index
 [X] 0007_remove_done_index
 [X] 0008_move_idde_to_edx_when
 [X] 0009_auto_20190703_1955
 [X] 0010_auto_20190709_1559
 [ ] 0011_csm_id_bigint
root@ip-10-3-71-92:/edx/app/edxapp/edx-platform#

Run Migration Fake

# Add extra space in front to prevent bash from writing password to history
root@ip-10-3-71-92:/edx/app/edxapp/edx-platform#  DB_MIGRATION_USER=migrate001 DB_MIGRATION_PASS=redacted /edx/bin/edxapp-migrate-cms --fake courseware 0011_csm_id_bigint

# Confirm the migration was applied
root@ip-10-3-71-92:/edx/app/edxapp/edx-platform# /edx/bin/edxapp-migrate-cms --list courseware
sudo: unable to resolve host ip-10-3-71-92
WARNING:py.warnings:/edx/app/edxapp/edx-platform/lms/djangoapps/courseware/__init__.py:7: DeprecationWarning: Importing 'lms.djangoapps.courseware' as 'courseware' is no longer supported
  warnings.warn("Importing 'lms.djangoapps.courseware' as 'courseware' is no longer supported", DeprecationWarning)

2019-08-30 18:55:31,993 WARNING 19468 [enterprise.utils] [user None] utils.py:55 - Could not import Registry from third_party_auth.provider
2019-08-30 18:55:31,993 WARNING 19468 [enterprise.utils] [user None] utils.py:56 - cannot import name EnterpriseCustomerIdentityProvider
courseware
 [X] 0001_initial
 [X] 0002_coursedynamicupgradedeadlineconfiguration_dynamicupgradedeadlineconfiguration
 [X] 0003_auto_20170825_0935
 [X] 0004_auto_20171010_1639
 [X] 0005_orgdynamicupgradedeadlineconfiguration
 [X] 0006_remove_module_id_index
 [X] 0007_remove_done_index
 [X] 0008_move_idde_to_edx_when
 [X] 0009_auto_20190703_1955
 [X] 0010_auto_20190709_1559
 [X] 0011_csm_id_bigint
sudo: unable to resolve host ip-10-3-71-92
WARNING:py.warnings:/edx/app/edxapp/edx-platform/lms/djangoapps/courseware/__init__.py:7: DeprecationWarning: Importing 'lms.djangoapps.courseware' as 'courseware' is no longer supported
  warnings.warn("Importing 'lms.djangoapps.courseware' as 'courseware' is no longer supported", DeprecationWarning)

2019-08-30 18:55:37,151 WARNING 19583 [enterprise.utils] [user None] utils.py:55 - Could not import Registry from third_party_auth.provider
2019-08-30 18:55:37,152 WARNING 19583 [enterprise.utils] [user None] utils.py:56 - cannot import name EnterpriseCustomerIdentityProvider
courseware
 [X] 0001_initial
 [X] 0002_coursedynamicupgradedeadlineconfiguration_dynamicupgradedeadlineconfiguration
 [X] 0003_auto_20170825_0935
 [X] 0004_auto_20171010_1639
 [X] 0005_orgdynamicupgradedeadlineconfiguration
 [X] 0006_remove_module_id_index
 [X] 0007_remove_done_index
 [X] 0008_move_idde_to_edx_when
 [X] 0009_auto_20190703_1955
 [X] 0010_auto_20190709_1559
 [X] 0011_csm_id_bigint

Cleanup

# Remove the migration file
root@ip-10-3-71-92:/edx/app/edxapp/edx-platform# rm /edx/app/edxapp/edx-platform/lms/djangoapps/courseware/migrations/0011_csm_id_bigint.py

Squashing Migrations

See Django's Documentation for Squashing Migrations. Some useful tips for squashing:

This enables you to squash and not mess up systems currently in production that aren’t fully up-to-date yet. The recommended process is to squash, keeping the old files, commit and release, wait until all systems are upgraded with the new release (or if you’re a third-party project, ensure your users upgrade releases in order without skipping any), and then remove the old files, commit and do a second release.