Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

...

  1. Release:
    • Add the new field to the model.
      • If the old field has null=False, blank=False, and no default:
        • If the model is used in forms (django admin, or other forums):
          • Create the new field with null=True, editable=False.
          • disabling editable removes the field from
        • else:
          • Create the new field with null=True.
      • else if the old field is a BooleanField:
        • You might need to change the old field type to NullableBooleanField so that unit tests in release 2 will be happy when the old field is removed from code but not sqlite3.
        • Create the new field with BooleanField and the same signature, assuming there's a default set.
      • else if the old field has null=true:
        • Create the new field with the same field signature as the old.
    • Update any place where there are creates or updates on the field
      • Write the same value into both fields
      • If there is a Django admin page or other form and it is used regularly to create/update rows:
        • Register a signal handler to the model to update the new field whenever the old field changes or a new row is created.
  2. Release:
    • Create a data migration to copy the values from the old field into the new field.
      • If the table is large, consider disabling atomicity and batching the copy.
    • Remove all references to the old field in the code.
      • Including removing the old field from the model in the code.
      • If this change is in the edx-platform codebase, add a skip to the test_migrations_are_in_sync unit test.
      • DO NOT include the migration for removing the old column (yet).
    • If you create the new field with a different field signature than the old, then update it now to be the same as the old.
      • e.g. change null back to False and editable back to True (the default).
        • CAUTION: Changing to null=False will cause a table rebuild during the ALTER. When performing this migration on a table with a large number of rows, degraded performance/downtime will likely result.
        • See 158766629 below.
      • include the migration that goes with this, but NOT the migration to remove the old field.
  3. Release:
    • Run makemigrations, this should pick up the field removal from the previous stage.
    • If this change is in the edx-platform codebase, remove the skip to the test_migrations_are_in_sync unit test.

How to drop a table


Warning

Pre-requirements:

  • A 2U employee must create a Data Platform 1-off request to ensure that there will be no complications due to losing the table (e.g. financial data, or otherwise), and to determine the follow-up required.
    • This must happen before removing the actual tables in 2U Production.

Check in with Data Engineering/Data Science & Analytics to understand the downstream impacts of removing a table. When you get an answer about the process around this, update the documentation.

TWO releases (after pre-requirements):

  1. Remove all references to the table by removing references to the model and the model itself
    1. If this change is in the edx-platform codebase, add a skip to the test_migrations_are_in_sync unit test.
  2. Remove the table with a migration
    1. Important: If 2U has determined that their data can't be lost during pre-requirements above, the merge should be timed with a pipeline pause so the delete migration can be faked.
      1. This should be a rare occurrence, but it has happened.
    2. Remove the skip if you added one

Once a table is removed, a Data Engineering ticket should be created :

  • A 2U employee should:
    • Follow up with the Data Platform ticket to clean up the table from downstream consumers of data. 

...

    • If there is an associated app-permissions group then a 2U employee must make an app-

...

How to delete a Django app containing tables

...

When using AWS Aurora, a nullable column can be added to existing large (100k rows+?) tables without causing downtime. However, the migration may still timeout in GoCD - so please coordinate the release with SRE.

  1. Make a new field addition in model with null=True.
  2. Generate the model-change migrations locally.
  3. Create a pull request containing the model change and newly-generated migrations.
  4. Merge the migration pull request.
  5. The release process will run the migration and add the nullable column to table.

...

On AWS Aurora, indexes can be build on large tables without causing downtime, but this requires SRE coordination as the migration may timeout in GoCD.

  1. Add the new index on model fields.
  2. Generate the model-change migrations locally.
  3. Make pull request containing the model change and newly-generated migrations.
  4. Merge the migration pull request.
  5. Release will run the migration and add the index to table.

...