Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

...

When using AWS Aurora, a nullable column can be added to existing large (100k rows+?) tables without causing downtime. However, the migration may still timeout in GoCD - so please coordinate the release with DevOps.

  1. Make a new field addition in model with null=True.
  2. Generate the model-change migrations locally.
  3. Create a pull request containing the model change and newly-generated migrations.
  4. Merge the migration pull request.
  5. The release process will run the migration and add the nullable column to table.

...

On AWS Aurora, indexes can be build on large tables without causing downtime, but this requires devops coordination as the migration may timeout in GoCD.

  1. Add the new index on model fields.
  2. Generate the model-change migrations locally.
  3. Make pull request containing the model change and newly-generated migrations.
  4. Merge the migration pull request.
  5. Release will run the migration and add the index to table.

...

  • The primary benefit of squashing migrations is the speed-up of running migrations from scratch.  If you are not running migrations from scratch, this may not help you.

    Expand
    titleClick for pros/cons of squashing...

    Pros:

    • Processing time for running actual migrations is greatly improved, but we are almost never building from scratch (in edx-platform). Only new instances of Open edX are probably benefiting from this.  It is unclear what IDAs may be running migrations before unit testing.

    • Ultimately, when we remove old files and unnecessary migrations, we may have less maintenance on the old migrations.

    Cons:

    • Processing speeds seem to be unchanged (or worse) for showmigrations, or determining migrations to run in GoCD when there are no new migrations.

    • To get the maximum benefit of faster from-scratch migration times, a lot of careful and potentially error-prone work is required.

    Conclusion:

    • I don’t recommend squashing unless you are starting with a clear problem to be solved, that isn’t already handled through cached databases containing earlier migrations. For example, if you happen to run migrations before unit tests, rather than running based on the models.

    See ARCHBOM-1148 for more details.


  • Managing you squash migrations PR:
    • Keep migration squashing to its own PR.  Introducing a migration in the same PR that you squash can cause issues.
    • Keep the auto-generated squash migration file as its own initial commit on PRs.  This will help your PR reviewers.
  • You can sometimes get an improved squash by removing the data migrations or removing all old migrations to create fresh migrations.
    • Note: commit this separately from the initial auto-generated commit to help with review.
    • You may need to remove all migrations for apps that depend in your migrations as well, to get this to run.
    • If you use this method, ensure makemigrations shows that there is nothing missing from your squash.
  • Testing your squashed migrations.

    Expand
    titleClick for more on testing...
    • Try to run all the migrations locally.
    • For pytest, you can use -vvv to show if the migrations are running, and a combination of --create-db and/or --enable-migrations should work.
    • For edx-platform:
      • To test locally, try:

        Code Block
        # Note: unit test don't currently run using migrations, but this will ensure the migrations complete.
        paver test_system -s lms --enable-migrations --verbose --disable_capture
        
        # Or try the following, which you can use to run the bokchoy smoke tests against:
        # Note: the mysqldump command may fail locally with 'Unknown table 'COLUMN_STATISTICS' in information_schema (1109)', but
        #   you should at least have seen all the migrations run successfully first.
        paver update_bokchoy_db_cache


      • Note that almost everywhere, edx-platform has optimizations to skip migrations or run minimal migrations, so squashing doesn't provide much benefit.


  • Important: Squashing Migrations is a two part process, and each part needs to live in a separate Open edX Named Release in order for the community to get caught up before the second part is released.  From Django's docs:

...