Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

See PLAT-2437. This is a runbook for running gh-ost to alter the id field of courseware_studenthistory in edx-platform as safely and with as little downtime as possible.


Questions we need to answer before running this:

  1. What changes do we need to make to support both versions of CSMH?
    1. Removing the FKs will likely mean that we need to use the integer IDs directly instead of using StudentModule objects, for example.
    2. Also manually handling deletes if they're cascading right now.
  2. After we upgrade CSM we will also need to migrate both CSMH's to have an appropriately sized column for CSM fake foreign keys. Does needing to perform maintenance that more than doubles the size of that table cause us to re-evaluate this?
  3. What do we need to do to ensure that Django migrations continue to work throughout this process? Especially after we switch over to the new table?
    1. Will Django notice that the column type has changed?
    2. Can we fake a migration that gets Django's state back in sync with reality? At what point do we do that?


Per-environment runbook (in progress)

  1. Use the gh-ost cheat sheet for "Connect to replica, migrate on master": https://github.com/github/gh-ost/blob/master/doc/cheatsheet.md#a-connect-to-replica-migrate-on-master
    1. Confirm the environment is set up correctly
      1. Is using Statement Based Replication
      2. Replica we want to use is configured with binary logs enabled (log_bin, log_slave_updates)
      3. binlog_format=ROW (gh-ost can apply the latter for you)
    2. Confirm the RDS environment has enough resources for two copies of the table!
    3. Make sure we have dropped the FK from courseware_studentmodulehistory
  2. Perform a no-op run to make sure everything is set up correctly
    1. The "alter" flag we want to run is: "change id bigint unsigned not null auto_increment"

  3. Run the migration on a replica only
    1. Monitor and see if the throttling variables need to be tweaked
    2. Confirm that the data all looks good and makes sense
  4. Run the migration on the master
    1. Monitor and tweak throttling as needed
    2. We should use the cut over flag file to make sure we are in the office and ready when it comes time to flip the switch: https://github.com/github/gh-ost/blob/master/doc/command-line-flags.md#postpone-cut-over-flag-file
    3. Confirm that the data looks good
  5. Cut over to the new table and monitor
  6. Trickle delete the rows from the old table, wait some time, then drop it: https://github.com/github/gh-ost/issues/307
    1. We will likely want to create a custom script / management command for the deletes
  • No labels