Django 2.2 Upgrade Plan

All of the GitHub repositories managed by edX which depend on Django need to be updated to support Django 2.2, preferably well before its end of life on April 1st, 2020. The earlier this is finished, the earlier we can put out the Juniper release of the Open edX platform, allowing our partners to also complete their upgrades before the support window ends. Previous Django LTS-to-next-LTS upgrades for this platform have consumed multiple months of several senior developers' time, and this time we want to do it with less impact on new feature velocity and/or invest in tooling which will make future upgrades easier.

The plan is outlined below in rough chronological order of the steps to be taken (although some steps can be done in parallel, and some were started before the plan was finalized).

1. Identify pain points from previous upgrades to mitigate this time

The Open edX platform has already been upgraded from Django 1.4 to 1.8, and again from 1.8 to 1.11. Some of the things we learned from that:

  • Don’t maintain a long-running branch for the upgrade. Fixes should be made in small, backwards-compatible batches that can be merged to the master branch of each repository after 1-2 days of development.

  • Automate the analysis of Django deprecation warnings. Identifying and fixing all the deprecation warnings emitted by Django is a big part of the upgrade process, and doing this manually by looking at the console output from pytest takes too long. We now have code to generate an HTML report of all the deprecation warnings generated by a repository’s test suite, and have it showing on edx-platform Jenkins test runs. This will be split into a separate package for use also in other repositories, and we will adapt code from the Python 3 upgrade to automatically generate Jira tickets from it.

  • Address specific tech debt items first that accelerate the upgrade work. A repository should manage its dependencies in compliance with OEP-18, use tox in CI with all tests passing, and use pytest as the test runner with warnings enabled before we even try to support the new Django version. Thankfully, this has already been done for many of our repositories for the Python 3 upgrade.

  • Maintain a dashboard of progress towards getting each service to Django 2.2 support. A manual wiki page has already been created for edx-platform, and the first draft of an automated dashboard for any specified service is nearly complete.

  • Use automated code refactoring for routine changes that need to be made in many places. Three such refactoring scripts have been implemented already using the bowler package.

2. Identify personnel available to help on the project

Time crunches in previous upgrades led us to allocate multiple principal engineers to the project, even for sub-tasks that could have been done competently by more junior developers or contractors with less domain knowledge of the code base. This time we wish to not just automate more of the work, but also delegate more of the work to outside contractors and the Open edX community. We already have some off-shore contractors with prior experience working on the project, and are finalizing plans to delegate more work to other Open edX community members. We will attempt to limit the staff in the Cambridge office dedicated to working on this project to 1-2 people, although certain tasks may occasionally require additional assistance from other full-time employees. Initial code review for each pull request should be performed by a contractor other than the one who made the changes, and a final review will be made by edX staff after the contractors are happy with the quality of the work. There may be more proactive review by edX staff early in the process to make sure expectations are in alignment before too much work and re-work is done.

3. Determine the approximate scope of work

A big benefit of having the progress dashboards for each relevant service is that it will allow creation of reasonable estimates of the total amount of work to be done, and hence allocation of appropriate overall staffing levels. We’re fine with over-hiring and getting the project done earlier than planned, but can’t afford to have more people being paid than can work effectively in parallel on useful tasks. Scoping work for each service includes:

  • Counting the service’s dependencies which use Django

  • Determining how many of those dependencies still don’t support Django 2.2 in their latest release

  • Determining how much edX-managed code needs to be updated for Django 2.2 support

  • Determining how much work would be involved in fixing/forking/replacing external dependencies which don’t yet support Django 2.2

  • Estimating how much work will be needed to update the service itself based on its size.

This is the list of services which we know need to be upgraded from Django 1.11.x to Django 2.2.x, along with a recent count of the number of kilobytes of Python code in each repository (excluding dependencies) which may be useful as a rough starting point for estimation:

4. Hire an appropriate number of outside contractors

This was touched on before, but the final number can’t be determined until the scoping work above is complete. Once we have a reasonable estimate of total project scope, we can finalize contracts for all the contractors we expect to need.

5. Upgrade pinned dependencies with Django 2.2 support available

While we’ve recently started proactively upgrading many of our dependencies as soon as new releases come out, there are still quite a few pinned to old versions for assorted reasons. The ones which use Django need to be upgraded to versions which support Django 2.2. In some cases this will be as simple as verifying that tests still pass with the version constraint removed, in others we’ll need to adapt to multiple backwards-incompatible changes made since the release we currently use. At any rate, this should be attempted early to identify any unexpected major problems; each upgrade attempt will typically result in either a quick win or an early identification of further work that needs to be done.

6. Fix, fork, or replace external dependencies lacking Django 2.2 support

These cases can require a lot of communications and/or effort, and hence should be started early. Our first preference is to stay on external dependencies which are actively maintained and support recent versions of Django, but each Django LTS cycle a few packages which used to match that description are effectively abandoned. Our rough preferences on how to handle such cases:

  1. Switch to an existing more actively maintained equivalent package if that can be done with minimal effort. Often this is a fork made after maintenance of the original package ceased.

  2. Help get any existing PRs adding Django 2.2 support merged (fix tests, respond to maintainer concerns, etc.)

  3. Create a new PR adding Django 2.2 support, and work with the maintainer to get it merged. This can take a long time depending on the maintainer’s availability, so these PRs need to be created early when needed.

  4. Fork the package into the edX GitHub organization and apply the necessary updates. This is often the fallback if options 2 or 3 fail to succeed in a reasonable length of time.

7. Identify and implement useful code refactoring automation

Some backwards-incompatible changes made in Django from 1.11 to 2.2 require the same trivial change to be made in dozens or hundreds of lines of code. Rather than hunting down and changing all of these by hand in multiple repositories, we’d like to start using automated code refactoring tools to do this for us. Our initial implementation of this for the now required on_delete attribute of ForeignKey fields proved quite useful, so we’d like to make more of these automated fixers using the same framework. Changes that only require modifying a few lines of code or are more difficult to automate will still be done manually.

8. Update any edX-managed dependencies to support Django 2.2

edX has created dozens of Python packages using Django, and our partners in the Open edX community have created dozens more that we use. These packages need to be updated to support Django 2.2 before any service using them can claim such support. Most of these are quite small and can be fully updated by a single developer in 1-3 days, but there are a few that are much larger and may take significantly longer (edx-enterprise, for example). There are also specific features that we generally want these repositories to have before we start making Django 2.2 compatibility fixes, otherwise the fixes take much longer and/or the resulting pull request has to make too many changes at the same time to allow for efficient code review. The ones which have already been identified are:

  • CI is run for each pull request

  • Python package dependencies are managed in compliance with OEP-18 (summary: there’s a make upgrade target which uses pip-tools to update all unpinned requirements to the latest compatible versions in a sane, consistent manner).

  • make upgrade has been run recently to pull in reasonably current versions of most dependencies

  • pytest is used as the test runner, with deprecation warnings displayed as part of the output

  • Travis is configured to release to PyPI when a new version is tagged

  • CI includes validation of the package’s long description via twine check to make sure it will pass validation on PyPI when uploaded

  • tox is used in CI to support testing multiple versions of Django (among other benefits)

The changes typically needed after that for proper Django 2.2 support are summarized in https://openedx.atlassian.net/browse/BOM-1009 .

9. Analyze, ticket, and fix deprecation warnings in the service itself

Once all dependencies have been updated so they don’t generate a spew of deprecation warnings in the service’s test run, the service can be tested against Django 2.0, 2.1, and 2.2. The resulting Django deprecation warnings should be reviewed and turned into Jira tickets for remediation. Each ticket will typically be fixed either by running an existing automated code fixer, writing a new one if appropriate, or fixing the issue manually if automation isn’t practical in that particular case. Each type of issue should be resolved in a separate pull request to keep code review efficient (but one pull request may fix many instances of the same problem).

10. Deploy to stage and production

Once the service’s test suite is running without test failures or relevant deprecation warnings under both Django 1.11 and 2.2, the default Django version can be switched over to 2.2 and then deployed out to the staging server and production as usual. Some problems may be identified there that weren’t caught by the test suites, and we’ll either roll back or fix forward as appropriate for the scale of the issues identified.