Django 1.8 Retro
What went well
- Coordination between us and Lahore
- Timezone shift = follow the sun development
- Only one bug on release day
Released on time (after updated schedule)
Touched over 500 files with no regressions and no new defects
- Bug bash was successful and offers a model for the future
- Diverse team - TNL, platform, devops - enabled us to understand the side effects of what we were doing
- Estimation overall was pretty good
- Initially TNL team estimated 4.5 months
- After adding resources, able to accomplish in 2.5 months
- Work was well suited for parallelizing, particularly addressing test failures
- Ned's test output categorization tool was very helpful!
- Pairing on transactional changes worked well in terms of peer review back and forth
- Improved a lot of things along the way
- Requirements management - everything off of tags or hashes
- Rollback plan was written before we pushed to production
What went poorly / what can we improve
- We didn't finish before Django 1.4 was EOL'd (JE, FP)
- We didn't start early enough
- Upgrading from 1.4 to 1.8 was a large jump (UK)
- Jump included switching migration frameworks, the transaction framework had changed, etc - everything touching data changed
- We didn't expect migrations to be such a huge hassle
- Jump included switching migration frameworks, the transaction framework had changed, etc - everything touching data changed
- We have a lot of use cases for the platform and considering all of them created a lot of work
- Some of this is not finished - we need to document the Cypress to Dogwood upgrade (potentially engineering work also)
- The Python toolchain sucks (FP, NB, JE, UK, MA, BB)
- Dependency management sucks
- Lots of pip quirks, including when there are multiple references to a dependency, last one wins
- Similar dependency issue with satellite repos referencing each other caused issues and could continue to cause issues
- Particularly an issue with devstack which does incremental updates
- Dependency management sucks
- Bug bash found lots of existing bugs (JE)
- We tried to write down what we were going to do, but we ended up winging it a lot anyway (NB, FP, BB)
- Merge back to master plan
- Release plan
- Should have picked a consistent branch name
- Didn't understand the variety of git usages across repos
- Some confusion resulted in duplicated work between developers (NB)
- Lots of people in the same codebase
- Wasn't a huge drag
- Rollback plan came together late
- Load testing should have been reviewed more thoroughly
Python toolchain sucks
- We mean pip, virtualenv, setuptools, etc
- We need to understand it better
- We need to put discipline around and and set policies about how its used
- Standardization about how we write requirements.txt files
- Need to clean up what we have - too much copypasta
- We should push for changes to pip that we think make sense
- Investigate other tools
- Upgrade pip after upgrading to Python 2.7.10, which gives us pip-tools
- We need an owner for pushing work forward on dependency management
- Ned has volunteered along with Feanil and Brian
- Think about what we need from a tool in order to do the next one better
- What group does this fall to? Release/deployment, coding standards, new?
Wrote things down but ended up winging it
- There's a difference between writing things down and figuring out what needs to be written down
- The satellite merge plan was not fleshed out well enough
- Huge number of repositories made documenting all the different scenarios difficult
- The fact that satellites have references to other satellites caused issues
- Making it up as you go isn't necessarily a problem, so long as that decision is made consciously
- Need to write it down as you go if we start winging it
- Muzaffar's wiki page on satellite repos was very useful
- We should add relationships between repos
- Need to designate a scribe when documenting plans as a group, and ask the question "who is in charge?"
- Who is the keeper of history?
- Write notes each day on what was done each day
- This improved over the course of the project - nightly notes were sent
- This came together when the work started to converge
- Documentation can be overdone - can't document every little issue, otherwise we'd be overwhelmed
Action items
- Brian Beggs Finish converting bug bash tickets
- Feanil Patel (Deactivated) Create or join working group regarding deployment & dependency management toolchain
- Ticket to adopt better dependency management tools - pipdeptree or find or build our own
- Document the best practices learned while doing this work, particularly wrt requirements and dependency management
- Standardize release branching process across repos (though there are some that are forks that complicate this)
- Can we align devstack more closely with production by fixing pip or some other strategy?
- Joel Barciauskas (Deactivated) What is our overall Django upgrade strategy? Document the shell of a plan for how to tackle large upgrades like this based on our experience
- As part of documentation of best practices, consider a "scribe" role
- As part of documentation of best practices, consider a "scribe" role
- Feanil Patel (Deactivated) Rename the table outside devops "the crisis table" and announce it at the eng all hands