These will have to be done twice in most cases, first on stage and then on prod so that we can uncover any major infrastructural issues.
Test before upgrade
After upgrade
After re-index
An example PR: https://github.com/edx/terraform/pull/2830
Set this setting to point to the new clusters
This is a manual process
eSREs will rely on SRE for assistance with how to do this
Each app has its own method for indexing data, will need to be discussed with eSRE during setup and implementation
Understand complications based on deploy time (i.e. missing notes that were added between indexing + deploy)
Communicate schedule to #support ahead of time.
Remove old settings
Remove old clusters from terraform
Plan deploy for week of Sept 7.
Does not have remote config, due to Kubernetes
Can’t send out config synchronously with image
Set different variables for different clusters
Different code uses different variables
Can we hide search from users temporarily?
Action Items:
Diana Huang - try to get this working in devstack
Try adding new ELASTICSEARCH_DSL
in the yaml, clean up HAYSTACK_CONNECTIONS
and ELASTICSEARCH_URL
after we confirm that the deploy goes smoothly
Diana Huang - schedule time next week to try to deploy to stage
Instead of using the existing ELASTICSEARCH_URL
var for the url for the new cluster. We’ll add a new one ELASTICSEARCH_CLUSTER_URL
to make the cutover easier and less error prone.
Runbook
Merge remote-config PR
Merge course-discovery PR
Deploy to stage.
After Deploy
manage.py search_index --create
← This is needed if it’s a fresh ES cluster.
manage.py update_index --disable-change-limit
Test on stage.
Other Useful commands:
|
|
Pause prod pipeline
Merge changes to master
Have revert PR available
After code is on stage:
Run these commands
python manage.py lms reindex_course_team --all python manage.py cms reindex_course --all |
Do testing in stage
Make sure to test teams.
If we feel confident with stage, unpause prod pipeline
Prevent prod/edge deploy from deploying - not added to ELB
Index data - prod/edge
This needs to be discovered and investigated - is it a checkbox? - Fred Smith (Deactivated)
Checkbox - switch deploy_ami step from ‘On Success’ to 'Manual'
Manually set up ASGs for
Diana Huang - schedule a 4 hour window for attempting to deploy