ES7 deployment strategy
Runbooks from the ES1 migration
https://openedx.atlassian.net/wiki/spaces/ArchiveEng/pages/151191637
https://openedx.atlassian.net/wiki/spaces/ArchiveEng/pages/158541260
https://openedx.atlassian.net/wiki/spaces/ArchiveEng/pages/158541200
https://openedx.atlassian.net/wiki/spaces/ArchiveEng/pages/157772122
Generalized Strategy
These will have to be done twice in most cases, first on stage and then on prod so that we can uncover any major infrastructural issues.
0. Create/Find a Test Bed on Stage and Prod
Test before upgrade
After upgrade
After re-index
1. Create ES7 clusters in terraform
An example PR: https://github.com/edx/terraform/pull/2830
2. Create a new setting for the ES7 configuration
Set this setting to point to the new clusters
3. Spin up new instances of app with ES7-compatible code
This is a manual process
eSREs will rely on SRE for assistance with how to do this
4. Index ES data from ES7-compatible instance
Each app has its own method for indexing data, will need to be discussed with eSRE during setup and implementation
Understand complications based on deploy time (i.e. missing notes that were added between indexing + deploy)
5. Merge ES7 code and configuration changes
6. Deploy using GoCD using the newly built changes
Communicate schedule to #support ahead of time.
7. Re-index ES7 to include any writes that were missed
8. Clean up settings and clusters
Remove old settings
Remove old clusters from terraform
Specific App Strategies
edx-notes
Plan deploy for week of Sept 7.
Does not have remote config, due to Kubernetes
Can’t send out config synchronously with image
Set different variables for different clusters
Different code uses different variables
Can we hide search from users temporarily?
Action Items:
@Diana Huang - try to get this working in devstack
Try adding new
ELASTICSEARCH_DSL
in the yaml, clean upHAYSTACK_CONNECTIONS
andELASTICSEARCH_URL
after we confirm that the deploy goes smoothly
@Diana Huang - schedule time next week to try to deploy to stage
course-discovery
Instead of using the existing
ELASTICSEARCH_URL
var for the url for the new cluster. We’ll add a new oneELASTICSEARCH_CLUSTER_URL
to make the cutover easier and less error prone.Runbook
Merge remote-config PR
Merge course-discovery PR
Deploy to stage.
After Deploy
manage.py search_index --create
← This is needed if it’s a fresh ES cluster.manage.py update_index --disable-change-limit
Test on stage.
Other Useful commands:
edx-search
Pause prod pipeline
Merge changes to master
Have revert PR available
After code is on stage:
Run these commands
python manage.py lms reindex_course_team --all python manage.py cms reindex_course --all
Do testing in stage
Make sure to test teams.
If we feel confident with stage, unpause prod pipeline
Prevent prod/edge deploy from deploying - not added to ELB
Index data - prod/edge
This needs to be discovered and investigated - is it a checkbox? - @Fred Smith (Deactivated)
Checkbox - switch deploy_ami step from ‘On Success’ to 'Manual'
Manually set up ASGs for
@Diana Huang - schedule a 4 hour window for attempting to deploy
cs_comment_service
Example of running the catchup command with rake.