...
The SRE Team is tracking each stage, prod and edge database upgrade under this Jira epic: https://openedx.atlassian.net/browse/PSRE-301
As each of our services have different requirements (owners, data pipelines), each database upgrade may have specific steps required.
Each of the services in the diagram below (except Forums, Prospectus & Marketing), has a MySQL database that will be upgraded to Aurora 5.7, in addition to the service, our data pipelines also connect to these databases so special care should be taken to ensure those pipelines and the downstream reports are not impacted by the upgrades.
At least the LMS, Ecommerce, Discovery, License Manager, Enterprise Catalog, Demographics and Credentials services also have DBT generated Snowflake views described here, please be sure to sync with DE about requirements before doing each cutover.
At least LMS and Ecommerce have Jenkins EMR jobs that pull data into Verticaand/or Swoop jobs, these jobs should be checked carefully before and after each respective cutover.
...
Notes (SRE) - Done on 10/23(not on Aurora)Discovery & Ecommerce (Engagement)(not on Aurora)Discovery - Done on 10/29Ecommerce - Done on 11/17 @ 10pm
Credentials and Demographics (Aperture)
Credentials - Done on 11/17 @ 3pmDemographics - Done on 12/7 @ 10am
Registrar and Portal Designer (Masters)
Registrar Target - Done on 12/2 @ 10amDesigner Target - Done 12/9 @ 10am
Analytics API and Insights (DE)
Prod Analytics API and Analytics Data - Done 12/17 @ 10am-12pm ETEdge Analytics - Done for 12/18 @ 10am-12:11pm
XQueue (T&L)Move Stage xqueue schema from edxapp db to shared db DoneProd Xqueue Target -Done on 1/5 @ 10am
License Manager, Enterprise Catalog, Blockstore, Video Encode Manager (Enterprise, T&L, Incident Management)Shared ClusterTarget - Tentatively scheduled for between 1/14 and 1/26- Done
Platform (Arch + TNL):Stage edxappTarget-Tentatively scheduled forDone on 1/25Edge edxappTarget-Tentatively scheduled forDone on 1/27Prod csmhe & edxappTarget-TBDDone on 2/2
Communications Plan
Note: These steps may be followed twice if we do a segmented upgrade to support TLS 1.2 first on 5.6.48 before going to 5.7 (e.g for notes)
...
Service | Business and Technical Owners (see list here) | Current MySQL Versions (best effort) | In Open edX? | 5.7 DevStack Upgrade Due November 9th if in Open edX, December 15th if not. | 5.7 Travis Upgrade Due November 9th, December 15th if not. | 5.7 Sandbox Upgrade | Planned 5.7 Stage Upgrade Due December 15 | Planned 5.7 Prod Upgrade (+ Any Read Replicas) Due Jan 1 | Planned 5.7 Edge Upgrade | Maintenance Mode / Smoke Test Docs | |
---|---|---|---|---|---|---|---|---|---|---|---|
Platform (LMS + CMS) + csmhe | Arch - | Stage: 5.7 | Yes | https://openedx.atlassian.net/browse/BOM-2059 - Muhammad Arif (Deactivated) Upgraded on 10/29 | Tests run on Jenkins instead of Travis https://openedx.atlassian.net/browse/BOM-2059 Upgraded on 10/29 | SRE will create a test sandbox once https://openedx.atlassian.net/browse/BOM-2059 is done. | (dedicated aurora) Upgraded on 01/26 @ 1 AM CNAME: stage-edx-edxapp.rds.edx.org | Coordinate with Jeremy/Nimisha/Sarina/Stu/Feanil Note: There are two databases. (dedicated aurora) Completed on 2/2 | Coordinate with Jeremy/Nimisha/Sarina/Stu/Feanil Completed on 1/27 | Maintenance Mode: See Runbook. Smoke Test: edxapp + csmhe Smoke Tests Run Book cc Sarina Canelake (Do Not Use) (Deactivated) | |
Engagement - | Stage: 5.7 | Yes | https://github.com/edx/devstack/pull/639 | Travis File https://openedx.atlassian.net/browse/REV-1568, now ecom team helping via https://openedx.atlassian.net/browse/REV-1568 | (dedicated aurora) | (dedicated aurora) | Not applicable (ecommerce not in Edge) | Maintenance Mode: 1: Disable ASG in Asgard | |||
Aperture - | Stage: 5.7 | Yes |
|
| N/A | (shared aurora) | (dedicated aurora) | N/A | Maintenance Mode: Disable ASG in Asgard | ||
Aperture - | Stage: 5.7 | No | Done as part of https://github.com/edx/demographics/pull/80/files | N/A | Adam Blackwell (Deactivated) ask Matt Tuchfarber (Deactivated) if demographics runs in sandbox. | (dedicated aurora) | (dedicated aurora) | N/A | Maintenance Mode: Disable in ArgoCD Smoke Test: Adam Blackwell (Deactivated) will reach out to Matt Tuchfarber (Deactivated) | ||
SRE - | Stage: 5.7 | Yes | https://github.com/edx/devstack/pull/629 | Travis File | (shared aurora) | (dedicated mysql) | (dedicated mysql) | Maintenance Mode: We can’t currently put just notes in maintenance, but the LMS fails gracefully for users and we can use Cloudflare. | |||
T&L - | Stage: 5.7 | Yes | Upgraded as of https://openedx.atlassian.net/browse/PSRE-322) UPGRADED 11/19 | https://openedx.atlassian.net/browse/PSRE-322 https://github.com/edx/xqueue/pull/783 UPGRADED 11/19 | (edxapp aurora) | (dedicated mysql) Will be done by SRE team, need coordination with Feanil Patel | N/A | Check with Feanil Patel Maintenance Mode: Disable ASG in Asgard Smoke Test: | |||
Registrar & Workers | Programs - | Stage: 5.7 | No | Upgraded as of | No Travis Docker Tests | (shared aurora) | (dedicated mysql) Endpoint for Stitch: | N/A | Maintenance Mode: Disable in Asgard Smoke Test: https://openedx.atlassian.net/wiki/x/bYCCeQ | ||
Enterprise - | Stage: 5.7 | Yes | Upgraded as of | No Travis Docker Tests | (shared aurora) | (shared aurora) | N/A | Maintenance Mode: Disable in Asgard | |||
Enterprise - | Stage: 5.7 | Yes, for now | Upgraded as of | No Travis or CircleCI | (shared aurora) | (shared aurora) | N/A | Maintenance Mode: Disable in Asgard Smoke Test: | |||
Engage - | Stage: 5.7 | Yes | Upgraded as of | Upgraded Travis as of | (shared aurora) | (dedicated mysql) | N/A | ||||
Masters - | Stage: 5.7 | No | Upgraded local compose file as of | No Travis or CircleCI | (shared aurora) | (dedicated aurora) | N/A | Maintenance Mode: Disable in Asgard Smoke Tests: Designer Smoke Tests Run Book | |||
T&L - | Stage: 5.7 | Yes | https://openedx.atlassian.net/browse/TNL-7642 https://openedx.atlassian.net/browse/OSPR-5101 Done 11/6 | N/A | (shared cluster) | (shared cluster) | N/A | ||||
Analytics Pipeline | Data Engineering - T:Brian Beggs | N/A | Yes | N/A |
| ||||||
Data Engineering - T:Brian Beggs | Stage: 5.7 | Yes | N/A (According to Brian Beggs , insights and analytics-api do not have a devstack.) | Travis doesn’t use MySQL | Doesn’t have sandboxes. | (shared aurora) https://openedx.atlassian.net/browse/PSRE-304 No action items | (analytics aurora) | (analytics mysql)
| Maintenance Mode: Brian Beggs will check with his team + Asgard | ||
Insights and/or Analytics Dashboard | Data Engineering - T:Brian Beggs | Stage: 5.7 | Yes | N/A (According to Brian Beggs , insights and analytics-api do not have a devstack.) | Travis doesn’t use MySQL | Doesn’t have sandboxes. | (shared aurora) https://openedx.atlassian.net/browse/PSRE-304 No action items | (analytics aurora) | (analytics mysql) | Maintenance Mode: Brian Beggs will check with his team + Asgard Smoke Test: Download button in Insights | |
ensure retirement | Simon Chen , Julie Davis (Deactivated) - repo ownership Jeremy Bowman (Deactivated) - for deprecation | N/A | No | Done 11/13 https://openedx.atlassian.net/browse/EDUCATOR-5378 - closed in favor of https://openedx.atlassian.net/browse/DEPR-106?searchSessionId=4511027c-eb42-4c57-b290-aaec5aaf03fa&searchObjectId=186639&searchContainerId=17023&searchContentType=issue | No Travis Docker or CircleCI Tests | N/A | N/A | ||||
Video Encode Manager | Stage: 5.7 | Yes | Not in DevStack: https://openedx.atlassian.net/browse/PSRE-371 | N/A | N/A | (shared cluster) | (shared cluster) Completed | N/A | Kashif Chaudhry (Unlicensed) Are you aware of how SRE can smoketest this service after we complete this database upgrade? | ||
Licence Manager | Stage: 5.7 | Yes | Fred Smith (Deactivated) I just realized this row hasn’t been added, do you know if any work needs to be done for License ManagerN/A | (shared cluster) | (shared cluster) Completed | N/A | |||||
ensure retirement | Website: | N/A | No | Due December 15 Will be retired before by then - https://openedx.atlassian.net/browse/WS-1413 | Will be retired before then -https://openedx.atlassian.net/browse/WS-1413 | N/A | Needs to be removed from shared | Needs to be removed from shared | N/A | N/A | |
Enterprise Reporting | Markhors: | Stage: 5.7 | No | N/A | N/A | N/A | N/A | N/A |
...