...
Setting up the local Devstack will help you prepare for triage work you may need to do. Follow the Devstack setup instructions (https://github.com/edx/devstack). Having a local environment that is up to date will make the triage process easier.
Operational Ops-Genie Steps:
Anchor | ||||
---|---|---|---|---|
|
...
- Once you are alerted by OpsGenie, please Acknowledge the alert.
- Review the alert and try to determine the impact
- If the impact is high (e.g. an IDA is down and many users are impacted) notify the Learner-All channel, your Manager, and the Escalations Engineering Lead.
- Review the the Run-books here: /wiki/spaces/LEARNER/pages/789970979 for next steps
- If there is no Run-book
- Create a learner JIRA ticket to track, if a ticket haven't already existed
- Triage the JIRA ticket immediately and follow the triage process defined above
- Contact the Escalations Engineering Lead using the "Customer Requests / Support / Escalation" Channel in HipChat.
- If the impact it low Close the alert and begin looking at possible resolutions steps in the Run-books here: /wiki/spaces/LEARNER/pages/789970979
- If the impact is unknown notify the Learner-All channel of the Alert and review the Run-books here: /wiki/spaces/LEARNER/pages/789970979 for possible resolution steps. Follow the steps above if there is no Run-book available.
- If the impact is high (e.g. an IDA is down and many users are impacted) notify the Learner-All channel, your Manager, and the Escalations Engineering Lead.
- After determining impact and whether or not there is a Run-book available please make sure to Close the alert. The alert will message you again if it is not closed.
...