This document is meant to capture the Triage & Development processes as they relate to Jira and communication. We will be moving both the Spartans and Mavericks teams to use similar workflows to help stakeholders have consistent expectations and also so that stakeholders can provide information and communications in a uniform manner for a more clear format.
Triage
Customer Requests Service Desk
Links
Customer Requests Intake Forms: https://openedx.atlassian.net/servicedesk/customer/portal/9
Customer Requests Queues: https://openedx.atlassian.net/projects/CR/queues/
What & Why
Triage for the Sustaining & Escalations teams will filter through the Customer Requests Service Desk project. The motivations and benefits expected from this shift include:
- Create a unified system for stakeholders to leverage which will simplify expectations and reduce confusion on different processes
- Focus all communication necessary for triaging tickets to one location for stakeholders that have multiple tickets open
- The Service Desk project will gather data and metrics automatically that the teams had previously needed to capture manually
- Triage time, but will be able to measure the time the teams actually work on tickets, SLA timing will be paused when waiting on response
- Product Area filtering on ticket intake
- Automation for setting fields in Jira based on user input
- Determine priority based on user input for reach and impact
- Potentially determine Github repos based on platform area and product component
- Clear way to mark tickets that have been escalated (separate queue)
- Triage data will be used to define SLAs for time to first response and measure when there is an increase in load from outside requests
- Help with broader expectation setting & visibility for stakeholders in recurring reports
- Product Area data will be used to measure hot spots
- Hot spots are areas of the system where most issues are coming from
- This will ultimately help determine where sustaining projects can be focused
Overview of the Customer Request Process
Jira Workflows
CR Queues View
These are example queues within the CR project. The first iteration of this project we will be attempting to manage incoming triage queues by Platform Area.
The six Platform Areas are:
- Content & Authoring
- Catalog & Publishing
- Commerce & Payment
- Learner Experience
- Platform & Infrastructure
- Business & Enterprise
Our teams will be responsible for monitoring and triaging new items in these queues. We will be working directly with the team leads and teams to define the breakdown of responsibility of which teams are responsible for which queues.
The Jira Workflow for the Customer Requests Project
This is the workflow of a ticket that comes through the CR project. Definitions for each of the workflow states is below.
Workflow State | Definition | Expectation |
---|---|---|
Needs Triage | Newly Created Ticket | Engineers will pick these tickets up and triage them. OLA to be determined |
In Triage | Engineering team is actively triaging ticket | An engineer is actively triaging the ticket, communicating within the CR ticket with stakeholder for any additional information needed to confirm issue, validate priority, and determine correct routing of ticket. |
Waiting for Customer | A question has been posed to an external stakeholder (requester, product, PC, etc) and triage is paused pending a response. | Additional communication and knowledge is needed for engineering team to be able to continue work on the issue. |
Triaged | Issue has been confirmed, priority has been set, and routing has been determined. A development ticket can be created. | There is enough information for the triaging team to know priority of the issue, the Product Area and Product Component that is affected, which team to route the issue to, and has updated the CR ticket as needed. This is will be the final stage where the engineer needs to manually update the CR ticket, outside of any additional communications during development with the requester (validating fixes in QA/stage/sandbox, etc). |
Backlog | Status of Development tickets that have been triaged and are not yet prioritized, or are lower in priority. | There will be review of these issues in some cadence TBD to determine if priority of these tickets should be changed and/or moved into the prioritized queue. |
Prioritized | Status of Development tickets that have been triaged and are already prioritized. CAT-1 & CAT-2 tickets will by default move to this state. (Discussions around CATs here are ongoing) | These are issues that are higher priority and have been already prioritized by the engineering team and product. They are ready to be picked up by the engineering team for development work to begin. |
In Progress | Status of Development tickets that are currently in development by the engineering team. This includes Dev statuses In Progress, In Code Review, Merged. | An engineer is actively working on development of this ticket and resolving the issue. |
Blocked | Status of Development ticket that has begun development and has now reached a point that it is blocked in development. This can be caused by any number of items including release freeze, rollback, blocked on an external team to name a few. | An engineer is actively keeping up with the work on this ticket, but has reached a point that they are blocked on moving forward to completion. They need this external blocker to clear before they can proceed. |
Closed | Either the ticket has been Closed without needing development work or the status of the Development ticket that it has been marked Done. | This issue has been resolved. The ticket can be marked closed if during triage it was determined that no development work was needed, or if the development ticket has been marked as Done and the fix has been released to Production. A final communication to the stakeholder should be made to let them know that the issue is going to be closed and communication as to why with necessary level of context. |
Escalated | A ticket has been escalated by an external stakeholder. | This ticket has been escalated by an external stakeholder and Product and Engineering leadership has been alerted. The stakeholder has entered an escalation reason. This means that a conversation with product and eng-lead will occur to understand stakeholder needs to determine if any additional escalation is necessary within the team. This does not mean the ticket will be definitely be moved to the top of the list. |
Reopened | A ticket that had previously been closed has been reopened for work. | This ticket needs to be triaged again to determine why it was reopened and why the original closure reason was not valid. If it appears that the original issue was resolved, but there is a new issue related to this ticket, a new ticket should be filed but linked to this original issue. |
Development
Product Jira Project
Links
Mavericks Prod Kanban Board: https://openedx.atlassian.net/secure/RapidBoard.jspa?rapidView=511
Spartans Prod Kanban Board: https://openedx.atlassian.net/secure/RapidBoard.jspa?rapidView=507
Note
Currently these Jira boards are pulling in tickets assigned to the Sustaining: Spartans (77) and Sustaining: Mavericks (87) teams across the Product, Educator, and Learner projects. The goal is that the tickets on these boards will only be coming from the Product project. Also, this means that all tickets will need to have team assigned in order to show up on the correct Kanban board. This step is expected to happen during triage prior to a Product ticket is created. For current Learner/Educator tickets, team assignment will need to happen in places that it has not already happened.
What & Why
Development for Sustaining and Escalations teams will occur within the Product Jira Project. The current Jira development flows and options are different between sub-teams and can be very complex; capturing ticket states and transitions that can confuse and overcomplicate the development workflow. This complexity can cause confusion in workflow transitions, slow down the process from start to finish due to overhead, and make it harder to capture clear reports when different terms/states mean different things across teams. The goal for this project is:
- To unify the teams into a similar, simple workflow to help reduce friction of state transitions from development start to finish.
- To reduce confusion over where to move a ticket once it has been triaged and what tickets to work on next.
- To be able to create one set of reports that can easily measure sub-teams performance in an equal manner.
- Be able to easily grab development tickets for hot spot parts of the system in one place to use for Sustaining projects.
- An ongoing conversation and longer term goal is to have the Product project be the home of all product related tickets across edX that new Themes/Squads would be able to search through in order to see what outstanding issues exist with Product Components that they are going to be actively developing on.
Jira Workflow
This is the workflow for a development ticket that comes through the Product Project. Definitions of the workflow states are below.
Workflow State | Definition | Expectation |
---|---|---|
Backlog | Issues that are a part of the project but have not yet been prioritized or are lower in priority. | There will be review of these issues in some cadence TBD to determine if priority of these tickets should be changed and/or moved into the prioritized queue. |
Prioritized | Prioritized issues that are ready for development. CAT-1 & CAT-2 tickets will by default move to this state. (Discussions around CATs here are ongoing) | These are issues that are higher priority and have been already prioritized by the engineering team and product. They are ready to be picked up by the engineering team for development work to begin. |
In Progress | Issues that are currently in development by the engineering team. | An engineer is actively working on development of this ticket and resolving the issue. |
Blocked | The issue has begun development and has now reached a point that it is blocked in development. This can be caused by any number of items including release freeze, rollback, blocked on an external team to name a few. | An engineer is actively keeping up with the work on this ticket, but has reached a point that they are blocked on moving forward to completion. They need this external blocker to clear before they can proceed. |
In Code Review | Code is ready and/or being reviewed by additional engineers. | An engineer is actively keeping up with the work on this ticket and code review is underway. The issue may move back into In Progress if the review uncovers larger changes to be made. The ticket # should be linked in the PR description to ensure that the PR and the Jira issue are linked. |
Waiting on Reporter | If needed, fix is being validated by the an outside stakeholder via manual user validation. This could be the person reporting the issue, a PC, a course team, product, or another relevant stakeholder. | An engineer is actively keeping up with the work on this ticket and ensuring that the fix aligns with the stakeholder's expectations. A sandbox, QA, or Stage likely will be required links for stakeholder validation. |
Merged | A ticket has been merged into the code base, but not yet released to production. | The code fix has been merged, but not yet released to production. |
Closed | Issue has been resolved and released to Production. | The fix has been released to production and the issue should have a resolution. The CR ticket should be updated with a message confirming completion and production release. |
Communication Policy
What & Why
In order to optimize our process and prevent unnecessary interruptions to the review process we have to be transparent with our stakeholders & customers on the work being done and how it is progressing. The following policy will help keep communication throughout the development lifecycle efficient and reliable, while building trust with our stakeholders that the right work is getting done in reasonable timeframes.
Communication Mapping to Customer Request Review
What to expect from Communications
CR Communication Policy Slides For a guided walkthrough. Below is a summary table:
Step | Description | Call to action (customer) |
---|---|---|
Begin Triage | Automated notice provided via a status update in the CR ticket | None |
Clarify Problem | Expected back and forth communication in order to understand what is the problem which the reported issue is a symptom of | Respond to any requests for additional info and confirm mutual understanding of the problem we are trying to solve |
Summarize CR | This is a posted comment summarizing the review of the issue and what the expected path forward is based on the Triager’s current | Ensure alignment with the general direction of the ticket based on the summary, express any concerns with the identified path forward via an FYI or escalation workflows as seems most appropriate to the severity of the issue. |
Development backlog | The status will be auto-updated in the CR to reflect a ticket is in the backlog fro Cat-3 - Cat-5 tickets | None |
Prioritize | the status will be automatically updated to the CR to reflect Prioritized status. Cat-1 & Cat-2 Are auto-prioritized once triaged. | None |
Assign Ticket to Self | An automated post to the CR will occur when a Sustaining Eng assigns the ticket to themselves for review | None |
Determine Plan of Action | Once a technical review has been completed the Engineer will post in the CR with the review-summary and intended next steps before coding. | Ensure alignment that the intended fix will resolve the issue to the best of our ability. Raise any concerns if it will not be a satisfactory resolution. |
Daily Status updates | as long as a ticket is in active development status with an assigned engineer, a daily status post will be made to the CR on the current progress. | None |
Hold Updates | Tickets that are “Blocked” or “Waiting for reporter” will have a manual CR update posted every-other day up until a week. If a ticket is Blocked by a dependency on another development team for more than a week the Engineer will notify the Product owner and Eng lead and post the next expected update on the CR. Tickets that are “Waiting on reporter” will have a follow up posted to the CR two days after the initial request and if no response is heard after 2 more days the ticket will be closed. | Clarify any requests for clarification on the CR. If the blocked status threatens an important deadline for the user escalate the CR ticket and post details about the timeframe. |
Resolution Confirmation | Once the resolution is merged there will be a closing comment that describes where and when the resolution can be observed. | Confirm resolution and any ask clarifying questions to help close the ticket with the end-user. If there are follow-on enhancements beyond the acute issue originally reported create the appropriate ticket for additional review |