Community Engineering New Relic Integrations

This document describes our squad’s footprint in New Relic. It’s not so bad.

About Code Owners

Code in edx-platform has been annotated with code owners via the Tech Ownership Assignment spreadsheet - we’re community-engineering. When code throws an error, it gets logged to New Relic. Our alert policies below have been specifically setup to include a “where” clause in their NRQL (New Relic Query Language) that filters down to code owned by us - you’ll see it if you look at the NRQL in the policy alert conditions. Conceptually all alerts in the platform are evaluated against all the alert policies and they pick up any that apply to them and send alerts out on their notification channels. We’ve created ones that filter by our squad name.

So that’s why we only get alerted for stuff we own.

We could also create alert policies for stuff that doesn’t get filtered by code owner if we really wanted - and that’s what we do for the two micro-frontend we own. There are alert policies for those - we just set them to only alert our squad. A different squad could take the over just by changing the notification channels.

Alert Policies

You can search alert policies here: https://one.nr/0bEjOpGmpQ6

The community engineering squad has four alert policies in New Relic, and all our policies include the string community-engineering for easy searching. Two are for LMS (one is prod, one is edge), two for frontends in prod. The LMS policies include both web and worker tier alerts.

A search showing our alert policies: https://one.nr/0oqQaoWD0R1

The individual policies:

prod-edge-edxapp-lms-community-engineering

prod-edge-edxapp-lms-community-engineering

prod-frontend-app-profile-community-engineering

prod-frontend-app-account-community-engineering

In each policy, you’ll see a number of alert conditions. We should feel free to tweak the condition thresholds, advanced signal settings, etc., to make these alert conditions work for us. It’s possible that because our squad’s owned code had been lumped in with other code T&L owned, that these alert thresholds are too high because of some particularly spammy thing we don’t actually own now.

Notification Channels

Notification channels are how New Relic communicates alerts to outside services. We use three, two of which are specific to our squad:

Slack integration with #ce-alerts

OpsGenie integration with the Community Engineering team

AlertStatsCollection - shared by other squads - used to send events to insights, I believe.