Exploring Analytics for Microfrontends

Exploring Business Requirements

High-level business requirement questions:

- What are the business requirements for tracking events for the header?
- What are the business requirements for generic event tracking?
- What is the overlap?

Notes

  • Gabe Mulley (Deactivated)'s experience is that:
    • Even events designed specifically for a particular analysis are nearly always tweaked to get them right, and legacy data from before the tweak is rendered useless.
    • It is possible, but unlikely, that general data would be useful because there are too many variables (e.g. what did that event mean at that particular time, since things can move).
      • Given this, general collection only makes sense if it is really cheap (infrastructure, design time, architecturally, etc.)
  • Events tracked through Segment (not direct to GA)

Additional related questions:

  • Where would generic event handling be good enough?
  • For the shared header, what tracking events are required?
    • Is it important to have any consistency across headers?
    • Would the name need to change per event?
    • Do we need to know what Front-end Application we are in?
    • What other data?
    • Category and label? Mobile? Link? Other properties?
    • Would a generic solution solve this?

Example Header Events (Prospectus):

User Menu and Site Navigation:

window.analytics.track('edx.bi.link.header_navigation.clicked', {
  category: 'navigation',
  label,
  link: event.currentTarget.href,
  mobile: false,
});

analyticsAllSearch() {
window.analytics.track('edx.bi.user.search.submitted', {
  category: 'search',
});
}

analyticsSearch(label) {
window.analytics.track('edx.bi.user.query.submitted', {
  category: 'search',
  label,
});


Footer:

window.analytics.track('edx.bi.footer.link', {
  category: 'outbound_link',
  label: $link.attr('href')
});


Exploring Potential Implementations

The intention is to move this to an RST doc at some point.  If we need/want to capture comments sooner rather than later, I can do that sooner rather than later.


This section details some exploration regarding both our microfrontend apps and for shared components.

Background

The LMS provides the ability to send events to Segment or to the tracking log (and potentially on to Segment).  At this time, these docs do not yet cover events that go to the tracking log.

Regarding front-end library selections for analytics, different microfrontend applications have selected different libraries.  Given that, each library needs to be reviewed in relation to the functionality it is providing.

Setting up Segment for a new Microfrontend

/wiki/spaces/AN/pages/938705325

Identify User

When using Segment, we need to call window.analytics.identify() with the appropriate user identification details.

  • Background
    • Enterprise Portal was using the user's email.  It has been established that that is not the right solution, but we are working out the details.
    • LMS/CMS currently use the user id.  LMS/CMS also send email and username as additional details, but not the main identifier.
    • Prospectus:
      • Unclear at this time what it is using.  I only found this code which doesn't provide the main id as far as I can tell.
  • General Notes
    • Note: Segment recommends that identify is called upon loading a new page in addition to the more obvious call after login.
    • Options for user tracking id:
      • Option 1: LMS User ID (Recommended) 
        • This is the current recommendation as documented by the analytics team.  It is also what is mostly in use (e.g. in LMS/CMS). 
        • Implementation proposal: Add "user_id" to JWT.
        • Note: The LMS/CMS also pass username and email as additional traits to identity.
          • It is unclear what specific integration is using this additional information.  It seems unlikely it is GA.
          • For now, we will exclude this information from new microfrontends.
      • Option 2: Anything else (Rejected)
        • See analytics documentation under Option 1 for detailed explanation.
    • ARCH-379 - Getting issue details... STATUS  - this ticket is about implementing the id on the new User Account (Profile Page) microfrontend app.

Passing Analytics Properties to Children

One may need the ability to pass analytics properties from parents to children, so it is available when a child component tracks an event, without the child needing to know about its context.

  • Background 
    • NYTimes React Tracking
      • Prospectus is successfully using this library to track events within the application.
      • It provides a way to make event tracking context available down the React component hierarchy.
      • Uses a Higher-Order Component to add props.
      • Decorators provide much of the benefit, but they aren't an official feature of JavaScript.
        • Is all react-tracking functionality available without the decorators?
        • Note: Prospectus is using their Higher-Order Component without the Decorators.
    • Medium article: React-based user behavioral tracking
      • Coursera describes proprietary solution using:
        • React Context to pass analytics properties.
        • Analytics specific components (atoms) like TrackedButton.
      • A home-grown solution could pull from this model.
  • General Notes:
    • For applications:
      • React-tracking is getting the job done for Prospectus, and is good enough for now.
      • When and if a solution is required for shared components, it may or may not replace the need for this third-party library.
    • For shared components:
      • We don't have a clear need for this yet.
      • It does not make sense to be dependent on a third-party library like react-tracking, although it may make sense to have a way to bridge the systems.
      • Until we have requirements needing a POC, it is unclear whether a final solution should use:
        • React Context, and/or
        • Higher-Order Component, and/or
        • Standard props/render props.
      • The solution should use reasonable namespacing. For example, avoiding conflicts with react-tracking's props.tracking property.
      • We don't have clear requirements/scenarios where this is needed in a shared component, other than knowing we may want it to be compatible with third-party libraries like react-tracking without being dependent on them.

Firing Analytics Events

  • Background:
    • Enterprise Portal
      • Uses redux-beacon/segment
        • Library allows a redux application to use reduce actions for eventing.
        • See wiring code here
        • It is unclear how and if this solution would play nicely with solutions for "Passing Analytics Properties to Children" (see above).
    • Prospectus
      • Uses Segment's interface to fire track events.
      • Uses react-tracking to solve a separate problem (see above for "Passing Analytics Properties to Children")
      • Note: it is also possible to use react-tracking to wrap Segment for dispatching, but Prospectus did not do this and Alasdair had concerns that one might not want to do this in certain cases, but I wasn't clear when.
  • For applications:
    • Recommend encapsulating segments window.analytics.trackEvent call in a wrapped function.
      • It is not yet clear whether this should be sharable code in the cookiecutter or a package.
    • Further review needed in User Account to determine when and if redux-beacon/segment provides benefit for certain events.
  • For shared components:
    • Atomic components (like Paragon's HyperLink)
      • Waiting on need/POC.
      • Needs to play well with onClick.
    • Non-atomic shared components (molecule or organism)
      • Initial MVC POC for footer shared component uses a new render prop (i.e., prop function).  The prop function would take two arguments aligning with segments interface for trackEvent  (eventName, eventProperties) a.k.a. (eventKey, eventValue).

        Although the options, pros, and cons attempt to capture what is known right now, I am sure that this will evolve as we build more.  So, this is an attempt to put our best foot forward without pretending like this is going to be perfect.

        • Naming Convention proposal:
          • Use the term "trackEvent" or "TrackEvent" somewhere in the prop name.
          • Start the prop name with "handle".  Not sure if this is a React best-practice, but it is in React's documented examples.  
        • There are several options for prop scope:
          • Option 1: Single prop for all events. (Proposal: Recommend)
            • Example prop: handleAllTrackEvents
            • For application overrides, the component would need to supply constants for comparison in an if-statement.
              • In the case of the POC, there would be a single constant for 'edx.bi.footer.link'.
            • Pros
              • Ease of use for app that doesn’t want to override. They get all analytics (with a single prop).

            • Cons

              • Requires an if-statement for overrides by the application.

              • If there is one event definition (like in this case), and an application overrides without an if-statement, they could set themselves up to mistakenly override future events.

          • Option 2: Single prop per event name. (Proposal: Reject)
            • Example prop: handleLinkClickedTrackEvent
            • Pros
              • Enables app to more easily override specific events.
            • Cons
              • More work for an app if it is not concerned with overrides.
              • Could we have a case with many different events?  We don't yet.
          • Option 3: Single prop per event name or value difference. (Proposal: Strong Reject)
            • Example prop: handleOutboundLinkClickedTrackEvent
              • In the case of the POC, these events are only sent from outbound links with category "outbound_link".
            • This solution seems unsustainable based on potential permutations of the values.
          • Note for future: As a best practice, a component could offer Option 1 or Option 2 (or both) depending on the likelihood of needing to override.
            • In the case of the footer, we may want to standardize on the event, reducing the likelihood of overriding.
            • If we do want to differentiate between apps in the footer events, the apps could also add an additional event property.  We may or may not want best-practices/helpers for this as well.
        • Required or optional analytics props: 
          • We want a solution that makes it difficult to forget analytics, but is simple to implement for an app.

          • Option A: Required analytics props. (Proposal: Recommend)
            • Combined with Option 1 above:
              • The app only needs to supply the handleAllTrackEvents once, even if more events are added in the future.
            • Combined with Option 2 above:
              • The app needs to supply a new required prop each time a new name will be fired.
              • Pro
                • May be useful to prompt thinking about whether an app wants to override for each new event.
              • Con
                • Makes roll-out of new events more painful because they will break apps at build time.
          • Option B: Optional analytics props. (Proposal: Reject)
            • Optional properties are simpler, but possibly too risky for eventing which is easy to mess up and not know.
          • Note for future: We may someday want the ability to pass a prop like handleAllTrackEvents using React Context.  If so, would there be any way to require either the prop or the context?  If not, does that rule out React Context if we wish to require this?
      • Implementing the header (and other shared components) is certain to evolve any current thinking.
      • Our solution could be affected by the introduction of "Passing Analytics Properties to Children" (see above).
      • Future solutions could potentially use:

Sending events to the (LMS) Tracking Log

See "Consolidating Event Tracking" section first.  This section is older.


Microfrontends (e.g. User Account with Profile):

  • Ideally, the frontend could send all events to Segment, and everything else (including the tracking log), would be handled downstream.
  • We may need a way to bypass tracking log, which might just be "edx.bi.*" events.
  • Events need to go to GA.
    • Would any events ever not go to GA?
    • How do we determine GA property(ies) to send to?  For example, the legacy profile page event went to the LMS GA property.  I assume that means we want this same event, "refactored" to the microfrontend, to go to the LMS GA property. Would that mean that all future events from the same microfrontend would also go to this GA property?
    • Does it make sense to reuse /segmentio/event, or do we need a 3rd endpoint (see additional questions below)?
      • Answer: No.
    • The Profile page has a legacy tracking log event which is basically a glorified page view with additional metadata.
      • Is there a better way to handle these more generally through page events?
      • How would one determine what page events they would or would not want to go to the tracking log? Is there a reason we wouldn't want to send these to the tracking log?

Consolidating Event Tracking

  • Wish list for event tracking from microfrontends:
    • All events sent to Segment using (wrapped) Segment calls.  Gives additional benefits of resilient JS code (e.g. trackLink and trackForm).
      • Note: for Tracking Log events, both the front-end code is not resilient, and it relies on calling LMS which is not the most reliable option.
    • Ideally, all events sent to reliable Destination to not lose events (e.g. AWS Bucket). 
    • Processing could be handled async on server.
      • Could handle adding user_id given username (e.g. for Profile API).
    • Need a way to indicate stability of event.
      • Versioned? Other indicator? Available to GA as well?
      • Can/should solution be kept out of the name (e.g. "edx.bi"), or should the version be part of the name?
      • Using RAPID, who are all the I's for event design at this point, including potential for refactoring legacy events? 
    • Need a way to configure to where an event should be sent, including tracking logs and GA.
      • Simplified defaults and decision making around this?
    • Special considerations for GA?  Send from backend processor back to Segment and on to GA?  Different GA properties?
    • Front-end consistent naming needed.  Use Segment names?  Tracking log names?  Other?  See code comment.
    • Bill DeRusha (Deactivated) has additional thoughts that require a conversation.

QA for Events and Configuration

It would be great to find ways to make this simpler. Here is an example testplan for the Profile microfrontend events.

  • Each new microfrontend requires a new Segment source, and likely a Segment destination for GA and S3.
    • GA testing should review real-time events in GA.  In Production, this probably needs a Category for the search in order to find the event.
    • S3 setup doesn't have a good process defined, since Data Engineering needs to both whitelist and verify.  There may need to be a ticketing system for this so one can know that it was properly completed. 
  • Events sent to the LMS Event API can be tested in one of two ways:
    • For whitelisted events that are sent to LMS Segment, you can use the Segment Debugger.
    • For events that are not sent to LMS Segment, only Data Engineering or Devops can test by reviewing splunk4tracking.  Other engineers are not allowed access.
  • One needs to test setup once per microfrontend in every environment to know if all is well (Segment Stage, Prod, possibly Edge).
  • For whitelisted events, one needs to test each event for each environment.
    • Note: it would be great if the defaults minimized this type of testing.