Date: 17 March 2023
The Data Working Group focuses primarily on advancing the data and analytics capabilities of the Open edX platform. Our primary goals are to establish and promote data and analytics best practices across the ecosystem, and ensure that the Open edX platform provides and supports analytics capabilities specifically for small- to medium-sized Open edX deployments.
Consider this report as a compilation of what we’ve done over the past year, what we’re looking at doing in the next 6 months, and a wider vision of the group’s future beyond that.
Over the past year, our major accomplishments were:
Discovery and Specification for OARS V1
We have decided to replace Insights with the Open Analytics Reference System, a light, flexible data pipeline based primarily on 3rd party open source solutions for routing, storing and analyzing Open edX event data. The OARS architecture will be cost effective for small to medium sized Open edX sites, will scale appropriately for their expected data use, and comprise loosely-coupled components which operators can exchange if their deployments require. OARS is composed of several components, and we have investigated and decided on the technologies included in the reference implementation, including Clickhouse, Ralph, and Superset.
Reference implementation for OARS V1
Alongside the discovery and specification, we have also begun the reference implementation for OARS in the Tutor environment.
Redis Streams as a Message Bus
We have a funded contribution project in progress with OpenCraft to complete a reference implementation of redis streams as a second concrete implementation of the Open edX message bus. This should allow all operators of Open edX to be able to gain the benefits of asynchronous messaging and advance our plans for a less coupled architecture.
Google Analytics 4 upgrade
We also have a funded contribution project in progress with Racoon Gang to upgrade our Google Analytics support and expand GA tracking into several microfrontends that did not have GA support added when they were moved out of edx-platform.
Concrete Plans - Next 6 Months
Over the next six months, the Data Working Group expects to start seeing the fruits of our planning. We hope for make a functional OARS v1 available for testing and feedback, and make detailed plans for OARS v2 which will be centered around returning processed analytical data to instructors directly in the CMS, reacting to early feedback, and growing the dataset and available reports.
Future Vision for the Group
Moving beyond the next six months, the Data Working Group is looking towards forming a cohesive xAPI profile for Open edX, growing our community’s data capabilities, and collaborating with other working groups on the foundational pieces of adaptive learning. With upcoming projects across the platform for a tagging and taxonomy system, modular learning, and standards-based learning traces in a data lake we will have a solid foundation to push the boundaries of how learner experiences can be tailored to improve educational outcomes!
Deep Dive: Open Analytics Reference System
The following links have details about the high level architectural decisions that have informed the OARS system: