Event Bus Architecture Overview
Vague Drawing
TBA (will have to be ported from a Google Drawing)
Glossary
TODO - straighten out these terms and make sure we all agree on what an event bus is (system vs server)
TODO - how does the Django documentation talk about signals?
Signal - An instance of OpenEdxPublicSignal. (synonym: channel?)
Event data - A dictionary whose structure is determined by the init_data attribute of an instance of OpenEdxPublicSignal. Event data is sent via a call to MY_SIGNAL.send_event(**event_data)
.
Note: Django (sort of) also refers to these as signals. More details to follow. The keyword arguments of a call to signal.send() are what we are calling event_data. Better words better words better words
Message - A unit of information to be sent from one service to another via a message broker. Sometimes called an event (ie the ‘event’ of ‘event bus’), but for purposes of this document it seemed easier to use the term ‘message’
Message broker - A service that receives messages from other services and organizes/stores them in some way such that other services can query it (the message broker) and receive the messages they want
Worker - A machine or process that can run code from a service but doesn’t run the web app
Event bus - Code that takes manages taking messages asyncronously from one service to another
Relevant repositories
openedx/openedx-events:
Used as a library
Defines Open EdX-specific implementation of a Django signal (OpenEdxPublicSignal)
Defines all instances of OpenEdxPublicSignal that are used in the platform (aspirationally)
[TODO] Put in a link to studio code that creates its own signal as an example of what we want to change
Contains utilities for converting between event data dictionaries and Avro records
openedx/event-bus-kafka (in development):
Used as a plugin
A specific implementation of an event bus
Must be configured to point to a specific Kafka cluster
Receives event data from a signal sent by the main service, converts it into a message, and sends the message to the configured Kafka cluster under the correct topic
Inside a worker, can run a management command that continuously polls the configured Kafka cluster for new messages, then converts those messages into event data and emits them via the correct signal to the main service
How exactly it will know which messages to put under which topics is TBD
Note: some of this code is currently in edx/edx-arch-experiments but will be moved over shortly
Producing
Abstraction: Service A has the event-bus plugin. The event bus plugin is an abstraction that receives event data from the service and sends it via an API call to a message broker somewhere.
Implementation: Services in Open EdX ecosystem use the openedx-events library to create OpenEdxPublicSignals that can emit event data to registered receivers
In some instance of Service A, Something Happens
2U/OCM Implementation: Most of our services are run [somewhere], a few are run via Kubernetes
Code executes in Service A to create event data, something like
event_data = CourseEnrollmentData(**bunch_of_data)
Code executes in Service A to emit this event data to all registered receivers by using the correct signal, eg
COURSE_ENROLLMENT_CREATED.send_robust(data=event_data)
All registered receivers within the Service A process
Enables decoupling rather than asyncronous execution of code, ie registered callbacks
Abstraction: The event bus plugin notices that the COURSE_ENROLLMENT_EVENT signal has fired, takes the event the signal sent, and sends the event to a message broker (again, via API call)
Implementation: event-bus-kafka is (or rather, will be) a concrete implementation of this plugin abstraction, where a signal receiver takes the emitted event data, serializes it into an Avro record, and sends that record to a machine (or machines) running Kafka somewhere.
Note: there are other serialization formats besides Avro, but for now we are hardcoding Avro as the serialization format for anyone using the Kafka event bus
2U/OCM implementation: Our Kafka cluster is externally managed by Confluent. We will set this via configuration variables. Configuration variables vary across OpenEdx installations.
Consuming
A worker that runs the Service B code is running somewhere
2U/OCM Implementation: We will use Kubernetes to configure and run these workers
Service B has the event-bus plugin and openedx-events libraries installed
The worker is setup to run a management command from the event-bus plugin. The command itself is an infinite loop whose job is to perpetually ask the message broker for any new messages via API call, then transform those messages into events that can be emitted by the appropriate signal
Implementation: event-bus-kafka asks the Kafka service for a new record and deserializes it from its Avro-format into event data that can be emitted by the relevant signal
2U/OCM implementation: We specifically ask the same Confluent-managed cluster for this information
Service B has receivers that listen to the relevant signal and run whatever business logic is appropriate using the information from the emitted event