Building Event-Driven Microservices Ch 3-4, 2021-09-29

Discussions of the book https://learning.oreilly.com/library/view/building-event-driven-microservices/9781492057888/
Watch Recording

Chapters 3-4 Outline

Chapter 3: Communication and Data Contracts

Event-Driven Data Contracts
- Intro
  - data contract = data definition + triggering logic
- Using Explicit Schemas as Contracts
- Schema Definition Comments
- Full-Featured Schema Evolution
- Code Generator Support
- Breaking Schema Changes
Selecting an Event Format
Designing Events
- Tell the Truth, the Whole Truth, and Nothing but the Truth
- Use a Singular Event Definition per Stream
- Use the Narrowest Data Types
- Keep Events Single-Purpose
- Minimize the Size of Events
- Involve Prospective Consumers in the Event Design
- Avoid Events as Semaphores or Signals
Summary

Chapter 4: Integrating Event-Driven Architectures with Existing Systems

What Is Data Liberation?
- Compromises for Data Liberation
- Converting Liberated Data to Events
Data Liberation Patterns
Data Liberation Frameworks
Liberating Data by Query
- Bulk Loading
- Incremental Timestamp Loading
- Autoincrementing ID Loading
- Custom Querying
- Incremental Updating
- Benefits of Query-Based Updating
- Drawbacks of Query-Based Updating
Liberating Data Using Change-Data Capture Logs
- Benefits of Using Data Store Logs
- Drawbacks of Using Data Base Logs
Liberating Data Using Outbox Tables
- Performance Considerations
- Isolating Internal Data Models
- Ensuring Schema Compatibility
- Capturing Change-Data Using Triggers
Making Data Definition Changes to Data Sets Under Capture
- Handling After-the-Fact Data Definition Changes for the Query and CDC Log Patterns
- Handling Data Definition Changes for Change-Data Table Capture Patterns
Sinking Event Data to Data Stores
The Impacts of Sinking and Sourcing on a Business
Summary

Discussion Notes

Author strongly recommends schema management. How do we feel about schema management for events?
- It’s always gonna have a schema, it's a matter of how much you manage it.
- Formal schema management is a useful tool for doing this thoughtfully at scale.
- Where does schema management live?
  - Schema registry, holds schema and can be used to evaluate if new schema is compatible with existing schema.
  - Compatibility mode
    - Start with full and go from there.
    - Hard to imagine not having full compatibility if we want to have a large number of consumers.
      - Counterpoint: We could deprecate old versions of schema and communicate deadlines between producer team and consumer teams.
        This suggests another line of communication between various teams.
Design Section
- Many of the principals seemed to conflict.
  - One event definition per stream.
    - How do we handle CRUD? Separate streams so we have one for each action. That seems very heavy and we’d have to worry about ordering at the consumer.
  - The idea is to not overload your entity topics.
    - Let a thousand streams bloom but each stream should be one entity.
  - Where ordering matters, we may want to push away from one event per stream, to be able to reason about when events happened.
  - There’s pure click stream, entity streams, and there are things in between.
    - Eg. I need to take an action that I need to take when an enrollment occurs.
    - Eg. User used to pass and used to fail.
    - Some things may need be short cut by IDs so event sizes don’t blow up. The referenced IDs would be the key in entity events
- When mapping from tables to entities, how do you deal with ids and foreign keys?
  - Might have to be critical of what the domain concept is that you want to convey. This might be at odds with how it’s laid out in a relational database.
  - Be mindful of entities growing too big.
  - But also be careful about pushing a bunch of ids in a message and then seeing a bunch of call-backs to fetch the data of those IDs
The idea of redundancy is not really built into the messaging systems.
- Keeping track of what the context was of a change.
- Eg. Indicate what the enrollment mode was before the change and what it is after this event.
Do we want to keep entity events and subset of entity events?
- Sometimes you might not care about the underlying event but a meta concept on top of it.
  - Grade change vs pass a course.
Author is pushing Kafka
- Expectation is that it’s treated more akin to the SQL database behind an app.
- Commit to the data store as a core data store that can make high reliability guarantees.
- We don’t say what to do if the database falls out of sync.
Data Liberation
- Saying I’m going to entity stream everything, I don’t like encapsulation.
- By exposing all your internal RDBMS schema, change management can be very complex and schema management becomes more difficult.
- Things happen and you need to react to it is a core part of the business, the book provides many strategies but it’s up to you to build good events that give you enough context to take the correct business actions.