Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Note

During this discussion, we decided this book is not relevant enough to our current use cases. Unless that discussion interests you, you may wish to skip this.

Chapter 6 Outline

Expand
titleClick for chapter 6 outline

Chapter 6: Deterministic Stream Processing

  • Determinism with Event-Driven Workflows

  • Timestamps

    • Synchronizing Distributed Timestamps

    • Processing with Timestamped Events

  • Event Scheduling and Deterministic Processing

    • Custom Event Schedulers

    • Processing Based on Event Time, Processing Time, and Ingestion Time

    • Timestamp Extraction by the Consumer

    • Request-Response Calls to External Systems

  • Watermarks

    • Watermarks in Parallel Processing

  • Stream Time

    • Stream Time in Parallel Processing

  • Out-of-Order and Late-Arriving Events

    • Late Events with Watermarks and Stream Time

    • Causes and Impacts of Out-of-Order Events

    • Time-Sensitive Functions and Windowing

      • Tumbling windows

      • Sliding windows

      • Session windows

  • Handling Late Events

  • Reprocessing Versus Processing in Near-Real Time

  • Intermittent Failures and Late Events

  • Producer/Event Broker Connectivity Issues

  • Summary and Further Reading

Discussion Notes

...

  • It seems like if we need to make sure that multiple streams always have correct NTP-backed timestamps in order to avoid logic errors, this will never work satisfactorily in production over the long term.

  • The book is helpful in some places, terrifying in others; how can we flag which ones are which?

  • FYI: Arch-BOM has pivoted from a prototype using a credentials use case to using license-manager events instead.

    • This should be a very simple example that lets us just cover the basics first.

  • We do need to think about the fact that there will be latency

    • How much is acceptable?

  • Is this the wrong kind of book for what we need right now?  Maybe enumerating the things we need to consider rather than offering advice on balancing them in practice?

    • No, even that would be more useful.  More like “here’s a very complex solution, impractical for most use cases, that theoretically offers the best of all worlds but probably won’t in practice.”

    • Not clear about use cases, sort of assumes that all implementations will want to make the same trade-offs.

    • At least a few of us are struggling to keep up with the chapters, those who are keeping up aren’t sure how much they’re getting out of it.

    • Maybe something like https://www.confluent.io/dummies/ would be more useful?  Unsure.

    • Seems to make the opposite choice for almost every item in the edX Architecture Manifesto

  • Links related to architectural goals related to event bus: