Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Chapter 5 Outline

Expand
titleClick for chapter 5 outline

Chapter 5: Event-Driven Processing Basics

  • Composing Stateless Topologies

    • Transformations

    • Branching and Merging Streams

  • Repartitioning Event Streams

    • Example: Repartitioning an Event Stream

  • Copartitioning Event Streams

    • Example: Copartitioning an Event Stream

  • Assigning Partitions to a Consumer Instance

    • Assigning Partitions with the Partition Assignor

    • Assigning Copartitioned Partitions

    • Partition Assignment Strategies

      • Round-robin assignment

      • Static assignment

      • Custom assignment

  • Recovering from Stateless Processing Instance Failures

  • Summary

Discussion Notes

  • Notes will be moved here from the Google Doc after the meeting.Used to suggest that the streams will be immutable but partitioning and repartitioning lets your change how things get read.

  • Partitioning is really big in kafka and the book really says “You can use anything as long as it has these features of kaka.” So you see a lot of bias for kafka.

  • In Kafka partitioning how we put data on disk and how we read data are coupled.  In Pulsar this is less the case.  Pulsar internally handles the storage partitioning and repartitioning.

  • When would you need to create a new stream of events using filtering?

    • Credentials currently might need to consume all the grade changes even though it needs only some of them.

    • At the level of data we have, we probably don’t need to worry about advanced filtering.

  • Do we want performance over other conveniences provided by tools that are more abstract like Redis Streams or Pulsar?

    • Kafka pushes a lot of performance concerns to the end user which is great for performance but requires us to build a lot more of the logic around partitioning and consumer management ourselves.

  • Consumer Implementation Thoughts

    • How do we handle a micro-service that cares about multiple topics?How do you read from multiple topics in a safe way? What does the pseudocode look like?

      • You need separate loops, one for each topic you’re talking about.

        • Eg. seperate threads or processes for each topic you care about.

        • Each microservice has a group of processes for each topic if it wants to scale.

      • Counterpoint

        • If each consumer is only listening to one stream, we may need consumers to provide more complex data.

      • Where split streams are going to be tricky, when we’re reacting to out of sync events.

        • In order to enroll you need to be a certain kind of user.

        • User a turns into that user

        • Then enrolled in the course.

        • If the two events are on two topics

        • Destination service might not process these in the right order.

      • For Multiple Streams, is this where we have stream transactions?