Building Event-Driven Microservices Ch 5, 2021-11-10

Chapter 5 Outline

Chapter 5: Event-Driven Processing Basics

  • Composing Stateless Topologies

    • Transformations

    • Branching and Merging Streams

  • Repartitioning Event Streams

    • Example: Repartitioning an Event Stream

  • Copartitioning Event Streams

    • Example: Copartitioning an Event Stream

  • Assigning Partitions to a Consumer Instance

    • Assigning Partitions with the Partition Assignor

    • Assigning Copartitioned Partitions

    • Partition Assignment Strategies

      • Round-robin assignment

      • Static assignment

      • Custom assignment

  • Recovering from Stateless Processing Instance Failures

  • Summary

Discussion Notes

  • Used to suggest that the streams will be immutable but partitioning and repartitioning lets your change how things get read.

  • Partitioning is really big in kafka and the book really says “You can use anything as long as it has these features of kaka.” So you see a lot of bias for kafka.

  • In Kafka partitioning how we put data on disk and how we read data are coupled.  In Pulsar this is less the case.  Pulsar internally handles the storage partitioning and repartitioning.

  • When would you need to create a new stream of events using filtering?

    • Credentials currently might need to consume all the grade changes even though it needs only some of them.

    • At the level of data we have, we probably don’t need to worry about advanced filtering.

  • Do we want performance over other conveniences provided by tools that are more abstract like Redis Streams or Pulsar?

    • Kafka pushes a lot of performance concerns to the end user which is great for performance but requires us to build a lot more of the logic around partitioning and consumer management ourselves.

  • Consumer Implementation Thoughts

    • How do we handle a micro-service that cares about multiple topics?How do you read from multiple topics in a safe way? What does the pseudocode look like?

      • You need separate loops, one for each topic you’re talking about.

        • Eg. seperate threads or processes for each topic you care about.

        • Each microservice has a group of processes for each topic if it wants to scale.

      • Counterpoint

        • If each consumer is only listening to one stream, we may need consumers to provide more complex data.

      • Where split streams are going to be tricky, when we’re reacting to out of sync events.

        • In order to enroll you need to be a certain kind of user.

        • User a turns into that user

        • Then enrolled in the course.

        • If the two events are on two topics

        • Destination service might not process these in the right order.

      • For Multiple Streams, is this where we have stream transactions?