Notes on Building Microservices, Designing Fine-grained Systems by Sam Newman
Ch 3. Bounded Contexts
- Seams with Loose coupling and High cohesion
- Beware of premature decomposition before domains and usages are solidified.
Ch 4. Integration
- Avoid database integration at all cost
- DRY and code-reuse can couple microservices
- Sync vs Async
- request/response vs event-based vs reactive (observe for results)
- request/response can be sync or async - async by registering a callback
- "orchestration" pattern leads to centralized authorities with anemic CRUD-based services
- these systems are more brittle with a higher cost of change
- technology choices
- RPC
- easy to use
- watch out for: technology coupling, incorrectly treating remote calls as local calls, lock-step releases
- REST over HTTP
- more resilient to changes than RPC - sensible default choice
- not suited for low latency and small message size (consider WebSockets)
- RPC
- event-based - better decoupling; intelligence is distributed
- "choreographed" pattern leads to implicit view of the business process (since no centralized workflow)
- additional work to monitor/track - can create a monitoring system that matches the view of the business process - validates flowchart expectations
- these systems are more loosely coupled, flexible, and amenable to change
- technology choices
- RabbitMQ
- HTTP + ATOM
- managing complexities
- maximum retry limits
- dead letter queue (for failed messages) - with UI to view and retry
- good monitoring
- correlation IDs to trace requests
- Versioning
- Postel's Law, a.k.a. Robustness principle: "Be conservative in what you do, be liberal in what you accept from others."
- consumer-driven contracts to catch breaking changes early
- semantic versioning - self-documented impact
- expand and contract pattern when versioning breaking changes.
- co-existing versions needed for blue-green deployments and canary releases
- UI as the compositional layer
- UI Fragment composition
- server-side rendering of course-grained fragments work well when aligned with team ownership
- problem: consistency of UX - mitigated with shared CSS/images/etc
- problem: doesn't support native applications / thick clients
- problem: doesn't work for cross-cutting features that don't fit into a widget/page
- Backends for Frontends
- aggregated backend layers, with dedicated backends serving UI/APIs for dedicated frontend experiences
- danger: keep business logic within the underlying services. Aggregated backends should contain only front-end specific behavior
- Hybrid of both approaches above
- UI Fragment composition
- Third-party software
- Build or Buy commercial-of-the-shelf?
- "Build if it is unique to what you do, and can be considered a strategic asset; buy if your use of the tool isn't that special" - Build if core to your business
- Problems with COTS
- lack of control
- customization - avoid complex customizations - rather, change your organization's functions
- integration spaghetti - with different protocols, etc
- On your own terms
- Hide COTS CMS behind your own web frontend, putting the COTS within your own service facade
- Use Strangler Application Pattern to capture and intercept calls to the old system
- Build or Buy commercial-of-the-shelf?
Ch 5. Splitting the Monolith
- Seams
- Identify seams that can become service boundaries (not for the purpose of cleaning up the codebase).
- Bounded contexts as seams, as they are cohesive and loosely coupled boundaries.
- Reasons to split
- Pace of change is faster with autonomous units.
- Team autonomy
- Replaceable with alternative implementation
- Tangled dependencies - use a dependency analysis tool to view the seams as a DAG to find the least depended on seam.
- Coupling at the database layer
- Examples of database refactoring to separate schemas
- Transactional boundaries - split across databases
- Design patterns for failures (success in one db, but failure in another)
- Eventually consistency - try again later
- Compensating transaction - abort entire operation
- more complex to recover
- need other compensating behavior to fix up inconsistencies
- Distributed transaction using a transaction manager
- 2-phase commit
- Locks on resources can lead to contention, inhibiting scaling
- Rather than requiring distributed transactions, actually create a higher-level concept that represents the transaction.
- Gives a natural place to focus logic around the end-to-end process and to handle exceptions
- Design patterns for failures (success in one db, but failure in another)
- Reporting Systems
- Can use a read-replica to access the data - but couples database technology
- APIs - don't scale
- Data pumps to push the data, rather than have reporting system pull the data
- service owners write their own pump so not coupled to service schemas
- reporting schema treated as a published API
- aggregated view of all service pumps within the reporting system
- need to deal with complexity of segmented schema
- Event data pump
- Reporting service just binds to the events emitted by upstream services
- Looser coupling and fresher data
- May not scale as well as data pumps though
- Backup data pump
- Variant of data pump used by Netflix using Hadoop off of S3-backed Cassandra data
- Cost of change
- Make small, incremental changes to understand the impact of each alteration - mitigates cost of mistakes
- Small cost: Moving code around within a codebase
- Large cost: Splitting apart a database
- Whiteboard
- Make mistakes where the impact will be the lowest: on the whiteboard
- Go through use cases
- Class-responsibility cards (CRC) - borrowing from OOP - each card includes name of the class, its responsibility, and its collaborators
Ch 6. Deployment
- Continuous Integration
- CI server detects code is committed, verifies code and runs tests
- Versioned Artifacts are also created for further validation and usage in downstream deployments
- Confirms that the artifacts deployed are the ones tested
- Reused without continual recreation
- Traceability back to the commit
- 3 questions from Jez Humble on whether you're really doing it
- Do you check in to mainline once per day?
- Even if you are using short-lived branches, integrate frequently
- Do you have a suite of tests to validation your changes?
- When the build it broken, is it the #1 priority of the team to fix it?
- Do you check in to mainline once per day?
- Repo
- Since repo and single CI build for all microservices
- requires lock-step releases
- Ok for early stage and short-period of time
- Cycle time impacted - speed of moving a single change to being live
- Ownership issues
- Single repo with separate CI builds mapping to different parts of the source tree
- Better than the above
- Can get into the habit of slipping changes that couple services together.
- Separate repo with separate CI builds for each microservice
- Faster development
- Clearer ownership
- More difficult to make changes across repos - can be easier via command-line scripts
- Since repo and single CI build for all microservices
- Continuous Delivery
- Treat each check-in as a release candidate, getting constant feedback on its production readiness
- Build pipeline
- One stage for faster tests and another stage for slower tests
- Feel more confident about the change as it goes through the pipeline
- Fast tests → Slow tests → User acceptance tests → Performance tests → Production
- One microservice per build
- Is the goal
- However, while service boundaries are still being defined, a single build for all services reduces the cost of cross-service changes