Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Ch 11. Microservices at Scale

  • At scale, concerns over failures (statistically likely) and performance
  • Can over-optimize unless know requirements for:
    • response time/latency
    • availability
    • durability of data
  • Graceful degradation
  • Architectural safety measures
    • Anti-fragile organization by Nassim Taleb
      • intentionally causing failures at Netflix and Google
    • Timeouts
      • too long slows down whole system
      • too quick creates false negatives
      • choose defaults and log → monitor → adjust
    • Circuit Breakers
      • Fail fast after a certain number of failures
        • gracefully degrade or error
        • queue for later if async
      • Restart after certain threshold
    • Bulkheads
      • Lose a part of the ship but rest remains intact
      • Separation of concerns via separate microservices
    • Timeouts and circuit breakers help free up resources when they become constrained
    • Bulkheads ensure they don't become constrained in the first place

Ch 12. Final