Clean Architecture Book Club
Kickoff
Purchasing the book
Purchase from anywhere and submit an expense report for it
Reason for book-clubbing
What do we hope to get out of this?
Hoping to read the followup to Clean Code and see where it goes
Hoping to going an explicit knowledge about architectural things
Enjoy Uncle Bob
Hoping to gain an architectural intuition and vocabulary.
Hoping to gain some motivation to learn more about architecture
Likes book clubs
Curiosity, hoping to start from the ground up.
Scheduling
How often should we meet?
Once a week
What time should we meet?
Lunch works fine
Next meeting?
Two weeks to get books
Read first two chapter (i.e. Part 1)
Conducting
How do we want to run the meetings?
Rotating note taker per meeting.
Reading amount
How much do we want to read per week
~40 pages / week. Will be about ~5ish average chapters
Chapters lengths aren't very even. Most 6-8 pages, a few 4 and a few 16-20.
Retro
Week 1: Chapter 1 & 2
Week 2: Chapter 3 & 4
Week 3: Chapter 5 & 6 & 7
Week 4: Chapter 8 & 9 & 10 & 11
Week 5: Chapter 12 + additional resources (http://butunclebob.com/ArticleS.UncleBob.PrinciplesOfOod)
Week 6: Chapter 13
Week 7: Chapter 14
Week 8: Chapter 15 & 16
Week 9: Chapter 17 & 18
Week 10: Chapter 19 & 20 & 21
Week 11: Chapter 22 & 23
Week 12: Chapter 24 & 25 & 26
Week 13: Chapter 27 & 28 & 29
Week 14: Chapter 30 & 31 & 32
Book Notes:
Part I: Introduction
Chapter 1: What Is Design and Architecture?
Does this utopia place with great architecture exist?
edX in "ball of mud" state - monolith
Is lines of code a good measurement for productivity?
TDD example seemed contrived. Unclear what was remarkable about this.
Sometimes it's helpful to have some idea of how underlying implementation works.
Tests with legacy code can build confidence when refactoring
Chapter 2: A Tale of Two Values
Behavior & Structure
Eisenhower's Matrix
Hard to sell important / not urgent since it's unclear when it will become an issue.
Sometimes the longer you wait is used as evidence that it's not important
All hands Adam mentioned getting features out first can be important
A lot of hackathon projects around tech debt. If its important than why is it just done in a hackathon
Does the rest of business think of software engineers as stakeholders?
Interfacing with product is a continual struggle around advocating
Part II: Starting with the Bricks: Programming Paradigms
Chapter 3: Paradigm Overview
Didn't think of structured programming as it's own paradigm
Interesting to think of paradigms as subset of programming possibilities
Stack/Heap decision for Object Oriented was thought provoking
Three concerns of architecture: function, separation of components, data management
Chapter 4: Structured Programming
An okay overview and a lot of history
Testing is breaking down programs to prove correctness
Chapter 5: Object-Oriented Programming
Discusses dependency inversion
"Inversion" feels poorly named
Tied together with dependency injection
We don't do a lot of dependency inversion in edX today
Object oriented didn't give us new features
Gives better way or limits the ways you can do something
Makes it easier to be disciplined
Breaks perfect encapsulation
Also makes it more convenient
Inheritance worked previously but it's better now
Sometimes can hide details behind the "magic"
Python enabled Dynamic Inheritance
Can change super classes at runtime
We use "mixins" or abstract classes as interfaces
Can be used as the "glue" that ties together functional components.
Since OO uses polymorphism, any dependency can be inverted.
Polymorphism allows for a plugin architecture.
Chapter 6: Functional Programming
Examples at edX:
React, tuples
SuccessFactors integration just stores transactions
We tend to be mutable
Tried to make CourseBlocks API functional but was too expensive
If we passed around immutable objects, we could have less bugs
Elm doesn't allow side effects in body of function
Code can be both object oriented and functional
Even if data structures are mutable, we can be clear if functions are mutating or not
Immutable / append-only tables
Beautiful, but has issues
Becomes expensive in the "roll-forward" stage
Capturing snapshots allows to roll-forwards to be cheaper
Great for making software testable and reliable, but breaks down when it comes to resources.
"All race conditions, .... are due to mutable data"
Part III: Design Principles
Chapter 7: SRP: The Single Responsibility Principle
Reading about the SOLID principles ahead of time would be beneficial
Separating it out into different structures felt less clean
Creates fewer dependencies for any given change
Responsibility of code is not currently defined, so it's hard to know needs to be changed when requirements change
SRP vs Django
Django design philosophies
Django recommend fat models
Django steers us away from encapsulation
If your description of responsibilities includes "and" you're breaking it (Stackify article)
Chapter 8: OCP: The Open-Closed Principle
Examples hard to follow with just diagrams and no code - how to map abstract descriptions to concrete code.
Dependency inversion is a mechanism to support open-closed (and other principles).
How would we support open-closed without dependency inversion? Is it desirable to do so?
Only in this order so that we have acronym SOLID, could be reordered
Having the interfaces live with the component that needs it, and then have other things outside implement the interfaces helped outline pluggability.
In figure 8.2 on page 72, arrows between double lined boxes only go in one direction between each other, ie all arrows between Controller and Interactor are going from Controller to Interactor.
Closed doesn't always mean never change, it may be refactored to support new abstractions. But, it is closed to adding new features directly within.
Chapter 9: LSP: The Liskov Substitution Principle
How do we anticipate LSP violations?
For example, overriding methods gets protection in compiled languages like Java, but still could have semantic issues.
Could use linters in languages like Python to avoid this
Beginning statement is very mathy, but seems like it is just saying that if one class could replace another, they are of the same type.
"is a" relationship isn't enough to establish that something should be a subclass, as seen in the Square/Rectangle example.
To do this properly, we would need an abstraction that doesn't have setH and setW, since those only apply to rectangle.
Only methods that apply to both the rectangle and square would be on this abstraction.
Code examples on the website
This is why broader product vision is useful to figure out where generic abstractions are necessary.
In API example, the change in the api fields exposed the violation.
Be conservative in what you send, and liberal with what you accept.
Should we invite Barbara Liskov!?
We can learn a lot from the formalism that comes out of academia.
Most production code doesn't have many inheritance levels... does this crop up in every day work?
Not all examples of this principle rely on inheritance, like the API example that was given.
How much time should you spend on trying to get to architectural "perfection", vs. the second use case revealing abstractions?
Chapter 10: ISP: The Interface Segregation Principle
Useful when deciding what goes into common libraries.
Figure 10.2 looks like the facade pattern.
What is Facade?
Facade pattern is when you make an interface that only exposes some methods of the class, i.e. only expose read even though backing implementation can read and write
Adapter is for supporting incompatible interfaces.
What if a User needs op1 and op2? Do we create a new interface for op1 and op2 together?
Seems like it could get bloated.
Do these three functions belong in the same class?
Demonstrates the benefit of composition, like if we need op1 and op2, we can just say we have a U1Ops and U2Ops, even though they are the same object underneath.
For languages like Python, violations of this won't hit at compile/run time necessarily, but it is still advantageous.
Chapter 11: DIP: The Dependency Inversion Principle
Has other properties besides enabling principles such as open-closed, ie gives better testability.
<Extra SOLID>
Invest the time to apply SOLID principles selectively because a lot of effort is needed to apply SOLID at all times
Invest the time in heavily shared components (plugins)
Part IV: Component Principles
Chapter 12: Components
Very historical chapter
Plugins, discussed
Chapter 13: Component Cohesion
Chapter 14: Component Coupling
Attaching mathematics to principles probably works really well for some things, but not others
The formulas also help us be rigorous if we need to in the future
It's hard to know what will be updated frequently and what won't
Drawing a dependency map helps you determine what items should be stable
We have a dependency graph generator for edx-platform (!!!)
DEPR
Removing things is hard because there's a lack of context
Large monoliths tend to have a majority of their code unused
Part V: Architecture
Chapter 15: What is Architecture?
Architects should also continue working on the code while they architect.
Architecture has very little bearing on how the system works; it more likely affects deployment and manageability.
Architecture should make the system use clear
Separate things by business concerns, not technical concerns
IDEA: Test new architectural strategies with future users of the code (i.e. "Does this look approachable")
IDEA: Mock external package inside a repository until it can be pulled out into it's own package.
It doesn't matter whether code is a library or a microservice as long as it's broken into a component, the difference between be an API and a library call is minimal.
Chapter 16: Independence
As life-cycle evolves, applications that were broken into a microservices may be brought back into the original service if it's better for the process (e.g. edX did this with the Programs IDA)
Don't try to unify items that may diverge for the sake of deduplication
Something to keep in mind when pulling components into libraries
Example: Reusing a translation string so you don't need to create another.
Some things like an Course client (fetch and cache) should be pulled into a library, but a Course should have different domain requirements for each IDA
Attempt to keep libraries from becoming the "kitchen sink" by good documentation on the libraries purpose
Breaking delete/create of item into separate views could possibly be done because:
Users that delete may be different than those that create
UI workflows for deletion and creation may be very different
Scaling of services
Breaking into micro-services by line-count is arbitrary and not helpful
Build apps to be individually deploy-able, but delay until necessary.
Chapter 17: Boundaries: Drawing Lines
Business rules should contain the interface that a plugin/detail layer (i.e. DB) should implement
Prototyping is not wasted code
Is abstracting for something as broad as SQL vs NoSQL over-engineering?
Should we have a layer between our code and the Django ORM?
Chapter 18: Boundary Anatomy
Chapter 19: Policy and Level
Every component in your project should be part of a DAG
Rewriting the bad code would look like:
encryptwould accept objects that implement thewriteCharandreadCharinterfaces.function encrypt(writer, reader){ while(true){ writer.writeChar(translate(reader.readChar)); } }
The line "the IODevices component depends on the Encryption component" is a bit confusing
Plugins implement the requirements of the component they're plugging in to.
To "plug in" a plugin, the component must have the right "holes"
Chapter 20: Business Rules
Do Django views map to use cases?
Views should really only be input/output marshaling
They are currently handling both the input/output and the logic
Would be nice to use
api.pyeven within the same app.Would make testing easier as well because there would be less mocking.
Chapter 21: Screaming Architecture
A quick glance at a component should tell you what the component does, not what framework you used.
Don't have a big collection of apps all together, co-locate them by feature. (i.e. have a "learning" folder filled with the apps related to learning)
Chapter 22: The Clean Architecture
Entities - data & methods; Use Cases - similar to a user story (a workflow)
Creating boundaries between layers can be difficult when you break code into async (out of process) tasks
Don't hand objects from the database down to the views
Should have an intermediate data structure.
How would this work with querysets? We would have to turn the queryset into a list of the data before passing it up a layer. This can be a HUGE performance issue.
One of the reason we find issues between our code and these principles is that Django was built to allow distinct skillsets to work on single sections of the code. Examples:
Designer: Only needs to edit the template, but may need text or string manipulation which leads to logic in the view (of MVC, not django)
DBA: Only need to touch models. But since they want to add methods to allow for efficient extraction of data, we end up with code in our model layer as well.
"Smashing things together is bad"
Chapter 23: Presenters and Humble Objects
Chapter 24: Partial Boundaries
"Reciprocal interfaces" aren't mentioned anywhere else
We've done similar work here in edx-enterprise where we assumed we'd be a separate IDA, and are now looking to assume we're just an installed app.
XModule is an example of a "facade" pattern here at edX
Chapter 25: Layers and Boundaries
This chapter made a lot of ideas from previous chapters more concrete.
Takes a simple problem then shows how the architecture grows and becomes more important as features are added
You have to keep an eye on the inflection point where making the software boundaries is going to be less painful than dealing with not having them in the future.
Chapter 26: The Main Component
Due to plugins, it's possible to have many different
maincomponents that will change with each deployment.Instead of a different
mainfor each of our deployments, we use a settings file so we can easily do multiple deployments without editing code itself.If a component is "dirty" it is very likely to change and does not have any of the pretty abstractions, that's why
mainis the dirtiest of the components.
Chapter 27: Services: Great and Small
Microservices don't mean you have an architecture
You can have an architecture even with a monolith
Discovery
Is it architecturally significant on it's own?
Is it better as a service or an app?
Should courses live in discovery?
Only pass simple data structures across services
Don't pass data models that require a standard manager class that we have to install across services
Chapter 28: The Test Boundary
Move from testing pyramid (unit tests / integration tests) to component tests (just test the public API of the application).
Better to test stable components than unstable.
Some exceptions happen
We still may want to test performance with num_query tests
Some complex operators may want unit tests
Chapter 29: Clean Embedded Architecture
Skipped
Part VI: Details
Chapter 30: The Database Is a Detail
A layer of abstraction over different relational databases (MySQL, Postgres, etc.) seems reasonable and useful
When possible, abstracting the fact that it's a relational DB at all may be useful
Performance concerns associated with this
Many companies have DBAs that "own" the database. They write the all the SQL, they know how to change it in a way that won't break things.
Can it be useful to use an ORM, but have a DBA as an "advisor"? Possibly! Could help edX with standards & practices
Uncle Bob thinks RDBMSs & disk storage in general are going away
There is some reality to this (Dell is working on "The Machine"), but largely, it is over-optimistic
Even if we didn't think about disks, it still matters how you persist things. It affects your ability to quickly access that data.
Relational data, irrespective of storage, is useful
Chapter 31: The Web Is a Detail
Architecture is fractal - if the frontend is a "detail", then it is detail that has its own business logic and sub-details
Chapter 32: Frameworks Are Details
Skipped
Chapter 33: Case Study: Video Sales
Chapter 34: The Missing Chapter