Book Clubbers: Michael Katz (Deactivated), Christopher Lee (Deactivated), Cliff Dyer (Deactivated), akivaR (Deactivated), Andy Armstrong (Deactivated), ChristinaR (Deactivated), Eric Fischer (Deactivated), Robert Raposa, Renzo Lucioni (Deactivated), ClintonB (Deactivated), Adam Palay (Deactivated), Nimisha Asthagiri (Deactivated)

Jan 6, 2016

Thoughts on organizing the book club

Case study of actual edX code
Go through each chapter and decide on what applies to edX
Overall thoughts on the book - writing, content, approach

Book Thoughts

We like that he acknowledges that there are multiple perspectives and multiple schools of thought
We like that he takes something complicated and makes it simpler

Chapter 3: Functions

Does smaller (and more) functions actually make it more readable?
In python
- By the way, function calls don't actually get optimized away.
- Performance issues should be exceptional cases - and should be commented as such.
- Is Python's coding practice counter to these rules?
As a team that does a lot of code reviews, code readability is very useful.
Functions should only do 1 thing
Triad rule of number of parameters?
- Python does have keyword arguments that helps mitigate this.
- But JS doesn't.
- But too many arguments means more combinations for testing
- And also, we tend to have arguments that are mutually exclusive of another.
- We may want to still flag this situation and consider why.
Step-down rule
Boy Scout rule
- Clean up while updating code.
- This is very difficult to do because it takes us down a long path.
- How to encourage and make this happen at edX?
  - Be stricter in code reviews.
  - Have separate commits for refactoring.
    - Sometimes it's difficult because of merge conflicts.
  - Ways of deprecating - come back to it.
Duck-typing in Python?
- Example: Argument can be a String or an Object.
- PyContracts
- Functions that are both getters-and-setters
Side Effects
Error Handling
- all error handling in a separate function? really?
- case study: views.py versus api.py separation of concerns.
api.py really needed?
- If there are no in-process requirement for it, then just exclude it?
- If people start using api.py and we want to separate into an IDA, it makes it difficult.
- If we don't create an api.py initially, we can create it later when it's actually required.
- For an app with many views, you can use a views folder with multiple modules.

edX Takeaways

Is there somewhere we can point to for best practices?

Jan 13, 2016

Chapter 4: Comments

Comments do get out of date. So commenting on things that aren't obvious makes sense.
Do we have an obligation when it comes to comments because of the open source nature of the platform
- API Docs
- pylint - docstring requirements
  - explicit disabling locally versus updating pylint to no longer require docstrings
  - but if explicitly disabled locally, that can also get stale
Why do docstrings have to be multi-line?
- Takes up a lot of vertical space
- Rationale: Code diffs are clearer? (similar to the trailing comma guideline)
- Petition against Ned Batchelder (Deactivated)?
Coding convention: triple double quotes
Ambiguity of parameters
- "Could be a string or a list"
- Duck typing
- JSON is mildly typed
API bindings
- DRF-Swagger will automatically create docstrings
- Now that DRF is upgraded, we can look into this.
- ClintonB (Deactivated) will send a link to how they do this for Course Discovery
- Can create common DRF Serializer fields - for example, for OpaqueKeys.
TODO comments
- Have corresponding JIRA tickets
- Distinguish between
  - short-term TODOs (address before merging the PR)
  - long-term TODOs (with JIRA tickets)
  - TODOs for code improvements - may not need tickets
- lint-checker for TODO to contain ticket numbers
When to use comments and when not to seemed conflicted
- All comments can get stale
- So only write comments that are really necessary
In place of comments
- use intermediate, well named, local variables
  - for conditionals
  - for long lines
  - But when there are local variables, you expect it to be used multiple times.
- better method names
List comprehensions that are 5 lines should probably be for loops instead - for readability.
The python ternary operator reads much better since it's in one line.
- But some developers don't like them as much.
"x or y" OR "x if x else y"

Chapter 5: Formatting

Python is very explicit
Dependent functions
- In test files, we've generally been putting the helper functions on top (which is reverse from his recommendations)
- But for other classes, we generally use the top-down approach.
For xCode
- File hierarchy on disk is different from hierarchy in the IDE
- So needs to be manually updated
Agree: Extra whitespace brings attention to wrong things
Like: Breaking indentation always returning back to the indented form
Agree: If statements not having braces
In JS, variable declarations always on top because of scoping rules, but is it helpful for readability?

Jan 20, 2016

Chapter 6: Objects and Data Structures

Law of Demeter
- "objects that you created" is Java-specific
- distinguishing between objects that you've created and those that you depend on
- abstraction layers
- in some places, it says no more than one period.
- Train wrecks
  - distinguishing between breaking abstractions through objects, and
  - accessing data models directly
Objects versus Data Structures
- Our django models have become hybrids
- What's the right pattern of having separation of concerns between persistence and the object-level abstraction?
Should we create a new business logic layer on top of our models?
Too many layers is not advantageous: API view -> Utils function -> .... eventually a Model
Trade-off between making use of the framework and having abstraction layers
If we think about the edx-platform codebase history, it seems we add more behaviors overtime, but no data structures.
- This implies perhaps we should think more procedural, rather than OO.

Chapter 7: Error Handling

Using exceptions rather than error codes
- No brainer.
Throwing exceptions instead of returning None
- We do this in our codebase.
- It's fine as long as we document it - we don't have a result for you.
- We do like his suggestion of having a special case object.
Using django framework conventions
- ObjectDoesNotExist exception
- get_or_none - returns None
- first - return None
In Java, exceptions are expensive, so they try to avoid them.
- But in Python, that's not the case.
For example, getattr, just catches the exception and returns False/True.
- But if catching the exception is acceptable for code, then go fo it.
By returning error codes, the callee can be assured that the caller will handle all cases.
- But then too many if statements.
- And probably end up wrapping it anyway, and throw an exception.
- If you make a mistake, and completely miss a case, then would get unnoticed.
  - In some other languages, however, you have to pull out your object explicitly.
- hasattr - in case the object raises an exception other than AttributeError, later version of Python now catches it and returns False. Before, the exception was passed through.
API wrapper mixin makes it easy to translate View exceptions to formatted HTTP error return objects.
Some languages have @Nullable decorators to explicitly specify which functions have chosen to return Null.
Passing Null around
- In some cases, we do want to do this - for example, Null values in Address fields, as opposed to empty strings.
- One way of thinking about this for Python: passing None for optional parameters is fine, but for non-optional parameters should be seriously vetted.
- "?." is not yet accepted into Python.

Next time: Chapters 8 and 9.

Jan 27, 2016

Chapter 8 and 9

Cliff discovered a useful tool! Mocking server for IDAs that need to talk to each other.
- There's also a python tool called beta-max which "caches" IDA responses (so you can write tests against responses)

Boundaries

Wrapping 3rd party libraries: it can easily be overdone.
Forking, and making minimal changes, can bite us, since it makes it more complicated to upgrade the library
- Could be a good place to get in the habit of wrapping
Coverage of 3rd-party library — does it matter if it has low-percentage test coverage?
- If we wrapped 3rd-party libraries, having unit tests for the wrapper could prevent against 3rd-party libraries of unknown quality
Difference between testing against 3rd-party libraries and services

Testing

"What we talk about when we talk about testing" — "I'm sure there's a blog post called that out there"
TDD
- Clean code: don't write more code than is sufficient to pass the test
- Relationship to red/green/refactor? — Similar, when you get down to first principles
- Sometimes DRY-ness in tests make them harder to understand (DRY = don't repeat yourself)
- DRY tests make it easier to do large refactoring
DSLs for Testing
- Course DSL for creating courses in tests
- Using named-tuples for ddt --> makes tests more readable (could also use a dict)

Feb 10, 2016

Chapter 10: Classes

Small classes?
- Need better developer tools
  - It would be great if our IDEs had a better visual tool for presenting relationships between our classes - so it's easier to understand how all the small classes fit together.
  - EComm used django-oscar, which had a ton of classes - but made it hard to figure out where to start.
- Hard to discover classes, but easier to add new classes.
Single responsibility principle
- What is a responsibility?
Underlying principles though help
- make complex code easier to understand.
- make it easier to test.
Can go overboard on generalizing everything.
Are the SQL classes in table 10-10 really better than the original code in 10-9?
Description of the class without using if, or, and, but.
Cohesion as a useful tool to gauge a class - as a guideline.

Chapter 11: Systems

Dependency Injection
- decoupling, extension mechanism
- iOS attempted this for string resolution, but instead of passing the dependency through all layers, created a singleton as a lookup service.
Enterprise Java Beans
- property setters/getters decoupled from the storage layer and the behavior
Aspects
- Declare in a separate place all the before and after operations to do for its decorated methods.
Separated Code dependency (code depending code implemented elsewhere)
- Aspects
- Python Mixins
- Python Decorators
DRF is a good example for us to look at.
Design exercise: What design patterns to use when designing an access control permission checking framework?

Feb 17, 2016

Chapter 12: Emergence

Not controversial to keep the code clean
This chapter didn't introduce any new ideas
Code that doesn't have tests/hasn't been tested, should not be shipped.
Keep minimal classes AND keep them short?
- The idea is that if the code is clean, classes will be minimal and short.
- Short could also be interpreted as conceptually short instead by lines.
In edx, seems like duplication can make the code easier to read.
In test code, code duplication could make the test much easier to read.
- There are some testing frameworks that use real sentences to describe tests instead of code. EdX used to use Lettuce, which did this.
- Pylint complains about test function name length. This limit has been tweaked up. Possibly increase this?
- Englishy test frameworks might be better for UI testing.

Chapter 13: Concurrency

Ned made a great concurrency joke
Concurrency is hard.
We believe edx is single threaded, but has multiple processes.
- Mongo and django, have their own mechanisms for concurrency.
GIL! Global Interpreter Lock
- Python code that runs on two threads, are still not run at the same time and on a single core
- Ssssssssss

March 9, 2016

Chapter 17: Smells and Heuristics

Removing commented out code - Yes or No? YES!!
G30: Functions should do one thing
Creating a Coding Standard Convention Wiki
- Find terminology used throughout the book that we want to use to identify.
- Make sure we don't get too jargony - that it makes it hard for newcomers to come up to speed.
- Make sure PR reviewers still continue to explain themselves rather than just throw out a term.
- Hyperlinkable sections.
- Create a confluence wiki - contributor guidelines.
Action Items:
- Order "Effective Python" Cliff Dyer (Deactivated)
- Invite all of Eng to club Cliff Dyer (Deactivated)
- Ping IT on creating Github team Christopher Lee (Deactivated)
- Create wiki for coding convention Nimisha Asthagiri (Deactivated)

March 30, 2016

Effective Python: Chapter 1 & 2

Item 2:
- double-underscore is not for private variables.
Item 3:
- How to handle native strings?
  - from __future__ import unicode_literal & silence when needed (variable behavior between files (yuck))
  - lint native strings
  - have coercion functions
- Needs engineering education
Item 5: slice
- Is it bad that it is forgiving?
  - Sometimes it's useful
  - Maybe there should be a failfast version.
Item 10: enumerate
- Very useful
- Takes initial index! Cool!
Item 11: zip
- Makes Akiva nervous, because it has weird behavior with different length strings
- zip_longest does the longer one (padding with Nones)
Item 12:
- Renzo disagrees
- should we avoid else blocks.
- "If a feature is rare and somewhat surprising, you shouldn't use it." – Akiva
- Alternative is good: break loop out into a separate function
- Use with comment to describe behavior.
Item 14: Avoid None
- not var is a source of bugs because None and 0 are both false
- It's fairly useful for containers (lists, dicts, strings, etc.)
Item 15: Closures
- Accessing non-locals: create a class
- Use nonlocals as read only
Item 16: consider generators instead of lists
- small lists are fine
- generators can be problematic. See Item 17.
- we don't do much of this
- generators are awesome
- custom containers are awesomer
Item 17: be defensive when iterating over sequences twice.
- iter(foo) is iter(foo)
  - different from iter(foo) is foo?
- convert to list if you hit this kind of error and don't want to refactor into a custom container.
Item 21: keyword only args
- Nimisha does not recommend using **kwargs to enforce keyword-only args, because it makes the code ugly and hard to read
  - Sometimes it's useful anyway

Not Discussed:

Item 1: Know which version of Python you're using
Item 4: Write helper functions instead of complex expressions
Item 6: avoid using start, end, and stride in a single slice
- I think we did discuss this, at least obliquely. I didn't capture notes.
Item 7: Use list comprehensions instead of map and filter
Item 8: Avoid more than two expressions in list comprehensions
Item 9: Consider generator expressions for large comprehensions
Item 13: Take advantage of each block in try/except/else/finally
Item 18: Reduce visual noise with variable positional arguments
Item 19: Provide optional behavior with keyword arguments
Item 20: Use None and docstrings to specify dynamic default arguments

Clean Code Book Club Notes

Jan 6, 2016

Thoughts on organizing the book club

Book Thoughts

Chapter 3: Functions

edX Takeaways

Jan 13, 2016

Chapter 4: Comments

Chapter 5: Formatting

Jan 20, 2016

Chapter 6: Objects and Data Structures

Chapter 7: Error Handling

Jan 27, 2016

Chapter 8 and 9

Feb 10, 2016

Chapter 10: Classes

Chapter 11: Systems

Feb 17, 2016

Chapter 12: Emergence

Chapter 13: Concurrency

March 9, 2016

Chapter 17: Smells and Heuristics

March 30, 2016

Effective Python: Chapter 1 & 2