Clean Code Notes - Run 1

Book Clubbers: Michael Katz (Deactivated), Christopher Lee (Deactivated), Cliff Dyer (Deactivated)akivaR (Deactivated), Andy Armstrong (Deactivated), ChristinaR (Deactivated), Eric Fischer (Deactivated)Robert Raposa, Renzo Lucioni (Deactivated), ClintonB (Deactivated)Adam Palay (Deactivated), Nimisha Asthagiri (Deactivated)

Jan 6, 2016

Thoughts on organizing the book club

  • Case study of actual edX code
  • Go through each chapter and decide on what applies to edX
  • Overall thoughts on the book - writing, content, approach

Book Thoughts

  • We like that he acknowledges that there are multiple perspectives and multiple schools of thought
  • We like that he takes something complicated and makes it simpler

Chapter 3: Functions

  • Does smaller (and more) functions actually make it more readable?
  • In python
    • By the way, function calls don't actually get optimized away.
    • Performance issues should be exceptional cases - and should be commented as such.
    • Is Python's coding practice counter to these rules?
  • As a team that does a lot of code reviews, code readability is very useful.
  • Functions should only do 1 thing
  • Triad rule of number of parameters?
    • Python does have keyword arguments that helps mitigate this.
    • But JS doesn't.
    • But too many arguments means more combinations for testing
    • And also, we tend to have arguments that are mutually exclusive of another.
    • We may want to still flag this situation and consider why.
  • Step-down rule
  • Boy Scout rule
    • Clean up while updating code.
    • This is very difficult to do because it takes us down a long path.
    • How to encourage and make this happen at edX?
      • Be stricter in code reviews.
      • Have separate commits for refactoring.
        • Sometimes it's difficult because of merge conflicts.
      • Ways of deprecating - come back to it.
  • Duck-typing in Python?
    • Example: Argument can be a String or an Object.
    • PyContracts
    • Functions that are both getters-and-setters
  • Side Effects
  • Error Handling
    • all error handling in a separate function?  really?
    • case study: views.py versus api.py separation of concerns.
  • api.py really needed?
    • If there are no in-process requirement for it, then just exclude it?
    • If people start using api.py and we want to separate into an IDA, it makes it difficult.
    • If we don't create an api.py initially, we can create it later when it's actually required.
    • For an app with many views, you can use a views folder with multiple modules.

edX Takeaways

  • Is there somewhere we can point to for best practices?

Jan 13, 2016

Chapter 4: Comments

  • Comments do get out of date.  So commenting on things that aren't obvious makes sense.
  • Do we have an obligation when it comes to comments because of the open source nature of the platform
    • API Docs
    • pylint - docstring requirements
      • explicit disabling locally versus updating pylint to no longer require docstrings
      • but if explicitly disabled locally, that can also get stale
  • Why do docstrings have to be multi-line?
    • Takes up a lot of vertical space
    • Rationale: Code diffs are clearer?  (similar to the trailing comma guideline)
    • Petition against Ned Batchelder (Deactivated)?
  • Coding convention: triple double quotes
  • Ambiguity of parameters
    • "Could be a string or a list"
    • Duck typing
    • JSON is mildly typed
  • API bindings
    • DRF-Swagger will automatically create docstrings
    • Now that DRF is upgraded, we can look into this.
    • ClintonB (Deactivated) will send a link to how they do this for Course Discovery
    • Can create common DRF Serializer fields - for example, for OpaqueKeys.
  •  TODO comments
    • Have corresponding JIRA tickets
    • Distinguish between
      • short-term TODOs (address before merging the PR)
      • long-term TODOs (with JIRA tickets)
      • TODOs for code improvements - may not need tickets
    • lint-checker for TODO to contain ticket numbers
  • When to use comments and when not to seemed conflicted
    • All comments can get stale
    • So only write comments that are really necessary
  • In place of comments
    • use intermediate, well named, local variables
      • for conditionals
      • for long lines
      • But when there are local variables, you expect it to be used multiple times.
    • better method names
  • List comprehensions that are 5 lines should probably be for loops instead - for readability.
  • The python ternary operator reads much better since it's in one line.
    • But some developers don't like them as much.
  • "x or y" OR "x if x else y"

Chapter 5: Formatting

  • Python is very explicit
  • Dependent functions
    • In test files, we've generally been putting the helper functions on top (which is reverse from his recommendations)
    • But for other classes, we generally use the top-down approach.
  • For xCode
    • File hierarchy on disk is different from hierarchy in the IDE
    • So needs to be manually updated
  • Agree: Extra whitespace brings attention to wrong things
  • Like: Breaking indentation always returning back to the indented form
  • Agree: If statements not having braces
  • In JS, variable declarations always on top because of scoping rules, but is it helpful for readability?

Jan 20, 2016

Chapter 6: Objects and Data Structures

  • Law of Demeter
    • "objects that you created" is Java-specific
    • distinguishing between objects that you've created and those that you depend on
    • abstraction layers
    • in some places, it says no more than one period.
    • Train wrecks
      • distinguishing between breaking abstractions through objects, and
      • accessing data models directly
  • Objects versus Data Structures
    • Our django models have become hybrids
    • What's the right pattern of having separation of concerns between persistence and the object-level abstraction?
  • Should we create a new business logic layer on top of our models?
  • Too many layers is not advantageous: API view -> Utils function -> .... eventually a Model
  • Trade-off between making use of the framework and having abstraction layers
  • If we think about the edx-platform codebase history, it seems we add more behaviors overtime, but no data structures.
    • This implies perhaps we should think more procedural, rather than OO.

Chapter 7: Error Handling

  • Using exceptions rather than error codes
    •  No brainer.
  • Throwing exceptions instead of returning None
    • We do this in our codebase.
    • It's fine as long as we document it - we don't have a result for you.
    • We do like his suggestion of having a special case object.
  • Using django framework conventions
    • ObjectDoesNotExist exception
    • get_or_none - returns None
    • first - return None
  • In Java, exceptions are expensive, so they try to avoid them.
    • But in Python, that's not the case.
  • For example, getattr, just catches the exception and returns False/True.
    • But if catching the exception is acceptable for code, then go fo it.
  • By returning error codes, the callee can be assured that the caller will handle all cases.
    • But then too many if statements.
    • And probably end up wrapping it anyway, and throw an exception.
    • If you make a mistake, and completely miss a case, then would get unnoticed.
      • In some other languages, however, you have to pull out your object explicitly.
    • hasattr - in case the object raises an exception other than AttributeError, later version of Python now catches it and returns False.  Before, the exception was passed through.
  • API wrapper mixin makes it easy to translate View exceptions to formatted HTTP error return objects.
  • Some languages have @Nullable decorators to explicitly specify which functions have chosen to return Null.
  • Passing Null around
    • In some cases, we do want to do this - for example, Null values in Address fields, as opposed to empty strings.
    • One way of thinking about this for Python: passing None for optional parameters is fine, but for non-optional parameters should be seriously vetted.
    • "?." is not yet accepted into Python.

Next time: Chapters 8 and 9.

Jan 27, 2016

Chapter 8 and 9

  • Cliff discovered a useful tool! Mocking server for IDAs that need to talk to each other.
    • There's also a python tool called beta-max which "caches" IDA responses (so you can write tests against responses)

Boundaries

  • Wrapping 3rd party libraries: it can easily be overdone.
  • Forking, and making minimal changes, can bite us, since it makes it more complicated to upgrade the library
    • Could be a good place to get in the habit of wrapping
  • Coverage of 3rd-party library — does it matter if it has low-percentage test coverage?
    • If we wrapped 3rd-party libraries, having unit tests for the wrapper could prevent against 3rd-party libraries of unknown quality
  • Difference between testing against 3rd-party libraries and services

Testing

  • "What we talk about when we talk about testing" — "I'm sure there's a blog post called that out there"
  • TDD
    • Clean code: don't write more code than is sufficient to pass the test
    • Relationship to red/green/refactor? — Similar, when you get down to first principles
    • Sometimes DRY-ness in tests make them harder to understand (DRY = don't repeat yourself)
    • DRY tests make it easier to do large refactoring
  • DSLs for Testing
    • Course DSL for creating courses in tests
    • Using named-tuples for ddt --> makes tests more readable (could also use a dict)

Feb 10, 2016

Chapter 10: Classes

  • Small classes?

    • Need better developer tools
      • It would be great if our IDEs had a better visual tool for presenting relationships between our classes - so it's easier to understand how all the small classes fit together.
      • EComm used django-oscar, which had a ton of classes - but made it hard to figure out where to start.
    • Hard to discover classes, but easier to add new classes.
  • Single responsibility principle
    • What is a responsibility?
  • Underlying principles though help
    • make complex code easier to understand.
    • make it easier to test.
  • Can go overboard on generalizing everything.
  • Are the SQL classes in table 10-10 really better than the original code in 10-9?
  • Description of the class without using if, or, and, but.
  • Cohesion as a useful tool to gauge a class - as a guideline.

Chapter 11: Systems

  • Dependency Injection
    • decoupling, extension mechanism
    • iOS attempted this for string resolution, but instead of passing the dependency through all layers, created a singleton as a lookup service.
  • Enterprise Java Beans
    • property setters/getters decoupled from the storage layer and the behavior
  • Aspects
    • Declare in a separate place all the before and after operations to do for its decorated methods.
  • Separated Code dependency (code depending code implemented elsewhere)
    • Aspects
    • Python Mixins
    • Python Decorators
  • DRF is a good example for us to look at.
  • Design exercise: What design patterns to use when designing an access control permission checking framework?

Feb 17, 2016

Chapter 12: Emergence

  • Not controversial to keep the code clean

  • This chapter didn't introduce any new ideas
  • Code that doesn't have tests/hasn't been tested, should not be shipped.
  • Keep minimal classes AND keep them short?
    • The idea is that if the code is clean, classes will be minimal and short.
    • Short could also be interpreted as conceptually short instead by lines.
  • In edx, seems like duplication can make the code easier to read. 
  • In test code, code duplication could make the test much easier to read.
    • There are some testing frameworks that use real sentences to describe tests instead of code. EdX used to use Lettuce, which did this.
    • Pylint complains about test function name length. This limit has been tweaked up. Possibly increase this?
    • Englishy test frameworks might be better for UI testing. 

Chapter 13: Concurrency

  • Ned made a great concurrency joke
  • Concurrency is hard.
  • We believe edx is single threaded, but has multiple processes. 
    • Mongo and django, have their own mechanisms for concurrency. 
  • GIL! Global Interpreter Lock
    • Python code that runs on two threads, are still not run at the same time and on a single core
    • Ssssssssss

March 9, 2016

Chapter 17: Smells and Heuristics

  • Removing commented out code - Yes or No?  YES!!
  • G30: Functions should do one thing
  • Creating a Coding Standard Convention Wiki
    • Find terminology used throughout the book that we want to use to identify.
    • Make sure we don't get too jargony - that it makes it hard for newcomers to come up to speed.
    • Make sure PR reviewers still continue to explain themselves rather than just throw out a term.
    • Hyperlinkable sections.
    • Create a confluence wiki - contributor guidelines.
  • Action Items: