Test Early and Test Often
Q: What does this mean exactly? TDD?
A: If your definition of done includes tests, automatically come up with some strategy to make that happen.
“Three amigos”: Product Owner, Developer, Tester/QA
If you have a framework, can translate this directly into the tests.
Not just unit tests. Might not want to run full gamut, but the later you test, the higher the cost.
Definition of Done should include updating/writing Automated tests
Annotate your tests
Q: What criteria/annotations to use when determining what unit/e2e tests are most important to always run in CI/CD before shipping to production? (For example if we have tests that take 60 minutes and want to only run 20 minutes worth of tests, what methods can you use to decide on which 20 minutes of test)
A: Assuming that includes unit tests that are already being run and API tests. Are they your fast tests? Or do you have things that are bloated? How much parallelization is available? Selenium, microbenchmarks, etc. What’s important to test depends on the application and strategy.
Q: If all tests were annotated, how to decide which ones are worth running at any given time? How has that ruleset evolved?
A: Depends on the contract within the team or org. Example: There was a company where the full suite took 13 hours to run–had to run that before merge could happen. Created strategies to parallelize, could not break it. “Break anything in the engine and you break the whole company.” Over time as the code became more modular/broken out, teams who made progress with that could run smaller sets of tests for their own development (not more than 20 mins). ask each group which the important tests were. The PR system would run the full set of tests for everyone. At the end of the day, it help set a baseline.
Q: Do annotations give more flexibility in the locations of the tests?
A: Annotations done for integration-level tests, not unit tests (which are co-located with the code).
How to know what unit tests to run: based on feature being tested, property file mapping.
Have we thought about using RCAs for our own bugs, including CAT-2s (not just critical outages)? Doing why, not what. Often comes down to people and processes, not code.
Can do reporting by severity and type to get a better understanding of where issues are coming from.
Previous experience: Each team does their own RCA, and then bubble up into a higher level RCA to find common patterns, missing infrastructure, missing training, etc.
Q: Re: Training: How to train testing – how to test, when to use what kind of testing. What are good resources? How do you train people who are new to testing, and how do you keep that level of skill across an org?
Q: Only one assert per Unit test?
A: To help make it obvious what exactly broke when there are failures.
Q: How do you even get a handle on the state of unit tests (which are really a big combo of integration test and unit tests)?
A: They created a metadata tool to inspect their tests and try to categorize them (based on test attributes, metadata, annotations), about 3K test classes.
To help address breaking things across applications, when selecting “important tests to run”, if code/tests are coupled across teams, then they need to also share the tests that are run so they know when they’re breaking each other.
As we’re refactoring, start moving those tests around to try to remove dependencies, but this is expensive.
How do you know the quality of your tests?
Can use mutation testing tools such as PIT
How effective are your tests?
Examine how much coverage it actually has. Can we replace with cheaper/faster Unit tests?
Types of Testing
git-bisect helpful to identify things like performance regressions where the test suite is too expensive to run on every commit.
Q: What do you mean by acceptance vs. integration tests?
A: Acceptance vs. Smoke vs. Regression is a business decision
Based on what’s acceptable.
If you’re working in Auth, is it acceptable for you to make a change and not run it against any of the applications using it? If it’s not, you’ll probably have an acceptance test.
If those other applications aren’t modular enough, probably worth writing a UI test for that functionality.
Are you going to break everybody else?
Q: Is that assessment of which it is something that happens in collaboration with Product?
A: Developer productivity call, not Product in this case. PO input/decision comes later in smoke testing.
Curious about test granularity and how to choose what types of automated tests to write for micro-frontends. Jest has “snapshot testing” which we’ve found to result in hard-to-update tests in cases where different snapshots overlap each other.
In writing this, I think it comes down to the “one assertion” principle again - your snapshot test should be snapshotting one thing, not itself and the entire DOM tree below it. Doing the latter is likely to result in changes in one code area affecting many otherwise unrelated tests.
Shweta didn’t have too much guidance, she said. She suggested trying to maintain 80-90% unit test coverage as something that felt right to her.
Q: How is it different without dedicated testers
A: All about the tasks. Came up with check lists and guidelines, sometimes people would rotate through the QA role, but often just came down to tasks. For example, someone can write acceptance tests vs. someone else writing feature. Idea is to identify the work, who does it is not as important.