In the previous blog, I covered the principles of Shift Left testing and also shared an overview about the test pyramid. Now, I will turn to a curated set of blogs and articles to address various problems with each test type described in the test pyramid.
End to End testing - the most common yet most unstable test layer
Testing a product through it’s user interface is necessary because you built the product with a end user in mind and you must verify if the product met those requirements. Given the clear definition and purpose, it is not surprising that end-to-end (E2E) tests are most common form of testing for software products. Despite the clarity, E2E testing also suffers from a few tough problems.
A failing test does not directly benefit the user. A bug fix directly benefits the user.
To evaluate any testing strategy, you cannot just evaluate how it finds bugs. You also must evaluate how it enables developers to fix (and even prevent) bugs.
The above two quotes are from the timeless blog about E2E Tests. The authors are trying to say that developers need the ability to fix a bug and run the tests quickly to verify the fix. Unfortunately the execution time of the E2E tests as well as the dependency on a “full setup” often slows down the bug fixing process.
The second problem with E2E tests is the unreliability. User interfaces tend to change often and are susceptible to timing issues. This results in long-running E2E tests to fail randomly. The failure could also be due to test code (which is sizeable in case of E2E tests). The analysis of failed test cases causes more delay and hence the ability to fix the issue quickly.
Dropbox engineering team has built a system called Athena which automatically “quarantining” flaky or unreliable tests. They continue to run the flaky tests in a separate pipeline to get the issues analysed and fixed but their main build pipeline continues to run with the most reliable test cases. A similar approach is advocated by Google also.
The bottom line is to keep the suite of E2E test cases small and high quality. This might reduce the test coverage but it will also reduce the time taken to execute E2E tests. This drop in coverage needs to be addressed via Unit and Integration tests. Therefore, not only do the Unit and Integration tests need to provide better coverage and be more reliable, they also need to run faster than E2E tests.
Integration tests are the most ambiguous layer of the test pyramid. Aren’t E2E tests a form of integration tests? After all in E2E tests, the entire system is integrated and then tested. Another challenge with integration tests is responsibility. It is well understood that developers are responsible for Unit tests and QA/SDET engineers are responsible for E2E tests. What about Integration tests? In this well written blog, the author Colin But expands the Integration test layer into API tests and Component tests. He also identifies who owns these sub-layers.
In a distributed/microservice architecture, the definition of “component” and APIs is a very useful exercise. If you use Kubernetes, each Deployment could constitute a component. The key thing to note is all tests are done via APIs. Microservices interact with others using internal APIs and another set of APIs are exposed to the end user. The external APIs can be part of E2E testing but testing internal APIs is an important part of integration tests.
The final element of integration tests is “hermetic testing”. As described in this blog about “hermetic servers”, the hermetic testing approach creates a “sandbox” for a component so that it can be tested with its dependencies. The blog uses “isolation” as a concept for hermetic testing, but for me “self-contained” is a better term. Basically every time you need to run component tests, you recreate the system under test with all its dependencies, ideally in a single VM or server. This is best accomplished with mock servers and simulators.
Unit Tests (UT)
Finally let’s look at the Unit tests. The responsibility of this layer lies with the developers but it would be fair to say that most developers ignore unit tests. Unit tests by definition should not require any runtime system or dependency. They are expected to run very fast and expected to provide most coverage. Yet developers find unit testing the most challenging. Here are couple of reasons why Unit testing can be daunting: - Tightly coupled code - Unit tests are used for testing the smallest unit of code - most common unit being a function or method but it boils down to each line of code. When code is tightly coupled, it is hard to write UT for a function because the function might be using variables and objects initialized elsewhere. - 3rd party libraries - The second challenge is that functions may be using 3rd-party libraries for making DB requests or updating cache etc. As mentioned earlier UT should not assume the existence of any external services. You can use “mock” toolkits to handle such situations but that adds to the bloat of code in the UT.
These are genuine problems that need to be addressed but the bigger picture is that Unit testing is the closest layer to test-driven development (TDD). This practice allows you to write better code in order to make it easier to test. Like everything else, there are pros and cons. But there are ways to take a balanced approach. I am a firm believer that the right level of unit tests can provide much higher return on investment. In the next blog, I will share some of my recommendations for Unit tests.
In this blog, I shared interesting ideas and suggestions related to the 3 test tiers. This will allow you to get a better idea about the importance of each test tier and help you decide the right approach. In the final blog of this series, I will share my recommendations and ideas on how to progress on your Shift Left testing journey.