“We don’t need this E2E test if all teams have their pipelines green” – hearing this made me uneasy and slightly annoyed. I went on a tiny rant about automation, checks, tests, integrations, and how pipeline being green may not mean that the product is perfect. What do I mean with that? I believe that we need to make sure that our test automation is correct, extensive, and meaningful to give us a good foundation for product quality.

With the arrival of DevOps, many companies started adopting Continuous Integration, Continuous Delivery, Continuous Deployment principles: there are green/red pipelines, quicker releases, faster feedback… To make sure we build in the quality, more and more teams are learning about the advantage of creating checks for their code (intentionally, I am avoiding the word tests here, even though many of definitions for specifics are including tests like unit tests, integration tests, contract tests, etc.). If there are enough of automated checks, we would have a better safety net preventing any major issues and allowing us to release faster to production.

With this comes a lot of trust in those checks, though. A lot of people have a tendency to believe the correctness of checks if they are there, and that’s a danger zone.

What if the check for a certain things: a) does not even exist – all the deployment will be green; b) exists, but does not check what it should exactly (for example, the check verifies the wrong assumption and just confirms what was wrongly understood).

Automation checks should be meaningful

Checks should be created correctly, not just for the sake of having them. So, as a result, a healthy test pyramid, has various levels of checks – not only unit tests are included, but E2E tests as well. Their count may be way smaller, but verifying a user journey can be extremely beneficial – the approach goes from user perspective and may reveal some issues which were not covered in lower levels.

Question the validity of checks

You could very easily write wrong unit tests. Imagine that for some reason you have a strong belief that 2 + 2 should return 5, so you implement an addition function which yields exactly what you think is correct, and then you write a unit test to verify it which passes. Tests are green, pipeline screams yay, but is it correct? Not at all. Only human judgement writing the checks can make sense if they are correct. Check out this nice article on the correctness of checks with this example and more.

Validity is a very common problem I notice, sometimes it could be that the product does not work as expected, but checks created during implementation pass. The passing of them is not as trivial as 2 + 2 equaling 5. Sometimes the mocks used in automation can be silently misleading.

Observe the right level for checks

If you can write a unit test, is that big Selenium suite really necessary checking exactly the same functionality? There may be cases where it is when the product is very UI heavy, but in most of the cases it is very useful to question if test automation we are doing is being done smart, rather than done in order to have something. Questioning levels of checks can be a good start.

Aim for the healthy amount of checks

It is easy to make the pipeline green if there are missing checks – if you never write a test, how can it fail? This reminds me of this meme we once printed for the team I was in:

Tests can't fail if there are no tests

On another side, we also may over-automate, so we have to balance our checks. How much should we automate? I really like Alan Page’s stock automation phrase (his article which introduced it): “You should automate 100% of the tests that should be automated”.

So, if I had to summarize my thoughts, I’d say:

Instead of looking if pipeline is green or not, implemented test automation should be observed, too: its meaningfulness, correctness, and balance on certain test levels & amount.

Assumptions are a breaking force – if we assume that every team has a green deployment, it does not tell anything about their quality apart from the fact that their written automated checks passed. This does not assure the correctness of the written checks or that in general they make sense and there is a good coverage.