At GeeCon Prague 2015, I saw a great talk by J.B. Rainsberger about integrated tests. It was basically a live version of his essay Integrated Tests Are a Scam, that I had read earlier. But It was great to hear his reasoning live, in a narrated way.

I use the term integrated test to mean any test whose result (pass or fail) depends on the correctness of the implementation of more than one piece of non-trivial behavior. J.B. Rainsberger

The Scam: Original Version

The gist of his talk (and his essay, which you should read), is: If you start to depend on integrated tests, you need to write more and more of them over time. Since they don’t put as much positive pressure on your design as unit tests would do, your code quality will suffer, more defects will escape your net of tests, and you’ll need more integrated tests to reproduce them.

The driving force here is: Positive design pressure caused by unit tests. We are missing this positive pressure with integrated tests, and this brings us into a tail spin.

The Problem with that Version

Now, some people seem to have a hard time to understand this narrative. After the talk, I have heard from several people things like:

“Integrated tests might not be the best possible solution, but they are not so bad. Better have some tests than none.”

“This sounds nice in theory, but I don’t think his solution (unit tests with mocks) will work in the real world. It is not any better than having the integrated tests.”

“But we have to test if all components are working together, don’t we?”

Some Don't Feel the Pressure

I almost dismissed what they said as “some people always complain”. But then I started thinking. For me, the reasoning in the talk was absolutely plausible and the conclusion to use unit tests and mocks seemed just right. Why was this not the case for other people?

And I think I found a possible explanation…

Some people don’t feel the positive design pressure from their tests. They write tests, but don’t take too much care to write good unit tests. Often, they don’t refactor enough. Then, after a very small change, 20 tests break, and they complain that testing is waste. I have seen this scenario, and I did those things myself.

In other words: TDD is hard. You have to learn it and train it and take the time to do it right. But when you do it right, you can get great benefits from it.

Alternative Version of the Scam

Even if you do TDD like that, integrated tests are still a scam: They still lead you down this vicious circle where you need more of them the more you have.

  1. We already have some integrated tests
  2. They don't catch many errors (because you never have enough integrated tests)
  3. When an error occurs, we try to write a test
  4. Since we already have some integrated tests, and we do not refactor enough, some parts of our system are really hard to test
  5. So, let's just add some code to an existing test or write another integrated test, and then fix the bug
  6. We have less time to write unit tests or improve our design
  7. Go to 1

I know that one can explain this cycle with “the positive design pressure is missing”. But it feels differently for the people in the cycle, who are not “listening to their tests” anyway. Because if they would, they would not be in this cycle.

Second Vicious Circle</h2 There is a second vicious circle that often seems to happen at the same time:
  1. We already have some integrated tests
  2. We don't want to cover a line of code with different tests, because we think this would be waste
  3. So we don't write unit tests, since we already covered the code with integrated tests ("There is already a test for this class, why do you write another one?")
  4. Oh shit, writing those integrated tests is hard
  5. Also, they never catch regressions (see above)
  6. So we write less of them, and make them bigger
  7. Bigger tests cover more lines -> Go to 1

Trust in Automated Testing Suffers

So, writing integrated tests is hard. And those tests seldom catch real regressions. I mean, they do, but they'll also often fail when we make a perfectly legitimate change. So, over time, people will start to consider automated testing a waste. Also, integrated tests often become flaky. There are a simply too many reasons why they might fail, so they'll fail "for no reason" from time to time. You'll hear your team members say: "Yeah, the nightly build is red, but whatever. This test simply fails once or twice a month." Over time, the team trusts the test suite less and less. You can easily get into a situation where a large percentage of the builds fail because of flaky tests, and nobody fixes them, because fixing those tests is hard, and they never catch any regressions anyway.

Manual Testing

When trust in automated testing decreases, teams often rely more and more on manual testing. "Red-Green-Refactor" becomes (if we ever had it) "Change-Test-Debug". They'll have testers (on the team or in an external testing department) who will "validate" the results once the programmers are "finished". At best, they will automate some test cases "through the GUI", but often they will just click through the program. This manual testing and automating through the UI slows down all feedback cycles considerably. Later feedback means that fixing the problems we find becomes more expensive, because someone has to go back and change something that was supposed to be "finished".

Conclusion

You cannot write all the integrated tests you need to verify that your software works correctly under all circumstances. Even covering all the code paths is incredibly hard. So, if you rely too much on them, you will increase the likelihood that some defects will slip all your safety nets and "escape" to production. Integrated tests are self-replicating: When you have more, you'll need them even more. Then they often become flaky and trust in your test suite starts to suffer. But when you rely too much on manual testing as a result, you slow down your feedback cycles, making your development unresponsive and expensive. So: Be wary of integrated tests! Now subscribe to my newsletter so you don't miss Part 2: "The Mock Objects Trap"! You might be also interested in:
  • Cheap plastic drills: Most people think construction workers should have great tools. A lot of people think paying more than 1000 Euros for an office chair is a waste of money, even for a software developer who uses it 8 hours a day. Good tools are expensive.
  • RMocks or Intermediate Results: What I Would Do: Considerations about different ways to test a very simple class with some nasty side effects.
  • Advance IT Conference: A conference about managing software development.