Łukasz Makuch

Łukasz Makuch

How to prevent intermittent test failures

A park

Imagine a warm summer afternoon. You're chilling out, enjoying a glass of refreshing lemonade. All you can hear are distant chatter and birds singing.

Suddenly, there's a buzzing sound! A fly passes right next to your head. Focus on that insect! Try not to lose sight of it as long as you can.

What was its trajectory? Did it fly straight from point A to point B? Of course not! No insect ever does it! Observing it for as little as a couple of seconds was enough to witness it flying all over the place.

Why does it do it? Is the fly having problems with flying? No, quite the opposite!

In fact, it exhibits extraordinary maneuverability! And it literally uses it to fight for its life! How? Random changes in direction make the fly harder to catch, effectively deceiving predators.

But what's a fly's gain is a bird's loss.

Coping with unpredictability isn't a challenge that only birds face. It also affects us, the IT professionals. We've all seen automated tests failing, just to pass correctly after nothing but a restart. Even when everything works fine on our machine, we can never know if it will work in the CI environment. Once we spot the problem, we may discover that the issue is somehow related to a statement like "wait 100 ms". How do we solve it? Well, changing "100 ms" to "200 ms" may temporarily fix it, but it doesn't guarantee that it won't happen again. More importantly, it doesn't silence our inner voice telling us that this may not be the most professional thing to do. In the end, we want our test suite to tell us when something is broken. We don't want the tests to be the thing that breaks. When they fail, our reaction should be "Oh no, the app broke!" and not "Oh no, the tests broke!".

Let's say that we have an app that fetches a list of items. Before the list is ready, the user can see a spinner. It takes some time for the list to load, and we can use that time to verify if the spinner meets the requirements. Is it displayed? Is some button disabled when the list is being loaded? Are the previous results hidden?

The idea of such a test is simple. However, in reality, it tends to be more than trivial to build. One thing, in particular, may pose a challenge - timing. How can we make sure that the test case has enough time to perform all the necessary assertions before the data is loaded and the screen changes? A common workaround to that problem is to delay the response from the mocked service for some number of milliseconds and hope that it will be enough. Unfortunately, we can never know what's the right delay. 100 ms? 84 ms? 1 second maybe? It depends on plenty of things we cannot control, such as the machine the tests run on and even the current load of that machine. A little slowdown of the test runner may be enough for the mocked service to respond and make the app transition to a different state before the test finishes making assertions about the loading state.

A term to describe our problem was already coined over half of a century ago. It's a race condition. It happens when the correctness of a process depends on the order of events we cannot control. And similarly to betting on horse racing, relying on tests waiting an arbitrary number of milliseconds is more like gambling than a profession.

Imagine how much easier our lives would be if the test and the mocked service communicated instead of competing with each other. There would be no race conditions if the test suite could just tell the mock "Hey, I finished all the assertions for now and I'm ready for you to give the app the data it asked for.”. We could see an increase in multiple areas spanning from the stability of individual test cases, through the performance of the whole pipeline, to our - developers - wellbeing. Just think about how comfortable would it be if no unexpected build failure would ever delay a deployment again!

I got something that may give you relief. In Endpoint Imposter you can assign what's called a release key to any mocked response.

{
  request: { path: '/todos' },
  releaseOn: 'todos',
  response: { json: ['a','b','c'] }
}

The mocked service won't give the app the items it asked for unless the test gives it permission to do so by issuing a request to http://mock/admin/release?session=sessid&key=todos.

That way you will always have enough time to make all the necessary assertions about the loading state before the screen changes.

It's that simple.

If there's one thing I'd like you to take away from this post, it's to let Endpoint Imposter make it easy for you to catch bugs.

From the author of this blog

  • howlong.app - a timesheet built for freelancers, not against them!