Let me tell you the story of one of the most valuable test suites I have ever written.

api

I was working on an application communicating with a web API. I did similar things before. I wrote plenty of methods making HTTP calls, processing the output and handling errors. But this was a different story. There wasn’t a single domain. There were few. Moreover, in order to perform one operation, I needed to send many HTTP requests. They were strictly correlated — each request contained some data from the previous response. And did I mention a single business process was distributed among many processes of a task queue? It already seemed hard, but the worst was yet to come. During the development phase I wrote many unit tests. I used the mocking tool provided by the test framework to create test doubles of objects communicating with the external world. I was checking if correct methods were called, what was the result of method calls if the API returned some particular response etc.

integration collapsed

I did it. It worked perfectly on my dev machine. Great! But it miserably failed in production. Great… Most of the time the user saw nothing but a message saying that an error occurred while trying to communicate with the external system. How was that possible? I wrote automated tests. I did some manual testing as well. Never saw anything like this.

race condition

A quick investigation revealed there was a race condition problem. The API was responding with some IDs. Right after receiving those IDs, I was sending them back in order to ask for more details. It turned out that the production environment was fast enough to get a 404 Not Found response for resources which, according to the API, have just been created. There were also few very similar problems. And just one solution — to repeat the request.

how to test?

I got concerned. How was I supposed to test this? I already knew it would never occur on my machine. Pretty soon I realised that really good, isolated and automated tests were not my support. They were my only hope. There was simply no other way. I needed to somehow create a test double of the whole, stateful web API!

The HTTP client I was using was a piece of object oriented software. I could easily use the mocking tool to create its test double which would always respond with some particular response to some particular request. But such a simple, stateless behavior was not even close to how the actual service behaved. I seriously needed a test double which would work (and fail) like the real API.

The very first idea I got was to write some imperative code that would mimic the stateful behavior. It didn’t feel correct. If I can make mistakes writing simple imperative code making some calls, how can I know another imperative code responding to these calls is free of bugs? I faced this problem around the time when I got interested in practical application of state machines in web applications (which few years later led to the creation of Rosmaro). I recalled the State design pattern and implemented the interface of the HTTP client around the idea of a graph where every leaf was a response and every arrow was a request. It allowed me to mimic some pretty complex behavior in a way that was easy on the brain and closely reflected how did the API behave in production.

For the purpose of this article, I’d like to focus on a simple yet stateful interaction with a web API. Because we’re talking about web applications, it’s natural to choose a ToDo App as the example. Our task is to add a ToDo and then mark it as done. There are two endpoints: one to add a ToDo and another one to mark a ToDo as done. After adding a ToDo we’re given its ID. We can then use it to mark it as done. I solved the problem I faced by implementing my own mocking tool using the State design pattern. But I wouldn’t do it again. The only reason I did it was because I didn’t know a great tool called WireMock. Let me quote the official documentation to explain what sets it apart from other mocking tools.

WireMock supports state via the notion of scenarios. A scenario is essentially a state machine whose states can be arbitrarily assigned.

It had me at behavior modeled as state machines.

Let’s run it as a standalone process and configure it using a bunch of JSON files in order to mimic the behavior shown in the picture below. This WireMock process will simply behave like an HTTP server. We can then configure our system under the test to send requests to an address like localhost:8888 instead of example.com/api.

adding a todo

Here comes the first arrow connecting the initial node (which is always called Started) with the Added node:

{
  "scenarioName" : "Add and mark as done",
  "requiredScenarioState": "Started",
  "newScenarioState": "Added",
  "request" : {
    "url" : "/todos/add",
    "method" : "POST",
    "bodyPatterns" : [ {
      "equalToJson" : "{ \"todo\": \"make a test double\" }"
    } ]
  },
  "response" : {
    "status" : 200,
    "headers": {
      "Content-Type": "application/json"
    },
    "jsonBody" : {
      "id": 123
    }
  }
}

The request key tells us that we expect it to be a request to add a ToDo. We will then get a response described under the response key. It contains the ID our ToDo was given. Following this arrow will lead us to the Added node. All our mock supports is to mark the added ToDo as done. The arrow between Added and Marked as done is defined in the following way:

{
  "scenarioName" : "Add and mark as done",
  "requiredScenarioState": "Added",
  "newScenarioState": "Marked as done",
  "request" : {
    "url" : "/todos/markDone/123",
    "method" : "POST"
  },
  "response" : {
    "status" : 200
  }
}

The requiredScenarioState key is very important. It states that this is an arrow from the Added node. It means that even if we send a request that matches while the current state is different (let’s say we’ve just Started and we’re in the the Started node), it’s not going to be followed. It forces us to first add a ToDo and only then mark it as done. The ID within the request is also worth noticing. It’s the exact same ID the server responded with when we added a ToDo.

{
  "scenarioName" : "Add and mark as done",
  "requiredScenarioState": "Marked as done",
  "request" : {
    "url" : "/passed",
    "method" : "GET"
  },
  "response" : {
    "status" : 200
  }
}

The last node of this simple graph makes the server respond with the 200 HTTP code only if the ToDo has been successfully added and marked as done. This will allow us to verify if the code worked correctly. Let’s see how does our mock API work.

We’re adding a ToDo and its given an ID.

$ curl \
    -H "Content-Type: application/json" \
    -X POST \
    -d '{"todo": "make a test double"}' \
    http://localhost:8888/todos/add
{"id":123}

Now we can mark it as done.

$ curl -i -X POST http://localhost:8888/todos/markDone/123
HTTP/1.1 200 OK

And finally we’re making sure the ToDo was marked as done.

$ curl -i http://localhost:8888/passed
HTTP/1.1 200 OK

record

If you are like me and you got worried that it may be tedious work to properly mimic all the real endpoints, let me tell you about another WireMock feature I find amazing. Instead of describing how does the mock behave in the first place, we can order WireMock to record the traffic between our machine and the real API. All we need to do is to fire up the browser and enter http://where_is_wiremock_listening/__admin/recorder, enter the address of the real API and press the Record button. Now we can make requests to localhost:8888 like it would be example.com/api. All the requests and responses are going to be saved as JSON files for us.

Of course it’s still the happy path. We could probably test it using the real API, even though it would be slower, less repeatable and dependent on the internet connection. But the real power of mocking APIs lies in the capability of testing how does our code behave in situations which are quite hard or even impossible to reproduce, like getting some specific errors or experiencing long delays. In order to do it, we can simply tweak a bit the graph representing the happy path by putting some 500 Internal Server Errors here and there. That’s exactly what I did to make sure my application was handling situations where the 5th request of out 8 fails three times.

Another thing worth noticing is that having a mock of the API we can test code interacting with endpoints which don’t exist yet. Both the API client and the API server can be developed simultaneously, because the team responsible for the client doesn’t need to wait for the team working on the server to deliver a testing environment. They just need to agree on the API contract, what’s necessary anyways.

WireMock has many other features, like: powerful request matchers, response templates and transformers, custom recording settings etc. This post is by no means an attempt to describe them all. I just wanted to encourage you to apply state machines to HTTP testing, because as you already know, it worked wonders with my test suite. It was literally the only way I found to deliver a working solution in a reasonable amount of time. If you’re willing to learn more about this amazing tool, I recommend the official documentation of WireMock.

Happy testing to all of you!