Tests that help you find defects faster

This is one of the most important lessons I try to teach less experienced developers. Next to making sure that your code works now, it's also essential to make sure that developers can fix defects in the future. The first step to achieving this is to write tests. Preferably, you are writing the tests before you write the code. But even if you have done that, your tests might still be inaccessible to other developers or make it hard for them to figure out what broke. Here's my list of properties that make tests go from good to great.

TL;DR

If you don't want to read the whole article, here is the list. Click on each entry to find out the details.

Each test should test one thing and one thing only
The test should do this with precisely the data it needs, nothing more
There should be no connection between tests
When the test fails, the error message should provide context to finding the problem

Mixed concerns in tests

A concern in software engineering is the what in "What does this code do?". If your code reads from the command line and writes to a database, you are handling at least two concerns. When you are testing your code, it's important to test one concern at a time. When a test fails, you have a much clearer picture of what aspect of your code isn't working correctly anymore. If your tests are handling two or more concerns simultaneously, you have to do some extra work before you can even start looking at what is wrong. You need to:

figure out the number of concerns the test handles,
how they relate to each other, and
which one causes the trouble.

All this extra work takes time which might be critical depending on the defect.

I follow a simple rule of thumb. Whenever I write the word "and" in a test description, I write two tests instead. Let's look at an example.

describe("Database manager", () => {
  it("should support reading and writing to the database")
})

This might be a great test that fails when something with the database connection is wrong. But could you tell which part has a problem? It could be the part of the application that reads the data, but it could also be the part that writes it. Even worse, it could be both. We can get out of this situation by splitting this test into two.

describe("Database manager", () => {
  it("should support reading from the database.")
  it("should support writing to the database.")
})

Now when the first test fails, we know something is wrong with the code that reads data. Respectively, we know that when the second test fails, the code that reads from the database has a flaw.

For me, not combining tests is hard when I'm starting on a new feature, and all the different use cases I need to test pop into my head. In these situations, I use to-do tests (I mainly work with jest). To-dos help me track what I still need to implement without bloating the existing tests that I have already written.

Extraneous data in tests

I like my tests best when they are concise. For me, this means that every line of code inside a test has a purpose. For instance, imagine a test like the following.

it("should communicate the value when a user changes the input.", () => {
  const onChange = jest.fn()
  const onKeyDown = jest.fn()
  const value = "test value"

  render(
    <Component value="initialValue" onChange={onChange} onKeyDown={onKeyDown} />
  )

  fireEvent.click(screen.getByRole("textbox"))
  fireEvent.change(screen.getByRole("textbox", { target: { value } }))
  fireEvent.blur(screen.getByRole("textbox"))

  expect(onChange).toHaveBeenCalledWith(value)
})

Now, answer the following questions:

Why do we need to set an onKeyDown handler?
Is it essential first to click and also blur the input?
Does this component require an initial value to work correctly?

Probably you can't answer any question with "yes" or "no." At least not without looking into other tests or into the code that we test.

If your test needs to perform some non-trivial actions, you might want to extract them into another function with a descriptive name. For instance, if clicking and blurring are important to change the input, then you could create a simulateChange helper.

function simulateChange(value) {
  fireEvent.click(screen.getByRole("textbox"))
  fireEvent.change(screen.getByRole("textbox", { target: { value } }))
  fireEvent.blur(screen.getByRole("textbox"))
}

This makes the test easier to read and clarifies that a change consists of multiple steps.

it("should communicate the value when a user changes the input.", () => {
  const onChange = jest.fn()
  const onKeyDown = jest.fn()
  const value = "test value"

  render(
    <Component value="initialValue" onChange={onChange} onKeyDown={onKeyDown} />
  )

  simulateChange(value)

  expect(onChange).toHaveBeenCalledWith(value)
})

We still have parts of our test that are not self-explanatory. For instance, the onKeyDown handler is a mock function, but there is no assertion for it. We should probably add a comment if we need to assign the value prop and the onKeyDown handler for the test to work. But if neither is essential for that test to work, then we can remove them.

it("should communicate the value when a user changes the input.", () => {
  const onChange = jest.fn()
  const value = "test value"

  render(<Component onChange={onChange} />)

  simulateChange(value)

  expect(onChange).toHaveBeenCalledWith(value)
})

We now have reduced the test to what is essential and made sure that future readers have less trouble understanding it.

Bloated test hooks

Some developers know that interdependent tests aren't desirable. In practice, this means that when you change one test, another one might fail. "Bloated test hooks" are my special version of this.

Tests can become interdependent when one test relies on the fact that another test has run before. The same applies if tests rely on a shared setup.

https://twitter.com/mfeathers/status/1281384915842371590

You can run into this situation in the react world when you have a common render mechanism that's shared between tests. Let's look at the following example of a test that ensures that an onClick handler is called when a user clicks a button.

let onClick = jest.fn()

beforeEach(() => {
  render(<Button onClick={onClick} />)
})

it("should communicate when the button was clicked", () => {
  fireEvent.click(screen.getByRole("button"))

  expect(onClick).toHaveBeenCalled()
})

This test runs fine. The developer might have chosen to put the onClick handler outside the scope of the test because she wanted to avoid extraneous data in the test itself. I observe this behavior when developers try to keep their tests DRY. We need to free ourselves from being strictly DRY in test cases. Some repetition is good when it's needed for the individual use case.

Imagine we wanted to add another use case that asserts that disabled buttons cannot be clicked.

it("should not be possible to click disabled buttons.", () => {
  render(<Button disabled onClick={onClick} />)

  fireEvent.click(screen.getByRole("button"))

  expect(onClick).not.toHaveBeenCalled()
})

What do you think will happen? Will the test fail or succeed? We can't be sure. If the test happens to run before the other test, then it will probably succeed. But if the test runs after the first one, then it will break. We've introduced interdependence between the tests by extracting the onClick handler mock.

To resolve the connection between the two tests, we need to move everything that is important for a test case into the test itself. In both tests, we are asserting whether the onClick handler was called or not. Even though it seems like we're repeating ourselves, it's much better to move that setup into each test case to make it evident that this is important for that particular test to achieve its goal.

it("should communicate when the button was clicked", () => {
  const onClick = jest.fn()

  render(<Button onClick={onClick} />)

  fireEvent.click(screen.getByRole("button"))

  expect(onClick).toHaveBeenCalled()
})

it("should not be possible to click disabled buttons.", () => {
  const onClick = jest.fn()

  render(<Button disabled onClick={onClick} />)

  fireEvent.click(screen.getByRole("button"))

  expect(onClick).not.toHaveBeenCalled()
})

With this change, every test case is slightly larger but also encapsulates all the code it needs. We could now also move the rendering part into each test.

But some developers prefer to keep rendering the component outside of the specs. Go with what you think feels best but does not hinder you from writing well-encapsulated specs.

No proper use of assertions

When you are not working with TDD, this is the easiest one to get wrong without noticing it. It all comes down to the fact that we can express 99% of assertions as an equality check. Let's look at some examples.

expect(user.name).toEqual("John")

expect(screen.getByRole("button").disabled).toEqual(true)

expect(error.indexOf("A custom error message")).not.toEqual(-1)

None of the above assertions are incorrect. They might also represent the mental model of the developer when she wrote the spec. What's the problem then? Let's look at what you might see in your console when these tests fail.

expect(user.name).toEqual("John")
// > Expected "Jane" to equal "John"

expect(screen.getByRole("button").disabled).toEqual(true)
// > Expected "false" to equal "true"

expect(error.indexOf("A custom error message")).not.toEqual(-1)
// > Expected "-1" not to equal "-1"

When I look at that output, I definitively also need to look at the respective test to figure out what's wrong. But I'm not particularly eager to do that. I'm probably working on something specific, and one of my changes made that test fail. Wouldn't it be great if the test failure could give me more detail? Then I might realize what my mistake is without the need to switch context and read the code of the test that failed.

The good news is that every testing framework I know has more specific assertions. We "just" need to use them. I would rewrite the tests a follows (using jest and jest-enzyme assertions).

expect(user).toHaveProperty("name", "John")
// > Expected property "name" to have value "John", but got "Jane"

expect(screen.getByRole("button")).toBeDisabled()
// > Expected prop "disabled" to be "true", but got "false"

expect(error).toContainText("A custom error message")
// > Expected "Generic error" to contain "A custom error message", but it didn't

The advantage of this approach is that the failing tests now give you some context about why they fail. For instance, you now know that when John did not equal Jane, that had something to do with a name property. Or that when true wasn't false, this was about the disabled prop of a component. Even though these are small pieces of information they might save you a lot of time over the course of your career.

TL;DR< ">

Mixed concerns in tests< ">

Extraneous data in tests< ">

Bloated test hooks< ">

No proper use of assertions< ">

TL;DR

Mixed concerns in tests

Extraneous data in tests

Bloated test hooks

No proper use of assertions