Engineering

Testing Database Interactions with Jest

Akshay Nathan
April 19, 2020

Testing Database Interactions with Jest

Jest has quickly become one of the most popular Javascript testing libraries. While Jest may be mostly used in the context of frontend applications, at walrus.ai we use Jest for testing our backend Node.js services as well.

Jest aims to make testing 'delightful', and a large component of that delight comes from speed. Jest by default runs concurrently with worker processes, a pattern that encourages and even requires test isolation. While this is relatively simple to accomplish for frontend code, there's a shared mutable state elephant in the room for backend code: the database.

Why test database interactions?

For unit tests, we generally follow best practice of mocking any interactions that are outside the unit. Consider the following function:

async function changeUserName(db, userId, username) {
  const userRepository = db.getRepository(User);
  const updated = await userRepository.updateUserName(userId, username);

  return updated;
}

This function takes a handle to database connection, the userId, and a new username, and updates the username in the database. We abstract away the underlying SQL necessary to make the database update with the Repository pattern. This allows us to test this function rather easily.

describe('changeUserName', () => {
  it('should update username in db', async () => {
    const db = { getRepository: jest.fn() };
    const repository = { updateUserName: jest.fn() };

    db.getRepository.mockReturnValue(repository);
    repository.updateUserName.mockReturnValue(Promise.resolve('updated'));

    const result = await changeUserName(db, '1', 'username');

    expect(result).toEqual('updated');
    expect(repository.updateUserName).toHaveBeenCalledTimes(1);
    expect(repository.updateUserName).toHaveBeenCalledWith('1', 'username');
  });
});

However, what if we want to test our actual Repository? The code for the repository probably looks something like this:

class UserRepository {
  ...
  public async update(id, username) {
	  await this.db.sql(`
	    UPDATE users SET username = :username WHERE id = :id
    `, { id, username });
  }
}

If we wanted to test method, we could obviously mock the db connection and assert that .sql is called with the expected parameters. But what if this SQL is invalid, or more probably, what if the SQL is valid but is doing the wrong thing?

Presumably, at some point, we'll want to test the actual interaction with the database. We won't get into what we actually call these tests (there's probably 1000s of internet discussions on whether we've crossed the line from unit tests to integration tests by involving a real database), we'll simply cover how to do it safely and concurrently with Jest.

Setting up the database for Jest

As we've discussed, Jest by default runs tests concurrently — this makes sharing a database problematic. Tests that are running in parallel will clobber each other's database state, causing spurious failures and flakes.

The simplest option to overcome this limitation is to run Jest with the --runInBand option. This forces Jest to only use one process to run all your tests. However, this probably will make your test suite far slower. At walrus.ai this took our test suite from 10s of seconds to a few minutes, and simply wasn't tenable for our CI/CD processes of constant deployments.

Luckily, parallel testing with a database is a pretty solved problem. We can simply spin up a database for each worker process we're using. If all tests running against a particular database are run serially, then we don't have to worry about parallel processes mutating database state.

The easiest solution for this would be something like the following. We could spin up a database before each Jest worker, and shut it down after. Since all tests within a worker run serially, each worker could operate on its individual database safely.

beforeWorker(async () => {
  db = await createDatabase(`db_${process.env.JEST_WORKER_ID}`);
});

// All tests run serially here.

afterWorker(async () => {
  await destroyDatabase(db);
});

Unfortunately, while Jest exposes the JEST_WORKER_ID environment variable to distinguish between workers, it doesn't expose any simple way of hooking in per-worker setup and teardown methods.

This means that we can't dynamically spin up and tear down databases for each worker. We can, however, do the next best thing, using a static number of Jest workers.

First, we'll need a test setup script that prepares our multiple databases.

Note: The following code examples use Typeorm, but the code could easily be extended for any other database interaction library such as Sequelize, Massive.js, Knex etc.

(async () => {
  const connection = await createConnection({
    type: 'postgres',
    username: process.env.DATABASE_USER,
    password: process.env.DATABASE_PASSWORD,
    database: process.env.DATABASE_MASTER',
    host: process.env.DATABASE_HOST,
    port: 5432,
  });
  const databaseName = `walrus_test_template`;
  const workers = parseInt(process.env.JEST_WORKERS || '1');

  await connection.query(`DROP DATABASE IF EXISTS ${databaseName}`);
  await connection.query(`CREATE DATABASE ${databaseName}`);

  const templateDBConnection = await createConnection({
    name: 'templateConnection',
    type: 'postgres',
    username: process.env.DATABASE_USER,
    password: process.env.DATABASE_PASSWORD,
    database: 'walrus_test_template',
    host: process.env.DATABASE_HOST,
    migrations: ['src/migrations/*.ts'],
    port: 5432,
  });

  await templateDBConnection.runMigrations();
  await templateDBConnection.close();

  for (let i = 1; i <= workers; i++) {
    const workerDatabaseName = `walrus_test_${i}`;

    await connection.query(`DROP DATABASE IF EXISTS ${workerDatabaseName};`);
    await connection.query(`CREATE DATABASE ${workerDatabaseName} TEMPLATE ${databaseName};`);
  }

  await connection.close();
})();

With this script, we create a connection to our database, Postgres in this instance. Then, for each worker that we are using (set statically in the JEST_WORKERS environment variable, we initialize a new database.

Because we're using Postgres, we can use a handy feature called Template Databases. This makes new database creation cheap, and allows us to only run our migrations once. At a high level, we create one template database, run our migrations once against the template database, and then quickly copy over the database for each Jest worker.

Now, we simply have to connect to the correct database in all of our tests. With the JEST_WORKER_ID environment variable, this is trivial:

beforeAll(async () => {
  connection = await createConnection({
    type: 'postgres',
    host: process.env.DATABASE_HOST,
    port: 5432,
    username: process.env.DATABASE_USER,
    password: process.env.DATABASE_PASSWORD,
    database: `walrus_test_${process.env.JEST_WORKER_ID}`,
    logging: false,
    entities,
    namingStrategy: new SnakeNamingStrategy(),
  });
});

afterAll(async () => {
  connection.close();
});

Now all of our workers will use individual databases, allowing us to run tests in parallel.

Cleaning up between tests

While parallel tests are no longer problematic, we still have a problem with serial tests. Consider the following example, again of our contrived UserRepository:

describe('UserRepository', () => {
	it('should create user', () => {
    await repository.createUser('username');

    expect(await repository.countUsers()).toEqual(1); 
  });

	it('should delete user', () => {
    const userId = await repository.createUser('username');
    await repository.deleteUser(userId);

    expect(await repository.countUsers()).toEqual(0); 
  });
});

Notice anything wrong here? The second test will fail. While we're setting up different databases for each of our parallel workers, tests within the same file are run serially. This means the user created in the first test is still present in the second test, causing the test to fail.

We considered two approaches to solve this problem. The first approach is to wrap each test in a database transaction:

beforeEach(() => {
  db.startTransaction();
});

afterEach(() => {
  db.rollbackTransaction();
});

With this approach, any database updates made within the test are wrapped into the initialized transaction. When the test is finished, we simply rollback the transaction, discarding any of those updates. While this approach is fast and generally supported by all databases, it's not always the best for certain classes of integration-tests.

Sometimes, the behavior under test may actually be the transaction itself. For example, we may want to test that when an update fails, certain components of the update are preserved (committed), and others are rolled back. This logic would require us to manually start and stop transactions within our code, and wrapping the code in a parent transaction with this method would keep us from effectively testing rollbacks.

Another, simpler but slower approach, is to just clear out the database after before every test. While this may be slower, it's less likely to bite us later. We can do this in a simple beforeEach block.

beforeEach(async () => {
    const queryRunner = getConnection().createQueryRunner();

    await queryRunner.query(`
      DO
      $func$
      BEGIN
        EXECUTE (
          SELECT 'TRUNCATE TABLE ' || string_agg(oid::regclass::text, ', ') || ' CASCADE'
            FROM pg_class
            WHERE relkind = 'r'
            AND relnamespace = 'public'::regnamespace
        );
      END
      $func$;
    `);
    await queryRunner.release();
  });

The above code iterates through all our tables and clears them using the SQL TRUNCATE command. In the walrus.ai test suite, this occurs in the order of milliseconds, and is a worthwhile performance tradeoff for keeping our tests simple.

Conclusion

By clearing the database between tests, and using one test per worker, we can continue getting the delightful Jest experience for testing database connected backend applications.

While testing database interactions with Jest helps increase unit and integration test coverage without sacrificing test stability, running true end-to-end tests with browser automation tools like Selenium or Cypress can still be flaky and unstable.

Jest's goal is to make unit and integration testing 'delightful' — our goal at walrus.ai is to do the same for end-to-end testing. Engineering teams can write tests in plain english, and we take care of automating the tests, resolving flakes, and maintaining tests as their applications change. Our contrived example above showed how to test the database end of a username update, here's how simple the corresponding end-to-end test with walrus.ai could be:

walrus -u your-application.com -i \
  'Login' \
  'Change your username' \
  'Verify you receive a confirmation email at the new email address' \
  'Verify your username is changed'

Follow us on Twitter