How to clear database in Spring Boot tests?

Nowadays using a production like database in unit¹ tests is a common practice. Calling a real database can increase our confidence that a tested code actually works. Having said that a database, by its very nature, brings external state into a test that will affect its behavior, hence we need to pay special attention to prepare the test execution. There are couple of ways to handle the database state in tests and I’m going to describe an approach I like most.

{: .center-image}

Problems with Spring Boot Transactional tests

Spring Boot offers many helpers to make testing application easier. Among many you can use a @DataJpaTest which by default will configure an in-memory embedded database. You can use a production type database in tests by adding @AutoConfigureTestDatabase(replace=Replace.NONE) like so:

@RunWith(SpringRunner.class)
@DataJpaTest
@AutoConfigureTestDatabase(replace=Replace.NONE)
public class ExampleRepositoryTests {

    // ...

}

The @DataJpaTest uses @Transactional under the hood. A test is wrapped inside a transaction that is rolled back at the end. This means that when using e.g. Hibernate one needs to pay special attention to how the tested code is written. As shown in the Java example below, a manual flush is indeed required:

@RunWith(SpringRunner.class)
@ContextConfiguration(classes = TestConfig.class)
@Transactional
public class HibernateUserRepositoryTests {
    ...
    @Test
    public void createUser() {
        // track initial state in test database:
        final int count = countRowsInTable("user");

        User user = new User(...);
        repository.save(user);

        // Manual flush is required to avoid false positive in test
        sessionFactory.getCurrentSession().flush();
        assertNumUsers(count + 1);
    }
}

Using @Transactional annotation on tests is certainly easy but I still don’t use it for the following reasons:

The production code is using transactions with different scope.
It is easy to forget about a flush and thus have false positive in test.
On failure and when debugging it is hard to see what values were actually saved in db.
It is much harder to write tests of production code that requires a transaction to be committed.
The test code needs to be more tightly coupled to production code and we all know that it hinders refactoring.

Cleaning database with SQL

In tests involving a database I reset its state before each test using plain old SQL. This makes the test code less dependent on how a transaction is scoped inside production code. Furthermore, one can easily review the values saved after a test failure. It turns out it is easy to write a JUnit @Rule or BeforeEachCallback that will remove all rows from all tables. Moreover, we can do so without hard coding table names which would increase maintenance cost.

Let’s start with defining a @Rule in Kotlin in that will be called before each test:

import org.junit.rules.ExternalResource
import org.springframework.stereotype.Component
import javax.sql.DataSource

@Component
class DatabaseCleanerRule(private val dataSource: DataSource) : ExternalResource() {

    override fun before() {
        if (databaseCleaner == null) {
            // Consider inspecting dataSource to check if we are connecting to test database
            databaseCleaner = DatabaseCleaner(dataSource::getConnection)
        }
        databaseCleaner!!.reset()
    }

    companion object {
        internal var databaseCleaner: DatabaseCleaner? = null
    }
}

Consider inspecting dataSource to check if we are about to connect to test database and not one used for development. It is very easy to use incorrect Spring Profile and wipe out your development data. Ask me how I know?

We can use the DatabaseCleanerRule in a spring enabled test as any other JUnit rule e.g. @Rule @Inject lateinit var cleanerRule: DatabaseCleanerRule.

Notice that we’ve delegated the actual important work to DatabaseCleaner class defined in Kotlin below.

import com.practi.util.iterator
import org.slf4j.LoggerFactory
import java.sql.Connection
import java.sql.PreparedStatement
import java.sql.SQLException

class DatabaseCleaner(private val connectionProvider: () -> Connection) {
    private val tablesToExclude = mutableSetOf<String>()
    private var tablesForClearing: List<TableRef>? = null

    fun excludeTables(vararg tableNames: String) {
        tablesToExclude += tableNames.flatMap { listOf(it, it.toLowerCase()) }
    }

    fun reset() {
        if (notPrepared) {
            prepare()
        }
        executeReset()
    }

    private val notPrepared get() = tablesForClearing == null

    private fun prepare() {
        connectionProvider().use { connection ->
            val metaData = connection.metaData
            val tableRefs = metaData.getTables(connection.catalog, null, null, arrayOf("TABLE")).use { tables ->
                iterator(tables::next) { tables.getString("TABLE_NAME") }
                    .asSequence()
                    .filterNot(tablesToExclude::contains)
                    .map(::TableRef)
                    .toList()
            }

            tablesForClearing = tableRefs

            LOG.info("Prepared clean db command: {}", tablesForClearing)
        }
    }

    private fun executeReset() {
        try {
            connectionProvider().use { connection ->
                val reset = buildClearStatement(connection)
                val result = reset.executeBatch()
                result
            }
        } catch (e: SQLException) {
            val status = engineInnoDbStatus()
            LOG.error("Failed to remove rows because {}. InnoDb status: {}", e, status)
            throw e
        }
    }

    private fun engineInnoDbStatus(): String { ... }

    private fun buildClearStatement(connection: Connection): PreparedStatement {
        val reset = connection.prepareStatement("")
        reset.addBatch("SET FOREIGN_KEY_CHECKS = 0")
        tablesForClearing?.forEach { ref ->
            reset.addBatch("DELETE FROM `${ref.name}`")
        }
        reset.addBatch("SET FOREIGN_KEY_CHECKS = 1")
        return reset
    }

    data class TableRef(val name: String)

    companion object {
        private val LOG = LoggerFactory.getLogger(DatabaseCleaner::class.java)!!
    }
}

Notice that we’ve defined tablesToExclude set that allows us to omit certain tables. This comes in handy when you’re using a database migration tool that stores its state inside some table(s).

The JDBC metadata allows us to introspect schema regardless of the database vendor. The iterator is a tiny Kotlin function that aids consuming iterator like objects:

inline fun <T> iterator(crossinline next: () -> Boolean, crossinline value: () -> T): AbstractIterator<out T> = object : AbstractIterator<T>() {
    override fun computeNext() {
        if (next()) {
            setNext(value())
        } else {
            done()
        }
    }
}

The buildClearStatement constructs a large query that DELETEs all rows from each relevant table. The example above uses MySQL where it is very easy to disable foreign key checks. This is important since foreign keys would prevent rows to be removed unless we paid special attention to the order of removal. A more generic example of how to deal with referential integrity when clearing a database can be found in the Respawn project.

Last but not least, when a SQLException is thrown we log the exception accompanied with SHOW ENGINE INNODB STATUS. The status information can hint us about failure reason e.g. another test process executing against the same database or a rogue, runaway thread that locks some rows.

private fun engineInnoDbStatus(): String {
    return connectionProvider().use { connection ->
        connection.createStatement().executeQuery("SHOW ENGINE INNODB STATUS ").use {
            iterator(it::next) { it.getString("Status") }.asSequence().toList()
        }.joinToString(System.lineSeparator())
    }
}

The above examples show that it is not hard to manually reset the database. I’ve found that using this approach makes my tests more trustworthy and less coupled to the underlying persistence layer. In fact, we can easily switch e.g. from JPA to JdbcTemplate in a performance critical code area without a need to change a test.

¹ Whether it is actually unit or integration test is a different topic.

Published 13 Oct 2017

TDD fan eager to learn new things. Always focused, always eager to ask why?Piotr Mionskowski