Coder Thoughts on software, technology and programming.

Piotr Mionskowski

  • Database timeouts

    31 October 2017

    Last time I have outlined the importance of timeouts. Without a carefully considered timeouts our application can become unresponsive easily. In this post I will focus on configuring various timeouts related to interaction with database. I am going to focus specifically on relational databases. The principles and practices however can be applied equally well to other types of databases.

    Database

    Different kinds of timeouts

    Asking a database for results of a query is one of the most common activities a back end application will do. Let us decompose this simple task into steps:

    1. Establish database connection(s) in pool
    2. Take the connection out of the pool
    3. Validate the acquired connection
    4. Send statement(s) to database
    5. Read query results

    Each of the above steps can involve a specific timeout configuration. The details depend on particular technology stack and type of database we are querying.

    Establish database connection(s) in pool

    Dealing with raw database connections is almost always done with the help of connection pool. Establishing a connection to database is very expensive compared to running a simple statement. The pool alleviates this cost by reusing connections for as long as needed.

    The first timeout is the maximum duration until a database connection is established. In JDBC this can be controlled by:

    • connectTimeout in MySQL JDBC driver

      Timeout for socket connect (in milliseconds), with 0 being no timeout. Only works on JDK-1.4 or newer. Defaults to ‘0’.

      The default is infinite ‼️

    • socketTimeout in MySQL JDBC driver

      Timeout (in milliseconds) on network socket operations (0, the default means no timeout).

    • loginTimeout in MySQL JDBC driver

      Sets the maximum time in seconds that a driver will wait while attempting to connect to a database once the driver has been identified.

      The default value 0 means infinite ‼️

    • loginTimeout in PostgreSQL JDBC driver:

      Specify how long to wait for establishment of a database connection. The timeout is specified in seconds.

      Default is infinite ‼️

    • connectTimeout in PostgreSQL JDBC driver:

      The timeout value used for socket connect operations. If connecting to the server takes longer than this value, the connection is broken. The timeout is specified in seconds and a value of zero means that it is disabled.

    • socketTimeout in PostgreSQL JDBC driver:

      The timeout value used for socket read operations. If reading from the server takes longer than this value, the connection is closed. This can be used as both a brute force global query timeout and a method of detecting network problems. The timeout is specified in seconds and a value of zero means that it is disabled.

    You probably have noticed a recurring theme above: default timeouts are either infinite or disabled at the driver level. In case of socketTimeout and connectTimeout there can still be a system level timeout involved both on Linux and Windows. However, those only work on blocking send and receive operations and how the JDBC driver interacts with the socket is an implementation detail for the most part.

    In order to demonstrate how the above timeouts work in practice we will use the following test cases:

    class JdbcTimeoutTest {
        @Test
        fun `mysql getConnection`() {
            val mysqlDataSource = mysqlDataSource()
    
            assertTimeoutPreemptively(Duration.ofMinutes(3)) {
                useResource { mysqlDataSource.connection }
            }
        }
    
        @Test
        fun `postgresql getConnection`() {
            val mysqlDataSource = postgreSQLDataSource()
    
            assertTimeoutPreemptively(Duration.ofMinutes(3)) {
                useResource { mysqlDataSource.connection }
            }
        }
    
        fun useResource(resourceProvider: () -> AutoCloseable) {
            val start = Instant.now()
            try {
                resourceProvider().use {
                    println("Completed in ${Duration.between(start, Instant.now())}")
                }
            } catch (e: Exception) {
                println("Error $e after ${Duration.between(start, Instant.now())}")
            }
        }
    
        fun mysqlDataSource(): MysqlDataSource {
            return MysqlDataSource().apply {
                this.setURL("jdbc:mysql://localhost:3306/database")
                this.user = "user"
                this.setPassword("password")
            }
        }
    
        fun postgreSQLDataSource(): PGSimpleDataSource {
            return PGSimpleDataSource().apply {
                this.user = "user"
                this.password = "password"
                this.databaseName = "database"
                this.serverName = "localhost"
            }
        }
    }
    

    To simulate a misbehaving database server we’ll use netcat listening on standard MySQL and PostgreSQL port e.g.:

    nc -k -l 3306 # listen on MySQL port, PostgreSQL uses 5432 by default
    

    Both of the above tests will fail due to assertTimeoutPreemptively.

    The most appropriate candidate to use for establishing connection is loginTimeout. This works in PostgreSQL but does not with MySQL. Apparently the MySQL JDBC driver in versions 5.1, 6.0 and 8.0 implement the method as Noop. Interestingly it is possible to force MySQL driver to respect the timeout when it is set globally through a static method java.sql.DriverManager.setLoginTimeout.

    A slightly less correct option is to use connectTimeout or socketTimeout. The socket level options work oblivious to database protocol hence it is impossible to set the timeout accurately for the whole establish connection operation. Additionally, the socketTimeout is applied to socket read operations not only to initial connection handshake. The fake netcat server is not suitable for testing the connectTimeout however we can use it for socketTimeout. The PostgreSQL driver correctly reported error after about 2 seconds after setting pgSimpleDataSource.socketTimeout = 2. Unfortunately settings socketTimeout in MySQL driver had no effect on the getConnection behavior. Interestingly no matter what value I’ve set the socketTimeout to, the error was thrown after about 26 seconds. I have no idea why it behaves like that 🤔.

    Be aware of the shortcomings of MySQL JDBC Driver.

    Take the connection out of the pool

    Reusing database connections gives the application great performance boost. However, writing an efficient and bug free database connection pool is no easy task thus we should all rely on proven solutions. In JVM world there are multiple choices when it comes to JDBC:

    • Hikari Claims to be the fastest and has limited number of configuration knobs and sane defaults. My favorite by far.
    • DBCP 2 A recently resurrected project which has a potential of being applicable to all resources pools with its commons-pool2 module.
    • Tomcat JDBC Connection Pool Commonly used with lots of configuration options. Came to be as a replacement of dbcp.

    When there are no connections available in the pool, the code asking for it needs to wait until one is available. The amount of time a thread is blocked waiting for a connection needs to be considered carefully. It is important to note that there are 2 situations we need to consider. The first one is when the pool has reached its maximum size and all connections are being used already. This is when there is very little actual work required to acquire a connection. The second case is when all currently opened connections are in use but the pool is allowed to create a new connection because it is not yet full. Here we need to keep in mind that time to establish a connection to the database may easily be around 200ms hence timeout should not be too short. Below you’ll find how to configure the timeout in the mentioned connection pools:

    • Hikari: connectionTimeout

      This property controls the maximum number of milliseconds that a client (that’s you) will wait for a connection from the pool. If this time is exceeded without a connection becoming available, a SQLException will be thrown. Lowest acceptable connection timeout is 250 ms. Default: 30000 (30 seconds)

    • DBCP: maxWaitMillis

      The maximum number of milliseconds that the pool will wait (when there are no available connections) for a connection to be returned before throwing an exception, or -1 to wait indefinitely.

      Default is infinite ‼️

    • Tomcat: maxWait

      (int) The maximum number of milliseconds that the pool will wait (when there are no available connections) for a connection to be returned before throwing an exception. Default value is 30000 (30 seconds)

    My rule of thumb is to set this timeout be under 5 seconds.

    Validate the acquired connection

    A database connection can be opened for several hours or even days. However, because there is network involved there are numerous cases where a socket that is seemingly open on the client side may in fact be part of a broken connection. A well behaving connection pool should avoid handing such connection to an application code. A common strategy to alleviate the problem is to test the connection just before it is taken out of the pool. In the past the test was performed using a simple SQL query e.g. SELECT 1. Nowadays there is a isValid method available on the JDBC Connection itself which moves the responsibility to the driver itself:

    Returns true if the connection has not been closed and is still valid. The driver shall submit a query on the connection or use some other mechanism that positively verifies the connection is still valid when this method is called.

    • Hikari: validationTimeout:

      This property controls the maximum amount of time that a connection will be tested for aliveness. This value must be less than the connectionTimeout. Lowest acceptable validation timeout is 250 ms. Default: 5000

    • DBCP: validationQueryTimeout:

      The timeout in seconds before connection validation queries fail. If set to a positive value, this value is passed to the driver via the setQueryTimeout method of the Statement used to execute the validation query.

      Default is infinite ‼️

    • Tomcat: validationQueryTimeout:

      (int) The timeout in seconds before a connection validation queries fail. This works by calling java.sql.Statement.setQueryTimeout(seconds) on the statement that executes the validationQuery. The pool itself doesn’t timeout the query, it is still up to the JDBC driver to enforce query timeouts. A value less than or equal to zero will disable this feature. The default value is -1.

      Default is infinite ‼️

    Send statement(s) to database and read query results

    We have finally arrived at the most common usage. Every query that we send to a database should have a timeout configured either at the statement level or at the transaction level. When it comes to individual statements, there is setQueryTimeout available:

    Sets the number of seconds the driver will wait for a Statement object to execute to the given number of seconds. By default there is no limit on the amount of time allowed for a running statement to complete. If the limit is exceeded, an SQLTimeoutException is thrown. A JDBC driver must apply this limit to the execute, executeQuery and executeUpdate methods.

    Additionally, it’s up to the driver to decide what the above timeout means exactly:

    Note: JDBC driver implementations may also apply this limit to ResultSet methods (consult your driver vendor documentation for details).

    Note: In the case of Statement batching, it is implementation defined as to whether the time-out is applied to individual SQL commands added via the addBatch method or to the entire batch of SQL commands invoked by the executeBatch method (consult your driver vendor documentation for details).

    A time required for a query to complete is very use case dependent thus we should not expect a sane default to be there. Instead, we need to ask ourself how long we are willing to wait for a query to complete. It is very easy to forget about this rule hence it is very handy to be able to set this timeout globally:

    • DBCP: defaultQueryTimeout

      defaultQueryTimeout null If non-null, the value of this Integer property determines the query timeout that will be used for Statements created from connections managed by the pool. null means that the driver default will be used.

    • Tomcat: queryTimeout available through QueryTimeoutInterceptor

      (int as String) The number of seconds to set for the query timeout. A value less than or equal to zero will disable this feature. The default value is 1 seconds.

    • Hikari: Not available but fairly easy to add by wrapping a DataSource e.g.:

    class CustomTimeoutsDataSource(val innerDataSource: DataSource, private val queryTimeoutProperties: DataSourceConfiguration.QueryTimeoutProperties) : DataSource by innerDataSource {
    
        override fun getConnection(username: String?, password: String?) = configureTimeouts(innerDataSource.getConnection(username, password))
        override fun getConnection() = configureTimeouts(innerDataSource.connection)
    
        private fun configureTimeouts(connection: Connection):Connection = CustomTimeoutsConnection(connection, queryTimeoutProperties)
    
        private class CustomTimeoutsConnection(val innerConnection: Connection, private val queryTimeoutProperties: DataSourceConfiguration.QueryTimeoutProperties)
            : Connection by innerConnection {
    
            private fun <T : Statement> configure(prepareStatement: T): T {
                //0 means no timeout
                val desiredTimeout = queryTimeoutProperties.statementQueryTimeoutInSeconds ?: 0
                prepareStatement.queryTimeout = desiredTimeout
                LOG.trace("Configure timeout {} seconds for statement {}", desiredTimeout, prepareStatement)
                return prepareStatement
            }
    
            override fun prepareStatement(sql: String?, autoGeneratedKeys: Int) = configure(innerConnection.prepareStatement(sql, autoGeneratedKeys))
            override fun prepareStatement(sql: String?, resultSetType: Int, resultSetConcurrency: Int, resultSetHoldability: Int) = configure(innerConnection.prepareStatement(sql, resultSetType, resultSetConcurrency, resultSetHoldability))
            override fun prepareStatement(sql: String?) = configure(innerConnection.prepareStatement(sql))
            override fun prepareStatement(sql: String?, columnNames: Array<out String>?) = configure(innerConnection.prepareStatement(sql, columnNames))
            override fun prepareStatement(sql: String?, resultSetType: Int, resultSetConcurrency: Int) = configure(innerConnection.prepareStatement(sql, resultSetType, resultSetConcurrency))
            override fun prepareStatement(sql: String?, columnIndexes: IntArray?)= configure(innerConnection.prepareStatement(sql, columnIndexes))
            override fun prepareCall(sql: String?) = configure(innerConnection.prepareCall(sql))
            override fun prepareCall(sql: String?, resultSetType: Int, resultSetConcurrency: Int) = configure(innerConnection.prepareCall(sql, resultSetType, resultSetConcurrency))
            override fun prepareCall(sql: String?, resultSetType: Int, resultSetConcurrency: Int, resultSetHoldability: Int) = configure(innerConnection.prepareCall(sql, resultSetType, resultSetConcurrency, resultSetHoldability))
            override fun createStatement() = configure(innerConnection.createStatement())
            override fun createStatement(resultSetType: Int, resultSetConcurrency: Int) = configure(innerConnection.createStatement(resultSetType, resultSetConcurrency))
            override fun createStatement(resultSetType: Int, resultSetConcurrency: Int, resultSetHoldability: Int) = configure(innerConnection.createStatement(resultSetType, resultSetConcurrency, resultSetHoldability))
    
            override fun toString(): String {
                return "CustomTimeoutsConnection(innerConnection=$innerConnection)"
            }
        }
    
        companion object {
            private val LOG = LoggerFactory.getLogger(CustomTimeoutsDataSource::class.java)
        }
    }
    

    The JDBC level queryTimeout is enforced at the application code side i.e. there’s a code executed after the timeout elapses which stops the query execution. Recent releases of both MySQL and PostgreSQL offer a database server level statement timeout capabilities.

    • MySQL: MAX_EXECUTION_TIME

      The MAX_EXECUTION_TIME hint is permitted only for SELECT statements. It places a limit N (a timeout value in milliseconds) on how long a statement is permitted to execute before the server terminates it:

    • PostgreSQL: statement_timeout

      Abort any statement that takes more than the specified number of milliseconds, starting from the time the command arrives at the server from the client. If log_min_error_statement is set to ERROR or lower, the statement that timed out will also be logged. A value of zero (the default) turns this off. Setting statement_timeout in postgresql.conf is not recommended because it would affect all sessions.

    If you are using a JPA provider like Hibernate, you might be urged to use javax.persistence.query.timeout. However, from my experience using Hibernate this timeout, when configured globally, is enforced in some uses cases and completely ignored in others. There were multiple bugs reported related to this feature: Bug 303360, HHH-9929, “Query timeout in persistence.xml doesn’t work” and some of them are still not addressed.

    There is no transaction scoped timeout available in JDBC. However, one can still apply the timeout in the application code through e.g. @Transactional.timeout.

    Keep all timeouts short

    The best timeout is a short one. It is often tempting to increase a query or wait timeout in face of performance or throughput problems. However, doing so will increase the amount of resources blocked on the server and is thus a rarely a good choice. Blocking more and more resources on server e.g. threads, may at some point, cause the entire server to collapse abruptly. I keep the timeouts as short as possible especially when certain API is called often. If you are looking for a good read on the topic I highly recommend Release It! by Michael T. Nygard. This books covers many resiliency related topics including timeouts and provides strategies to avoid increasing them.

  • The importance of timeouts

    23 October 2017

    Timeouts are not an exciting thing to talk about. They do not add immediately perceivable value. They are difficult to guess get right and force one to consider problems that are hard to solve. In fact, in my experience, the timeout is only ever considered when our software stops working or is about to. That is an enormous shame since, in my opinion, carefully applied timeouts can vastly improve software resiliency.

    Man with a wrist clock

    An example application

    Let us consider a simplistic example of a Spring Boot application generated using Spring Initializr. The application will only expose actuator API which by default define a health check endpoint. Our example will also have a mail module configured.

    The dependencies section of build.gradle:

    dependencies {
    	compile('org.springframework.boot:spring-boot-starter-actuator')
    	compile('org.springframework.boot:spring-boot-starter-mail')
    	compile('org.springframework.boot:spring-boot-starter-web')
    	compile("org.jetbrains.kotlin:kotlin-stdlib-jre8:${kotlinVersion}")
    	compile("org.jetbrains.kotlin:kotlin-reflect:${kotlinVersion}")
    }
    

    A typical health check endpoint verifies that application integration points work correctly. If the service talks to database then a connection is established and verified. A free disk space is checked. If the service sends emails through SMTP then a connection to a server is established.

    The health checks are auto discovered and enabled when you include Spring Boot Actuator. By default, /health path is used for the endpoint. The SMTP server host, port and credentials obviously need to be configured. At minimum host entry is required e.g. spring.mail.host=localhost. For the debugging purpose one can disable actuator security with management.security.enabled=false to verify what health checks are performed by actuator.

    A last part of our example is the trivial application code:

    @SpringBootApplication
    class Application {
        companion object {
            @JvmStatic
            fun main(args: Array<String>) {
                SpringApplication.run(Application::class.java, *args)
            }
        }
    }
    

    When you request /health the API will return response similar to:

    HTTP/1.1 200 
    Content-Type: application/vnd.spring-boot.actuator.v1+json;charset=UTF-8
    Date: Mon, 23 Oct 2017 08:08:32 GMT
    Transfer-Encoding: chunked
    X-Application-Context: application
    
    {
        "diskSpace": {
            "free": 105755779072,
            "status": "UP",
            "threshold": 10485760,
            "total": 499046809600
        },
        "mail": {
            "location": "localhost:-1",
            "status": "UP"
        },
        "status": "UP"
    }
    

    An application health API, like the one in our example, is often hooked into external monitoring software. The monitor asks the target application about its health in regular intervals e.g. every 5 seconds.

    Shooting yourself in the foot

    The above example has an issue that can kill production server. More importantly, other metrics that are usually monitored e.g. CPU and memory usage will not warn about upcoming, dreadful service stall. The application will also not suffer from an enormous number of requests or emails being sent.

    Imagine that the health endpoint is checked every 5 seconds and that there is intermittent issue with the SMPT server. The health endpoint will rightfully try to connect to the SMTP server and from time to time respond with error. From my experience, when a health check is introduced it typically takes a while to tweak the monitor thresholds so that we get rid of false alarms. It is thus very easy to ignore intermittent errors when we think they are caused by too sensitive thresholds. However, the ignored errors can after a while cause our server to stop responding to any request.

    Man with a wrist clock

    Why this can happen you ask and I answer. There is no timeout configured!

    The mail server health check uses javax.mail.Service.connect under the hood. For a variety of reasons an attempt to establish a TCP connection can take arbitrary longer than usual. Unfortunately the default timeouts used by the javax.mail.* are infinite. A thread that waits for the connection to be established cannot serve other requests even though it barely uses any CPU. The default maximum thread pool size used by embedded Tomcat in Spring Boot application is 200. Assuming that the blocked connection attempt happens twice an hour our application will stop working after 4 days.

    Never use infinite timeouts

    As you can see it is very easy to miss a need for a timeout to be configured. To be fair the Spring Boot documentation states clearly:

    In particular, certain default timeout values are infinite and you may want to change that to avoid having a thread blocked by an unresponsive mail server:

    spring.mail.properties.mail.smtp.connectiontimeout=5000
    spring.mail.properties.mail.smtp.timeout=3000
    spring.mail.properties.mail.smtp.writetimeout=5000
    

    In my option any library or framework should either force the programmer to configure the timeout or have some sensible default. Unfortunately this is not always possible to introduce them later on without breaking changes hence we should check what are the timeouts used when calling any external service.

    Timeouts needed everywhere

    Imagine a controller action method that inserts a single row into a database. Let us further assume that the endpoint is called 50 times per second and it typically takes 100ms to complete. Things work well until we encounter an intermittent sloppiness of the database and now the insert takes 2 seconds to complete. The clients calling the API do not slow down. More request threads are blocked and more database connections are taken out of the pool. Soon as all database connections are in use and all other API endpoints start to fail. This is an example of cascading failure i.e. a problem in one component propagating to others. It is easier to avoid such issues when there is a timeout configured on the controller action and the interaction with the database.

    Every API endpoint should have a timeout configured if we want our services to be resilient. Unfortunately this rule is easy to neglect. Moreover, some frameworks do not even expose such ability. Even in the immensely popular Spring Mvc I could not find a way to set such timeout other than using an async method. Fortunately there are libraries e.g. Hystrix that tackle that problem and can be integrated easily.

    Just to recap here is a short and incomplete list of cases where a timeout should be configured:

    • a controller action
    • a database query, statement
    • a database pool interaction
    • a thread pool interaction
    • an API client e.g. HTTP, SOAP, SMTP

    I will describe how to deal with the cases in the following posts.

  • How to clear database in Spring Boot tests?

    13 October 2017

    Nowadays using a production like database in unit1 tests is a common practice. Calling a real database can increase our confidence that a tested code actually works. Having said that a database, by its very nature, brings external state into a test that will affect its behavior, hence we need to pay special attention to prepare the test execution. There are couple of ways to handle the database state in tests and I’m going to describe an approach I like most.

    Database

    Problems with Spring Boot Transactional tests

    Spring Boot offers many helpers to make testing application easier. Among many you can use a @DataJpaTest which by default will configure an in-memory embedded database. You can use a production type database in tests by adding @AutoConfigureTestDatabase(replace=Replace.NONE) like so:

    @RunWith(SpringRunner.class)
    @DataJpaTest
    @AutoConfigureTestDatabase(replace=Replace.NONE)
    public class ExampleRepositoryTests {
    
        // ...
    
    }
    
    

    The @DataJpaTest uses @Transactional under the hood. A test is wrapped inside a transaction that is rolled back at the end. This means that when using e.g. Hibernate one needs to pay special attention to how the tested code is written. As shown in the Java example below, a manual flush is indeed required:

    @RunWith(SpringRunner.class)
    @ContextConfiguration(classes = TestConfig.class)
    @Transactional
    public class HibernateUserRepositoryTests {
        ...
        @Test
        public void createUser() {
            // track initial state in test database:
            final int count = countRowsInTable("user");
    
            User user = new User(...);
            repository.save(user);
    
            // Manual flush is required to avoid false positive in test
            sessionFactory.getCurrentSession().flush();
            assertNumUsers(count + 1);
        }
    }
    

    Using @Transactional annotation on tests is certainly easy but I still don’t use it for the following reasons:

    • The production code is using transactions with different scope.
    • It is easy to forget about a flush and thus have false positive in test.
    • On failure and when debugging it is hard to see what values were actually saved in db.
    • It is much harder to write tests of production code that requires a transaction to be committed.
    • The test code needs to be more tightly coupled to production code and we all know that it hinders refactoring.

    Cleaning database with SQL

    In tests involving a database I reset its state before each test using plain old SQL. This makes the test code less dependent on how a transaction is scoped inside production code. Furthermore, one can easily review the values saved after a test failure. It turns out it is easy to write a JUnit @Rule or BeforeEachCallback that will remove all rows from all tables. Moreover, we can do so without hard coding table names which would increase maintenance cost.

    Let’s start with defining a @Rule in Kotlin in that will be called before each test:

    import org.junit.rules.ExternalResource
    import org.springframework.stereotype.Component
    import javax.sql.DataSource
    
    @Component
    class DatabaseCleanerRule(private val dataSource: DataSource) : ExternalResource() {
    
        override fun before() {
            if (databaseCleaner == null) {
                // Consider inspecting dataSource to check if we are connecting to test database
                databaseCleaner = DatabaseCleaner(dataSource::getConnection)
            }
            databaseCleaner!!.reset()
        }
    
        companion object {
            internal var databaseCleaner: DatabaseCleaner? = null
        }
    }
    

    Consider inspecting dataSource to check if we are about to connect to test database and not one used for development. It is very easy to use incorrect Spring Profile and wipe out your development data. Ask me how I know?

    We can use the DatabaseCleanerRule in a spring enabled test as any other JUnit rule e.g. @Rule @Inject lateinit var cleanerRule: DatabaseCleanerRule.

    Notice that we’ve delegated the actual important work to DatabaseCleaner class defined in Kotlin below.

    import com.practi.util.iterator
    import org.slf4j.LoggerFactory
    import java.sql.Connection
    import java.sql.PreparedStatement
    import java.sql.SQLException
    
    class DatabaseCleaner(private val connectionProvider: () -> Connection) {
        private val tablesToExclude = mutableSetOf<String>()
        private var tablesForClearing: List<TableRef>? = null
    
        fun excludeTables(vararg tableNames: String) {
            tablesToExclude += tableNames.flatMap { listOf(it, it.toLowerCase()) }
        }
    
        fun reset() {
            if (notPrepared) {
                prepare()
            }
            executeReset()
        }
    
        private val notPrepared get() = tablesForClearing == null
    
        private fun prepare() {
            connectionProvider().use { connection ->
                val metaData = connection.metaData
                val tableRefs = metaData.getTables(connection.catalog, null, null, arrayOf("TABLE")).use { tables ->
                    iterator(tables::next) { tables.getString("TABLE_NAME") }
                        .asSequence()
                        .filterNot(tablesToExclude::contains)
                        .map(::TableRef)
                        .toList()
                }
    
                tablesForClearing = tableRefs
    
                LOG.info("Prepared clean db command: {}", tablesForClearing)
            }
        }
    
        private fun executeReset() {
            try {
                connectionProvider().use { connection ->
                    val reset = buildClearStatement(connection)
                    val result = reset.executeBatch()
                    result
                }
            } catch (e: SQLException) {
                val status = engineInnoDbStatus()
                LOG.error("Failed to remove rows because {}. InnoDb status: {}", e, status)
                throw e
            }
        }
    
        private fun engineInnoDbStatus(): String { ... }
    
        private fun buildClearStatement(connection: Connection): PreparedStatement {
            val reset = connection.prepareStatement("")
            reset.addBatch("SET FOREIGN_KEY_CHECKS = 0")
            tablesForClearing?.forEach { ref ->
                reset.addBatch("DELETE FROM `${ref.name}`")
            }
            reset.addBatch("SET FOREIGN_KEY_CHECKS = 1")
            return reset
        }
    
        data class TableRef(val name: String)
    
        companion object {
            private val LOG = LoggerFactory.getLogger(DatabaseCleaner::class.java)!!
        }
    }
    

    Notice that we’ve defined tablesToExclude set that allows us to omit certain tables. This comes in handy when you’re using a database migration tool that stores its state inside some table(s).

    The JDBC metadata allows us to introspect schema regardless of the database vendor. The iterator is a tiny Kotlin function that aids consuming iterator like objects:

    inline fun <T> iterator(crossinline next: () -> Boolean, crossinline value: () -> T): AbstractIterator<out T> = object : AbstractIterator<T>() {
        override fun computeNext() {
            if (next()) {
                setNext(value())
            } else {
                done()
            }
        }
    }
    

    The buildClearStatement constructs a large query that DELETEs all rows from each relevant table. The example above uses MySQL where it is very easy to disable foreign key checks. This is important since foreign keys would prevent rows to be removed unless we paid special attention to the order of removal. A more generic example of how to deal with referential integrity when clearing a database can be found in the Respawn project.

    Last but not least, when a SQLException is thrown we log the exception accompanied with SHOW ENGINE INNODB STATUS. The status information can hint us about failure reason e.g. another test process executing against the same database or a rogue, runaway thread that locks some rows.

    private fun engineInnoDbStatus(): String {
        return connectionProvider().use { connection ->
            connection.createStatement().executeQuery("SHOW ENGINE INNODB STATUS ").use {
                iterator(it::next) { it.getString("Status") }.asSequence().toList()
            }.joinToString(System.lineSeparator())
        }
    }
    

    The above examples show that it is not hard to manually reset the database. I’ve found that using this approach makes my tests more trustworthy and less coupled to the underlying persistence layer. In fact, we can easily switch e.g. from JPA to JdbcTemplate in a performance critical code area without a need to change a test.

    1 Whether it is actually unit or integration test is a different topic.

  • Simple Config Sections 1.2 released

    25 December 2015 .net

    About a week ago I’ve released another version of SimpleConfigSections. It’s marked 1.2.0 and you can install it through Install-Package SimpleConfigSections.

    What is SimpleConfigSections?

    SimpleConfigSections is a little library that simplifies defining .net configuration sections. As you can see in the mentioned msdn documentation the amount of boilerplate code is substantial. An equivalent example in SimpleConfigSections looks like follows:

    As you can see the SimpleConfigSections version is more concise and a bit easier to write. More examples can be found on project’s github page.

  • Android Data Binding Part 2: Observer pattern to the bone

    27 July 2015 android

    In the previous part I’ve described typical problems we have to face when developing applications on Android. I’ve also highlighted that some of them may be mitigated when data binding API is utilized properly. It’s time to dive into more details of how this promising API works.

    Observer Pattern

    At the root of many of solutions we find in today’s APIs is a design pattern. In case of the discussed Android API it is Observer Pattern applied to the bone. In fact this particular pattern is so common and powerful that some languages and runtimes (C# with .NET, Objective-C and Swift on iOS and Mac) provide neat support for it. You may wonder why it is important in the context of Android data binding? The answer is simple yet easy to ignore - memory management. The following diagram depicts dependencies of an observer pattern’s elements in the context of Android data binding: Android data binding diagram On the right side we have the observable part producing notifications about changes - an object implementing android.databinding.Observable interface. In the observer pattern events producer can have many events listeners - OnPropertyChangedCallback implementations. Android data binding provides implementations of events listeners taking care of property values changes as well as collection elements changes.

    Note that the Observable object knows nothing about concrete OnPropertyChangedCallback implementations (Dependency Inversion Principle at its best). However, at runtime the Observable object will hold references to many OnPropertyChangedCallbacks instances. If you look further down the diagram you’ll see that in order for OnPropertyChangedCallback implementations provided by Android data binding library to update the state of the view component a reference to it is required. This means that although the view model knows nothing about the view components at compile time they will reference them at runtime.

    How Android data binding library aids memory management?

    As stated above in a simple implementation we would have a lightweight and testable view model retain expensive view widgets - thus making it heavy. In the Android context it means that even though an Activity got destroyed we could have other object still retaining its Views, trying to update them and preventing them from being garbage collected. Not so nice, huh?

    If you look closer at how Android data binding is implemented you’ll immediately see that its OnPropertyChangedCallback holds only weak references to views. WeakPropertyListener, WeakListListener, WeakMapListener and ViewDataBinding.WeakListener make sure that the Observable object is not retaining views. This means that there’s no need for a developer to manually stop data binding in an activity onDestroy or a fragment onDestroyView methods. If you use RxJava you probably know that this extra step is tedious and requires a great deal of attention and boilerplate code. You can find out more about the problem in at least couple of issues on GitHub: 1, 2.

    Because Android data binding library takes extra steps to make sure a view model will not cause views to be retained we can detach a view model’s lifecycle from that of an activity or a fragment. We now have an easy way to i.e. do a simple in-memory caching via singleton view models. More importantly, we can cross out at least one place when looking for a memory leak cause.

    This article is cross-posted with my company blog