Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HikariPool getConnection does not timeout #2245

Open
Ducius6 opened this issue Sep 26, 2024 · 9 comments
Open

HikariPool getConnection does not timeout #2245

Ducius6 opened this issue Sep 26, 2024 · 9 comments

Comments

@Ducius6
Copy link

Ducius6 commented Sep 26, 2024

Every once in a while, we experience an issue in our application where applications stops on executing some mssql server query. We figured out that the issue is that HikariPool getConnection does not time out even though it should based on the connectionTimeout value set.

Here is the config of HikariDataSource
image

Here is the log where you can see that getConnection timed out after ~13 hours.

`2024-09-26T08:55:36.530Z ERROR 1597804 --- [DatabaseJobsScheduler-6] o.i.c.c.s.p.queue.jobs.DatabaseJob : Error on execution for Old collected traffic removal job for accountId=20001; error: Failed to execute removeOldCollectedTraffic.

java.lang.RuntimeException: Failed to execute removeOldCollectedTraffic.
at org.infobip.cdr.collector.service.persistence.queue.repository.impl.CdrDataRepositoryImpl.removeOldCollectedTraffic(Unknown Source)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
at java.base/java.lang.reflect.Method.invoke(Method.java:580)
at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:355)
at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:196)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:768)
at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:138)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:768)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:768)
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:720)
at org.infobip.cdr.collector.service.persistence.queue.repository.impl.CdrDataRepositoryImpl$$SpringCGLIB$$0.removeOldCollectedTraffic()
at org.infobip.cdr.collector.service.persistence.queue.jobs.DatabaseJobsScheduler.lambda$scheduleRemovalOfOldCollectedTraffic$9(DatabaseJobsScheduler.java:128)
at org.infobip.cdr.collector.service.persistence.queue.jobs.DatabaseJob.execute(Unknown Source)
at org.infobip.cdr.collector.service.persistence.queue.jobs.DatabaseJob.run(Unknown Source)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
Caused by: java.sql.SQLTransientConnectionException: cdr_collector_op_20000_db_1094001 - Connection is not available, request timed out after 47269980ms (total=6, active=0, idle=6, waiting=0)
at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:686)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:179)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:144)
at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:99)
at org.infobip.cdr.collector.service.persistence.queue.repository.impl.CdrDataRepositoryImpl.removeOldCollectedTraffic(Unknown Source)
... 23 common frames omitted
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed.
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:233)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:1848)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.setNetworkTimeout(SQLServerConnection.java:7228)
at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:562)
at com.zaxxer.hikari.pool.PoolBase.isConnectionDead(PoolBase.java:169)
at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:168)
... 26 common frames omitted`

Also worth to notice is that these errors happen but the timeout usually does trigger, but for some reason there is this case where every once in a while getConnection lasts this long...
Here is the log of getConnection when timeout does happen in time:

Caused by: java.sql.SQLTransientConnectionException: cdr_collector_op_30000_db_1094001 - Connection is not available, request timed out after 30001ms (total=2, active=2, idle=0, waiting=11) at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:686) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:179) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:144) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:99) at org.infobip.cdr.collector.service.persistence.queue.repository.impl.CdrDataRepositoryImpl.executeAggregationProcedure(CdrDataRepositoryImpl.java:74) ... 23 common frames omitted Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed. at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:233) at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:1848) at com.microsoft.sqlserver.jdbc.SQLServerConnection.setNetworkTimeout(SQLServerConnection.java:7228) at com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:562) at com.zaxxer.hikari.pool.PoolBase.isConnectionDead(PoolBase.java:169) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:168) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:144) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:99) at org.infobip.cdr.database.helper.SecondStageImportExecutor.execute(SecondStageImportExecutor.java:23) at org.infobip.cdr.collector.service.persistence.queue.repository.impl.TwoStageImportJobRepositoryImpl.executeCompleteTwoStageImport(TwoStageImportJobRepositoryImpl.java:26) at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103) at java.base/java.lang.reflect.Method.invoke(Method.java:580) at org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:355) at org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:196) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:768) at org.springframework.dao.support.PersistenceExceptionTranslationInterceptor.invoke(PersistenceExceptionTranslationInterceptor.java:138) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:184) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.proceed(CglibAopProxy.java:768) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:720) at org.infobip.cdr.collector.service.persistence.queue.repository.impl.TwoStageImportJobRepositoryImpl$$SpringCGLIB$$0.executeCompleteTwoStageImport(<generated>) at org.infobip.cdr.collector.service.write.cdr.twostageimport.SecondStageImportJob.executeImport(SecondStageImportJob.java:65) at org.infobip.cdr.collector.service.write.cdr.twostageimport.SecondStageImportJob.process(SecondStageImportJob.java:55) at org.infobip.common.concurrent.AbstractProcessor.run(AbstractProcessor.java:91) ... 1 common frames omitted

Looking at the HikariPool getConnection, I am not sure how could this happen but it does :D .
HikariCP version that we are using is 5.1.0.

@quaff
Copy link
Contributor

quaff commented Sep 27, 2024

I think it could be caused by SQL Server JDBC Driver itself, have you tried to reproduce by other database such as MySQL PostgreSQL?

@Ducius6
Copy link
Author

Ducius6 commented Sep 28, 2024

I have not tried, I am also unsure what happens to cause this on SQL Server. Hm could be the driver yes, stuck somewhere in the setNetworkTimeout..

@JJ-WM
Copy link

JJ-WM commented Oct 28, 2024

HI @Ducius6 I am seeing the same issue on azure sql, what's the version of azure sql driver are you using?

@JJ-WM
Copy link

JJ-WM commented Oct 28, 2024

msql jdbc driver version used is 12.6.3.jre11

@brettwooldridge
Copy link
Owner

brettwooldridge commented Nov 2, 2024

I feel like a broken record. Ensure that the JDBC driver socket timeout is configured. Ensure that the driver keepalive is configured, or that the OS TCP keepalive is configured.

Also check Google.

@joyaljohn
Copy link

@brettwooldridge Here, the connection timeout should be respected even if the validation request takes longer or encounters a broken record, correct? Typically, applications set the socket timeout to more than 5 minutes for long-running queries and the connection timeout to less than 10 seconds, they cannot wait for the connection request for more than 5 minutes.

@brettwooldridge
Copy link
Owner

brettwooldridge commented Nov 4, 2024

Connections that get hung during close, for example, will prevent connections from being added to the pool. Additionally, most OS's have a keepalive timeout of 2 hours which can also cause such issues. If the OS blocks a close, and literally blocks our thread, the timeout cannot be honored.

By the way, what is going on here?

Caused by: java.sql.SQLTransientConnectionException: cdr_collector_op_30000_db_1094001 - Connection is not
 available, request timed out after 30001ms (total=2, active=2, idle=0, waiting=11) at
 com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:686) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:179) at 
com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:144) at 
com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:99) at 

org.infobip.cdr.collector.service.persistence.queue.repository.impl.CdrDataRepositoryImpl.executeAggregationProcedure(
CdrDataRepositoryImpl.java:74) ... 23 common frames omitted Caused by: 
com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed. at 

com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:233) at 
com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:1848) at 
com.microsoft.sqlserver.jdbc.SQLServerConnection.setNetworkTimeout(SQLServerConnection.java:7228) at 
com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:562) at 
com.zaxxer.hikari.pool.PoolBase.isConnectionDead(PoolBase.java:169) at 
com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:168) at 

What is org.infobip.cdr.collector.service.persistence.queue.repository.impl.CdrDataRepositoryImpl.executeAggregationProcedure doing in the stack between HikariCP and the driver?

@Ducius6
Copy link
Author

Ducius6 commented Nov 13, 2024

msql jdbc driver version used is 12.6.3.jre11

we were using the same driver version when this happened. Now upgraded to 12.6.4 and haven't had the issue since

@Ducius6
Copy link
Author

Ducius6 commented Nov 13, 2024

Connections that get hung during close, for example, will prevent connections from being added to the pool. Additionally, most OS's have a keepalive timeout of 2 hours which can also cause such issues. If the OS blocks a close, and literally blocks our thread, the timeout cannot be honored.

By the way, what is going on here?

Caused by: java.sql.SQLTransientConnectionException: cdr_collector_op_30000_db_1094001 - Connection is not
 available, request timed out after 30001ms (total=2, active=2, idle=0, waiting=11) at
 com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:686) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:179) at 
com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:144) at 
com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:99) at 

org.infobip.cdr.collector.service.persistence.queue.repository.impl.CdrDataRepositoryImpl.executeAggregationProcedure(
CdrDataRepositoryImpl.java:74) ... 23 common frames omitted Caused by: 
com.microsoft.sqlserver.jdbc.SQLServerException: The connection is closed. at 

com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:233) at 
com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:1848) at 
com.microsoft.sqlserver.jdbc.SQLServerConnection.setNetworkTimeout(SQLServerConnection.java:7228) at 
com.zaxxer.hikari.pool.PoolBase.setNetworkTimeout(PoolBase.java:562) at 
com.zaxxer.hikari.pool.PoolBase.isConnectionDead(PoolBase.java:169) at 
com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:168) at 

What is org.infobip.cdr.collector.service.persistence.queue.repository.impl.CdrDataRepositoryImpl.executeAggregationProcedure doing in the stack between HikariCP and the driver?

Application first fetches a connection from HikariPool in try with resources, and in the next step it executes java.sql CallableStatement.

It looks something like this:

` try (Connection connection = datasourceProvider.getDataSource().getConnection();
CallableStatement statement = connection.prepareCall(procedureCall)) {
statement.registerOutParameter(1, Types.INTEGER);
statement.setInt(2, value1);
statement.setInt(3, value2);
statement.execute();

        return DatabaseJobResult.forId(statement.getInt(1));
    } catch (SQLException e) {
        throw new RuntimeException("Failed to execute ...", e);
    }`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants