[Bug]: LocalStack & @SqsListener - SdkClientException "Unable to execute HTTP request: Connection refused: /127.0.0.1:{portNumber}" during tests #7454

daniel-frak · 2023-08-31T10:20:11Z

daniel-frak
Aug 31, 2023

Module

LocalStack

Testcontainers version

1.19.0

Using the latest Testcontainers version?

Yes

Host OS

Linux

Host Arch

x86

Docker version

Client: Docker Engine - Community
 Version:           24.0.5
 API version:       1.43
 Go version:        go1.20.6
 Git commit:        ced0996
 Built:             Fri Jul 21 20:35:18 2023
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          24.0.5
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.20.6
  Git commit:       a61e2b4
  Built:            Fri Jul 21 20:35:18 2023
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.22
  GitCommit:        8165feabfdfe38c65b599c4993d227328c231fca
 runc:
  Version:          1.1.8
  GitCommit:        v1.1.8-0-g82f18fe
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

What happened?

When using the LocalStack container in tests with an @SqsListener, the logs get spammed with software.amazon.awssdk.core.exception.SdkClientException for a while, even though the container is supposedly already running.

It seems that TestContainers are not quite ready after the "Ready." log.

This can make CI logs unreadable, as sometimes the LocalStack initialization takes a long time, overwhelming the log file with errors.

Relevant log output

Connected to the target VM, address: '127.0.0.1:58503', transport: 'socket'
Standard Commons Logging discovery in action with spring-jcl: please remove commons-logging.jar from classpath in order to avoid potential conflicts
12:11:53.206 [main] INFO  org.springframework.test.context.support.AnnotationConfigContextLoaderUtils - Could not detect default configuration classes for test class [com.testcontainers.demo.MessageListenerTest]: MessageListenerTest does not declare any static, non-private, non-final, nested classes annotated with @Configuration.
12:11:53.352 [main] INFO  org.springframework.boot.test.context.SpringBootTestContextBootstrapper - Found @SpringBootConfiguration com.testcontainers.demo.Application for test class com.testcontainers.demo.MessageListenerTest
12:11:53.545 [main] INFO  org.testcontainers.utility.ImageNameSubstitutor - Image name substitution will be performed by: DefaultImageNameSubstitutor (composite of 'ConfigurationFileImageNameSubstitutor' and 'PrefixingImageNameSubstitutor')
12:11:53.749 [main] INFO  org.testcontainers.dockerclient.DockerClientProviderStrategy - Loaded org.testcontainers.dockerclient.UnixSocketClientProviderStrategy from ~/.testcontainers.properties, will try it first
12:11:54.079 [main] INFO  org.testcontainers.dockerclient.DockerClientProviderStrategy - Found Docker environment with local Unix socket (unix:///var/run/docker.sock)
12:11:54.088 [main] INFO  org.testcontainers.DockerClientFactory - Docker host IP address is localhost
12:11:54.106 [main] INFO  org.testcontainers.DockerClientFactory - Connected to docker: 
  Server Version: 24.0.5
  API Version: 1.43
  Operating System: Ubuntu 22.04.3 LTS
  Total Memory: 31962 MB
12:11:54.159 [main] INFO  tc.testcontainers/ryuk:0.5.1 - Creating container for image: testcontainers/ryuk:0.5.1
12:11:54.164 [main] INFO  org.testcontainers.utility.RegistryAuthLocator - Failure when attempting to lookup auth config. Please ignore if you don't have images in an authenticated registry. Details: (dockerImageName: testcontainers/ryuk:0.5.1, configFile: /home/daniel/.docker/config.json, configEnv: DOCKER_AUTH_CONFIG). Falling back to docker-java default behaviour. Exception message: Status 404: No config supplied. Checked in order: /home/daniel/.docker/config.json (file not found), DOCKER_AUTH_CONFIG (not set)
12:11:54.533 [main] INFO  tc.testcontainers/ryuk:0.5.1 - Container testcontainers/ryuk:0.5.1 is starting: 9a2c41dc33d0862c11b9ed20539ffdcb97c4c59ad8c0415dbd9ea53ce38dc703
12:11:54.988 [main] INFO  tc.testcontainers/ryuk:0.5.1 - Container testcontainers/ryuk:0.5.1 started in PT0.861951328S
12:11:54.993 [main] INFO  org.testcontainers.utility.RyukResourceReaper - Ryuk started - will monitor and terminate Testcontainers containers on JVM exit
12:11:54.994 [main] INFO  org.testcontainers.DockerClientFactory - Checking the system...
12:11:54.994 [main] INFO  org.testcontainers.DockerClientFactory - ✔︎ Docker server version should be at least 1.6.0
12:11:54.994 [main] INFO  tc.localstack/localstack:2.0 - LOCALSTACK_HOST environment variable set to localhost (to match host-routable address for container)
12:11:54.994 [main] INFO  tc.localstack/localstack:2.0 - Creating container for image: localstack/localstack:2.0
12:11:56.031 [main] INFO  tc.localstack/localstack:2.0 - Container localstack/localstack:2.0 is starting: 93bf37f504caaf4ebf19a44e1d747fcd28445069db32c04ae78799d87b0c08dc
12:11:57.915 [main] INFO  tc.localstack/localstack:2.0 - Container localstack/localstack:2.0 started in PT2.920205994S

  .   ____          _            __ _ _
 /\\ / ___'_ __ _ _(_)_ __  __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| |  ) ) ) )
  '  |____| .__|_| |_|_| |_\__, | / / / /
 =========|_|==============|___/=/_/_/_/
 :: Spring Boot ::                (v3.1.2)

12:12:01.063 [main] INFO  com.testcontainers.demo.MessageListenerTest - Starting MessageListenerTest using Java 17.0.8.1 with PID 852408 (started by daniel in /home/daniel/Work/Projects/tc-guide-testing-aws-service-integrations-using-localstack)
12:12:01.064 [main] INFO  com.testcontainers.demo.MessageListenerTest - No active profile set, falling back to 1 default profile: "default"
12:12:03.246 [lifecycle-thread-1] INFO  io.awspring.cloud.sqs.listener.AbstractMessageListenerContainer - Container io.awspring.cloud.sqs.sqsListenerEndpointContainer#0 started
12:12:03.257 [main] INFO  com.testcontainers.demo.MessageListenerTest - Started MessageListenerTest in 2.398 seconds (process running for 11.263)
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
12:12:07.015 [sdk-async-response-0-0] ERROR io.awspring.cloud.sqs.listener.source.AbstractPollingMessageSource - Error polling for messages in queue b0e72410-c24d-4651-9353-cc2374d97b40
java.util.concurrent.CompletionException: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Connection refused: /127.0.0.1:32935
	at software.amazon.awssdk.utils.CompletableFutureUtils.errorAsCompletionException(CompletableFutureUtils.java:65)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage.lambda$execute$0(AsyncExecutionFailureExceptionReportingStage.java:51)
	at java.base/java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:934)
	at java.base/java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:911)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:79)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeAttemptExecute(AsyncRetryableStage.java:103)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:184)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:159)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at software.amazon.awssdk.utils.CompletableFutureUtils.lambda$forwardExceptionTo$0(CompletableFutureUtils.java:79)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$null$0(MakeAsyncHttpRequestStage.java:103)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
	at java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$executeHttpRequest$3(MakeAsyncHttpRequestStage.java:165)
	at java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
	at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
	at java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Connection refused: /127.0.0.1:32935
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
	at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:223)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:218)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:182)
	... 22 common frames omitted
	Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 1 failure: Unable to execute HTTP request: The connection was closed during the request. The request will usually succeed on a retry, but if it does not: consider disabling any proxies you have configured, enabling debug logging, or performing a TCP dump to identify the root cause. If this is a streaming operation, validate that data is being read or written in a timely manner. Channel Information: ChannelDiagnostics(channel=[id: 0xefd5f2b7, L:/127.0.0.1:38324 ! R:/127.0.0.1:32935], channelAge=PT1.744174977S, requestCount=1)
	Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 2 failure: Unable to execute HTTP request: Connection refused: /127.0.0.1:32935
	Suppressed: software.amazon.awssdk.core.exception.SdkClientException: Request attempt 3 failure: Unable to execute HTTP request: Connection refused: /127.0.0.1:32935
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: /127.0.0.1:32935
Caused by: java.net.ConnectException: Connection refused
	at java.base/sun.nio.ch.Net.pollConnect(Native Method)
	at java.base/sun.nio.ch.Net.pollConnectNow(Net.java:672)
	at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:946)
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:337)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:776)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at java.base/java.lang.Thread.run(Thread.java:833)
12:12:07.017 [lifecycle-thread-3] INFO  io.awspring.cloud.sqs.listener.AbstractMessageListenerContainer - Container io.awspring.cloud.sqs.sqsListenerEndpointContainer#0 stopped
Disconnected from the target VM, address: '127.0.0.1:58503', transport: 'socket'

Process finished with exit code 0

Additional Information

To reproduce the issue, the official Testcontainers example can be cloned:
https://github.com/testcontainers/tc-guide-testing-aws-service-integrations-using-localstack

The MessageListenerTest::shouldHandleMessageSuccessfully test creates the attached log with ERROR io.awspring.cloud.sqs.listener.source.AbstractPollingMessageSource - Error polling for messages in queue.

I have also tested this in my own project, using the latest version of Testcontainers (1.19.0) and the result is the same.

Additionally, I have tried creating the queue using a LocalStack init script (/etc/localstack/init/ready.d/init-aws.sh) but the error persists. I have tried explicitly waiting for my SQS queue to be created:

until awslocal sqs list-queues --output json | grep -q "$MY_QUEUE_NAME"
do
  echo "waiting for SQS…"
  sleep 3
done

# Used in test as:
# .waitingFor(Wait.forLogMessage(".*Initialized\\.\n", 1));
echo "Initialized."

This also does not fix the error.

eddumelendez · 2023-08-31T20:15:33Z

eddumelendez
Aug 31, 2023
Maintainer

Hi @daniel-frak, I've moved the issue to a discussion, until we can identify if it is an issue or not. Please, next time consider raising a discussion in order to triage it or join the slack.

I am not able to reproduce the issue with https://github.com/testcontainers/tc-guide-testing-aws-service-integrations-using-localstack. Did you change something?

If sqs is not available then the test would fail and that's not the case, right? I think this is related to the container/test shutdown. Once the test finished, the container will be killed and be removed. So, at this point the sqs client is still trying to pull from localstack containers but this doesn't exist anymore. So, the logs you see are related to the end of the test and not at the beginning as I understand is your concern.

0 replies

daniel-frak · 2023-09-01T06:23:39Z

daniel-frak
Sep 1, 2023
Author

I have not changed the code in any way:

$ git status

On branch main
Your branch is up to date with 'origin/main'.

nothing to commit, working tree clean

$ git log

commit fb9dd2c167d92ff140dd599c751c03e5a9965a1f (HEAD -> main, origin/main, origin/HEAD)
Author: K Siva Prasad Reddy <[email protected]>
Date:   Thu Aug 3 11:18:43 2023 +0530

    Add local development support

commit f5fb018899dcc8c3e87a10330f6d72dfdb85af23
Author: K Siva Prasad Reddy <[email protected]>
Date:   Mon Jul 10 18:20:23 2023 +0530

    Add local development support

commit 1cf05fc49a88a58f393ffa3a39cbef9e83f62d0f
Author: K Siva Prasad Reddy <[email protected]>
Date:   Mon Jul 10 12:04:57 2023 +0530

    Add local development support

[...]

mvn clean test produces the log with the exception, every time.

I'm running this on Ubuntu and OpenJDK 17, Maven 3.9.0:

$ lsb_release -a
 
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.3 LTS
Release:	22.04
Codename:	jammy

java --version
openjdk 17.0.8.1 2023-08-24
OpenJDK Runtime Environment (build 17.0.8.1+1-Ubuntu-0ubuntu122.04)
OpenJDK 64-Bit Server VM (build 17.0.8.1+1-Ubuntu-0ubuntu122.04, mixed mode, sharing)

Apache Maven 3.9.0 (9b58d2bad23a66be161c4664ef21ce219c2c8584)
Maven home: /opt/maven
Java version: 17.0.8.1, vendor: Private Build, runtime: /usr/lib/jvm/java-17-openjdk-amd64
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "6.2.0-31-generic", arch: "amd64", family: "unix"

But the issue also happens on Gitlab CI runners.

Upon further investigation, it seems you were right that it's related to container/test shutdown - when I add a System.out.println to the test, it shows up before the exception does.

On production code, I also noticed exceptions related to acknowledgement of messages by an @SqsListener (among others) but I didn't include them to not muddy the issue. There I also get a Mapped port can only be obtained after the container is started for one of LocalStack's @DynamicPropertySource properties, but I suppose that might be some sort of misconfiguration on my part (though the test is almost exactly the same).

Finally, it seems like this is an issue that has been persisting for some time, as looking at this random blog example:
https://github.com/rieckpil/blog-tutorials/tree/master/spring-cloud-aws-sqs-testing

The author silences `Connection refused' messages, which in that earlier version of the AWS library (which his code depends on) used to be logged as WARN, instead of ERROR:

  <!-- Noisy logs when shutting down the context, connection refuse messages for LocalStack -->
  <logger name="io.awspring.cloud.messaging.listener" level="error"/>

Let me know if I can provide any more info.

0 replies

mbench777 · 2023-10-18T22:19:14Z

mbench777
Oct 18, 2023

I have the same issue, the sqslistener spam the logs with connection refused when the test is finished ( so the localstack container is shutdown) until the spring container is destroyed. Is there anyway to stop the listener directly after the test ?

0 replies

aarrsseni · 2023-11-10T10:45:48Z

aarrsseni
Nov 10, 2023

I have the exact same issue too. Also this issue happens mostly in Jenkins.

0 replies

SimY4 · 2023-11-12T05:49:14Z

SimY4
Nov 12, 2023

I have one more a quite minimalistic repo where this issue can can be reproduced (though in my case it's dynamo sending these messages).

This starts to happen with no code change and now it happens for every commit.
Strangely, running tests in IntelliJ works fine. Running tests in CI or using bundled ./mvnw script fails.

This is not the first time this error happened to me. Last time it was fixed by bumping the localstack image version. So I have a suspicion that the problem in on localstack end.

2 replies

aarrsseni Nov 12, 2023

Yes, I was able to resolve it by specifying the local stack version instead of latest. So I’m also pretty sure that the issue is on localstack side.

a-rampal May 29, 2024

@aarrsseni what version of localstack end up resolving the issue for you?

oyee91 · 2023-11-13T14:13:12Z

oyee91
Nov 13, 2023

My theory is that this happens because both Spring leaves the beans in a cache that can be used later if needed to save on context spinning up time, also could be since request have already gone out and hanging while you close down the SQS container from localstack.

Our workaround is by marking the context as dirty, use which ever mode that fits your needs best just remember that the lower scope you use the longer the time it takes since for every dirty context a new one has to be spun up again. So if you only need to do it once per test class you will only need to spin up the context one extra time, however if you have 10 tests and you go with after each test, you will need to spin up the context 10 more times.
@DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)
@DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_CLASS)

3 replies

mbench777 Nov 13, 2023

Thanks, it works with @DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_EACH_TEST_METHOD)

martinlechner1 Nov 15, 2023

Thanks a lot mate! That issue was haunting me as well.

GabrielBB Feb 18, 2024

Worked!!

ipsi · 2024-01-12T16:12:12Z

ipsi
Jan 12, 2024

I think the problem basically boils down to this:

TestContainers uses Ryuk to handle cleaning up containers
TestContainers registers a JVM shutdown hook to terminate Ryuk, which will in turn terminate and clean up any containers
Spring leaves Application Contexts running until the JVM shuts down - the Listeners will also continue to run until the JVM shuts down
Spring, therefore, also registers a JVM shutdown hook
JVM shutdown hooks are not guaranteed to run in any particular order
JVM shutdown hooks run in parallel
Chances are good that it takes longer for Spring to get around to shutting down the listener than it does for Ryuk to terminate LocalStack.

And that's it, basically. There's no easy way around this - you can disable Ryuk with an environment variable, but then you get no cleanup at all.

I think the "easiest" solution is to turn off logging for io.awspring.cloud.sqs.listener.source.AbstractPollingMessageSource in tests. A longer-term solution would be to provide better integration with Ryuk & Spring, or at least provide a way to manage it without having to use an environment variable to turn it off.

0 replies

xak2000 · 2024-01-30T18:37:10Z

xak2000
Jan 30, 2024

The problem is that sometimes the processes executed by Spring JVM shutdown hook expect and require a working connection (e.g to the DB), that is handled by TestContainers.

In examples above the main problem is just spamming in the logs, so these warnings can be ignored. But this is not always the case.

E.g. Spring Integration executes DefaultLockRepository.close() method on shutdown. This method executes an SQL query, so it expects a connection in the pool to be available.

When TestContainers provides this connection, then all connections in the pool become invalid earlier, than the pool is shutted down.

Example:

18:50:37.650 DEBUG [Thread-23] o.t.s.c.g.d.core.command.AbstrDockerCmd  : Cmd: f94f01e87a0b6b1b237df68cccb625131e7acab3270818edc6b93becf0365ad0,SIGTERM
...
18:50:37.655  INFO [SpringApplicationShutdownHook] o.s.i.endpoint.EventDrivenConsumer       : stopped bean '_org.springframework.integration.errorLogger'
18:50:38.672 DEBUG [SpringApplicationShutdownHook] o.s.jdbc.support.JdbcTransactionManager  : Creating new transaction with name [null]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT
18:50:38.673  WARN [SpringApplicationShutdownHook] com.zaxxer.hikari.pool.PoolBase          : HikariPool-1 - Failed to validate connection org.testcontainers.jdbc.ConnectionWrapper@78c91d2a (No operations allowed after connection closed.). Possibly consider using a shorter maxLifetime value.
...
18:50:38.709  INFO [HikariPool-1 connection adder] tc.mysql:5.7                             : Creating container for image: mysql:5.7
...
18:50:38.780 ERROR [HikariPool-1 connection adder] tc.mysql:5.7                             : Could not start container

java.lang.IllegalStateException: Shutdown in progress
	at java.base/java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:66)
...
18:51:08.783  WARN [SpringApplicationShutdownHook] o.s.b.f.support.DisposableBeanAdapter    : Invocation of close method failed on bean with name 'lockRepository': org.springframework.transaction.CannotCreateTransactionException: Could not open JDBC Connection for transaction
18:51:08.786  INFO [SpringApplicationShutdownHook] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Shutdown initiated...
18:51:08.983  INFO [SpringApplicationShutdownHook] com.zaxxer.hikari.HikariDataSource       : HikariPool-1 - Shutdown completed.

Marking each method with @DirtiesContext is not an option. It greatly increases tests running time. E.g. in my project the Liquibase migrations alone are running 10 seconds on an empty DB. If every test method will start fresh context with fresh DB container and execute the migrations, the tests will be running forever.

I think, we need a mechanism to synchronize shutting down of TestContainers and Spring Context. They should not shut down in parallel. They should shut down in a guaranteed order, where TestContainers is always shutting down the last (the same way as it is always starts the first).

Maybe Spring's SmartLifecycle could help with this. I'm not sure. I'm not that good at Spring/TestContainers lifecycle internals.

0 replies

GabrielBB · 2024-02-18T22:32:43Z

GabrielBB
Feb 18, 2024

@DirtiesContext(classMode = DirtiesContext.ClassMode.AFTER_CLASS) worked for me ! This was driving me crazy. Thanks to @oyee91 for the workaround

1 reply

daniel-frak Feb 19, 2024
Author

Please note that this is not a good practice, though, as it can significantly slow down your test suite.

alex-arana · 2024-03-13T00:48:18Z

alex-arana
Mar 13, 2024

We experienced this issue more frequently after a recent round of library upgrades which included testcontainers, localstack, Camel and Spring Boot. The way we addressed it was by creating an EventListener that stops all Camel routes (i.e. SQS listeners) as soon as the application begins to shutdown. I am attaching in case it helps anyone else (Kotlin):

/**
 * This is a **Test** [EventListener] used to stop all Camel routes when the application begins the process
 * of shutting down by switching to a [ReadinessState.REFUSING_TRAFFIC] state.
 *
 * This is a hack to avoid the situation occurring after running tests where the _localstack_ shutdown hook
 * terminates the process before the Camel shutdown hook performs its shutdown routine as is described in the
 * following [testcontainers discussion](https://github.com/testcontainers/testcontainers-java/discussions/7454).
 */
@EventListener
fun stopRoutesOnShutdown(event: AvailabilityChangeEvent<*>) {
    val source = event.source
    if (event.state == ReadinessState.REFUSING_TRAFFIC && source is ApplicationContext) {
        logger.info { "Event received: eventState=${event.state}. Shutting down Camel routes in reverse startup order.." }
        with(source.getBean<CamelContext>()) {
            routes.sortedByDescending { it.startupOrder }.forEach { route ->
                routeController.stopRoute(route.routeId, 250, TimeUnit.MILLISECONDS)
            }
        }
    }
}

0 replies

SimY4 · 2024-04-02T04:23:55Z

SimY4
Apr 2, 2024

My update on this issue, I was able to prevent this error from happening by enabling containers reuse:

public abstract class IntegrationTest {
  static final LocalStackContainer localstack =
      new LocalStackContainer(DockerImageName.parse("localstack/localstack:3.2.0"))
          .withServices(LocalStackContainer.Service.SQS)
          .withReuse(true);

  static {
    TestcontainersConfiguration.getInstance()
        .updateUserConfig("testcontainers.reuse.enable", "true");
  }

  @BeforeAll
  static void init() {
    localstack.start();
    localstack.followOutput(new Slf4jLogConsumer(LoggerFactory.getLogger("localstack")));
  }
}

In case if you're running your tests in context of Spring Boot Extension, another option would be to manager container lifecycle through Spring:

@SpringBootTest
public abstract class IntegrationTest {
}

@Configuration
public class TestContainersConfig {
  @Bean(initMethod = "start", destroyMethod = "stop")
  public LocalStackContainer container() {
    return new LocalStackContainer(DockerImageName.parse("localstack/localstack:3.2.0"))
        .withServices(LocalStackContainer.Service.SQS);
  }
}

3 replies

bigunyak Jul 16, 2024

The option with container @Bean didn't work for me.

SimY4 Jul 16, 2024

@bigunyak can you elaborate?

bigunyak Jul 18, 2024

@SimY4 well, I tried replacing standard way of using @Container annotation with manually creating LocalStackContainer bean, as you suggested.

@Configuration
public class TestContainersConfig {
  @Bean(initMethod = "start", destroyMethod = "stop")
  public LocalStackContainer container() {
    return new LocalStackContainer(DockerImageName.parse("localstack/localstack:3.2.0"))
        .withServices(LocalStackContainer.Service.SQS);
  }
}

But that didn't help to get rid of those errors in the log.

alex-arana · 2024-04-05T03:22:20Z

alex-arana
Apr 5, 2024

According to the testcontainers commit that introduced this issue last October:

Currently, Ryuk container finishes after around 10s. This commit register
a shutdown hook which will send a sigterm to the Ryuk container when JVM
is terminating causing Ryuk to finish sooner

As per the above description, this change was aimed at expediting Ryuk's finalisation routine which has introduced a whole raft of other issues when other components such as Spring JMS or Camel routes that depend on the services monitored by Ryuk are suddenly terminated; akin to having the rug pulled from under them while they are busy in the process of coalescing.

Should we consider configuring the shutdown hook above as an opt-in until such time a more permanent solution has been found?

3 replies

alex-arana Apr 6, 2024

I've tested suggested approach locally making said shutdown hook optional based on a new configuration property (see details here).

Here's a summary of my steps:

Installed a testcontainers artifact with mentioned changes into my local Maven repository using ./gradlew publishMavenJavaPublicationToMavenLocal -Ptestcontainers.version=1.19.7-TEST
Created a test spring boot application using Spring Initializr and added a dependency to my local Maven repository version of testcontainers
Added a @ServiceContainer-managed instance of LocalStackContainer to the TestConfiguration

  @Bean
  @ServiceConnection
  fun localStackContainer(): LocalStackContainer {
      val dockerImageName = DockerImageName.parse("localstack/localstack:3.2.0")
      return LocalStackContainer(dockerImageName)
          .withStdOutConsumer()
          .withServices(
              LocalStackContainer.Service.S3,
              LocalStackContainer.Service.SNS,
              LocalStackContainer.Service.SQS
          )
  }

Below is the final portion of of my Gradle build logs after making those changes:

# SpringApplicationShutdownHook kicks in, cleans up CamelContext
2024-04-06T18:11:46.177+11:00  INFO 1423157 --- [test-spring-service] [ionShutdownHook] o.a.c.impl.engine.AbstractCamelContext   : Apache Camel 4.4.0 (camel-1) is shutting down (timeout:1s0ms)
2024-04-06T18:11:46.184+11:00  INFO 1423157 --- [test-spring-service] [ - ShutdownTask] o.a.c.i.engine.DefaultShutdownStrategy   : Waiting as there are still 1 inflight and pending exchanges to complete, timeout in 1 seconds. Inflights per route: [TestProduceRoute = 1]
2024-04-06T18:11:47.181+11:00  WARN 1423157 --- [test-spring-service] [ - ShutdownTask] o.a.c.i.engine.DefaultShutdownStrategy   : Interrupted while waiting during graceful shutdown, will force shutdown now.
2024-04-06T18:11:47.181+11:00  WARN 1423157 --- [test-spring-service] [ionShutdownHook] o.a.c.i.engine.DefaultShutdownStrategy   : Timeout occurred during graceful shutdown. Forcing the routes to be shutdown now. Notice: some resources may still be running as graceful shutdown did not complete successfully.
2024-04-06T18:11:47.192+11:00  INFO 1423157 --- [test-spring-service] [ionShutdownHook] o.a.c.impl.engine.AbstractCamelContext   : Routes stopped (total:2 stopped:2 forced:2)
2024-04-06T18:11:47.193+11:00  INFO 1423157 --- [test-spring-service] [ionShutdownHook] o.a.c.impl.engine.AbstractCamelContext   :     Forced stopped TestProduceRoute (aws2-sqs://ihub-test-queue-a)
2024-04-06T18:11:47.193+11:00  INFO 1423157 --- [test-spring-service] [ionShutdownHook] o.a.c.impl.engine.AbstractCamelContext   :     Forced stopped direct:test-sns (direct://test-sns)
2024-04-06T18:11:47.208+11:00  INFO 1423157 --- [test-spring-service] [ionShutdownHook] o.a.c.impl.engine.AbstractCamelContext   : Apache Camel 4.4.0 (camel-1) shutdown in 1s31ms (uptime:10s)
# SQS listener has shutdown at this point, ResourceReaper's turn..
2024-04-06T18:11:49.278+11:00 DEBUG 1423157 --- [test-spring-service] [ionShutdownHook] o.t.s.c.g.d.core.command.AbstrDockerCmd  : Cmd: cce0e4d38214323e559e34450f2f837385787363673c3d1260ec2fa2f540e901,false
2024-04-06T18:11:49.278+11:00 DEBUG 1423157 --- [test-spring-service] [ionShutdownHook] o.t.s.c.g.d.c.e.InspectContainerCmdExec  : GET: DefaultWebTarget{path=[/containers/cce0e4d38214323e559e34450f2f837385787363673c3d1260ec2fa2f540e901/json], queryParams={}}
2024-04-06T18:11:49.287+11:00 TRACE 1423157 --- [test-spring-service] [ionShutdownHook] o.testcontainers.utility.ResourceReaper  : Stopping container: cce0e4d38214323e559e34450f2f837385787363673c3d1260ec2fa2f540e901
2024-04-06T18:11:49.288+11:00 DEBUG 1423157 --- [test-spring-service] [ionShutdownHook] o.t.s.c.g.d.core.command.AbstrDockerCmd  : Cmd: cce0e4d38214323e559e34450f2f837385787363673c3d1260ec2fa2f540e901,<null>
2024-04-06T18:11:49.289+11:00 TRACE 1423157 --- [test-spring-service] [ionShutdownHook] o.t.s.c.g.d.c.exec.KillContainerCmdExec  : POST: DefaultWebTarget{path=[/containers/cce0e4d38214323e559e34450f2f837385787363673c3d1260ec2fa2f540e901/kill], queryParams={}}
2024-04-06T18:11:49.626+11:00 TRACE 1423157 --- [test-spring-service] [ionShutdownHook] o.testcontainers.utility.ResourceReaper  : Stopped container: localstack/localstack:3.2.0
2024-04-06T18:11:49.626+11:00 DEBUG 1423157 --- [test-spring-service] [ionShutdownHook] o.t.s.c.g.d.core.command.AbstrDockerCmd  : Cmd: cce0e4d38214323e559e34450f2f837385787363673c3d1260ec2fa2f540e901,false
2024-04-06T18:11:49.626+11:00 DEBUG 1423157 --- [test-spring-service] [ionShutdownHook] o.t.s.c.g.d.c.e.InspectContainerCmdExec  : GET: DefaultWebTarget{path=[/containers/cce0e4d38214323e559e34450f2f837385787363673c3d1260ec2fa2f540e901/json], queryParams={}}
2024-04-06T18:11:49.637+11:00 TRACE 1423157 --- [test-spring-service] [ionShutdownHook] o.testcontainers.utility.ResourceReaper  : Removing container: cce0e4d38214323e559e34450f2f837385787363673c3d1260ec2fa2f540e901
2024-04-06T18:11:49.637+11:00 DEBUG 1423157 --- [test-spring-service] [ionShutdownHook] o.t.s.c.g.d.core.command.AbstrDockerCmd  : Cmd: cce0e4d38214323e559e34450f2f837385787363673c3d1260ec2fa2f540e901,true,true
2024-04-06T18:11:49.637+11:00 TRACE 1423157 --- [test-spring-service] [ionShutdownHook] o.t.s.c.g.d.c.e.RemoveContainerCmdExec   : DELETE: DefaultWebTarget{path=[/containers/cce0e4d38214323e559e34450f2f837385787363673c3d1260ec2fa2f540e901], queryParams={v=[true], force=[true]}}
2024-04-06T18:11:49.649+11:00 DEBUG 1423157 --- [test-spring-service] [ionShutdownHook] o.testcontainers.utility.ResourceReaper  : Removed container and associated volume(s): localstack/localstack:3.2.0

I am now able to let Spring shutdown testcontainers whilst avoiding those dreaded stacktraces mentioned earlier in the discussion.. having that capability in a future release would allow us to remove said noise from CI/CD builds as well.

s-jepsen May 8, 2024

Please take into consideration the possibility of supporting JDBC url configuration like:
jdbc:tc:mysql:8.0.36:///databasename?TC_DAEMON=true&shutdownHook=false

EAlf91 Jun 18, 2024

#8732

[Bug]: LocalStack & @SqsListener - SdkClientException "Unable to execute HTTP request: Connection refused: /127.0.0.1:{portNumber}" during tests #7454

Module

Testcontainers version

Using the latest Testcontainers version?

Host OS

Host Arch

Docker version

What happened?

Relevant log output

Additional Information

Replies: 12 comments · 12 replies

eddumelendez Aug 31, 2023 Maintainer

daniel-frak Sep 1, 2023 Author

daniel-frak Feb 19, 2024 Author

Replies: 12 comments 12 replies

eddumelendez
Aug 31, 2023
Maintainer

daniel-frak
Sep 1, 2023
Author

daniel-frak Feb 19, 2024
Author