Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools: run integration tests with testcontainers-go #519

Merged
merged 11 commits into from
Feb 20, 2024

Conversation

mdelapenya
Copy link
Contributor

@mdelapenya mdelapenya commented Jan 17, 2024

What is this PR solving?

In #517, we discussed the opportunity to include tests with real dependencies for the datastores used in the project (at the moment, MySQL and Postgres), so that it's possible to verify the real behaviour of their integration for any PRs or merge commit.

The rationale and motivation to add have this integration tests is that it would be possible to detect potential bugs for this store layer, the soonest, when it's still cheap to fix them.

And it's cheap because it'll be possible to run those tests with confidence and in a reliable manner both locally and in the CI: so anybody cloning the project would be able to run the integration tests (ITs) exactly the same as in the CI. As a consequence, there is no need to add an env var to include the tests in the execution, as the backing containers for the datastores will be automatically created from the test code (Infra-as-code to the rescue).

Another important thing for having these ITs is to use real dependencies for testing the integrations, exactly as in production. I'm choosing mysql:8 and postgres:13 but please let me know if other versions are the reference ones.

As mentioned in the issue, the langchain4j project already uses testcontainers-java for creating and running the ITs (please see https://github.com/langchain4j/langchain4j/pulls?q=is%3Apr+testcontainers+is%3Aclosed), so it could be considered as a good reference to adopt testcontainers-go based tests.

Closes #517

PR Checklist

  • Read the Contributing documentation.
  • Read the Code of conduct documentation.
  • Name your Pull Request title clearly, concisely, and prefixed with the name of the primarily affected package you changed according to Good commit messages (such as memory: add interfaces for X, Y or util: add whizzbang helpers).
  • Check that there isn't already a PR that solves the problem the same way to avoid creating a duplicate.
  • Provide a description in this PR that addresses what the PR is solving, or reference the issue that it solves (e.g. Fixes #123).
  • Describes the source of new concepts.
  • References existing implementations as appropriate.
  • Contains test coverage for new functions.
  • Passes all golangci-lint checks.

@mdelapenya
Copy link
Contributor Author

@tmc and community, if you like this approach I can send follow-up PRs for the rest of tests that are skipped on CI for the vector databases (vectorstores package) whenever the Docker image exists and satisfies the needs of the project. Please let me know anything you need from me here 🙏

Copy link
Owner

@tmc tmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be straightforward to keep the DSN reading approach? I like that this would allow folks to keep a db up for testing and inspection.

Thanks so much for your contribution!

@mdelapenya
Copy link
Contributor Author

Do you mean that, for local development, not testing, it would be convenient to provide the backend service (MySQL, Postgres) for manual inspection? If that's the case, I have something in mind: https://www.atomicjar.com/2023/08/local-development-of-go-applications-with-testcontainers/

In a nutshell:

  • leverage a build tag, e.g. dev, which is used for local development
  • leverage Go's init function to start up the dependencies of interest
  • start the app passing the build tag flag: go run -v -tags dev ./... in order to start in dev mode with the dependencies

If that's the case, I can work on it plus clearly document the setup

@tmc
Copy link
Owner

tmc commented Jan 21, 2024

I think I'd prefer to have some env vars that if they are already populated we skip the database container spinup and run the tests against the provided connection string.

@mdelapenya
Copy link
Contributor Author

I'd prefer to have some env vars that if they are already populated we skip the database container spinup and run the tests against the provided connection string.

✅ Done in 8e3b9c3

* main:
  googleai: add initial Vertex (GCP) implementation of Model (tmc#540)
  docs: add pgvector page
* main:
  googleai: refactor to better separate generated code (tmc#547)
  googleai: add embeddings to vertex (tmc#546)
  embeddings: add cybertron local embeddings
  googleai: move the PaLM provider into googleai (tmc#541)
@tmc
Copy link
Owner

tmc commented Jan 27, 2024

Really appreciate you adding this! I have used ory/dockertest in the past and before merging this I'd at least some basic analysis of how they differ. The number of added dependencies here (while they are test dependencies) is a tad high.

I should get to this later today but if someone else wants to supply some fair analysis of the options I'd appreciate it.

@mdelapenya
Copy link
Contributor Author

I usually use https://deps.dev/go/github.com%2Ftestcontainers%2Ftestcontainers-go/v0.27.0/dependencies/graph to analyze the incoming deps. Dockertest comes with a few less, but please consider the value of the ones in the extra delta.

Regarding your previous experience with Dockertest I couldn't say anything against another wonderful OSS library, as it's always done a great job. It's a matter of a personal preference on how to use it and how much value it provides. As a maintainer, I can only offer my assistance for any doubt you have.

@mdelapenya
Copy link
Contributor Author

@tmc any thought on this? Please let me know if I can be of any help to support any decision you take.

@mdelapenya
Copy link
Contributor Author

Hi @tmc I'd like to contribute more tests for the rest of the vector databases, even creating specific modules for them, but I'd need to validate the efforts through this PR. Is there anything I could do for this PR?

@mdelapenya
Copy link
Contributor Author

@tmc in case you are interested, we have added testcontainers-go modules for the majority of the vector databases this project is using: testcontainers/testcontainers-go#2245.

Hope they support any decision you want to take in the case you want to run the integration tests in the CI.

Thanks in advance!

@tmc
Copy link
Owner

tmc commented Feb 20, 2024

Thanks! Added skipping in case docker is not present/available, LGTM!

Copy link
Owner

@tmc tmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tmc tmc enabled auto-merge February 20, 2024 23:34
@tmc tmc merged commit 7193152 into tmc:main Feb 20, 2024
3 checks passed
@mdelapenya mdelapenya deleted the testcontainers-go branch February 21, 2024 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run integration tests for the database layer (Mysql & Postgres) using testcontainers-go
2 participants