Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong docker compose for Clickhouse + Hasura Setup #9964

Closed
tushar5526 opened this issue Nov 3, 2023 · 23 comments · May be fixed by #9965
Closed

Wrong docker compose for Clickhouse + Hasura Setup #9964

tushar5526 opened this issue Nov 3, 2023 · 23 comments · May be fixed by #9965
Assignees
Labels
c/v3-ndc-clickhouse k/bug Something isn't working

Comments

@tushar5526
Copy link

Version Information

Server Version:
CLI Version (for CLI related issue):

Environment

OSS

What is the current behaviour?

Error while health check in clickhouse + hasura setup.

What is the expected behaviour?

Services should start normally.

How to reproduce the issue?

https://github.com/hasura/graphql-engine/blob/master/install-manifests/enterprise/clickhouse/docker-compose.yaml

The data-connector and hasura service, both are exposed to 8080. Also from the log it seems that data-connector is running internally on the port 8080 so the correct port config should be 8081:8080.

Screenshots or Screencast

Please provide any traces or logs that could help here.

Any possible solutions/workarounds you're aware of?

Keywords

@tushar5526 tushar5526 added the k/bug Something isn't working label Nov 3, 2023
@tushar5526 tushar5526 changed the title Same ports are exposed in docker compose Wrong docker compose for Clickhouse + Hasura Setup Nov 3, 2023
@dameleney
Copy link
Contributor

Are you having any issues when you update the port number for the clickhouse data connector and health check to 8081? @BenoitRanque Can update the docker-compose file.

@tushar5526 Would you find it useful if we updated the docker compose file to also spin up an OSS instance of Clickhouse or are you planning to test Hasura OSS with Clickhouse Cloud? We have created similar docker compose files for OSS partners such as CockroachDB. At a minimum we can include the lines needed to spin up Clickouse OSS and comment them out.

@tushar5526
Copy link
Author

The docker compose I am using

version: "3.7"
services:
  redis:
    image: redis:7
    restart: always
    # ports:
    #   - 6379:6379
  postgres:
    image: postgres:15
    restart: always
    volumes:
      - db_data:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: postgrespassword
  hasura:
    image: hasura/graphql-engine:v2.35.0
    restart: always
    ports:
      - 8080:8080
    environment:
      HASURA_GRAPHQL_DATABASE_URL: postgres://postgres:postgrespassword@postgres:5432/postgres
      HASURA_GRAPHQL_ENABLE_CONSOLE: "true"
      HASURA_GRAPHQL_DEV_MODE: "true"
      HASURA_GRAPHQL_REDIS_URL: redis://redis:6379
      HASURA_GRAPHQL_RATE_LIMIT_REDIS_URL: "redis://redis:6379"
      HASURA_GRAPHQL_MAX_CACHE_SIZE: "200"
      HASURA_GRAPHQL_METADATA_DEFAULTS: '{"backend_configs":{"dataconnector":{"clickhouse":{"uri":"http://data-connector-agent:8080"}}}}'
    # depends_on:
    #   data-connector-agent:
    #     condition: service_healthy
  data-connector-agent:
    image: hasura/clickhouse-data-connector:v2.32.0
    restart: always
    ports:
      - 8081:8080
    # healthcheck:
    #   test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
    #   interval: 5s
    #   timeout: 10s
    #   retries: 5
    #   start_period: 5s

  clickhouse:
    image: docker.io/bitnami/clickhouse:23
    environment:
      CLICKHOUSE_USER: user
      CLICKHOUSE_PASSWORD: secret
    ports:
      - '8123:8123'
    volumes:
      - clickhouse_data:/bitnami/clickhouse
volumes:
  db_data:
  clickhouse_data:
    driver: local

I had to turn off health checks, because they were failing and hasura was not even starting due to that.

After connecting Clickhouse in Hasura, I can't see any tables to track. I ingested this sample data https://clickhouse.com/docs/en/getting-started/example-datasets/nyc-taxi from clickhouse's demo examples.

Screenshot 2023-11-04 at 1 08 10 PM Screenshot 2023-11-04 at 1 08 27 PM

The error I am getting after connecting the clickhouse instance in Hasura

hasura-clickhouse-data-connector-agent-1  | {"timestamp":"   0.000020584s","level":"INFO","fields":{"message":"init logging & tracing"},"target":"init_tracing_opentelemetry::tracing_subscriber_ext"}
hasura-clickhouse-data-connector-agent-1  | {"timestamp":"   0.000131500s","level":"DEBUG","fields":{"key":"service.name","value":"unknown_service"},"target":"otel::setup::resource"}
hasura-clickhouse-data-connector-agent-1  | {"timestamp":"   0.000134334s","level":"DEBUG","fields":{"key":"os.type","value":"linux"},"target":"otel::setup::resource"}
hasura-clickhouse-data-connector-agent-1  | {"timestamp":"   0.000155167s","level":"DEBUG","fields":{"OTEL_EXPORTER_OTLP_TRACES_ENDPOINT":"http://localhost:4318"},"target":"otel::setup"}
hasura-clickhouse-data-connector-agent-1  | {"timestamp":"   0.000157167s","level":"DEBUG","fields":{"OTEL_EXPORTER_OTLP_TRACES_PROTOCOL":"http/protobuf"},"target":"otel::setup"}
hasura-clickhouse-data-connector-agent-1  | {"timestamp":"   0.002722375s","level":"DEBUG","fields":{"OTEL_TRACES_SAMPLER":"\"parentbased_always_on\""},"target":"otel::setup"}
hasura-clickhouse-data-connector-agent-1  | {"timestamp":"   0.002760542s","level":"DEBUG","fields":{"OTEL_PROPAGATORS":"tracecontext,baggage"},"target":"otel::setup"}
hasura-clickhouse-data-connector-agent-1  | {"timestamp":"   0.000593500s","level":"INFO","fields":{"message":"Server listening on port 8080"},"target":"clickhouse_gdc"}
hasura-clickhouse-data-connector-agent-1  | OpenTelemetry trace error occurred. error sending request for url (http://localhost:4318/): error trying to connect: tcp connect error: Address not available (os error 99)
hasura-clickhouse-data-connector-agent-1  | OpenTelemetry trace error occurred. error sending request for url (http://localhost:4318/): error trying to connect: tcp connect error: Address not available (os error 99)
hasura-clickhouse-data-connector-agent-1  | OpenTelemetry trace error occurred. error sending request for url (http://localhost:4318/): error trying to connect: tcp connect error: Address not available (os error 99)

I hope it helps!

@tushar5526
Copy link
Author

I don't have much experience in Opentelemetry but it seems I have the service is trying to send telemetry data somewhere and failing. Not able to find what I am missing here.

@tushar5526
Copy link
Author

tushar5526 commented Nov 6, 2023

Hi @dameleney, I also tried setting up a Hasura cloud instance and added the clickhouse data connector agent. I am facing the same error there as well, the clickhouse DB is added successfully but I cannot see any tables to track in it. (I assume the same errors are popping up in the cloud instance as I am seeing in my local)

Was there any release/version that was working before? Let me know if there is something I can help, looking forward to using this. Thanks!

@BenoitRanque
Copy link
Contributor

BenoitRanque commented Nov 6, 2023

2 issues:

  1. the mapping of ports is incorrect, should be 8081:8080
  2. the healthcheck is incorrect. The image is based on scratch, so curl is not available.
    Working on a fix now, this will require publishing a new image.

Please note you should be able to work around this issue by temporarily removing the healthcheck.

Was there any release/version that was working before?

Yes, this should be working so I'm unsure what the issue is. Will look further into this. Failure to send telemetry should be a non-issue. Will look into whether we can silence the logs if we fail to configure a target to send the logs to.

@tushar5526
Copy link
Author

tushar5526 commented Nov 6, 2023

Please note you should be able to work around this issue by temporarily removing the healthcheck.

@BenoitRanque I was able to add the clickhouse data connector in Hasura, but I was not able to see any tables in Clickhouse on either the docker or cloud instance of Hasura. Tables are not getting tracked from clickhouse in Hasura.

For testing, I ran a clickhouse instance on a public server and imported the demo taxi data from clickhouse's website.

please refer to above screenshots, thanks

@BenoitRanque
Copy link
Contributor

@tushar5526 Confirming the issue. Seems there's some missing third party lib of some kind, which causes an issue at runtime when building from scratch. This went unnoticed during recent efforts to reduce image size. We have a PR with the required fix.

@tushar5526
Copy link
Author

tushar5526 commented Nov 8, 2023

Thanks for the quick fixes.

I couldn't find the repo which controls this docker image - https://hub.docker.com/r/hasura/clickhouse-data-connector/tags of clickhouse connector.

It would be helpful if you could get the docker image updated as well.

@tushar5526
Copy link
Author

@BenoitRanque PS: I tried the updated GDC repo locally and I am still not able to track my tables in clickhouse. I suppose there are any other PRs yet to be created or were you talking about hasura/clickhouse_gdc_v2#9?

@BenoitRanque
Copy link
Contributor

@tushar5526 we released hasura/clickhouse-data-connector:v2.35.0, can you try using that?

We're also updating the sample docker compose file

Please let us know if this works

@tushar5526
Copy link
Author

@BenoitRanque there are no health check errors, also I can connect to a DB now but cannot track any tables present in clickhouse. Can you verify whether the data-connector is behaving as expected?

@BenoitRanque
Copy link
Contributor

@tushar5526 we've released a new version 2.35.1 with a fix for the introspection issue.

The problem caused introspection to return no tables. It seems some assumptions about some system enums did not hold true in all environments.

Can you let us know if this fixes the issue? Thank you for your patience!

@tushar5526
Copy link
Author

@BenoitRanque this is working nicely, I can see the tables. Thanks for the help folks. Looping in @choxx. I can help with the docker-compose if needed, or else this issue is resolved.

@choxx
Copy link

choxx commented Nov 16, 2023

Hi @BenoitRanque the fix is indeed working & working fine for normal tables.
But for some foreign tables, it's failing to load the tables list.

Sharing the API details responding with 400:

Request

URL: POST /v1/metadata
Payload:

{"type":"dataconnector_get_source_tables","args":{"source":"posthog"}}

Response

{
    "error": "error decoding response body: unknown variant `FOREIGN TABLE`, expected `BASE TABLE` or `VIEW` at line 54 column 32",
    "path": "$.args",
    "code": "data-connector-error",
    "internal": null
}

image

Clickhouse version: 22.3.18

Could you please help?

@BenoitRanque
Copy link
Contributor

BenoitRanque commented Nov 17, 2023

@choxx acknowledging this issue.

This is happening because you have a table of type FOREIGN TABLE in the information schema.
Will fix ASAP. Working out whether we can safely treat those as normal tables, or should exclude them from the schema.

@BenoitRanque
Copy link
Contributor

@choxx The clickhouse documentation mentions FOREIGN TABLE as a possible value for the table type here.
I could not find other mentions.

Two questions:

  • How are these tables created? Could you share sample SQL?
  • Do you need these tables exposed in your API?

You can use this SQL to find table names and types:

SELECT
    tables.table_name AS "name",
    tables.table_type AS "table_type"
FROM INFORMATION_SCHEMA.TABLES AS tables
WHERE tables.table_catalog = currentDatabase()
AND tables.table_type IN ('BASE TABLE', 'VIEW', 'FOREIGN TABLE')

We can either exclude them, which is easiest but may not be what you need.
Or we can include them, but can only do so if they behave like normal tables.
Specifically, we need to be able to mention those tables in complex SQL queries.

We appreciate the help, or any pointers.

@choxx
Copy link

choxx commented Nov 18, 2023

How are these tables created? Could you share sample SQL?

These tables have been created by a 3rd party tool we are using (Posthog). The tables creation & migrations are all managed by Posthog only, we don't have much control over it.

Do you need these tables exposed in your API?

For our current use case, we don't need these tables exposed over Hasura. Also, no clue on whether these tables could behave like normal tables or not.

Using the query you mentioned above, it seems all the foreign tables are related to Kafka:

│ kafka_events_dead_letter_queue             │ FOREIGN TABLE │
│ kafka_events_json                          │ FOREIGN TABLE │
│ kafka_groups                               │ FOREIGN TABLE │
│ kafka_person                               │ FOREIGN TABLE │
│ kafka_person_distinct_id                   │ FOREIGN TABLE │
│ kafka_person_distinct_id2                  │ FOREIGN TABLE │
│ kafka_plugin_log_entries                   │ FOREIGN TABLE │
│ kafka_session_recording_events             │ FOREIGN TABLE │

@BenoitRanque
Copy link
Contributor

@choxx We've published v2.35.2 which should fix this issue. For the time being we've opted to exclude foreign tables, so you won't see them from Hasura when using the connector. This may change in future versions if we're able to determine that there is a need for it, and that foreign tables behave as expected in the complex queries we generate.

Please let us know if this fixes the problem so we can close the issue.

@dameleney
Copy link
Contributor

@choxx @tushar5526 Have your issues been resolved? My understanding is that you do not need foreign tables for your use case. Would this be a useful enhancement to make in the future?

@tushar5526
Copy link
Author

Hey @BenoitRanque @dameleney, we wrote our custom APIs on top of our DB as we needed that on priority. I can help with testing whether the Foreign table issue persists if you want any input on that front. Otherwise, you can close this issue.

Much appreciate all the help!

@spkprav
Copy link

spkprav commented Jan 3, 2024

Hi @BenoitRanque, Happy New Year. I tried to do a fresh setup with v2.36.0 and the issue still exists. I use a simple MergeTree engine table ('BASE TABLE') in Clickhouse that doesn't show up under untracked tables/views. Here's my docker-compose.yml.

version: "3.7"
services:
  redis:
    image: redis:7
    restart: always
  hasura:
    image: hasura/graphql-engine:v2.36.0
    restart: always
    ports:
      - 8080:8080
    environment:
      ## Add your license key below
      # HASURA_GRAPHQL_EE_LICENSE_KEY: ""
      HASURA_GRAPHQL_ADMIN_SECRET: myadminsecretkey
      ## The metadata database for this Hasura GraphQL project. Can be changed to a managed postgres instance
      HASURA_GRAPHQL_DATABASE_URL: postgresql://[email protected]/clickhouse_pg_db
      # HASURA_GRAPHQL_READ_REPLICA_URLS: postgres://postgres:postgrespassword@postgres:5432/postgres

      ## Optional settings
      ## enable the console served by server
      HASURA_GRAPHQL_ENABLE_CONSOLE: "true"
      ## enable required apis; metrics api exposes a prometheus endpoint, uncomment to enable
      # HASURA_GRAPHQL_ENABLED_APIS: 'graphql,metadata,config,developer,pgdump,metrics'
      ## secure metrics endpoint with a secret, uncomment to enable
      # HASURA_GRAPHQL_METRICS_SECRET: 'secret'
      ## enable debugging mode. It is recommended to disable this in production
      HASURA_GRAPHQL_DEV_MODE: "true"
      # HASURA_GRAPHQL_LOG_LEVEL: debug
      ## enable offline console assets if you wish to access console without internet connectivity
      # HASURA_GRAPHQL_CONSOLE_ASSETS_DIR: "/srv/console-assets"
      HASURA_GRAPHQL_REDIS_URL: redis://redis:6379
      HASURA_GRAPHQL_RATE_LIMIT_REDIS_URL: "redis://redis:6379"
      HASURA_GRAPHQL_MAX_CACHE_SIZE: "200"
      # Configures the connection to the Data Connector agent for Clickhouse by default
      # You can also omit this and manually configure the same thing via the 'Data' tab, then 'Add Agent'
      # in the Hasura console
      HASURA_GRAPHQL_METADATA_DEFAULTS: '{"backend_configs":{"dataconnector":{"clickhouse":{"uri":"http://data-connector-agent:8080"}}}}'
    depends_on:
      data-connector-agent:
        condition: service_healthy
  data-connector-agent:
    image: hasura/clickhouse-data-connector:v2.36.0
    restart: always
    ports:
      - 8081:8080
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 5s
      timeout: 10s
      retries: 5
      start_period: 5s
  clickhouse:
    image: clickhouse/clickhouse-server
    user: "101:101"
    container_name: clickhouse
    hostname: clickhouse
    volumes:
      - ${PWD}/fs/volumes/clickhouse/etc/clickhouse-server/config.d/config.xml:/etc/clickhouse-server/config.d/config.xml
      - ${PWD}/fs/volumes/clickhouse/etc/clickhouse-server/users.d/users.xml:/etc/clickhouse-server/users.d/users.xml
    ports:
      - "127.0.0.1:8123:8123"
      - "127.0.0.1:9000:9000"

@spkprav
Copy link

spkprav commented Jan 11, 2024

The tables do show up when I use the default database, doesn't work with other database names, this is fine for me as of now. Thanks @BenoitRanque for the help.

@manasag
Copy link
Contributor

manasag commented Apr 11, 2024

Closing this issue. Also note that this ticket #10094 closes the issue where non default tables doesn't show up.

@manasag manasag closed this as completed Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/v3-ndc-clickhouse k/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants