Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow authentication over HTTP #530

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

koszti
Copy link
Member

@koszti koszti commented Feb 5, 2025

Description

The Trino Python client currently rejects all authentication attempts over HTTP. While this seems like is a sensible security measure on the part of the Trino client, using an HTTP URL to access the Trino server does not inherently indicate an insecure environment.

This could be useful for testing purposes or in modern service meshes, such as those using SPIFFE authentication where injection of the spiffe headers and adding TLS are handled by spire and envoy separately in sidecars. Furthermore the trino server also has the option of http-server.authentication.allow-insecure-over-http=true (doc here) so it might be reasonable to not rejecting non-https authentication attempts in the client side.

This pull request includes the following improvements:

  • The SQLAlchemy dialect no longer automatically sets http_scheme to https when authentication is detected in the connection URL.
  • Authentication over HTTP is now allowed.
  • To authenticate over HTTP using SQLAlchemy, the http_scheme="http" connection argument is now respected. If http_scheme is not specified, the scheme defaults to https when the port is 443; otherwise, it defaults to http.

Example auth over HTTP:

- DBAPI

    ```python
    from trino.dbapi import connect
    from trino.auth import BasicAuthentication

    conn = connect(
        user="<username>",
        auth=BasicAuthentication("<username>", "<password>"),
        http_scheme="http",
        ...
    )
    ```

- SQLAlchemy

    ```python
    from sqlalchemy import create_engine

    # defaults to use http if the port is not 443
    engine = create_engine("trino://<username>:<password>@<host>:<port>/<catalog>")

    # or as connect_args
    from trino.auth import BasicAuthentication
    engine = create_engine(
        "trino://<username>@<host>:<port>/<catalog>",
        connect_args={
            "auth": BasicAuthentication("<username>", "<password>"),
            "http_scheme": "http",
        }
    )
    ```

Non-technical explanation

This PR adds functionality to optionally authenticate over HTTP.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

* Allow authentication over HTTP ({issue}`530`)

@cla-bot cla-bot bot added the cla-signed label Feb 5, 2025
@koszti koszti force-pushed the allow-insecure-auth-arg branch 2 times, most recently from 550cadd to 84ceec7 Compare February 5, 2025 12:32
trino/client.py Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@wendigo wendigo requested a review from hashhar February 5, 2025 12:54
@koszti koszti force-pushed the allow-insecure-auth-arg branch from 84ceec7 to 0e8b860 Compare February 5, 2025 12:56
@hashhar
Copy link
Member

hashhar commented Feb 5, 2025

I think we should simply remove the check.

After all the server does reject auth over HTTP unless:

  • insecure over HTTP is allowed via config
  • OR the connection was forwarded from a LB that terminates TLS (signalled via X-Forwarded-Proto header in the request)

What does the JDBC driver do right now?

@koszti koszti force-pushed the allow-insecure-auth-arg branch from 0e8b860 to de652a1 Compare February 5, 2025 12:59
@koszti
Copy link
Member Author

koszti commented Feb 5, 2025

I think we should simply remove the check.

@hashhar , this needs to be tested, but removing it without introducing a new flag could lead to breaking changes. The SQLAlchemy dialect automatically sets http_scheme="https" for all authentication types (here), so we’d also need to remove that along with the actual check (here).

Would this mean that clients not explicitly specifying http_scheme would end up using an unknown protocol?

@hashhar
Copy link
Member

hashhar commented Feb 5, 2025

I'm saying to only remove these two lines

if self._http_scheme == constants.HTTP:
raise ValueError("cannot use authentication with HTTP")

It's server's job to disallow or allow whatever combination it wants.

The http_scheme should still remain obviously.

@koszti
Copy link
Member Author

koszti commented Feb 5, 2025

removing those two lines only doesn’t help unfortunately because the http_sceme is set to https automatically for all auth types. Client actually has no chance to overwrite it by passing http sceme in connect_args and they end up with https only whatever they do.

@hashhar
Copy link
Member

hashhar commented Feb 5, 2025

I think the SQLAlchemy dialect code is being too smart for no reason. It should not be automatically setting http_scheme in any case. Same as how the db-api doesn't do it. We can remove the code from there too.

The DB-API code (client.py TrinoRequest.__init__) will already anyway determine correct http_scheme from the URI itself. The only case when user needs to pass http_scheme explicitly will be for a bare URI which I don't think is common at all.

@koszti koszti force-pushed the allow-insecure-auth-arg branch from de652a1 to 731a0cc Compare February 6, 2025 09:11
@koszti koszti changed the title Add allow_insecure_auth connection arg Allow authentication over HTTP Feb 6, 2025
@koszti koszti force-pushed the allow-insecure-auth-arg branch from 731a0cc to f6cb973 Compare February 6, 2025 09:18
@koszti koszti force-pushed the allow-insecure-auth-arg branch from f6cb973 to 47ba278 Compare February 6, 2025 09:31
@koszti
Copy link
Member Author

koszti commented Feb 6, 2025

@hashhar I've updated the PR as per your recommendations. The SQLAlchemy dialect no longer sets the http_scheme automatically, and the client now accepts authentication over HTTP. No new arguments have been introduced. I've added more details to the PR description, including connection examples. Additionally, I've updated the tests and verified functionality with other Trino instances. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants