how to restrict which operators can be used and which props can be used for ordering #2442

adrian-gierakowski · 2022-08-22T09:33:06Z

Environment

PostgreSQL version: (if using docker, specify the image)

14.3

PostgREST version: (if using docker, specify the image)

9.0.1.20220717

Operating system:

linux (nixos, debian)

Description of issue

By default, users of the api served by postgrest can execute queries using all available operators and sort the results based on all available columns. I'd like to be able to restrict which operators can be used (and with which columns) as well as only enable ordering on specific columns. I don't see any mention of such capabilities in the postgrest docs. What's the recommended way of doing this?

Thanks!

steve-chavez · 2022-08-23T00:35:34Z

This would be an interesting feature. I don't know the interface yet but Ideally we'd reuse some PostgreSQL mechanism.

which operators can be used (and with which columns) as well as only enable ordering on specific columns.

For example, if we would expose operators based on indexes, I see that CREATE OPERATOR CLASS has a FOR SEARCH | FOR ORDER BY distinction.

The CREATE OPERATOR FAMILY can be schema namespaced, it would be great if we could have an operator family for postgREST.

(Mostly thinking out loud above, those interfaces might not work)

Another idea on #2028 (comment)

What's the recommended way of doing this?

For now you could use functions/views and only expose the columns you want to be searched on. The most restrictive way would be scalar functions(that don't return table valued types) which wouldn't allow any filter or order, those would have to be done internally.

Ideally pg would get the hint from:

grant EXECUTE on FUNCTION pg_catalog.int4eq(test.projects.id%TYPE,integer) TO postgrest_test_anonymous;
grant EXECUTE on FUNCTION pg_catalog.int4eq(test.projects.client_id%TYPE,integer) TO postgrest_test_anonymous;

Tthat =(int4eq) is only allowed on the id and client_id columns.

steve-chavez · 2022-08-23T01:08:11Z

Another idea on #2028 (comment)

Using the YAML idea above but namespacing the configs to our role, it could be like:

ALTER ROLE authenticator SET pgrst.operators TO $$
available: 
  -lt: '<'
 - gt: '>' 
 - like: 'LIKE'
schema:
  table:
    - column1: lt, gt
    - column2: like
    - order: column3
$$;

steve-chavez · 2022-08-23T17:38:21Z

@adrian-gierakowski Instead of whitelisting the operations, what do you think of disallowing requests by their plan cost using the pg_plan_filter extension?

Using https://www.postgresql.org/docs/current/pgstatstatements.html, it seems possible to determine the query with the highest cost and set that as a cost limit.

adrian-gierakowski · 2022-08-23T20:32:36Z

@adrian-gierakowski Instead of whitelisting the operations, what do you think of disallowing requests by their plan cost using the pg_plan_filter extension?

This does not sound like what I'm looking for.

What I'm looking for is something which would allow me to start by crafting a minimal API for a particular purpose, and then easily expand it as the need arises, making sure that any allowed queries can be server efficiently. The solution I'm looking for should also enable the generated OpenApi spec to reflect exactly what is allowed. And error messages returned for the rejected requests should clearly indicate why the reason for rejection (for example: "ordering by name is not allowed").

I could obviously slap a reverse proxy in front of the postgrest instance and reject unwanted queries, but that would required a re-implementation of the parser for the not so trivial postgrest query language on the proxy, which I think doesn't make sense, given that postgrest does the parsing already.

steve-chavez · 2022-08-23T21:58:10Z

Fair points.

Some other ideas. Ideally pg would get the hint from:

grant EXECUTE on FUNCTION pg_catalog.int4eq(test.projects.id%TYPE,integer) TO postgrest_test_anonymous;
grant EXECUTE on FUNCTION pg_catalog.int4eq(test.projects.client_id%TYPE,integer) TO postgrest_test_anonymous;

That eq/=(int4eq) is only allowed on the id and client_id columns for the test.projects relation but that doesn't work.

Another idea is to have a convention on a function like:

CREATE FUNCTION eq(test.projects, integer) RETURNS bool
AS $_$ 
  SELECT $1.id = $2 
$_$ LANGUAGE SQL STABLE;

That wouldnn't work as well because we need the function to apply for integer types coming from different columns. Defining operators in this way could get long too.

Another option would be

CREATE FUNCTION "pgrst.projects.id.eq"(test.projects, test.projects.id%TYPE) RETURNS bool
AS $_$ 
  SELECT $1.id = $2 
$_$ LANGUAGE SQL STABLE;

CREATE FUNCTION "pgrst.projects.client_id.eq"(test.projects, test.projects.client_id%TYPE) RETURNS bool
AS $_$ 
  SELECT $1.client_id = $2 
$_$ LANGUAGE SQL STABLE;

steve-chavez · 2022-08-26T18:50:15Z

There's another way that michelp suggested to me.

PostgreSQL has the SECURITY LABEL command, we could use it as:

SECURITY LABEL FOR pgrest ON TABLE foo.bar IS $$
  GRANT ORDER ON col TO anonymous
  GRANT USAGE ON col2.eq TO anonymous
$$;

The IS string is a label and the syntax is arbitrary, we can have a nice DSL there.

The benefit is that it would allow us to restrict ORDER or any other feature, unlike the FUNCTION approach.

The drawback is that it would require installing a postgresql extension but we can fallback to our in-db config or env vars using the same syntax.

adrian-gierakowski · 2022-08-27T14:46:19Z

Nice, this sounds really promising! Thank you for giving it such a thorough consideration. Let me know if there is anything I could do to help make it happen.

wolfgangwalther · 2022-08-27T15:46:23Z

PostgreSQL has the SECURITY LABEL command

I thought about using SECURITY LABEL to add metadata to database objects a while back, too. The big upside is, that we'd have our own "namespace", so we wouldn't conflict with other metadata, as in the COMMENT ON approach. The downside is, that SECURITY LABEL is currently not supported for all database objects. Examples are CONSTRAINT, which could be useful in embedding contexts, OPERATOR, in this context here, CAST in terms of input/output conversions, etc..

I'm not sure whether we should start attaching metadata to objects in this way, when we're that limited down the road.

steve-chavez · 2022-08-27T16:53:19Z

The downside is, that SECURITY LABEL is currently not supported for all database objects. Examples are CONSTRAINT, which could be useful in embedding contexts, OPERATOR, in this context here, CAST in terms of input/output conversions, etc..

Not sure we'll need all of those but we could propose extending SECURITY LABEL in postgres core(feasible now that I have a couple of patches landed) and for past versions we could fallback to our in-db config.

Edit: Additionally, since we'll have to parse the label inside anyway, we could extend the approach this way

SECURITY LABEL FOR pgrest ON TABLE foo.bar IS $$
  HIDE CONSTRAINT mypkey MESSAGE WITH 'duplicate id'
$$;

Meaning we could use the label to refer to other objects of the table. Using SECURITY LABEL .. ON SCHEMA would allow us to annotate all possible db objects I think.

wolfgangwalther · 2022-08-28T09:23:41Z

Edit: Additionally, since we'll have to parse the label inside anyway, we could extend the approach this way
SECURITY LABEL FOR pgrest ON TABLE foo.bar IS $$
  HIDE CONSTRAINT mypkey MESSAGE WITH 'duplicate id'
$$;
Meaning we could use the label to refer to other objects of the table. Using SECURITY LABEL .. ON SCHEMA would allow us to annotate all possible db objects I think.

One of the upsides of using COMMENT ON or SECURITY LABEL ON is, that those are tied to the objects they reference - so they will be automatically removed when the object is removed, too. No orphaned comments possible. But this is not the case with those "in-label-references".

The downside is, that SECURITY LABEL is currently not supported for all database objects. Examples are CONSTRAINT, which could be useful in embedding contexts, OPERATOR, in this context here, CAST in terms of input/output conversions, etc..

Not sure we'll need all of those but we could propose extending SECURITY LABEL in postgres core(feasible now that I have a couple of patches landed) and for past versions we could fallback to our in-db config.

Yeah, extending SECURITY LABEL to other objects would be great. I can imagine, however, that this will be rejected, because to make use of security labels e.g. for selinux, there would need to be a sensible way to do some kind of "access check" (via object_access_hook) on the objects to be added...

Another option could be to try to extend COMMENT ON in postgres core to support something like COMMENT FOR pgrst ON ..., similar to security labels FOR, essentially providing namespaces for comments.

steve-chavez · 2022-08-29T00:19:35Z

One of the upsides of using COMMENT ON or SECURITY LABEL ON is, that those are tied to the objects they reference - so they will be automatically removed when the object is removed, too. No orphaned comments possible.

Hm, I don't see how's that a dealbreaker for going with SECURITY LABEL. Orphaned metadata can be noticed and cleaned up on git-versioned database code and we could warn or err when trying to apply it. After all we're going to have to parse the label metadata on Haskell code.

Also consider that we'd need to support the in-db config and the other configs as fallback too

ALTER ROLE authenticator SET pgrest.sec.table."foo.bar" IS $$
  HIDE CONSTRAINT mypkey MESSAGE WITH 'duplicate id'
$$;

And those could leave orphaned metadata too

steve-chavez · 2022-10-24T19:49:25Z

I've been thinking this could be kept friendlier(less cognitive overload for the user) and conciser if we use plain HTTP syntax for an allowlist.

GET /clients?select=projects(),tasks()&id=eq.*&name=like.*&order=name # users can only use eq on id, like on name, order on name and only embed on projects and tasks

PATCH /clients?id=eq.* # users can only patch by using the id filter

Every other request would fail. This allowlist could limit the Schema Cache and also the OpenAPI output.

(Specific columns in select are not included here because we want users to secure those in the database)

Ideally there would be a mode where PostgREST starts without an allowlist but can generate it based on the requests it receives. So users can start with an unrestricted instance, build a test suite, run it against the instance and have it generate the allowlist(likely with an endpoint on /admin). This would avoid having to create the allowlist manually.

adrian-gierakowski · 2022-10-24T23:08:15Z

@steve-chavez this sounds great!

steve-chavez · 2023-04-04T20:58:40Z

Just so I don't forget.

This could be more easily implemented with #1710 plus a pre-request function. With that we could filter postgREST query params on SQL.

pre-request might not be as performant but that might change with pipeline mode.

steve-chavez · 2023-05-16T17:22:36Z

So I've been trying hard to restrict operators using pg itself(ref). The interface turns out to be clunky and requires a lot of database objects. And that doesn't even cover restricting order.

Instead of trying to encode HTTP concerns into pg. What if we take an OpenAPI spec as input?

The user can take our openapi output and then store it in a db function, say db-openapi-input-spec. And simply remove all the unwanted query params. This would solve this issue.

Notes:

A YAML openapi would be nicer to edit.
Seems simpler to implement as it won't require using the schema cache. It would be purely config.
We would need to solve Improvements to query language #2066 first to restrict operators?

steve-chavez added the idea Needs of discussion to become an enhancement, not ready for implementation label Aug 23, 2022

adrian-gierakowski changed the title ~~how to restrict which operators operators can be used and which props can be used for ordering~~ how to restrict which operators can be used and which props can be used for ordering Aug 23, 2022

steve-chavez mentioned this issue Aug 24, 2022

Support for PGroonga operators #2028

Open

steve-chavez mentioned this issue Aug 26, 2022

feat: add request.param guc #1710

Draft

steve-chavez mentioned this issue Aug 30, 2022

Ordering by random PostgREST/postgrest-docs#671

Open

steve-chavez mentioned this issue May 20, 2023

OpenAPI spec as input #2792

Closed

steve-chavez mentioned this issue May 28, 2023

Configurable query grammar #2805

Open

steve-chavez mentioned this issue Aug 9, 2024

Request to support order-by with COLLATE #3683

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to restrict which operators can be used and which props can be used for ordering #2442

how to restrict which operators can be used and which props can be used for ordering #2442

adrian-gierakowski commented Aug 22, 2022

steve-chavez commented Aug 23, 2022 •

edited

Loading

steve-chavez commented Aug 23, 2022

steve-chavez commented Aug 23, 2022

adrian-gierakowski commented Aug 23, 2022

steve-chavez commented Aug 23, 2022 •

edited

Loading

steve-chavez commented Aug 26, 2022

adrian-gierakowski commented Aug 27, 2022

wolfgangwalther commented Aug 27, 2022

steve-chavez commented Aug 27, 2022 •

edited

Loading

wolfgangwalther commented Aug 28, 2022

steve-chavez commented Aug 29, 2022 •

edited

Loading

steve-chavez commented Oct 24, 2022

adrian-gierakowski commented Oct 24, 2022

steve-chavez commented Apr 4, 2023

steve-chavez commented May 16, 2023 •

edited

Loading

how to restrict which operators can be used and which props can be used for ordering #2442

how to restrict which operators can be used and which props can be used for ordering #2442

Comments

adrian-gierakowski commented Aug 22, 2022

Environment

Description of issue

steve-chavez commented Aug 23, 2022 • edited Loading

steve-chavez commented Aug 23, 2022

steve-chavez commented Aug 23, 2022

adrian-gierakowski commented Aug 23, 2022

steve-chavez commented Aug 23, 2022 • edited Loading

steve-chavez commented Aug 26, 2022

adrian-gierakowski commented Aug 27, 2022

wolfgangwalther commented Aug 27, 2022

steve-chavez commented Aug 27, 2022 • edited Loading

wolfgangwalther commented Aug 28, 2022

steve-chavez commented Aug 29, 2022 • edited Loading

steve-chavez commented Oct 24, 2022

adrian-gierakowski commented Oct 24, 2022

steve-chavez commented Apr 4, 2023

steve-chavez commented May 16, 2023 • edited Loading

steve-chavez commented Aug 23, 2022 •

edited

Loading

steve-chavez commented Aug 23, 2022 •

edited

Loading

steve-chavez commented Aug 27, 2022 •

edited

Loading

steve-chavez commented Aug 29, 2022 •

edited

Loading

steve-chavez commented May 16, 2023 •

edited

Loading