Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STAC Search with "ids" and "fields" not working as specified in the STAC spec #707

Closed
adrienDog opened this issue May 4, 2021 · 9 comments · Fixed by #708
Closed

STAC Search with "ids" and "fields" not working as specified in the STAC spec #707

adrienDog opened this issue May 4, 2021 · 9 comments · Fixed by #708

Comments

@adrienDog
Copy link

Describe the bug
I am trying to use the advanced POST /search from the STAC API documentation here

Some queries are not behaving as specified:

  1. Using ids query field: seems not taken into account at all
  2. Using fields to specify which fields to include/exclude in each feature response

Expected behavior
Example advanced search:

curl --location --request POST 'mySTAC/search' \
--header 'Content-Type: application/json' \
--data-raw '{
    "ids": [ "id_1", "id_2"],
    "fields": {
        "include": ["id"],
        "exclude": ["geometry"]
    }
}'
  1. Returns the totality of features, across all collections.
  2. All fields are returned:
  • geometry included

Additional context

@jisantuc
Copy link
Contributor

jisantuc commented May 4, 2021

👋🏻 the fields behavior is specified in an extension that we've never implemented -- https://github.com/radiantearth/stac-api-spec/tree/master/item-search#fields. Can you say some more about how fields would be useful to you? It's always seemed kind of superfluous to me outside of bandwidth-constrained environments.

You're right about the item search though, and I'm getting a PR up to fix that now.

@adrienDog
Copy link
Author

Hi James! Thanks for the quick reply :)

The fields, you are right, is about saving bandwidth on big result lists. Not critical and we wont use now. I just meant to use it to be quick at checking whether the id I was requesting was in the result list.

For the ids query, I see from the specs that if set:

All other filter parameters that further restrict the number of search results are ignored

Which will mean for example that specifying collections will have no effect?
Example:

  • if id_1 is in collection_a, and we query for
{
  "collections": ["collection_b"], 
  "ids": ["id_1"]
}

then id_1 will be returned anyway

Thanks for addressing this one! important to us when there is some client side grouping which does not fit the stac model

@jisantuc
Copy link
Contributor

jisantuc commented May 4, 2021

All other filter parameters that further restrict the number of search results are ignored

I didn't notice this before. The next big spec push is around the API spec, sometime in the next few months, so there's time to refine how that works if the currently described behavior isn't great. I'd prefer to apply whatever filters a client provides (Franklin's current behavior) instead of requiring consumers to understand the spec well enough to know that some filters won't be applied under certain conditions. What do you think?

@jisantuc
Copy link
Contributor

jisantuc commented May 4, 2021

Closed as fixed, but we can continue talking about the filter behavior here -- I'll open another issue for implementing the fields extension

@adrienDog
Copy link
Author

I didn't notice this before. The next big spec push is around the API spec, sometime in the next few months, so there's time to refine how that works if the currently described behavior isn't great. I'd prefer to apply whatever filters a client provides (Franklin's current behavior) instead of requiring consumers to understand the spec well enough to know that some filters won't be applied under certain conditions. What do you think?

I fully agree, I thought this was quite a weird specification tbh, counter-intuitive at least.

Originally I thought:

  • "collections": results have to be in one of those collections
  • "ids": results have to be in this list of ids

--> so composing the two criteria would have meant: results have to be in one of those collections and in this list of ids".

if users of the API wanted not to care about a certain criteria (e.g. "collections") they wouldnt specify it imo.
maybe this specification is to simplify STAC implementations?

@jisantuc
Copy link
Contributor

jisantuc commented May 5, 2021

Ok I think we're on the same page here -- I'll open an issue in the STAC API specification repo to clarify and ideally revert that choice

@adrienDog
Copy link
Author

awesome! thanks James :)

@adrienDog
Copy link
Author

btw we deployed the latest docker tag published and the query with ids works as expected in combination with collections filter, thanks a lot!

@philvarner
Copy link

@jisantuc just as one example for fields -- when i indexed MODIS and had a well-decimated polygon in the proj:geometry field, it accounted for like 90% of the entire Item json, so excluding it by default (but allowing a user to get it if they really wanted) was useful

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants