Browseable recommendations and Children conformance class #229

philvarner · 2021-11-08T15:32:44Z

Related Issue(s): #17 #159 #138 #137 #230

Proposed Changes:

Better description of what the Core conformance class means
Detailed descriptions of how to use sub-catalogs for browse
STAC API - Children conformance class to get all child Catalog and Collection metadata in one call

PR Checklist:

This PR is made against the dev branch (all proposed changes except releases should be against dev, not master).
This PR has no breaking changes.
This PR does not make any changes to the core spec in the stac-spec directory (these are included as a subtree and should be updated directly in radiantearth/stac-spec)
I have added my changes to the CHANGELOG or a CHANGELOG entry is not required.

jisantuc · 2021-11-16T17:28:51Z

core/README.md

@@ -62,24 +72,66 @@ A `service-doc` endpoint is recommended, but not required.
 | ------------- | ----------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 | `service-doc` | `/api.html` | OAFeat OpenAPI | An HTML service description.  Uses the `text/html` media type to refer to a human-consumable description of the service. The path for this endpoint is only recommended to be `/api.html`, but may be another path. |

-Additionally, `child` relations may exist to individual catalogs and collections.
+Additionally, `child` relations may exist to child Catalogs and Collections and `item` relations to Items. These
+relations form a directed acyclic graph that supports browseable traversal.


I'm throwing this comment here because it's the first mention of the acyclicality of the graph -- I don't think this is technically correct. I think the existence of parent links means that the graph has cycles. Moreover, I think it's good that the graph has cycles -- it means we can get back up a level without using the back button / that we can retrace our steps using only the data. Is acyclicality an important property? I've thought about this a bit and I'm not sure what it gets us. I think this is just a normal directed graph.

Ah, so that's a good point and one I should be more clear on. I was primarily thinking about the graph of child links, which must be a DAG. Also to make it clear that these are DAGs and not only trees, since the subcatalogs don't have to be distinct (e.g., you can slice them up by date or grid id, not only one)

Is that even necessarily the case (that child links are acyclic)? I doubt anyone has mutual child relationships, but:

it's not explicitly disallowed

I'm not aware of any notes recommending against multi-part cycles like catalog a has item a as a child, item a has catalog b as a child (nothing says items can't have children as far as I know), catalog a is a child of catalog b

I have no idea why anyone would do such a thing, I just read a nice Alloy 6 tutorial earlier today and it might have poisoned my brain forever.

Either way, unless the acyclicality is important for some reason, I'm not sure it's doing much work except encouraging thinking about edge cases where it doesn't hold -- I think a plain directed graph should be sufficient for supporting browseable traversal

I agree with @jisantuc here.

updated the language to only be directed graph

m-mohr · 2021-11-27T17:18:25Z

Would it make sense to split the Children conformance class and the rest of the PR into two PRs? This PR is so large that it's nearly impossible for me to really do a full PR in one go... Also, could it be that this PR also includes some other changes? For example, I've also seen pagination changes in here...

PRINCIPLES.md

ogcapi-features/README.md

core/README.md

geospatial-jeff · 2021-11-27T19:30:43Z

children/README.md

+- **Extension [Maturity Classification](../extensions.md#extension-maturity):** Pilot
+- **Dependencies**: [STAC API - Core](../core)
+
+A STAC API can return information about all STAC [Catalogs](../stac-spec/catalog-spec/catalog-spec.md) available using a link


I'm confused if the server is expected to return just the first generation of children, the last generation of children, or the entire ancestry. For example, if I had Landsat catalogs organized as /catalogs/landsat_8_c1/{path}_{row}_{date}, the /children endpoint could:

Return all {path} catalogs (immediate children).

Return every combination of {path}_{row}_{date} (last generation of children).

Return every combination of {path}, {path}_{row}, and {path}_{row}_{date} (entire ancestry).

Hmm... is this a conformance class a "simple" way for providers to express their "preferred list of catalogs and collections to show on a frontpage"?

To answer @geospatial-jeff 's question, it's up to the implementer. I think typically it will return exactly the same set of children that have child relations from the root. The benefit to having it at this endpoint is that the title, desc, etc can all be returned, so a client doesn't have to retrieve every single child link just to find out the title to display -- this was Rob's use case.

Maybe "simple" isn't the right word. I'll think about this.

core/README.md

geospatial-jeff · 2021-11-27T20:14:17Z

children/README.md

+A STAC API can return information about all STAC [Catalogs](../stac-spec/catalog-spec/catalog-spec.md) available using a link
+from the landing page that uses the link relation `children`, which links to an endpoint called
+`/children`. The purpose of this endpoint is to present a single resource from which clients can retrieve
+all the child objects (Catalogs and Collections) of a Catalog. This eliminates then need for a client to


I think that returning both Catalogs and Collections in the /children endpoint is confusing. My understanding after reading this PR is that collections are for searching while catalogs are for browsing. It even says in the Browseable Catalogs best practices that collections should not be included as part of the browseable tree of catalogs. Allowing collections to be used for browsing is poor separation of concerns, and instead of just recommending against their use the spec should forbid using collections in this way.

Yeah, this is a bit confusing. Like do I not need to retrieve /collections anymore if /children is implemented? How should clients handle all the duplication to show a list of unique catalogs/collections?

I can see the benefit of having children return both Catalogs and Collections - it more fits with the STAC core specification which uses rel=child for both of those cases. It also allows for all of the catalogs and collections to be retrieved in a single paginated call - useful for rendering the content of the API.

One addition we could make is that a query parameter can determine which type to return - e.g.

https://stac.api/children?type=catalog

Where is the best practices text you referenced located?

It's included as part of https://github.com/philvarner/stac-api-spec/blob/catalogs-and-browseable/core/README.md#browseable-catalogs introduced in this PR. The exact language is:

These are the two standard ways of structuring a browseable tree of catalogs, the only difference being whether the Collection is used as part of the tree or not:

Catalog (root) -> Catalog* -> Item (recommended)

Catalog (root) -> Collection -> Catalog* -> Item

I'll see if I explicitly stated it anywhere, but the intention was that /children would return the same list of entities that are linked to via rel=child from the root. The benefit (as Rob has articulated before) is that a client can get all of the entities with one call instead of having to make one HTTP request for each one to, say, only get the title of the collection or catalog.

geospatial-jeff · 2021-11-27T20:19:55Z

children/README.md

@@ -0,0 +1,156 @@
+# STAC API - Children


Are catalogs returned by this endpoint allowed to implement item search (i.e. can they have a /search endpoint)?

The landing page is just a catalog, and itself implements /search, so I'm guessing this is allowed. And the /search response would only return items that are contained by the catalog (potentially across many collections).

Yes, any (sub) Catalog can implement a search endpoint.

Thanks for bringing this up. I thought I'd explicitly mentioned this, but I can't find it, so I'll make sure to add something.

geospatial-jeff · 2021-11-27T20:27:03Z

The top level README (https://github.com/philvarner/stac-api-spec/tree/catalogs-and-browseable#in-this-repository) should be updated to link to the children folder w/ description.

lossyrob · 2021-11-29T17:08:32Z

Some comments talking through this PR at with Matthias Matt and Chris:

The /children endpoint should contain the STAC Objects for any directly linked Catalogs or Collections (not recursive)
Catalogs in a STAC API should not contain Items (as a best practice)
Collections in a STAC API should not contain any children (as a best practice)
One question that came up - should the /collections endpoint return all Collections contained in the STAC API (recursively through the Catalog)? Or only the Collections whose parent is the STAC API?
Regardless of the answer above, the /search endpoint will search Collections that are direct children and also the children of any children (recursive). That way a search on the root API can find any Items contained in the catalog as a whole, while searching on sub-APIs (child subcatalogs that have conformance classes/are an API) will only search Items in the Collections that are returned by their own /collections endpoint

Matt suggested scheduling a working session to clarify these points and others, and get this PR over the finish line - he will follow up

m-mohr · 2021-11-29T17:38:43Z

Catalogs in a STAC API should not contain Items (as a best practice)

Collections in a STAC API should not contain any children (as a best practice)

I was confused by this in the call and did not pick this up as being agreed on yet. What's the background on this?

cholmes

Not quite done with full review, but want to get some of my comments in as not sure when I'll get back to it.

PRINCIPLES.md

cholmes · 2021-11-29T16:47:51Z

core/README.md

-  that indicates to clients that this is a STAC API and how to access conformance classes, including this
-  one. The relevant conformance URI's are listed in each part of the
-  API specification. If a conformance URI is listed then the service must implement all of the required capabilities.
+Whenever a static STAC catalog is served over HTTP, it is a defacto hypermedia-driven web API. Even without implementing any


Probably should define what a 'static STAC catalog' is? Maybe it's just rephrasing this, but as it reads it seems to assume some knowledge of a static STAC. Could also link to the section in stac-spec on static catalogs.

cholmes · 2021-11-29T16:51:23Z

core/README.md

+Whenever a static STAC catalog is served over HTTP, it is a defacto hypermedia-driven web API. Even without implementing any
+STAC API conformance classes, the entire catalog can be traversed from the root via `child` and `item` link relations. Support for 
+this "browse" mode of interaction is complementary to the dynamic search capabilities defined by other STAC API conformance classes.
+Conversely, many STAC API implementations do not support browse, even though the root is a Catalog object, because they do not


This line and the next seems to be talking about existing STAC API implementations? I think it'd be good to rephrase it more abstractly for the spec. Describe the use case when API's do not support browse, and perhaps in parenthesis explain how many 1.0-beta and early catalogs didn't have the appropriate link relations to traverse. Like the spec should read as the spec, without too much dialog on the existing state of things.

cholmes · 2021-11-29T16:52:51Z

core/README.md

+Conversely, many STAC API implementations do not support browse, even though the root is a Catalog object, because they do not
+have the appropriate `child` and `item` link relations to traverse over the objects in the catalog. 
+Providing users with these two different, complementary ways of navigating the catalog allows them to interrogate the data in whichever
+way best meets their needs.  Supporting these also opens up a catalog to both


Maybe explain the use cases of both of these ways? One for crawling / browsing, one for getting the endpoints to search against.

(could also be a link to a spot that discusses it more)

cholmes · 2021-11-29T17:55:47Z

core/README.md

+1. Catalog -> Catalog (product) -> Catalog (date) -> Catalog (path) -> Catalog (row)
+2. Catalog -> Catalog (product) -> Catalog (path) -> Catalog (row) -> Catalog (date)
+
+There are many options for how to structure these catalog graphs, so it will take some analysis work to figure out


Could be good to have some sort of 'best practices' linked to where options for this are discussed more. It also might be worth a bit of guidance here, just like thinking about it from a users perspective - how people would like to browse to data.

I filed an issue to come back to these #243

I think this is an improvement over what we had before, but recognize there's still a better way to describe these that I can't quite articulate right now

cholmes · 2021-11-29T18:03:42Z

core/README.md

+    - child -> /catalogs/sentinel_2_l2a
+
+Since the catalogs structure is a directed acyclic graph which allows 
+you to provide numerous different Catalog and Collection graphs reach leaf Items. For example, for a Landsat 8 data


It might be good to provide some guidance on the final 'item' link from different views. Like should they all link to the exact same item url? Or have different item urls that all have canonical hrefs to the same one? And should the canonical one be in the browse hierarchy, or in the collection?

lossyrob · 2021-11-30T20:37:08Z

@m-mohr

Catalogs in a STAC API should not contain Items (as a best practice)
Collections in a STAC API should not contain any children (as a best practice)

I see this is in direct conflict with this declaration of Catalogs as a browsable unit that contains Items. When we were discussing it, I was less thinking about the "browseable" part of APIs, which to me is less interesting. The children extension is interesting for my use case in that it allows the ability to create sub-STAC APIs to organize Collections into a multi-level hierarchy, which solves an issue when you have many Collections in an API, some of them strongly related. If we consider Collections as the "searchable" container vs child Catalogs being "sub-APIs", while a Catalog that is a child of a Landing Page root of a STAC API could have Items, it is confusing to have Items in Catalogs, which can't be searched, and confusing for Collections to contain anything but Items. This is true if we only want to use Catalogs in an API them as an organizational mechanism for Collections.

Collections that contain Catalogs that organize the Items into things like date/path/row would allow for users to click through an organized set of items, enabling the "Browsable" part of all this work - I see that now. This compounds the complexity of using Catalogs for both organizing Collections (and existing as sub-APIs) and also organizing Items (which may or may not be a sub-API, though I'm not sure you'd want that many sub-APIs for groups of Items in the same Collection, though in order to have their own /children endpoint they may need to be a STAC API themselves). So let me retract those points.

To think through something specific and imagine an API that might have both cases:

A root STAC API Landing Page/Catalog (which I'll refer to as a STAC) that contains a large number of Collections. These Collections are organized into Catalogs that themselves represent STACs (sub-STACs).
The root /collection endpoint of the root STAC returns either all collections, recursive through the tree of sub-STACs, or only the collections that have the root STAC as a parent (not sure what's best here).
The root /children endpoint returns all direct children of the STAC, i.e. any Catalog (sub-STAC) or Collection that has the root STAC set as the parent
The root /search endpoint will search through all collections contained in the STAC or sub-STAC, recursively.
Say you have a sub-STAC called "MODIS" that contains all Collections for products related to modis. This is retrievable through the /children endpoint. It has its self link contained to the STAC endpoint which returns the Landing Page/Catalog for that sub-STAC. Say it's at /children/modis.
The /children/modis/collections endpoint will return all the MODIS Collections, who have their parent link set to the MODIS sub-STAC endpoint
The /children/modis/children endpoint returns all children of that sub-STAC. In this case, the MODIS sub-stack only contains Collections (and no sub-STACs of its own), so it returns the same as the /children/modis/collections endpoint
The /children/modis/search endpoint searches against only the MODIS Collections contained by this sub-STAC
Now let's consider a specific MODIS Collection, say /children/modis/collections/MOD14A2. If I call /children/modis/collections/MOD14A2/children ... this is where I'm confused. The Collection isn't a STAC API, so it wouldn't have a children endpoint to 'crawl' through. So it seems like the /children/modis/children would have to contain the Collection's subcatalogs, even though the parent of those Catalogs is actually the Collection?

Perhaps someone can help clear that up for me by continuing that specific example? I was trying to get to the point where the MOD14A2 Collection has subcatalogs that organize the Items into browsable categories, let's say year month day. Then thinking through how that would translate into a front-end experience like STAC Browser, i.e. what endpoints the front end would call when.

README.md

philvarner · 2021-12-06T16:06:04Z

I extracted a bunch of the typo and wordsmithing changes into https://github.com/radiantearth/stac-api-spec/pull/234/files

philvarner · 2021-12-07T19:14:16Z

One other issue to consider is that OGC uses the term "crawlable", which I think is synonymous with our use of "browseable", so we should consider adopting that term instead.

philvarner · 2021-12-08T23:17:31Z

The top level README (https://github.com/philvarner/stac-api-spec/tree/catalogs-and-browseable#in-this-repository) should be updated to link to the children folder w/ description.

Thanks @geospatial-jeff -- I noticed the same and added it

… and browseable

cholmes

Looks great! I added a few commitable suggestions, but feel free to tweak them. And one or two other suggestions that are not 'must have', so I'm approving this.

I was wondering why you ended up with 'browseable', as I thought you were leaning towards 'crawlable' like OGC. I'm happy with it as is, just curious.

browseable/README.md

cholmes · 2022-01-04T22:12:39Z

browseable/README.md

+This JSON is what would be expected from an API that implements `STAC API - Browseable`. 
+
+This particular catalog provides both the ability to browse down to child Catalog objects through its
+`child` links, which then will eventually reach Items through `item` link relations.


The example below doesn't seem particularly meaningful. Could be good to try to illustrate the point more, since usually the example helps show that. Perhaps a little diagram, that shows a non-browsable catalog vs a browsable one? Or perhaps just explain after the example that a catalog not implementing browsable would not have the 'child' links, but it would have them in the 'data' rel link. I think it'd be good to just provide more context, and to help make it clear to existing implementations what they need to do.

children/README.md

core/README.md

matthewhanson · 2022-01-04T23:23:31Z

This looks great, approved, I just left a comment regarding the use of the word "shall".

lossyrob

I have some comments related to using catalogs/ instead of collections/ for hosting sub-catalogs for Item groupings for the Hierarchy recommendations. However I don't think these comments are blocking.

💯 well done!

browseable/README.md

core/README.md

lossyrob · 2022-01-05T02:26:46Z

core/README.md

+
+| Endpoint              | Returns                                        | Description          |
+| --------------------- | ---------------------------------------------- | -------------------- |
+| `/catalogs/{catalogId}` | [Catalog](../stac-spec/catalog-spec/README.md) | child Catalog object |


My previous question on the example using /catalogs/landsats-8-c1 is answered here. I see that it's a recommendation to avoid conflicts, though I think implementations could work around that and it would provide a cleaner set of URLs to avoid both a Catalog and Collection with the same name. Actually I was surprised to see that Catalog.id and Collection.id don't at least recommend there isn't a conflict inside the same root, as having duplicate IDs for Collections and Catalogs might get confusing.

I think that's a pretty reasonable recommendation to make. My only hesitancy in making it is that I feel like Catalog and Collections (as we currently define them) are not related in a way in which there would ever be confusion between them. If we'd defined a Collection as-a Catalog (maybe with the additional restriction that only one Collection can exist in a hierarchy?), then I could see a good case for not duplicating ids.

geospatial-jeff

Nice work @philvarner!

Co-authored-by: Chris Holmes <[email protected]>

Co-authored-by: Rob Emanuele <[email protected]>

Co-authored-by: Chris Holmes <[email protected]>

philvarner · 2022-01-05T22:45:57Z

I was wondering why you ended up with 'browseable', as I thought you were leaning towards 'crawlable' like OGC. I'm happy with it as is, just curious.

My understanding was that the semantics of this may be slightly different than whatever crawlable ends up meaning in OGC, and that crawlable might not even be the term they end up using. Given that uncertainty, if we go with our own terminology, we can easily align it to theirs when then work is finalized, whereas it would be very confusing if we had something that was named the same but different.

philvarner added 3 commits October 25, 2021 15:23

merge and initial catalogs and browseable

67bfa4b

finish browseable

a7cfb08

add template mention

84bc36c

philvarner requested review from m-mohr, cholmes and matthewhanson November 8, 2021 15:32

Add Children conformance class

f12c2ad

philvarner changed the title ~~Browseable recommendations~~ Browseable recommendations and Children conformance class Nov 8, 2021

jisantuc reviewed Nov 16, 2021

View reviewed changes

geospatial-jeff reviewed Nov 27, 2021

View reviewed changes

core/README.md Show resolved Hide resolved

geospatial-jeff reviewed Nov 27, 2021

View reviewed changes

cholmes reviewed Nov 29, 2021

View reviewed changes

philvarner added 2 commits December 3, 2021 14:04

Merge branch 'dev' into catalogs-and-browseable

7981ce3

fix bad merge in CHANGELOG

5beb369

philvarner commented Dec 6, 2021

View reviewed changes

README.md Show resolved Hide resolved

philvarner added 4 commits December 6, 2021 15:09

catalog graphs are not specifically acyclic

969d125

remove bit about aligning with OGC standards

369ae52

add Children to root README

9abae23

Merge branch 'dev' into catalogs-and-browseable

b568773

add Browseable conformance class

a40b9de

philvarner added 12 commits December 16, 2021 11:25

remove unused openapi file, update order of summary items in children…

d7fbd6d

… and browseable

updates

f8d3d10

clarify what should be returned by children

7b6bd6f

update children openapi example

788be5e

clarify browseable and children

3bcf17b

wordsmithing

be01abb

claify browseable vs. search

c128863

add mention of single canonical link to Item

966d433

remove mention of search in the browseable landing page example

491af9c

wordsmithing

b6d4379

wordsmithing

0c8829a

updates

5a694d5

cholmes approved these changes Jan 4, 2022

View reviewed changes

matthewhanson approved these changes Jan 4, 2022

View reviewed changes

lossyrob approved these changes Jan 5, 2022

View reviewed changes

geospatial-jeff approved these changes Jan 5, 2022

View reviewed changes

philvarner and others added 10 commits January 5, 2022 15:30

Merge branch 'dev' into catalogs-and-browseable

5bd2d2e

Update browseable/README.md

caa5ef1

Co-authored-by: Chris Holmes <[email protected]>

add link about items link rel in features

03200a5

add link about items link rel in features

4443078

Update core/README.md

d6581b4

Co-authored-by: Rob Emanuele <[email protected]>

clarify browseable example

f112c01

Update core/README.md

2a54b6a

Co-authored-by: Chris Holmes <[email protected]>

remove sub-collection

435fb5a

Update core/README.md

8cc1642

Co-authored-by: Chris Holmes <[email protected]>

update items lang in core

5353b53

philvarner merged commit 4f0ced2 into radiantearth:dev Jan 5, 2022

philvarner deleted the catalogs-and-browseable branch January 5, 2022 22:46

Browseable recommendations and Children conformance class #229

Browseable recommendations and Children conformance class #229

Conversation

philvarner commented Nov 8, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jisantuc Nov 16, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

m-mohr commented Nov 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

geospatial-jeff Nov 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lossyrob Nov 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

geospatial-jeff Nov 27, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

geospatial-jeff commented Nov 27, 2021

lossyrob commented Nov 29, 2021 • edited Loading

m-mohr commented Nov 29, 2021

cholmes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lossyrob commented Nov 30, 2021

philvarner commented Dec 6, 2021

philvarner commented Dec 7, 2021

philvarner commented Dec 8, 2021

cholmes left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthewhanson commented Jan 4, 2022

lossyrob left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

geospatial-jeff left a comment

Choose a reason for hiding this comment

philvarner commented Jan 5, 2022

philvarner commented Nov 8, 2021 •

edited

Loading

jisantuc Nov 16, 2021 •

edited

Loading

m-mohr commented Nov 27, 2021 •

edited

Loading

geospatial-jeff Nov 27, 2021 •

edited

Loading

lossyrob Nov 29, 2021 •

edited

Loading

geospatial-jeff Nov 27, 2021 •

edited

Loading

lossyrob commented Nov 29, 2021 •

edited

Loading