-
Notifications
You must be signed in to change notification settings - Fork 9.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow documenting HATEOAS APIs (pathless operation / interface / URI class / operation type) #577
Comments
I guess this is a bit related to #576 (that discussion caused me to write this now down, though I had the ideas in me for a while), but this is much wider in scope. |
See my comments on #576 for one possible approach to this. |
@mpnally's comment on #576 gives some ideas on how this could be done, thanks. I can imagine something like this (modified from the linked comment): uri-classes:
Widget:
get:
...
put:
...
paths:
/widget/{widget_id}:
parameters:
- name: widget_id
in: path
...
uri-class:
$ref: '#/uri-classes/Widget'
definitions:
Doohickey:
properties:
widget:
description: hyperlink to a Widget
format: uri
type: string
uri-class:
$ref: '#/uri-classes/Widget' (The path definition here isn't even needed if clients don't have to know the widget_id, and can reach that kind of uri from somewhere else. It might be needed for implementation help on the server side, but is not needed in the interface to be used by the client.) I'm not sure if Of course, using $ref here seems to imply an URI class can also be defined inline instead of referring to one defined in a special section. |
Terminology aside, your use of $ref is better than using the YAML merge operator (<<). I used YAML merge because I needed a solution that was legal today in swagger 2.0 and so would work with existing tools. |
The use of x-interfaces / x-interface in #576 seems to solve a different problem (that of several APIs or API subresources implementing an API contract/interface/pattern.) I'm more interested in the OP question of defining and documenting an API via HATEOAS. We'd like to hide the paths in the API (i.e. in the UI) and encourage clients to consume links rather that hard-code the API to fixed paths. (The paths become more of an implementation detail. They still exist and could be revealed in the UI for reference / lookup, or presented in an alternate view.) Thus, an API starts at the root and the client can discover the set of links to nested resources via compound keys that include a link relation (name), a verb (PUT, POST, etc.), a type (request and/or response media type), and a description. The definition of a link may also include a path but that is more of an implementation attribute. Thus, each path in an OAS can be associated with one or more contained resources For example if the API has two primary resources,
So from the root, one can find models and activities (as resources), and also discover how to create activities, and from the collection, one can access an individual model. From a model, there are links for operations on that model (the uri is inherited from the current path context). The UI could then navigate the API by exploring links (hiding the URI by default). |
@DavidBiesack I'm not sure if I understand you right ... is the example code there a proposed syntax for a API specification, example output, or something else? I guess |
It is a sketch for additions to OAS -- i.e. addition of a Yes, It is path-centric in that I'm trying to integrate with the existing Swagger 2.0 schema of paths rather than completely undo it all. I'm trying to express resource-oriented architecture, not RPC style. again, the UI or other tools can hide the paths (optionally) and present a link/resource/discovery focused view of the API. |
@DavidBiesack I would define the resource types separately, and then in the paths (and in content schema definitions for URIs) just refer to them. The resource types in your case would be (if I understood right):
I guess it would be possible to merge the "list of" with the "creation" resources, if that is supposed to always be the same URI. Then we can map the paths to those resource types:
Except for the entry point this mapping doesn't need to be public in the API description (but it can, for legacy users/tools who don't really know how HATEOAS works). The actual links are then provided by actually invoking the resources containing the links, they are not hardcoded in the OpenAPI definition. (My point of view is that the API definition is something stable, from which e.g. a client library can be generated and used in some program using it. The actual URIs in use can then change on the fly without any changes necessary to the client. This is something separate from the use of tools like Swagger UI, who are able to construct the UI and do the interactions on the fly from the API definition, and thus can work easily with changing definitions). |
@ePaul yes, defining resources separately and using a $ref mechanism is fine. I just started with a sketch, it certainly needs refinement. For example, some APIs expose resources via multiple paths, so sharing the resource definitions (which includes links but may have more...) is important. For example, the sets of links would need to be mergable/inherited -- that is, it is important to be able to use composition to define a specific resource. (The creation operations/links are an example of that: we wish to expose the createModel link both at the API root and also in the "list of models" result.) We've been calling these aspects of the API 'traits' (a word we borrowed from RAML). Regarding stability -- our view is that we also want clients to be coded to consume links at runtime, so that the client has as few hard-coded paths as possible (for example, the root). Start at the root, get the "models" or "createModel" link (where the API provides the uri) and invoke that operation from that information. |
@DavidBiesack, the x-interfaces / x-interface design in #576 is trying to address the problem you want to solve. When you write: definitions:
Doohickey:
properties:
widget:
description: hyperlink to a Widget
format: uri
type: string
x-interface:
$ref: '#/x-interfaces/Widget' the x-interface declaration is saying that the 'widget' property of a Doohickey will hold an opaque URI—there is no path declaration for this URI and its format is not modeled is any other way. Further, this opaque URI supports the methods defined at #/x-interfaces/Widget. Of course, just because this design addresses your problem, doesn't mean you like it :) An advantage of this design is that it is quite incremental over the current Swagger 2.0—in fact it is legal Swagger 2.0 today. Note that the definition at /x-interfaces/Widget is also legal Swagger 2.0—it is the same Swagger that would normally be nested under a path. It can be included in the definition of a regular Swagger path, or, as above, used to define the interface that is supported by an opaque URL. |
@mpnally I finally spent enough time reading your x-interfaces posts to fully understand what you are suggesting. My ignorance, not your lack of clarity. When you commented on my suggestion to introduce "OperationTypes" in #563 I didn't understand why you suggested I would need to use YAML if my URL had parameters. Now I understand. I consider the definition of the parameters to be part of the "OperationType" and therefore wouldn't need to be defined explicitly under the path. Where (and if) the parameters defined by the OperationType appear in the URL is the job of the path template. This allows me to use a simple $ref under the path to identify the operation type. The only missing piece of the puzzle is a standard place to store these definitions. My use of the term OperationType was thinly veiled attempt to get Link relation support into Open API. Personally, I think @ePaul 's term |
@darrelmiller I think parameters sometimes belong in the 'OperationType' itself and sometimes belong in the path. Here is an example. Suppose I am a person, and my URI is 'Interface' seems like a more intuitive name to me than 'OperationType' or 'URI class'. I think all 3 are trying to get at the same idea, unless I have perhaps misunderstood the other 2. Link Relation Types from RFC 5988 look to me more like predicate names, which would map more naturally to JSON properties. Many 'link relation types' may reference people, but there is a single 'interface' for all people across all those 'link relation types'. |
@darrelmiller When I think of "Link relation type", it always depends on two resources, the linked one, and the linking one. My "URI class" depends only on the linked one. Going with @mpnally's example, |
@ePaul The context resource of a link relation type may affect how the response is processed, but it is rare that it affects how the HTTP request to the target is actually made. Think rel="search", rel="stylesheet", rel="help". I can't think of any example where the mechanics of the request (parameters, payload, headers, response body) are affected by the identity of the context resource. |
@darrelmiller I agree with what you said in this last post, Darrel, but I don't see how it relates to @ePaul 's comment, which also seems correct to me. I must be missing something. |
@mpnally With reference to your first comment, I think I see the challenge. I'm used to using hypermedia to define relationships between resources, not using URL structures to do it. Having resources that are identified using URI parameters that are not communicated using link relation types (or some kind of embedded form) creates out of band coupling. However, I recognize that approach is targeted at people who are aiming for very low coupling and high evolvability, which comes at a cost. I need to think more about this from a more "coupling tolerant" perspective. |
(Not related to the current discussion about link relations, I'll answer separately on this.) A note about implementationAssume we defined an API with pathless operations (or "Operation types", "URI classes", "Interfaces" – however those be called), which are referenced in URI properties in response models. A client receiving such a response can then simply take the URI, and know which operations to do on it (from the Operation types). I can imagine a code generator putting an object implementing a generated interface/class in place of the URI, and when some of its methods are called, the correct HTTP operation is applied on the URI. (Of course, the server could do the same with a (foreign) URI it receives in some parameter property.) For the server producing such URIs, and implementing the URI class, this is obviously not that easy – in contrast to the current model, where every path supported by the server has its "interface" defined in the API definition, now we might additionally have ones which are not. How can the HTTP-facing layer of the application know which implementation part to call for an incoming request? One possible strategy would be to have a second (non-public) API definition file, which actually has all those paths defined, referring to the interface definitions in the public file for details. Then each one of those can be implemented separately. Yet we can update the server, add more paths implementing the same interfaces (or remove ones in the hope no client has stored the URI), and refer to the from the responses, without having to change anything in the public definition file, or any client. A more dynamic strategy would be to use a database storing the association of path to interface. When we generate a new URI on our server (i.e. at latest when it is handed out in some response, but could happen earlier), we store that URI (or the path portion of it), its interface (or some implementation specific information also identifying the interface), and enough implementation information to produce the actual content. |
@mpnally I'm just saying that when a client follows a link relation, the action it needs to take to make the request is not, in my experience, dependent on the context resource. It is also true that the action the client takes is not simply dependent on the type of the target resource. A "cancel" operation might apply to many different kinds of resources. |
@ePaul I think you are trying to take this much further that I was trying. I was trying to associate OpenAPI paths with a single "OperationType/UriClass/Interface". I think you are suggesting that there could be multiple and I think you are asking how the server can disambiguate which one was called. If that is the case, the problem there is that you are trying to layer a new set of semantics on top of HTTP, whereas, I was simply trying to package up some HTTP semantics and communicate them to the client. |
@darrelmiller The difference between Interface/URI class/Operation type on the one hand, and link relation type on the other hand is more of a conceptual nature than on a technical one. The link relation type describes to the client how the linked resource relates to the context resource. All your examples "help", "stylesheet", "search", even "cancel" have different meanings (for the client) depending on where you find those links. (And yes, I agree that the relation type is important too.) But this is not what our "interface" is about. This is more about "In whichever relation I find a link declared as To expand, there might be this interface definitions:
Person:
type: object
properties:
name:
type: string
self:
type: string
format: uri
interface:
$ref: "#/interfaces/Person"
siblings:
type: array
items:
type: string
format: uri
relation: sibling
interface:
$ref: "#/interfaces/Person"
partners:
type: array
items:
type: string
format: uri
relation: partner
interface:
$ref: "#/interfaces/Person"
interfaces:
Person:
get:
responses:
200:
schema:
$ref: '#/definitions/Person' Doing a GET on a Person URI (e.g. {
"name": "MP",
"self": "https://example.org/people/12345",
"siblings": [ "https://example.org/people/98765", "https://example.org/people/98764" ],
"partners" : [ "https://example.org/people/56789" ]
} The links in the first array here would have |
@darrelmiller My "implementation note essay" is not about the case that there are multiple interfaces for a path, but where in the API specification there is no path for an interface, i.e. the case where the association path → interface is hidden at API definition time (and only known at run time to the client – and at "implementation time" (which is after API definition time) to the server). |
@darrelmiller If I were implementing the 'Person' API in the example, I would do exactly what you would do—I would always use Jane's |
@ePaul Regarding your "essay". Now I understand. OpenAPI is all about being explicit about what resources exist. If you only want to discover them at runtime (a la hypermedia) then OpenAPI quickly becomes mismatched. I think I see what you are trying to achieve with the notion of interface. Personally, I'm quite happy to use a combination of link relations and media type in the content-type header do that description for me. I'm not really sure I am a fan of introducing another concept for clients and servers to couple on. |
@mpnally Yup. Got it. I would use |
@darrelmiller Right. So, back on the original topic, what are the HTTP requests I can send to |
@Bert-R OAS is never going to be a purist specification. It facilitates a variety of HTTP API design approaches, including retrofitting existing APIs whose design approaches may best be described as "ad hoc". All OAS can do is enable different API approaches to be described. So the only question here is "can OAS be used to properly describe a HATEAOS API?", and as long as the OAS supports semantics of "these things may or may not exist at runtime, and additional things not describe here might also exist", then the answer is yes. Any debate about what is or isn't proper HATEOAS is irrelevant to the OpenAPI Specification. |
@handrews I understand. OAS is a broad specification and that's a good thing. My point is: for an HATEOAS extension of OAS, I don't see any value expressing link availability. It's the nature of HATEOAS that link availability is a runtime aspect. @cbornet Creating an API spec for an HATEAOS application certainly makes sense: when creating a client for such a server, one needs to know:
|
I don't want to start a debate here but knowing anything about the application upfront is the opposite of REST. |
@cbornet Interesting thought. We shouldn't have that debate here, but do you have a blog or so (by yourself or someone else) that explains how one would write say a mobile app or some other client without knowing the things I mentioned? |
@Bert-R: Fielding has written about it and, among other things, states the following:
OAS is by definition the "out-of-band information" Fielding describes. @cbornet's OHM attempts to fix this, by bridging the gap between the runtime hypermedia controls and the external, static OAS document. |
@asbjornu Thanks! In either case, a static specification is created that defines the resource representation and the its interpretation. I don't see the point of adding this information dynamically as proposed with OMH. Providing this dynamically causes two major issues:
|
@Bert-R, I don't understand it as a requirement that the entirety of the schema must be discovered during runtime. It just must be "entirely defined within the scope of the processing rules for a media type ". And it is definitely possible to use OAS in a way that upholds that requirement, but only if you adhere to the following constraints:
All of these points are generalized and won't be true in every imaginable context, of course. For point 1 it can be argued that if you use a custom |
@asbjornu I don't see why we would need to rigidly stick with "entirely defined within the scope of the processing rules for a media type". OAS with JSON Schema could describe:
The paths that are used in the links should not be defined as paths in the OAS, as that would stimulate wrong behavior: just use these endpoints instead of following the links. With that, you stay within the concept of defining a mediatype, but in a pragmatic way, while retaining the dynamic hypermedia as the engine of application state. |
Coupling requests and responses to paths goes against HATEOAS. Knowing which possible requests and responses exist is of course required to be able to interoperate with a server, but knowing where and when they are going to show up is a limiting factor.
Sure.
Preferably not. The schema or profile of a given request and response should preferably not be coupled to either a path or an operation. Each request and response should be described "within the scope of the processing rules for the media type". Each request is decoupled from both the response and from which resource the request was made against. It is the server that should decide when a given response is served, and which possible future requests it makes available in that response. The client must of course know the shape of all possible requests and responses in order to deal intelligently with them when they appear, but the client should have no hard-coded presumptions about when they appear. The protocol that describes each possible request and response needs to be rich enough to allow for this flexibility and the effort of designing this protocol is what makes creating a custom hypermedia-enabled media type so difficult.
Indeed.
I can agree that your suggested solution may be considered pragmatic, but I would not say it's adhering to the ultimate dynamism of HATEOAS. |
I understand that HATEOAS supports even more dynamic models than the one I described, but I don't see the added benefits of it, while I clearly see the added cost (complexity of client development, effort of creating a new interface specification language, effort of creating client and server frameworks that support the hypermedia types). |
You still need to describe the URLs for one or several entry points. The client will need to start the interaction somewhere. |
I will second the idea that describing the JSON schema of response is exactly in line with Roy Fielding's description, albeit in a more simple way than using media types. It does not go against HATEOAS. |
The proposal of @mikekistler here would be a great step toward documenting HATEOAS APIs. |
@Bert-R While I agree it's a step in the right direction, I'm not a fan of using Unsafe, non-idempotent, and/or complex operations in HTML are called |
In the purist HATEOAS approach, OAS has no place at all. Everything is dynamic and the content and possibilities are only known at the time you have a response in hand. |
I'm curious, because I've never seen this "purist" approach described anywhere. It doesn't fit with Roy Fielding's descriptions, in his thesis or later. Who does HATEOAS with zero out-of-band information? |
Since that would be impossible, I guess the answer is: No one. |
Well, it's possible in toy examples, but as far as I know, you're right, nobody does that. I'd like to confirm that it's a strawman argument. |
Which argument would that be, and who in this thread has made that argument? |
@cbornet :
|
@kephas, I think that quote is taken a bit out of context. It is of course necessary for the client to know the details about the media type the server responds with. But it's a balancing act, and I agree with @cbornet that having the media type be But we do have a way to go before OpenAPI can be considered within the realm of HATEOAS. |
@kephas IMHO, you are missing a major point here: the OpenAPI specification describes content and possibilities, in a way that a media type could do as well. The responses contain actual links that reflect and drive the application state. A client application should never use the specification to determine what path to follow but only use the links that are returned from the server. A client developer should read the specification to see what possible links will be returned and what is required and can be expected when following these links. |
If an OpenAPI specification didn't contain any URL but the entry points and described the content of each kind of resource and where links are (and how they're supposed to be used, with the methods and type of bodies), it could BE a media type description. |
Frankly, I don't. This idea has been put forth several times accross several issues around HATEOAS and I have never seen any context that gives it a more nuanced meaning. |
OK folks, there are plenty of places to debate HATEOAS purity or lack thereof, all of which are better than a long-closed issue. I'm going to lock this issue now. For ideas on HATEOAS and OpenAPI (keeping in mind that the OAS has never claimed to be a runtime hypermedia system and uses the phrase "HTTP APIs" rather than "REST APIs" very intentionally) there is OAI/sig-moonwalk#30. Please join the discussion there where it might actually influence the future direction of OpenAPI. For general debates about HATEOAS/hypermedia/etc., there are many slack/discord/whatever forums. |
TLDR
(added after some discussion)
Background / Current Situation
I work at a company where our Restful API Guidelines prescribe both "document your API using OpenAPI" and "Use HATEOAS" ("Hypertext as the Engine of Application State", one of the core principles of REST).
Unfortunately both do not work together nicely:
Currently the intersection between both is "have links in your result, but just to URIs which are also described as paths in your Swagger definition, and describe in descriptions what kind of URIs you can expect".
Ideas
A core point of HATEOAS are media types and link relations. An API definition should mainly be a description of the media types and what operations they imply for links included in them. (Paraphrased from a blogpost of Roy Fielding.)
Media type definitions are analogous to what OpenAPI's
definitions
are currently, might even be an extension of that. For models with a property of typestring
, formaturi
we can then additionally refer to the URI class of that URI.A URI class is defined analogously to the current path definitions, but with some name to be used in model definitions. The URI class defines what operations are possible on an URI, which parameters are allowed/needed (no path parameters, I guess), and what can be expected in return. (The URI class is not a property of the URI itself, but of its usage in a specific place. The same URI could appear with multiple URI classes.)
I'm not yet sure how link relations can come into this – I guess we might need a way to define the URI class depending on link relation or similar.
A client needs just some initial "bookmark URI" (so a single path definition might be fine), and then can take a link (with a given URI class) and use the operation behind the URI class to do what it wants to do. The format of the second URI doesn't matter anymore to the client – it can even have a different domain name than the bookmark one. But still everything can be strongly typed if wanted.
The text was updated successfully, but these errors were encountered: