Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/1693 api v3 versioned model - Dataset v3 implementation #2046

Merged
merged 34 commits into from
May 9, 2022

Conversation

dk1844
Copy link
Contributor

@dk1844 dk1844 commented Apr 4, 2022

This PR currently implements Enceladus API v3 for

  • general versioned model v3

    • note that general /{modelName}/{entityName}/{versions}/validate exists
  • dataset model v3; for the dataset API, a few changes have been done in the design:

    • .../properties sub-endpoints has been moved from the general versioned model to datasets (dataset-specific feature)
    • .../properties/{propertyName} has not been implemented, since the .../properties endpoint has no additional logic of validation (because of the existence of a general validation endpoint instead)
  • later ADDED: schema and MT existence checking (those referenced in datasets (for MTs - in MCR))

  • schemas, mapping tables, and property definitions are to follow; probably in another PR.

Partially implements #1693
Closes #1611

Release notes suggestion

/api-v3/datasets/ Rest API V3 added. Includes checks on entities and is to be used externally. The API is now truly RESTful (endpoint naming, structure, methods, Location header on creating responses). Also, Swagger API doc reacts to Spring dev profile (JVM arg -Dspring.profiles.active=dev) to show legacy V2 API doc, too.

dk1844 added 19 commits March 7, 2022 15:35
…e contours, empty DatasetControllerV3 to test
 - login allowed at /api/login & /api-v3/login
 - v2/v3 BaseRestApiTest distinquished
…st and post-import + IT to prove correct behavior
…mport + IT to prove correct behavior - fix with common location processing with segment stripping (+normalization)
…ion} now works for # or 'latest' (IT = regression test)
…{name}/{version} and /{name}/latest - improved
…- supports latest for as version-expression, impl for datasets improved by actual existence checking + IT test cases for non-existing/non-latest queries
…properties - supports latest for as version-expression; get impl is unvalidated, put impl checks validity

 - login is now common, under /api/login for both v2 and v3 (did not work previously)
…properties - supports latest for as version-expression; get impl is unvalidated, put impl checks validity - extended for different validation cases

 - login is now common, under /api/login for both v2 and v3 (did not work previously)
…ture -> CompletableFuture (mistake reverted)
… added. IT mostly adjusted, but there are todos

 - DatasetServiceV3 introduced to carry difference in behavior to DatasetService. original entity validation has been divided into create-validation and regular-entity validation.
 - buildfix for VersionedModelServiceTest
…asets/dsName/version/rules, GET datasets/dsName/version/rules/# + IT
Copy link
Contributor

@AdrianOlosutean AdrianOlosutean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it overall, just some small comments.

Also, do you plan to change the frontend to use this api?

@GetMapping(Array("/{name}/export"))
@ResponseStatus(HttpStatus.OK)
def exportLatestEntity(@PathVariable name: String): CompletableFuture[String] = {
versionedModelService.exportLatestItem(name) // todo: remove in favor of the above? (that supports /{name}/latest/export)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my point of view, this can stay

Copy link
Contributor Author

@dk1844 dk1844 Apr 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this is part of the v3 API outline. I consider this misleading, because since you have the option to export specific versions already ( at .../{model}/{name}/{versionEitherNumberOrLatestKeyword}/export, having the option for export here would have to have another meaning - e.g. export the whole entity with all the versions. But since this is designed as the latest export as well, I consider this unnecessary, and as I said, even misleading.

That is why I am proposing to omit this endpoint entirely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if we gave the "latest" version we can omit this.

Another point to think about can be - export/import being on the entity itself. Export having parameters of name and version.

Copy link
Contributor Author

@dk1844 dk1844 May 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After even further discussion, I have decided to remove the endpoint based on the following logic:

The structure of API endpoints representing a collection should, in general, behave in the following fashion:

  • GET /entityNameInPlural -> list items or provide information about all/more items
  • GET /entityNameInPlural/individualEntityKeyEgId -> give information about a specific entity

By that logic, GET /{name}/export would be appropriate if all entities or information about all/more was returned -- perhaps all versions, while GET /{name}/{version}/export would export just a single version.

But since we mapped just the single version (the latest one) export to a collection's endpoint, the logic does not fit. Moreover, given the fact that we also accept latest as a valid version identifier, it seem much more appropriate to use GET /{name}/{version}/export here (specifically GET /{name}/latest/export)

@dk1844
Copy link
Contributor Author

dk1844 commented Apr 6, 2022

...
Also, do you plan to change the frontend to use this API?

In the end, I think the UI should get changed to use this API. But that is obviously outside of the scope of the API update PRs.

@dk1844 dk1844 added the work in progress Work on this item is not yet finished (mainly intended for PRs) label Apr 6, 2022
@dk1844 dk1844 removed the work in progress Work on this item is not yet finished (mainly intended for PRs) label Apr 7, 2022
@dk1844 dk1844 linked an issue Apr 7, 2022 that may be closed by this pull request
@dk1844 dk1844 requested a review from AdrianOlosutean April 7, 2022 08:40
@@ -16,3 +16,5 @@
package za.co.absa.enceladus.model.versionedModel

case class VersionedSummary(_id: String, latestVersion: Int)

case class VersionList(_id: String, versions: Seq[Int])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't Range be used for versions?

Copy link
Contributor Author

@dk1844 dk1844 Apr 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, it could. Or even VersionedSummary with just the latest version would do (assuming that all entities start with v1).

Attachments are the exception, they take after the schema version and there may be holes, but attachments are not exposed directly.

Then there is the matter of not being able to prove that all versions are actually there in the DB, we assume they should be, but... how sure can you be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps let's just use just VersionedSummary - with the latest version

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -113,7 +113,7 @@ class DatasetController @Autowired()(datasetService: DatasetService)
def replaceProperties(@AuthenticationPrincipal principal: UserDetails,
@PathVariable datasetName: String,
@RequestBody newProperties: Optional[Map[String, String]]): CompletableFuture[ResponseEntity[Option[Dataset]]] = {
datasetService.replaceProperties(principal.getUsername, datasetName, newProperties.toScalaOption).map {
datasetService.updatePropertiesV2(principal.getUsername, datasetName, newProperties.toScalaOption).map {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the V2 and V3 suffixes are because of Spring autowiring?
(Looks ugly to me, to be honest 😄 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this has nothing to do with spring autowiring. This is just a way to have V2 and V3 implementation in some services so that the v2 version can be removed easily when only v3 is to stay.

If we want to keep v2 and v3 long-term, this would require to create separate services, e.g. datasetService and datasetServiceV3. One of these solutions is, I believe, necessary if you want to keep the original implementation working.

It then begs the question - what is the intended lifetime of V2 API? Knowing this better, it could help to design better: copy-paste-and-change-a-few-things knowing that the V2 impl will be gone soon vs. keeping v3 extending v2 (no code duplication, but harder removal)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The v3 method moved to DatasetServiceV3, so now it is tidier. Thanks for the tip, @benedeki.

Comment on lines +206 to +209
validation <- for {
generalValidation <- validate(item)
creationValidation <- validateForCreation(item)
} yield generalValidation.merge(creationValidation)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will execute the Futures in serial manner.

Suggested change
validation <- for {
generalValidation <- validate(item)
creationValidation <- validateForCreation(item)
} yield generalValidation.merge(creationValidation)
val generalValidationFuture = validate(item)
val creationValidationFuture = validateForCreation(item)
validation <- for {
generalValidation <- generalValidationFuture
creationValidation <- creationValidationFuture
} yield generalValidation.merge(creationValidation)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't see it as a problem (faster multiple-at-once vs possibly not needed - I don't even see a clear winner here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO Let's comment in place on how the for-comprehension will only run the necessary checks up until the point where it fails first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explanatory comment added, thanks for the tip.

Comment on lines +56 to +60
for {
originalValidation <- super.validate(item)
propertiesValidation <- validateProperties(item.propertiesAsMap)
schemaValidation <- validateSchemaExists(item.schemaName, item.schemaVersion)
rulesValidation <- validateRules(item)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto with concurrency:

Suggested change
for {
originalValidation <- super.validate(item)
propertiesValidation <- validateProperties(item.propertiesAsMap)
schemaValidation <- validateSchemaExists(item.schemaName, item.schemaVersion)
rulesValidation <- validateRules(item)
val originalValidationFuture = super.validate(item)
val propertiesValidationFuture = validateProperties(item.propertiesAsMap)
val schemaValidationFuture = validateSchemaExists(item.schemaName, item.schemaVersion)
val rulesValidationFuture = validateRules(item)
for {
originalValidation <- originalValidationFuture
propertiesValidation <- propertiesValidationFuture
schemaValidation <- schemaValidationFuture
rulesValidation <- rulesValidationFuture

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explanatory comment added, thanks for the tip.

AdrianOlosutean
AdrianOlosutean previously approved these changes May 2, 2022
Copy link
Contributor

@AdrianOlosutean AdrianOlosutean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Comment on lines 42 to 48
or[String](
// api v2
regex("/api/dataset.*"), regex("/api/schema.*"),
regex("/api/mappingTable.*"), regex("/api/properties.*"),
regex("/api/monitoring.*"),regex("/api/runs.*"),
regex("/api/monitoring.*"), regex("/api/runs.*"),
regex("/api/user.*"), regex("/api/spark.*"),
regex("/api/configuration.*")
regex("/api/configuration.*"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not document v2 API as we do not want people to use it. Thoughts?

Maybe spring fox could have multiple profiles? Dev and non-dev

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with not documenting v2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I have changed the implementation to the following:

  • JVM being supplied -Dspring.profiles.active=dev will result in full V2 + V3 API as before,
  • by default (no special profile setup) will only allow Swagger to show API V3 + import/export methods of API v2 (these we consider public API in V2 already).

Hope this is satisfactory.

Showcase DEV -Dspring.profiles.active=dev

menas-api-dev

Showcase Non-DEV

menas-api-nondev

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redone due to #2046 (comment).
Menas DEV API remains unchanged, for Menas non-DEV API now only contains V3 API endpoints (i.e. V2 import/export removed)

…vice and DatasetServiceV3 with renaming.

VersionList removed in favor of VersionedSummary everywhere.
- explanatory comments
Copy link
Contributor

@Zejnilovic Zejnilovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have tested all the endpoints at this point.

Have not reviewed all the code.

versionedModelService.getAuditTrail(name)
}

@GetMapping(Array("/{name}/{version}/used-in"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does a global used-in make sense since Dataset does not need it? I think this should be specific for the entity.

Copy link
Contributor Author

@dk1844 dk1844 May 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I think it makes sense to keep it.

Since the disable is available for all entity types, so should used-in. The nice thing about it is that there can be a common approach to entities:

  1. making sure that there is anything preventing from disabling (used-in)
  2. dealing with linked references returned by used-in if any
  3. disabling the entity knowing that it should succeed (without surprises)

It is quite reassuring for the API user to be able to call used-in even if currently for a certain type there cannot be any entity linked, it perhaps may be in the future?

All in all, I am a fan of having used-ins everywhere (where disable is).

@GetMapping(Array("/{name}/export"))
@ResponseStatus(HttpStatus.OK)
def exportLatestEntity(@PathVariable name: String): CompletableFuture[String] = {
versionedModelService.exportLatestItem(name) // todo: remove in favor of the above? (that supports /{name}/latest/export)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if we gave the "latest" version we can omit this.

Another point to think about can be - export/import being on the entity itself. Export having parameters of name and version.

val paths: Seq[Predicate[String]] = if (isDev) {
v2devPaths ++ v3paths
} else {
v2prodPaths ++ v3paths
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
v2prodPaths ++ v3paths
v3paths

So prod APIs should be only v3 paths as the import-export being "prod" is only an ABSA thing.

This would result in dev having all v2 and v3 paths and prod only v3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification, suggestion applied (along with removing the unnecessary code)

@Zejnilovic Zejnilovic added the PR:tested Only for PR - PR was tested by a tester (person) label May 9, 2022
@sonarcloud
Copy link

sonarcloud bot commented May 9, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 17 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@dk1844 dk1844 requested a review from Zejnilovic May 9, 2022 08:13
Copy link
Contributor

@Zejnilovic Zejnilovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code reviewed
Api tested

@dk1844 dk1844 merged commit a3fdde4 into develop-ver-3.0 May 9, 2022
@dk1844 dk1844 deleted the feature/1693-api-v3-versioned-model branch May 9, 2022 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR:tested Only for PR - PR was tested by a tester (person)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Experimental API 3 - Versioned Model
4 participants