-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Validate dependencies.yaml using jsonschema #29
Conversation
As an example, applying this patch:
and rerunning tests results in
|
Just making some notes of my planned next steps here to help reviewers understand where I was going with this and identify any blind spots I may have:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the inline-comments, some general recommendations:
- We should strive for significantly increasing the composability of the schema file. Using
$ref
links, it is possible to define sub-schemas, such as RequirementStrings, RequirementStrings (or RequirementStringList), and Requirements which could be either one of the these two. - As already mentioned, the schema file should be placed in a separate file.
- We need to think about how the schema is maintained and also how it is published. One of the easiest approaches is to tie it to the package version and then access it directly via the GitHub raw or download endpoint. In this way the version would be directly encoded as part of the URL. The downside of that approach is that the schema version would potentially change even if there are no changes to it. Finally, it would need to be understood that the schema is part of the API (even if the user-documentation does not directly reference it) and must therefor be backwards-compatible unless a release with breaking changes is made.
- We should also validate the schema itself as part of the tests for this repository.
Overall, I would recommend to refactor this schema to be significantly more modular before actually checking its correctness. I find it extremely hard to read right now and it would take me significantly more effort compared to refactoring it first and then comparing it to the current non-formalized specification as part of the documentation.
src/rapids_dependency_file_generator/rapids_dependency_file_validator.py
Outdated
Show resolved
Hide resolved
src/rapids_dependency_file_generator/rapids_dependency_file_validator.py
Outdated
Show resolved
Hide resolved
src/rapids_dependency_file_generator/rapids_dependency_file_validator.py
Outdated
Show resolved
Hide resolved
src/rapids_dependency_file_generator/rapids_dependency_file_validator.py
Outdated
Show resolved
Hide resolved
src/rapids_dependency_file_generator/rapids_dependency_file_validator.py
Outdated
Show resolved
Hide resolved
The previous version would actually require an argument.
refactor: schema validation
Thanks to a makeover from @csadorf I think this PR is ready for review. @ajschmidt8 let me know what you think of it. One minor note, I snuck in an isort bugfix. |
🎉 This PR is included in version 1.1.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
This PR enables validating the contents of a dependencies.yaml file directly without doing any processing. The schema is encoded using [JSON Schema](https://json-schema.org/) and validated using [the Python implementation](https://python-jsonschema.readthedocs.io/). The new Python code is fairly minimal, and it would be even shorter except that I leveraged the object-oriented API to show all errors in a file instead of simply showing the first error using `jsonschema.validate`. The majority of the new lines are from the schema definition. The validation is injected into the normal CLI usage so that schemas are always validated before dependency files are generated, ensuring that developers see useful errors about why their dependencies.yaml file is invalid rather than opaque runtime errors when dfg fails to use the file. --------- Co-authored-by: Simon Adorf <[email protected]>
This PR enables validating the contents of a dependencies.yaml file directly without doing any processing. The schema is encoded using JSON Schema and validated using the Python implementation. The new Python code is fairly minimal, and it would be even shorter except that I leveraged the object-oriented API to show all errors in a file instead of simply showing the first error using
jsonschema.validate
. The majority of the new lines are from the schema definition. The validation is injected into the normal CLI usage so that schemas are always validated before dependency files are generated, ensuring that developers see useful errors about why their dependencies.yaml file is invalid rather than opaque runtime errors when dfg fails to use the file.