-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a build tool #959
Comments
I think this is the first thing we need if we are to get some automation going. I'll make a draft PR soonish. |
hi @fkleedorfer ! Good idea, but could you elaborate a bit on what do you want to automate?
|
Would like the work to be done in reasonable small chunks (because I dont have enormous amounts of time for it), so I'd like to first not add new functionality, just automate existing. We are looking at a lot of things that can be added once the build automation is in place. The first problem is choosing the build system itself. I did not get a lot of input on the question in the discussion, however, the current favorite is maven. That's what my PR will be about. At the moment I am looking at how to do TTL formatting in that setting. (Probably jena prettyprint but we'll see, there is also https://github.com/atextor/turtle-formatter ). Weirdly, no maven integration for either. (Sideglance spotless) |
@fkleedorfer But is there a problem with the turtle formatting of QUDT? I think it comes from TQ, and I think it's just fine? |
(Accidentally deleted my post so I rewrite it here) If formatting was part of the build, our life would be easier. That is not to say that TQ formatting is bad. If we can use it in a build then mayb we should. |
I think the serialization we use in TopBraid is fairly common - alphabetical by grouped subject - isn't it? I assume that same serialization is available via the TQ API if we use that for inferencing and validation in the build, although I haven't checked. I'm not sure what the PySHACL library does, but my understanding is that it is slower and not complete. |
OWL-API is also common. @ashleysommer @nicholascar can you comment on completeness of pySHACL? |
Else go for RDF Canonicalization https://www.w3.org/TR/rdf-canon/ (is this in the TQ Suite?) |
Canonicalization is relevant for consistent ordering of blank nodes across multiple serializations. That's the one thing most formatters will fail to do. |
Don't most contributors submit relatively small PRs, typically new units, where they can follow the existing formatting even by hand? In addition to the question of formatting, let's collect other needs for a build workflow. Like checking data consistency using SPARQL. see my two bullets above. |
Would you be ok wrapping the SPARQL queries in a SHACL shape or would you prefer another way, such as a folder with files containing sparql queries, and some convention for how their results should be interpreted? |
I vote for a SHACL shape, since we already do other validations that way (not yet part of the build). |
@steveraysteveray and @fkleedorfer SHACL vs SPARQL:
I like it. If classes and props follow naming conventions, then that sorts them in the proper order. But I see Florian contributing to https://github.com/atextor/turtle-formatter: |
My point would be that formatting should be accessible to any developer who wants to contribute. I don't think that will be the case with TopBraid. I was hoping to be able to do it with jena, but it's not so simple. turtle-formatter is a decent solution for us (if it works, which is what I'm working on). As there is more to formatting your codebase than just formatting one file, I've prepared a contribution to spotless - a spotless RDF plugin, if you like, that will use whatever we manage on the file-formatting side (turtle-formatter for TTL, jena for everything else, or just not support anything else), to format the whole codebase. The spotless RDF plugin is more or less done, except for tests, and we'll need a published turtle-formatter jar with our changes. EDIT: My impression of turtle-formatter is that its default output is ok, it is highly configurable, and the codebase is small and I'm confident we can contribute any formatting options that we need, for example, individuals last. |
|
@nicholascar is this the formatter you use? |
|
RDF Toolkit seems like a good tool, but it does not have the stable inline blank nodes feature I just put into turtle-formatter: edmcouncil/rdf-toolkit#49. The good thing is that now I know how to do it ;-) - but I don't know if I want to put in the time again. However, I like their approach on formatting (git hook with the binary, all you need to do is install java and set JAVA_HOME). and I do like the end result of their pipeline: https://spec.edmcouncil.org/fibo/ontology/ @ralphtq @steveraysteveray @jhodgesatmb might want to see this as another possible direction to take the whole build/publication process Not convincet at this point that all of this warrants a switch, but it's certainly worth thinking about it. |
Hi all, pySHACL is under active development as we are using it every day in large projects lead by Ashley, the main developer of it.
Also, please note that there is a SHACL WG proposed (some of you have contributed to this proposal already!) and one of the deliverables for the WG which I will likely lead is an "Inferencing Rules" spec that work to standardise how SHACL build rules may be ordered, bundled, selected etc.: https://w3c.github.io/shacl/charter-1.2/shacl-wg.html#deliverables
Sure, there are lots of build rule systems out there but many are not widely used (e.g. RIF-CS) whereas SHACL is widely used for validation, some UI generation and some data building, as per TQ products. So let's extend and standardise that so all our data building rules can be just a bunch of RDF Shapes files, not weird other-language things or over-the-to reasoning envelopes.
Nick
…On Tuesday, 17 September 2024 at 17:28, Florian Kleedorfer ***@***.***> wrote:
[RDF Toolkit](https://github.com/edmcouncil/rdf-toolkit) seems like a good tool, but it does not have the stable inline blank nodes feature I just put into turtle-formatter: [edmcouncil/rdf-toolkit#49](edmcouncil/rdf-toolkit#49). The good thing is that now I know how to do it ;-) - but I don't know if I want to put in the time again.
However, I like their approach on formatting (git hook with the binary, all you need to do is install java and set JAVA_HOME). and I do like the end result of their pipeline: https://spec.edmcouncil.org/fibo/ontology/ ***@***.***(https://github.com/ralphtq) ***@***.***(https://github.com/steveraysteveray) ***@***.***(https://github.com/jhodgesatmb) might want to see this as another possible direction to take the whole build/publication process
Not convincet at this point that all of this warrants a switch, but it's certainly worth thinking about it.
—
Reply to this email directly, [view it on GitHub](#959 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/ABX3SED6O3SPKEDKUL7ANP3ZW7KYZAVCNFSM6AAAAABM2D6GUCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJUG42TENRVGQ).
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Well, FWIW, the shacl-maven-plugin was just released, which supports validation and inferencing. @nicholascar thanks for the pointer to those standardization efforts. Very much looking forward to theresults. Hopefully, not too far off SHACL-AF Rules. |
PR #975 addresses this issue |
Use a build tool?
Problem: All Issues brought up so far require or aim at some kind of build automation. There currently is none.
Why is that a problem: Anything that needs to be done manually will cause errors, bottlenecks and dependency on individuals
Cause: Most programming languages/frameworks come with a variety of build tools, and most projects use one. However, this is an ontology project, inherently independent from programming languages, and therefore, it is not obvious what should be used. That is probably the reason why none is in use.
Fix: Choose one build tool that the community can live with and refactor the project so it uses that tool. Bonus: github actions become easier to make and maintain because they might only need to run some build targets
So, question: What would be your criteria for choosing a build tool, and which one, if any, should it be?
Originally posted by @fkleedorfer in #942 (comment)
Edit: collecting requirements/ideas/aspects from the comments here (and my own)
This issue is not about adding new functionality, just about automating what is currently done manually or semi automatically
Incomplete list of future functionality to be implemented in the build
derivedCoherentUnitOfSystem, hasBaseUnit
andconversionMultiplier
#952 to check that all and only "fundamental units" have conversionMultiplier=1The text was updated successfully, but these errors were encountered: