`--inline-blank-nodes` on Turtle-Turtle normalization fails on blank node that only bears a `rdf:type` #52

ajnelson-nist · 2023-05-18T21:40:27Z

I've encountered another issue with the --inline-blank-nodes flag, which I don't wholly suspect is related to #49, but I could see something about some shared code being an influence.

My Java runtime is version 18, and I've freshly produced this issue on v1.14.2.

$ openssl dgst -sha3-256 rdf-toolkit-1.14.2.jar 
SHA3-256(rdf-toolkit-1.14.2.jar)= 2d0efd578994243d43e629629b3bf44da4350268aee8d3c1bae2784ca243a924

I have this input data:

@prefix ex: <http://example.org/ontology/> .
@prefix kb: <http://example.org/kb/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

kb:thing-1
	a owl:Thing ;
	ex:property-1 [
		a owl:Thing ;
	] ;
	.

On running this command ...

java -jar rdf-toolkit-1.14.2.jar --source-format turtle --target-format turtle --source test-input.ttl --target test-output.ttl

... I get output I roughly expect.

@prefix ex: <http://example.org/ontology/> .
@prefix kb: <http://example.org/kb/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

kb:thing-1
	a owl:Thing ;
	ex:property-1 _:blank1 ;
	.

_:blank1
	a owl:Thing ;
	.

However, on running this command, the prior with --inline-blank-nodes added ...

java -jar rdf-toolkit-1.14.2.jar --inline-blank-nodes --source-format turtle --target-format turtle --source test-input.ttl --target test-output.ttl

... I get a stack trace:

17:17:07.807 ERROR o.e.rdf_toolkit.RdfFormatter - RdfFormatter: stopped by unexpected exception: 
17:17:07.809 ERROR o.e.rdf_toolkit.RdfFormatter - RDFHandlerException: unable to generate/write RDF output
17:17:07.809 ERROR o.e.rdf_toolkit.RdfFormatter - org.eclipse.rdf4j.rio.RDFHandlerException: unable to generate/write RDF output
	at org.edmcouncil.rdf_toolkit.writer.SortedTurtleWriter.endRDF(SortedTurtleWriter.java:179)
	at org.eclipse.rdf4j.rio.Rio.write(Rio.java:582)
	at org.edmcouncil.rdf_toolkit.runner.RdfToolkitRunner.runOnFile(RdfToolkitRunner.java:218)
	at org.edmcouncil.rdf_toolkit.runner.RdfToolkitRunner.run(RdfToolkitRunner.java:104)
	at org.edmcouncil.rdf_toolkit.RdfFormatter.run(RdfFormatter.java:64)
	at org.edmcouncil.rdf_toolkit.RdfFormatter.main(RdfFormatter.java:47)
Caused by: org.eclipse.rdf4j.rio.RDFHandlerException: unable to generate/write RDF output
	at org.edmcouncil.rdf_toolkit.writer.SortedRdfWriter.endRDF(SortedRdfWriter.java:534)
	at org.edmcouncil.rdf_toolkit.writer.SortedTurtleWriter.endRDF(SortedTurtleWriter.java:177)
	... 5 more
Caused by: java.lang.NullPointerException: Cannot invoke "org.edmcouncil.rdf_toolkit.model.SortedTurtleObjectList.iterator()" because "firstValues" is null
	at org.edmcouncil.rdf_toolkit.comparator.ComparisonUtils.isCollection(ComparisonUtils.java:138)
	at org.edmcouncil.rdf_toolkit.writer.SortedTurtleWriter.writeObject(SortedTurtleWriter.java:410)
	at org.edmcouncil.rdf_toolkit.writer.SortedTurtleWriter.writeObject(SortedTurtleWriter.java:397)
	at org.edmcouncil.rdf_toolkit.writer.SortedTurtleWriter.writePredicateAndObjectValues(SortedTurtleWriter.java:335)
	at org.edmcouncil.rdf_toolkit.writer.SortedTurtleWriter.writeSubjectTriples(SortedTurtleWriter.java:294)
	at org.edmcouncil.rdf_toolkit.writer.SortedRdfWriter.endRDF(SortedRdfWriter.java:508)
	... 6 more

Strangely, none of these similar test inputs trigger an error:

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[]
	a owl:Thing ;
	.

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

[ a owl:Thing ; ] .

@prefix ex: <http://example.org/ontology/> .
@prefix kb: <http://example.org/kb/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

kb:thing-1
	a owl:Thing ;
	ex:property-1 [
		rdfs:label ""@en ;
	] ;
	.

@prefix ex: <http://example.org/ontology/> .
@prefix kb: <http://example.org/kb/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

kb:thing-1
	a owl:Thing ;
	ex:property-1 [] ;
	.

This next sample did fail with the same stack trace:

@prefix ex: <http://example.org/ontology/> .
@prefix kb: <http://example.org/kb/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

kb:thing-1
	a owl:Thing ;
	ex:property-1 _:blank1 ;
	.

kb:thing-2
	a owl:Thing ;
	ex:property-1 _:blank1 ;
	.

_:blank1
	a owl:Thing ;
	.

In summary, the stack trace seems to appear when:

--inline-blank-nodes is passed,
there is a blank node that is in the object position of one or more triples,
the only triple with that blank node as subject has predicate rdf:type.

Trying --target-format rdf-xml had no effect; I still got the stack trace in the same conditions with --target-format turtle.

Impact: This bug has lead to some code re-designs, because of running rdf-toolkit to normalize some inferencing results that could only infer anonymous nodes having a type.

The text was updated successfully, but these errors were encountered:

…nodes This patch uses the inherence UUID functions from `case-utils` PR 112 to replace the blank nodes generared with SPARQL Construct queries. As side effects of this migration, some bugs were fixed with generating some associations, and inherence modeling assumptions are now specified in code comments. This patch also adds `prov:Start` and `prov:End` nodes to reify `prov:Activity` (and `case-investigation:InvestigativeAction`) time boundaries. This will be a significant assistance in OWL-Time-based visualization under development for `case-prov` PR 54. Creating the `prov:Start` and `prov:End` nodes as IRI-identified is also necessary because of a bug observed in `rdf-toolkit`; see their Issue 52. Since `case_prov_rdf` will now be able to generate non-blank nodes, it has picked up two behaviors used in other projects importing `case-utils`: * The `--use-deterministic-uuids` flag has been added. * The `CASE_DEMO_NONRANDOM_UUID_BASE` environment variable can now be used to make non-inherent deterministic UUIDs. A follow-on patch will regenerate Make-managed files. References: * #54 * casework/CASE-Utilities-Python#112 * edmcouncil/rdf-toolkit#52 Signed-off-by: Alex Nelson <[email protected]>

…nodes This patch uses the inherence UUID functions from `case-utils` PR 112 to replace the blank nodes generated with SPARQL Construct queries. As side effects of this migration, some bugs were fixed with generating some associations, and inherence modeling assumptions are now specified in code comments. This patch also adds `prov:Start` and `prov:End` nodes to reify `prov:Activity` (and `case-investigation:InvestigativeAction`) time boundaries. This will be a significant assistance in OWL-Time-based visualization under development for `case-prov` PR 54. Creating the `prov:Start` and `prov:End` nodes as IRI-identified is also necessary because of a bug observed in `rdf-toolkit`; see their Issue 52. Since `case_prov_rdf` will now be able to generate non-blank nodes, it has picked up two behaviors used in other projects importing `case-utils`: * The `--use-deterministic-uuids` flag has been added. * The `CASE_DEMO_NONRANDOM_UUID_BASE` environment variable can now be used to make non-inherent deterministic UUIDs. A follow-on patch will regenerate Make-managed files. References: * #54 * casework/CASE-Utilities-Python#112 * edmcouncil/rdf-toolkit#52 Signed-off-by: Alex Nelson <[email protected]>

mereolog · 2023-05-19T06:31:00Z

@ajnelson-nist could you try to run it using 4293ce8, i.e., the latest commit from master?

ajnelson-nist · 2023-05-19T12:28:45Z

@mereolog I'm happy to report all tests I listed in this issue passed with the --inline-blank-nodes flag.

This test dumped a strange message without the --inline-blank-nodes flag, though:

@prefix ex: <http://example.org/ontology/> .
@prefix kb: <http://example.org/kb/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

kb:thing-1
	a owl:Thing ;
	ex:property-1 [] ;
	.

Shell transcript:

$ make check
java -jar ../target/rdf-toolkit-1.14.2.jar \
	  --source passed-4.ttl \
	  --source-format turtle \
	  --target _normalized-passed-4.ttl \
	  --target-format turtle
**** blank node not a subject: node1h0pvf100x1
mv _normalized-passed-4.ttl normalized-passed-4.ttl

If it helps, here's the Makefile:

#!/usr/bin/make -f

# Portions of this file contributed by NIST are governed by the following
# statement:
#
# This software was developed at the National Institute of Standards
# and Technology by employees of the Federal Government in the course
# of their official duties. Pursuant to title 17 Section 105 of the
# United States Code this software is not subject to copyright
# protection and is in the public domain. NIST assumes no
# responsibility whatsoever for its use by other parties, and makes
# no guarantees, expressed or implied, about its quality,
# reliability, or any other characteristic.
#
# We would appreciate acknowledgement if the software is used.

SHELL := /bin/bash

RDF_TOOLKIT_JAR := ../target/rdf-toolkit-1.14.2.jar

all: check

check: \
  normalized-failed-1.ttl \
  normalized-failed-2.ttl \
  normalized-passed-1.ttl \
  normalized-passed-2.ttl \
  normalized-passed-3.ttl \
  normalized-passed-4.ttl

normalized-%.ttl: \
  %.ttl
	java -jar $(RDF_TOOLKIT_JAR) \
	  --source $< \
	  --source-format turtle \
	  --target _$@ \
	  --target-format turtle
	mv _$@ $@

mereolog · 2023-05-24T15:54:55Z

@mereolog I'm happy to report all tests I listed in this issue passed with the --inline-blank-nodes flag.

This test dumped a strange message without the --inline-blank-nodes flag, though:

@prefix ex: <http://example.org/ontology/> .
@prefix kb: <http://example.org/kb/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

kb:thing-1
	a owl:Thing ;
	ex:property-1 [] ;
	.

Shell transcript:

$ make check
java -jar ../target/rdf-toolkit-1.14.2.jar \
	  --source passed-4.ttl \
	  --source-format turtle \
	  --target _normalized-passed-4.ttl \
	  --target-format turtle
**** blank node not a subject: node1h0pvf100x1
mv _normalized-passed-4.ttl normalized-passed-4.ttl

If it helps, here's the Makefile:

#!/usr/bin/make -f

# Portions of this file contributed by NIST are governed by the following
# statement:
#
# This software was developed at the National Institute of Standards
# and Technology by employees of the Federal Government in the course
# of their official duties. Pursuant to title 17 Section 105 of the
# United States Code this software is not subject to copyright
# protection and is in the public domain. NIST assumes no
# responsibility whatsoever for its use by other parties, and makes
# no guarantees, expressed or implied, about its quality,
# reliability, or any other characteristic.
#
# We would appreciate acknowledgement if the software is used.

SHELL := /bin/bash

RDF_TOOLKIT_JAR := ../target/rdf-toolkit-1.14.2.jar

all: check

check: \
  normalized-failed-1.ttl \
  normalized-failed-2.ttl \
  normalized-passed-1.ttl \
  normalized-passed-2.ttl \
  normalized-passed-3.ttl \
  normalized-passed-4.ttl

normalized-%.ttl: \
  %.ttl
	java -jar $(RDF_TOOLKIT_JAR) \
	  --source $< \
	  --source-format turtle \
	  --target _$@ \
	  --target-format turtle
	mv _$@ $@

As you may expect, this happens when a bnode is not a subject in any triple. The serialiser does not like such cases because, as I tried to explain in #49 (comment), it cannot sort such nodes (and consequently the triples in which they occur).

The comment in the code says that last resort - this should never happen - rather dogmatic, but it is just a warning.

dbpierson · 2023-07-27T19:06:49Z

Hi. I’m well retired, but I’ve been following this. I believe the past tense of the common verb lead is led. The noun spelled lead that sounds like led is a heavy metal. I see this too often not to comment. Please forgive me for intruding into your discussion. Dennis Pierson

…

On Jul 27, 2023, at 7:57 AM, Pawel Garbacz ***@***.***> wrote: Closed #52 <#52> as not planned. — Reply to this email directly, view it on GitHub <#52 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA7757S7FYD34EYVDA2SL23XSJXWPANCNFSM6AAAAAAYG7VIBA>. You are receiving this because you are subscribed to this thread.

ElisaKendall · 2023-07-27T19:43:51Z

@dbpierson Too funny - more likely a typo than intentional based on what I know of the author. Some of our other participants are non-native speakers of English, and I would not be surprised to see more of this sort of thing in their responses, though.

Nice to see you are still "lurking". Hope you are enjoying your retirement!

mereolog closed this as not planned Won't fix, can't repro, duplicate, stale Jul 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`--inline-blank-nodes` on Turtle-Turtle normalization fails on blank node that only bears a `rdf:type` #52

`--inline-blank-nodes` on Turtle-Turtle normalization fails on blank node that only bears a `rdf:type` #52

ajnelson-nist commented May 18, 2023

mereolog commented May 19, 2023

ajnelson-nist commented May 19, 2023

mereolog commented May 24, 2023

dbpierson commented Jul 27, 2023 via email

ElisaKendall commented Jul 27, 2023

--inline-blank-nodes on Turtle-Turtle normalization fails on blank node that only bears a rdf:type #52

--inline-blank-nodes on Turtle-Turtle normalization fails on blank node that only bears a rdf:type #52

Comments

ajnelson-nist commented May 18, 2023

mereolog commented May 19, 2023

ajnelson-nist commented May 19, 2023

mereolog commented May 24, 2023

dbpierson commented Jul 27, 2023 via email

ElisaKendall commented Jul 27, 2023

`--inline-blank-nodes` on Turtle-Turtle normalization fails on blank node that only bears a `rdf:type` #52

`--inline-blank-nodes` on Turtle-Turtle normalization fails on blank node that only bears a `rdf:type` #52