Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to ontology (possibly around Rhea xrefs or riot) prevent full ontology build and block other pipeline builds #28404

Closed
1 of 2 tasks
kltm opened this issue Jul 5, 2024 · 4 comments
Assignees
Labels

Comments

@kltm
Copy link
Member

kltm commented Jul 5, 2024

Between 8am and 4pm PT, on July 4th, the ontology build started to fail, bringing down or blocking multiple pipelines. Looking at the PRs in that window, I suspect the changes happened earlier, but took a couple of runs to sync into the build.

While the fatal build step seems to be:

11:39:45  INVALID ONTOLOGY FILE ERROR Could not load a valid ontology from file: enhanced.owl
11:39:45  For details see: http://robot.obolibrary.org/errors#invalid-ontology-file-error
  11:39:45  Use the -vvv option to show the a trace.
11:39:45  Use the --help option to see usage information.
11:39:45  
11:39:45  real	0m8.802s
11:39:45  user	0m48.590s
11:39:45  sys	0m3.159s
11:39:45  make: *** [Makefile:202: reasoned.owl] Error 1

The most suspicious part leading up to that seems to be this set of Rhea identifier errors:

11:38:20  2024.07.05 18:38:17 [�[33mWARN�[0m] �[32mammonite.$file.$up.util.filter$minusrhea$minusxrefs.xrefsToRemove:69:16�[0m - No such Rhea identifier, filtering xref: RHEA:80287
11:38:20  2024.07.05 18:38:17 [�[33mWARN�[0m] �[32mammonite.$file.$up.util.filter$minusrhea$minusxrefs.xrefsToRemove:69:16�[0m - No such Rhea identifier, filtering xref: RHEA:79423
11:38:20  2024.07.05 18:38:17 [�[33mWARN�[0m] �[32mammonite.$file.$up.util.filter$minusrhea$minusxrefs.xrefsToRemove:69:16�[0m - No such Rhea identifier, filtering xref: RHEA:80271
11:38:20  2024.07.05 18:38:17 [�[33mWARN�[0m] �[32mammonite.$file.$up.util.filter$minusrhea$minusxrefs.xrefsToRemove:69:16�[0m - No such Rhea identifier, filtering xref: RHEA:80271
11:38:20  2024.07.05 18:38:17 [�[33mWARN�[0m] �[32mammonite.$file.$up.util.filter$minusrhea$minusxrefs.$anonfun.applyOrElse:82:20�[0m - No such Rhea identifier, filtering definition xref: RHEA:80287
11:38:20  2024.07.05 18:38:17 [�[33mWARN�[0m] �[32mammonite.$file.$up.util.filter$minusrhea$minusxrefs.$anonfun.applyOrElse:82:20�[0m - No such Rhea identifier, filtering definition xref: RHEA:79423
11:38:20  2024.07.05 18:38:17 [�[33mWARN�[0m] �[32mammonite.$file.$up.util.filter$minusrhea$minusxrefs.$anonfun.applyOrElse:82:20�[0m - No such Rhea identifier, filtering definition xref: RHEA:80271
11:38:20  2024.07.05 18:38:17 [�[31mERROR�[0m] �[32mammonite.$file.$up.util.filter$minusrhea$minusxrefs.main:97:46�[0m - Obsolete Rhea ID used in xref: RHEA:67620
11:38:20  2024.07.05 18:38:17 [�[31mERROR�[0m] �[32mammonite.$file.$up.util.filter$minusrhea$minusxrefs.main:97:46�[0m - Obsolete Rhea ID used in xref: RHEA:67624

I would also note the following oddity:

11:36:32  riot -q --nocheck --output ntriples go-edit.facts.ttl | sed 's/ /\t/' | sed 's/ /\t/' | sed 's/ \.$//' >go-edit.facts
11:37:10  18:37:04 WARN  riot            :: [line: 349, col: 1 ] Bad IRI: <http://www.geneontology.org/formats/oboInOwl#http://purl.obolibrary.org/obo/go#source> Code: 0/ILLEGAL_CHARACTER in FRAGMENT: The character violates the grammar rules for URIs/IRIs.
11:37:10  18:37:05 WARN  riot            :: [line: 403735, col: 4 ] Bad IRI: <http://www.geneontology.org/formats/oboInOwl#http://purl.obolibrary.org/obo/go#source> Code: 0/ILLEGAL_CHARACTER in FRAGMENT: The character violates the grammar rules for URIs/IRIs.
11:37:10  18:37:06 WARN  riot            :: [line: 526884, col: 4 ] Bad IRI: <http://www.geneontology.org/formats/oboInOwl#http://purl.obolibrary.org/obo/go#source> Code: 0/ILLEGAL_CHARACTER in FRAGMENT: The character violates the grammar rules for URIs/IRIs.

Ideally, this issue should only be closed when both of these criteria are met:

  • a fix to the ontology is in place
  • tests on PRs are in place to prevent this happening in the future

Tagging @pgaudet @balhoff @sjm41

@kltm kltm added the bug label Jul 5, 2024
@kltm kltm changed the title Changes to ontology (possibly around Rhea xrefs or riot) prevent full ontology build and block full pipeline builds Changes to ontology (possibly around Rhea xrefs or riot) prevent full ontology build and block other pipeline builds Jul 5, 2024
@sjm41
Copy link
Contributor

sjm41 commented Jul 5, 2024

I'm not sure the RHEA errors/warnings are the root cause of the issue here, but here's a quick report on the offending IDs:

RHEA:80287
RHEA:80271
RHEA:79423

From looking at the associated tickets, these three RHEAs are in the RHEA internal DB but not yet in the public file - they all related to recently created GO terms:

id: GO:0141208
name: protein lysine delactylase activity
namespace: molecular_function
def: "Catalysis of the reaction: H2O + N6-lactoyl-L-lysyl-[protein] + NAD = L-lysyl-[protein] + nicotinamide +2''-O-lactoyl-ADP-D-ribose, removing a lactoyl group attached to a lysine residue in a protein." [PMID:38512451, RHEA:80287]
xref: RHEA:80287
is_a: GO:0033558 ! protein lysine deacetylase activity
property_value: term_tracker_item "#28015" xsd:anyURI

id: GO:0141207
name: peptide lactyltransferase (ATP-dependent) activity
namespace: molecular_function
def: "Catalysis of the reaction: lactate + ATP + L-lysyl-[protein] = N(6)-lactoyl-L-lysyl-[protein]+ AMP + diphosphate. Can also act on free lactate." [PMID:38512451, PMID:38653238, RHEA:80271]
synonym: "peptide lactyltransferase (ATP dependent) activity" EXACT []
synonym: "peptide lactyltransferase activity" BROAD []
xref: RHEA:80271 {source="skos:exactMatch"}
xref: RHEA:80271 {comment="skos:narrowMatch"}
is_a: GO:0016886 ! ligase activity, forming phosphoric ester bonds
is_a: GO:0140096 ! catalytic activity, acting on a protein
property_value: term_tracker_item "#28015" xsd:anyURI

id: GO:0141200
name: UTP thiamine diphosphokinase activity
namespace: molecular_function
def: "Catalysis of the reaction: UTP + thiamine = UMP + thiamine diphosphate." [PMID:38547260, RHEA:79423]
xref: RHEA:79423
is_a: GO:0016778 ! diphosphotransferase activity
property_value: term_tracker_item "#27518" xsd:anyURI
created_by: pg

I see there's an additional problem with GO:0141207, where the second line needs deleting:
xref: RHEA:80271 {source="skos:exactMatch"}
xref: RHEA:80271 {comment="skos:narrowMatch"}


Obsolete Rhea ID used in xref: RHEA:67620

This has been replaced with RHEA:78471 and RHEA:78475

Obsolete Rhea ID used in xref: RHEA:67624

And this has been replaced with RHEA:78479 and RHEA:78507

That might mean that all four new RHEAs should be narrowMatch xref on the associated GO term, but I haven't checked:

id: GO:0071164
name: RNA cap trimethylguanosine synthase activity
namespace: molecular_function
def: "Catalysis of two successive methyl transfer reactions from AdoMet to the N-2 atom of guanosine, thereby converting 7-methylguanosine in an RNA cap to 2,2,7 trimethylguanosine." [PMID:11983179, PMID:18775984]
comment: A 2,2,7-trimethylguanosine (TMG) cap is found on many RNA polymerase II transcribed small noncoding RNAs including small nuclear RNA (snRNA), small nucleolar RNA (snoRNA) and telomerase RNA. It is also found on nematode mRNAs that undergo trans-splicing of a 5'-capped leader sequence.
synonym: "cap hypermethylase activity" EXACT [PMID:11983179]
synonym: "RNA trimethylguanosine synthase activity" EXACT []
synonym: "small nuclear RNA methyltransferase activity" RELATED [GOC:rl]
synonym: "snRNA methyltransferase activity" RELATED [GOC:rl]
xref: RHEA:67620 {source="skos:narrowMatch"}
xref: RHEA:67624 {source="skos:narrowMatch"}
is_a: GO:0008173 ! RNA methyltransferase activity
is_a: GO:0008757 ! S-adenosylmethionine-dependent methyltransferase activity
relationship: part_of GO:0036261 ! 7-methylguanosine cap hypermethylation
property_value: term_tracker_item "#25717" xsd:anyURI
property_value: term_tracker_item "#26934" xsd:anyURI
created_by: mah
creation_date: 2009-11-19T03:23:20Z

@pgaudet
Copy link
Contributor

pgaudet commented Jul 8, 2024

From looking at the associated tickets, these three RHEAs are in the RHEA internal DB but not yet in the public file - they all related to recently created GO terms:

I dont think this what is causing the problem; new RHEAs (not yet public) are allowed. (see https://wiki.geneontology.org/Guidelines_for_database_cross_references#Database_cross-references)

@sjm41
Copy link
Contributor

sjm41 commented Jul 9, 2024

I've fixed the "xref: RHEA:80271 {comment="skos:narrowMatch"}" issue in #28015.

For GO:0071164, I've checked the new RHEAs, and they should all be added as narrowMatch xrefs to this term, so I'll do that now.

sjm41 added a commit that referenced this issue Jul 9, 2024
sjm41 added a commit that referenced this issue Jul 9, 2024
@sjm41 sjm41 removed their assignment Jul 18, 2024
@balhoff
Copy link
Member

balhoff commented Nov 6, 2024

Closing this issue, since this was fixed and we have another ticket to add QC for problems with the 'source' property: #28417

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants