You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I, with a lot of help from @ukemi, have a question about how the reasoning over a specific relationship chain in the "AmiGO (with regulates)" option and in MouseMine is working.
The question arises due to a question from an MGI user about repeatedly seeing discrepancies between AmiGO and MGI in the results of queries for gene sets annotated to a GO term and its children, with one specific example being the GO term “embryonic morphogenesis”.
To investigate, I obtained lists of GO annotations to get the gene lists associated with the GO term “embryonic morphogenesis” from each of these sources:
Here is the summary of my comparison of the GO terms that are present within each of the above 4 sets of annotations to the term “embryonic morphogenesis”
Two of these options, ii. = MGI MouseMine and iv. = AmiGO with “includes regulates” produced identical sets of GO terms.
The third option, iii. = AmiGO default option (without “includes regulates”), produces a smaller set of GO terms lacking the 10 regulation terms present in options ii. & iv. as expected.
However, option i., downloading an Excel file from MGI’s Gene Ontology Annotations page for “embryonic morphogenesis” produced a set of GO terms with 17 additional regulates terms NOT included in options ii. (MouseMine) or iv. (AmiGO with regulates).
Picture showing all differences between the 4 sets. Note that list of terms present in all 4 sets is truncated.
Our question: What causes the difference between option i. and options ii. & iv. The 17 terms included in option i. that are NOT included in option ii. & iv. are all regulates terms. David and I have looked carefully at two representative terms and think we may have an explanation of what is going on.
Example 1 – representative of regulation terms present in options i., ii., AND iv.
David’s recollection is that there was a conscious decision that it is not appropriate to reason over the relationship chain “regulates-over-part_of” as this chain does NOT always mean that the first term "regulates" the third term in all places where a “regulates-over-part_of” chain occurs in the ontology. We did look at the RO term “regulates (RO:0002211)” in Protégé (see attached picture) and this confirms David’s recollection that this chain is not asserted for the term “regulates”.
Getting back to our Question, we would like input to know if this explanation (NOT reasoning on “regulates over part_of” relationship chains) is consistent with the reasoning over relationships that is applied in AmiGO with regulates (option iv.) and in MouseMine (option ii.) and thus a possible explanation for this discrepancy, noting that we are also assuming that the MGI Gene Ontology Annotations page (option i.) is NOT reasoning over relationships, but is just going down the chains of relationships and including ALL terms.
Hmmm. So it looks like the closure is supposed to happen over part_of. So it is a mystery why the top terms in the spreadsheet are missing from the closure.
I, with a lot of help from @ukemi, have a question about how the reasoning over a specific relationship chain in the "AmiGO (with regulates)" option and in MouseMine is working.
The question arises due to a question from an MGI user about repeatedly seeing discrepancies between AmiGO and MGI in the results of queries for gene sets annotated to a GO term and its children, with one specific example being the GO term “embryonic morphogenesis”.
To investigate, I obtained lists of GO annotations to get the gene lists associated with the GO term “embryonic morphogenesis” from each of these sources:
** https://www.informatics.jax.org/go/term/GO:0048598
** https://www.mousemine.org/mousemine/results.do?trail=%257Cquery
** https://amigo.geneontology.org/amigo/term/GO:0048598?relation=isa_partof
** https://amigo.geneontology.org/amigo/term/GO:0048598?relation=regulates
Here is the summary of my comparison of the GO terms that are present within each of the above 4 sets of annotations to the term “embryonic morphogenesis”
Two of these options, ii. = MGI MouseMine and iv. = AmiGO with “includes regulates” produced identical sets of GO terms.
The third option, iii. = AmiGO default option (without “includes regulates”), produces a smaller set of GO terms lacking the 10 regulation terms present in options ii. & iv. as expected.
However, option i., downloading an Excel file from MGI’s Gene Ontology Annotations page for “embryonic morphogenesis” produced a set of GO terms with 17 additional regulates terms NOT included in options ii. (MouseMine) or iv. (AmiGO with regulates).
Picture showing all differences between the 4 sets. Note that list of terms present in all 4 sets is truncated.
Our question: What causes the difference between option i. and options ii. & iv. The 17 terms included in option i. that are NOT included in option ii. & iv. are all regulates terms. David and I have looked carefully at two representative terms and think we may have an explanation of what is going on.
Example 1 – representative of regulation terms present in options i., ii., AND iv.
** OLS term page: https://www.ebi.ac.uk/ols4/ontologies/go/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FGO_006068
*** Shows is_a hierarchy within regulation terms
** OLS term page: https://www.ebi.ac.uk/ols4/ontologies/go/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FGO_0001715?lang=en
*** Searching withing page for “embryonic morphogenesis” shows an ALL is_a hierarchy between the original search term “embryonic morphogenesis” and the child term “prostatic bud formation”
Example 2 – representative of regulation terms present ONLY in option i.
** OLS term page: https://www.ebi.ac.uk/ols4/ontologies/go/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FGO_0042666
** Shows is_a hierarchy within regulation terms
** OLS term page: https://www.ebi.ac.uk/ols4/ontologies/go/classes/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FGO_0060513?lang=en
** Searching withing page for “embryonic morphogenesis” shows presence of part_of relationships in between the original search term “embryonic morphogenesis” and the child term “ectodermal cell fate specification”
David’s recollection is that there was a conscious decision that it is not appropriate to reason over the relationship chain “regulates-over-part_of” as this chain does NOT always mean that the first term "regulates" the third term in all places where a “regulates-over-part_of” chain occurs in the ontology. We did look at the RO term “regulates (RO:0002211)” in Protégé (see attached picture) and this confirms David’s recollection that this chain is not asserted for the term “regulates”.
Getting back to our Question, we would like input to know if this explanation (NOT reasoning on “regulates over part_of” relationship chains) is consistent with the reasoning over relationships that is applied in AmiGO with regulates (option iv.) and in MouseMine (option ii.) and thus a possible explanation for this discrepancy, noting that we are also assuming that the MGI Gene Ontology Annotations page (option i.) is NOT reasoning over relationships, but is just going down the chains of relationships and including ALL terms.
FAO: @balhoff @kltm
The text was updated successfully, but these errors were encountered: