Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TAIR locus identifiers: ATGs vs numeric IDs #5

Open
cmungall opened this issue Apr 19, 2018 · 11 comments
Open

TAIR locus identifiers: ATGs vs numeric IDs #5

cmungall opened this issue Apr 19, 2018 · 11 comments

Comments

@cmungall
Copy link

Trying to reconcile this with GO

tair.locus is expected to be ^AT[1-5]G\d{5}$

https://www.ebi.ac.uk/miriam/main/datatypes/MIR:00000050
https://github.com/identifiers-org/registry/blob/master/prefix/tair.locus.md

However, in GO, we use prefix TAIR and local IDs locus:2005496

https://github.com/geneontology/go-site/blob/73ee1c0dd6128e08b788dffbe0025eb1fd4c3c06/metadata/db-xrefs.yaml#L2845-L2867

We would like to use URIs such as http://identifiers.org/tair.locus/2005496 but these don't resolve.

There doesn't appear to be anything in id.org for the numeric IDs, just the At3g15890 accessions

cc @tberardini @tonysawfordebi @kltm

@tberardini
Copy link

Not all TAIR loci are of the format ^AT[1-5,M,C]G\d{5}$. We annotate some unsequenced genomic loci which have names that don't conform to the sequenced loci. This is why the identifier is the bare number locus:nnnnnn or gene:nnnnnn. How can we (TAIR) work with identifiers.org on the numerical accessions so that all annotated objects can be resolved through their system?

@sarala
Copy link
Member

sarala commented Apr 19, 2018

Hi Chris and Tanya,

Thanks for pointing this out. I have fixed the pattern now. However, to be consistent with what we already have for tiar.gene, the URL looks like https://identifiers.org/tair.locus/Locus:2005496.

I would like to explore your suggestion tair.locus/2005496. Is this how you would prefer to access this resource? What about tair.gene and tair.protein? Please let me know your thoughts.

@tberardini
Copy link

I don't have a preference for either tair.locus/2005496 or tair.locus/Locus:2005496. Using the numerical id + the type (locus) covers our use case of having both sequenced and unsequenced loci.

@cmungall
Copy link
Author

I would prefer tair.locus/nnnn

(no more MGI:MGIs!)

cmungall added a commit to geneontology/noctua-models that referenced this issue Apr 19, 2018
@jmcmurry
Copy link

jmcmurry commented Apr 24, 2018

+1 to tair.locus/nnnn; this is what I recommended a few years ago

@cmungall
Copy link
Author

Any further decision on this?

@tberardini
Copy link

Who needs to decide? Do we (TAIR) need to do anything?

@cmungall
Copy link
Author

cmungall commented May 18, 2018 via email

@jmcmurry
Copy link

@cmungall Unfortunately, at the moment, identifiers.org prefix file is a one-way street; a PR would not be worthwhile. In time...

@sarala
Copy link
Member

sarala commented May 21, 2018

I have updated tair.locus record to support http://identifiers.org/tair.locus/2005496 or you could use the compact identifier form tair.locus:2005496. This will require everyone to change how they access tair.locus.

@sarala
Copy link
Member

sarala commented May 21, 2018

Regarding MGI - https://identifiers.org/MGI:2442292 works. If you are using the URL form it will be https://identifiers.org/mgi/MGI:2442292

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants