Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vizier query for columns with underscore in name #3124

Closed
rjs3273 opened this issue Oct 24, 2024 · 7 comments · Fixed by #3153
Closed

Vizier query for columns with underscore in name #3124

rjs3273 opened this issue Oct 24, 2024 · 7 comments · Fixed by #3153

Comments

@rjs3273
Copy link

rjs3273 commented Oct 24, 2024

I am unable to get catalogue queries to work on Vizier for columns which contain an underscore. I know I have done it in the past, so I must be doing something daft now. In fact I have the code that used to work about a year or so ago. It still runs, but simply ignores columns that contain an underscore in the name.

Here are a couple of examples on two different catalogues.

from astroquery.vizier import Vizier
from astropy.coordinates import SkyCoord
import astropy.units as u
vizierCatName = "II/336/apass9"
catalogs = Vizier.get_catalogs(Vizier.find_catalogs(vizierCatName).keys())
print(catalogs[0].keys())
v = Vizier(catalog=vizierCatName,columns=['RAJ2000', 'DEJ2000', 'Vmag', 'r_mag'])
cent = SkyCoord(ra=100*u.degree, dec=30*u.degree, frame='fk5')
result = v.query_region(cent, radius=0.05*u.degree)
print(result)
print(result[0])

The results say three columns where I asked for four. It includes the Vmag data, but not the r_mag.

['RAJ2000', 'DEJ2000', 'e_RAJ2000', 'e_DEJ2000', 'Field', 'nobs', 'mobs', 'B-V', 'e_B-V', 'Vmag', 'e_Vmag', 'Bmag', 'e_Bmag', 'g_mag', 'e_g_mag', 'r_mag', 'e_r_mag', 'i_mag', 'e_i_mag']

TableList with 1 tables:
'0:II/336/apass9' with 3 column(s) and 20 row(s)

RAJ2000 DEJ2000 Vmag
deg deg mag


100.022444 29.953985 14.110
100.033717 29.969932 --
100.022465 29.954233 --
100.010035 29.953945 15.400
99.989656 29.959382 15.368
100.006628 29.972751 15.475
100.017447 29.977752 15.638
100.006698 29.972827 --

And a similar query on Tycho, for which the RA,DEC have underscores in their name.

from astroquery.vizier import Vizier
from astropy.coordinates import SkyCoord
import astropy.units as u
vizierCatName = "I/259/tyc2"
catalogs = Vizier.get_catalogs(Vizier.find_catalogs(vizierCatName).keys())
print(catalogs[0].keys())
v = Vizier(catalog=vizierCatName,columns=['RA_ICRS_', 'DE_ICRS_', 'BTmag', 'VTmag'])
cent = SkyCoord(ra=100*u.degree, dec=30*u.degree, frame='fk5')
result = v.query_region(cent, radius=0.1*u.degree)
print(result)
print(result[0])

Which returns the following. The table includes only two columns for the two magnitudes. No RA,DEC.

['TYC1', 'TYC2', 'TYC3', 'pmRA', 'pmDE', 'BTmag', 'VTmag', 'HIP', 'RA_ICRS_', 'DE_ICRS_']

TableList with 1 tables:
'0:I/259/tyc2' with 2 column(s) and 4 row(s)

BTmag VTmag
mag mag


11.540 10.461
12.700 11.887
11.625 10.366
11.831 10.931

@keflavich
Copy link
Contributor

The problem is that there is no r_mag in that catalog, instead it is r'mag:

image

So if you replace r_mag with r'mag, it works:

v = Vizier(catalog=vizierCatName,columns=['RAJ2000', 'DEJ2000', 'Vmag', 'r\'mag',]
result = v.query_region(cent, radius=0.05*u.degree)
result[0]

gives

['RAJ2000', 'DEJ2000', 'e_RAJ2000', 'e_DEJ2000', 'Field', 'nobs', 'mobs', 'B-V', 'e_B-V', 'Vmag', 'e_Vmag', 'Bmag', 'e_Bmag', 'g_mag', 'e_g_mag', 'r_mag', 'e_r_mag', 'i_mag', 'e_i_mag']
TableList with 1 tables:
	'0:II/336/apass9' with 4 column(s) and 20 row(s)
Out[7]:
<Table length=20>
 RAJ2000    DEJ2000     Vmag   r_mag
   deg        deg       mag     mag
 float64    float64   float32 float32
---------- ---------- ------- -------
100.022444  29.953985  14.110  13.746
100.033717  29.969932      --  15.479
100.022465  29.954233      --      --
100.010035  29.953945  15.400  15.207
 99.989656  29.959382  15.368  14.950
100.006628  29.972751  15.475  15.291
100.017447  29.977752  15.638  15.325
100.006698  29.972827      --      --
100.006732  29.972883      --      --
100.044166  30.012513  15.607  15.308
 99.980119  29.981877  14.538  14.470
 99.976337  29.967789  16.271  15.763
 99.976501  29.968596      --  16.122
 99.980080  29.981995  14.596  14.512
 99.943878  30.004463  14.531  14.246
 99.971428  29.999754  13.034  12.894
 99.958174  30.019172  16.043  16.094
 99.952535  30.028457      --      --
100.031142  30.032612  10.786  10.478
 99.990081  30.037989  16.034  15.787

There is a real problem here, which is that catalogs[0].keys() has replaced ' with _. I think we do this internally to avoid other errors related to special characters, but this is clearly an unclear/unexpected behavior.

@keflavich
Copy link
Contributor

The return from Vizier is correct:

    <FIELD name="r'mag" ucd="PHOT_SDSS_R" datatype="float" width="6" precision="3" unit="mag">
      <DESCRIPTION>[5.1/23.9]? r'-band AB magnitude, Sloan filter</DESCRIPTION>
      <VALUES null="NaN" />
    </FIELD>

and as far as I can tell, this renaming is happening in the astropy votable parser.

@rjs3273
Copy link
Author

rjs3273 commented Oct 24, 2024

Thanks. That is starting to make a bit of sense. Using the the r'mag name gives me a workaround for now.

As a bit of background, there seem to have been some changes on the formatting of this database not long ago. For years I had code running that used "r_mag" on Vizier APASS searches. In January this year that stopped working and I modified the code to use "r'mag" as you mention. That worked until last week, when that too stopped working. I have not been able to get to the bottom of whether the changes were in astroquery or the database endpoint on Vizier.

As you demonstrate, "r'mag" is working again for me now, so I can use that for the time being.

And then, on my second example in the original question, the Tycho catalogue had 'RA_ICRS_', 'DE_ICRS_' for the RA and DEC. Looking on Vizier itself, they actually use "RA(ICRS)" and "DEC(ICRS)". I just tested and I can indeed run searches for those if I just use the parentheses.

>>> vizierCatName = "I/259/tyc2"
>>> catalogs = Vizier.get_catalogs(Vizier.find_catalogs(vizierCatName).keys())
>>> print(catalogs[0].keys())
['TYC1', 'TYC2', 'TYC3', 'pmRA', 'pmDE', 'BTmag', 'VTmag', 'HIP', 'RA_ICRS_', 'DE_ICRS_']
>>> v = Vizier(catalog=vizierCatName,columns=["RA(ICRS)", "DE(ICRS)", 'BTmag', 'VTmag'])
>>> cent = SkyCoord(ra=100*u.degree, dec=30*u.degree, frame='fk5')
>>> result = v.query_region(cent, radius=0.1*u.degree)
>>> print(result[0])
  RA_ICRS_     DE_ICRS_   BTmag  VTmag 
    deg          deg       mag    mag  
------------ ------------ ------ ------
 99.96182083  29.91847056 11.540 10.461
100.05428361  29.95318139 12.700 11.887
100.05607917  29.94456611 11.625 10.366
100.02939944  30.03363583 11.831 10.931

It therefore looks like multiple problematic special characters are getting mapped to underscore.

You can still retrieve the values by checking the underlying column name on Vizier, but as you say it is probably 'unexpected behaviour'

@rjs3273
Copy link
Author

rjs3273 commented Oct 24, 2024

My original question title seems potentially misleading now that you have spotted the true problem. Would it be good practice for me to edit the question title or is it best left as is? The real issue seems to relate to column names in Vizier that contain any punctuation marks.

@bsipocz bsipocz added the vizier label Oct 25, 2024
@bsipocz
Copy link
Member

bsipocz commented Oct 25, 2024

and as far as I can tell, this renaming is happening in the astropy votable parser.

@keflavich - Would you mind either reporting it upstream with a self-contained example or should we just move this same issue?

@ManonMarchand
Copy link
Member

Hi all,

I checked the history of these catalogs and the column names did not change in VizieR.

The only instruction I could find in the VOtable standard is that column names should not start with a number, and can contain any unicode character. I don't think VizieR is doing anything wrong here.

@keflavich
Copy link
Contributor

Thanks @ManonMarchand, good to know this is allowed.

@bsipocz I think this needs to become an astropy votable issue. We'll need an appropriate MWE; if I can find 20m, I'll do that, but probably after Nov 15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants