Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve caching documentation #1239

Open
5 tasks done
keflavich opened this issue Sep 18, 2018 · 17 comments
Open
5 tasks done

Improve caching documentation #1239

keflavich opened this issue Sep 18, 2018 · 17 comments

Comments

@keflavich
Copy link
Contributor

keflavich commented Sep 18, 2018

There's very little documentation on the caching process, and users sometimes complain about the cache filling up.

In short, we need documentation stating:

  • The default location is ~/.astropy/cache/astroquery
  • get_cache_dir can be used to find the cache dir
  • some documentation about how clear_download_cache can be used to clear the cache, with examples

[Edit]:

  • add a repeated troubleshooting section to every module's narrative docs that refers to clearing the cache (and upgrading the version) as possible ways to fix failing queries.
  • possibly link the above in the class docstrings, too (along with timeout, row_limit, etc.)
@bsipocz
Copy link
Member

bsipocz commented Sep 18, 2018

This is a duplication of #300, but I'm closing that one given there are more concrete todo items on this one.

@keflavich
Copy link
Contributor Author

Dangit, I searched for 'cache' but not 'caching'. My memory fails.

@bsipocz
Copy link
Member

bsipocz commented Sep 18, 2018

I only remembered because of the recent simbad issues 😄

@Himanshu-Garg
Copy link

Anyone on this?

@bsipocz
Copy link
Member

bsipocz commented Mar 19, 2019

no one, please go ahead!

@Himanshu-Garg
Copy link

Can you pls mention the doc where i have to insert the new info

I cannot find it in docs folder

@Himanshu-Garg
Copy link

here cache folder is not showing in the cloned repo because the repo cloned is of older version but the following page is available in newer version-
http://docs.astropy.org/en/stable/api/astropy.config.get_cache_dir.html#astropy.config.get_cache_dir

so, how to make changes in this page?

@UdokaVrede
Copy link

UdokaVrede commented Oct 20, 2020

can I take this up, I am an outreachy applicant

@bsipocz
Copy link
Member

bsipocz commented Oct 20, 2020

I'm afraid this is a bit miscategorized to be package-novice in a sense it is best suited for users running into cache issues. E.g. we cannot give much guidance on where to document these things to be the most effective, where the users would most likely see it.
Maybe a frequent solutions to issues on the opening page, Troubleshouting?

So you certainly have a go with it, but it would be the most impactful if there was input from users. You can reach some of them on the astroquery slack channel, or the astropy mailing list, or ping the people in issues from the past 1-2 years where the solution was something cache related where the extra docs would help.

So, in summary, you've welcome to work on this, but it is not a straightforward issue because of the meta layer.

@UdokaVrede
Copy link

UdokaVrede commented Oct 20, 2020 via email

@bsipocz bsipocz removed the Easy label Nov 17, 2021
@bsipocz bsipocz added the cache label Nov 18, 2022
@bsipocz
Copy link
Member

bsipocz commented Nov 18, 2022

@ceb8 - while the cache stuff is still fresh, could you have a look at the checklist above and see what else is missing in the updated docs?

@ceb8
Copy link
Member

ceb8 commented Jul 8, 2023

@keflavich Can you explain what you mean by the last item?

@keflavich
Copy link
Contributor Author

My idea was that we should have global documentation about the cache & timeout parameters, and each individual function's docstring that describes those parameters briefly should link back to the (longer and more complete) global documentation.

@JohannesBuchner
Copy link

On sciserver.org, the home directory has very limited space, and one is supposed to work in workspace/Storage//persistent/. It would be good to be able to set the astroquery cache directory to another location.

Some workarounds:

  • regularly delete ~/.astropy/cache/download
  • make ~/.astropy a symlink to another place
  • delete ~/.astropy and create a file ~/.astropy, then astroquery cache fails with WARNING: CacheMissingWarning: Remote data cache could not be accessed due to FileExistsError: [Errno 17] File exists: '/home/idies/.astropy' [astropy.utils.data and WARNING: CacheMissingWarning: ("Cache directory cannot be read or created ([Errno 17] File exists: '/home/idies/.astropy'), providing data in temporary file instead.", '/home/idies/workspace/Storage/jbuchner2/chispec/tmp/astropy-download-12547-5hzkwzf3') [astropy.utils.data] and stores in TMPDIR instead

This is unexpected because I am using SDSS.get_spectra(..., cache=False), so I expected no caching.

A environment variable or other way to influence caching could be useful.

@keflavich
Copy link
Contributor Author

@JohannesBuchner cache=False absolutely should result in no files being cached. When I run SDSS.query_region(..., cache=False), it does not write files - as expected. I've tested get_spectra, too, and nothing is written to disk:

rslt = SDSS.query_region(SkyCoord(200.5*u.deg, 20*u.deg), radius=45*u.arcsec, cache=False, spectro=True)
spec = SDSS.get_spectra(matches=rslt, cache=False)

The other workaround for your situation is to change the cache_location:

SDSS.cache_location = 'workspace/Storage//persistent/'

If you're seeing a case where cache=False is still caching a file, please report that as a separate issue. As reported right now, I cannot reproduce your problem.

@bsipocz
Copy link
Member

bsipocz commented Oct 21, 2024

@ceb8 - is astroquery still on your radar? It would be nice to wrap up the caching issues and make it all consistent and well documented.

@ceb8
Copy link
Member

ceb8 commented Oct 25, 2024

@bsipocz Yes. I don't know when I will get back to it. But it's definitely on my radar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants