-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JuliaAstro/JuliaSpace use case #26
Comments
Thanks @helgee, I really appreciate these concrete use cases (despite the slow reply)! I think DataSets is exactly the right kind of package to help with some of these these things, though certainly there's some thinking to do and code to write to make it all work smoothly. Firstly, I think I noted on Zulip that DataSets doesn't have a web data download backend yet so there's some work to do there. Perhaps some of the code from RemoteData/DataDeps could help inform the details. For your data workflows, I think we can support (2) and (3) already quite nicely. In particular, the current system is quite flexible about mapping dataset names to data storage via the data projects mechanism and Ironically I think it's workflow (1) which will require some work. For this, there's currently no nice way to define datasets which are distributed via DataSets.register_package_dataset("path/to/package/Data.toml") Within Data.toml, we might have something like the following: [[datasets]]
name="EarthOrientation/EOP"
description="IERS Earth orientation parameter data"
[datasets.storage]
driver="download"
type="BlobTree"
[datasets.storage.cache]
updates="thursdays"
[[datasets.storage.files]]
url="https://datacenter.iers.org/data/csv/finals.all.csv"
path="finals.csv"
[[datasets.storage.files]]
url="https://datacenter.iers.org/data/csv/finals2000A.all.csv"
path="finals2000A.csv" I've specified the currently-fictitious "download" storage driver here, but it could also be some other DataSets driver depending on the need. For example, small amounts of static data could just be distributed with the source code. After that, how do you access the datasets within your package code? If you want type safety and in-memory caching of the data, there will still be a need for # Lazily loaded in-memory cache of EOP_DATA for current Julia session.
# Uses the name "EarthOrientation/EOP" to connect to the data defined in Data.toml
# Can be overridden if the user has a dataset of the same name in JULIA_DATASETS_PATH
const EOP_DATA = DataRef{EOParams}("EarthOrientation/EOP")
function use_data()
data = EOP_DATA[] # Internally, uses `dataset("EarthOrientation/EOP")` to read the dataset if not already in memory?
# do something with `data
end About your data dependencies with the EarthOrientation.jl vs Astrodynamics.jl approaches... these both seem like valid patterns for their own use cases. DataSets can enable various data to be swapped in though depending on the environment though, so it could be valid to have |
Continuation of our discussion on Zulip: https://julialang.zulipchat.com/#narrow/stream/295423-juliaspace/topic/Lift-off!/near/248190318
CC: @ronisbr
Background
Within the JuliaAstro/JuliaSpace ecosystem there are several packages which need acces to data sets on the internet some of which get updated regularly.
This includes:
Workflows
We foresee several different workflows depending on the environment:
Current Solution
We currently use a combination of OptionalData.jl and RemoteFiles.jl to handle workflows 1 & 2. As of now, we do not have a solution for workflow 3.
Here's an example from EarthOrientation.jl:
Issues
update
function for manual updates. For example, AstroBase.jl depends on AstroTime.jl which depends on EarthOrientation.jl. In principle, AstroBase.jl needs anupdate
function which callAstroTime.jl
'supdate
function which callsEarthOrientation.jl
'supdate
function and so on.position(eph::AbstractEphemeris, t, ...)
. Astrodynamics.jl is the top-layer opinionated metapackage and defines a global default ephemeris via the approch above, e.g.position(t, ...)
.The text was updated successfully, but these errors were encountered: