-
-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convention for persistent per-package data location? #777
Comments
Yes. Though is is assumed packages-private is only uses when there is a name clash. Solving this with a project directory is also fine, until you have a name clash within a given project |
Just to be clear, by "per-package data location", I mean "for Conda.jl" and "for DataDeps.jl" not "for each package using DataDeps.jl". So I guess it would be something like |
Ah, ok then. In general, I am of the opinion that when throwing around multiple gigabyte folders, Conda is different. You actively do not want the Conda environments bleeding over between julia environments. |
I agree with @oxinabox. I think that we need a mechanism for sharing read-only data in a content-addressable fashion. For writable data, it should be isolated and per-environment, ideally recorded in the project file somewhere or in a separate configuration file. |
That's not the point of my suggestion. Use of a UUID was just an example. It can be
You need de-duplication sometimes, exactly like Pkg3 is doing. For Conda.jl, it is sub-optimal to install full Miniconda installation for each environment (JuliaPy/Conda.jl#123 (comment)). A Miniconda installation (so-called "base" environment) has a package cache to avoid downloading each package every time you create a new environment. IIRC it also uses hard-link for package installations to save diskspace. You can make use of those efforts of |
It's not 100% clear to me how to get a reasonable path for the current environment in There doesn't seem to be any documented API in Julia 1.0 to get the environment of a package. The best I can come up with (based on function projectof(m::Module)
pkg = get(Base.module_keys, m, nothing)
pkg === nothing && return nothing
if pkg.uuid === nothing
for env in Base.load_path()
Base.project_deps_get(env, pkg.name) !== false && return env
end
else
for env in Base.load_path()
Base.manifest_uuid_path(env, pkg) !== nothing && return env
end
end
return nothing
end This returns the project file or path in To get a unique name from the environment path, I suppose we could use |
It's also not clear to me how to achieve the reproducibility goals of a Project.toml file with a Conda.jl installation, since that is a completely foreign (to Pkg) system of dependencies that the user could have mucked with arbitrarily (by |
Here is the Pkg API and usage in Conda.jl I have in mind. The logic below should be implementable if we use package name or UUID instead of module Hypothetical
module Pkg
""" Hypothetical package data API. """
datapath(m::Module) = joinpath(DEPOT_PATH[1], "data", string(m))
""" Hypothetical package option API. """
function options(m::Module)
if endswith(string(m), "Conda")
return Dict(:private_env => rand() < 0.5)
end
return Dict()
end
end # module
module Conda
using ..Pkg
""" Return `~/.julia/data/Conda/envs/\$env`. """
prefix(env) =
joinpath(Pkg.datapath(Conda), "envs", string(env))
"""
Get `conda` executable from the base miniconda installation.
Return `~/.julia/data/Conda/base/bin/conda`.
"""
conda_cmd() = joinpath(Pkg.datapath(Conda), "base", "bin", "conda")
# there is probably a better way to do this...
is_named_env() = startswith(Base.active_project(),
joinpath(DEPOT_PATH[1], "environments"))
""" Get `prefix` for the current conda environment. """
function current_prefix()
if get(Pkg.options(Conda), :private_env, false)
if is_named_env()
# Use Conda environment ~/.julia/data/Conda/envs/$ENV for
# Julia environment ~/.julia/environments/$ENV if package
# option `private_env` is set to `true` for this
# environment:
name = basename(dirname(Base.active_project()))
return prefix(name)
else
# For environment outside DEPOT_PATH[1], use
# $(dirname(Base.active_project()))/.condajl or something
# (or maybe make it configurable)?
return joinpath(dirname(Base.active_project()), ".condajl")
end
else
# Otherwise, default to environment per-Julia installation:
return prefix("v$(VERSION.major).$(VERSION.minor)")
# return prefix("v$(VERSION.major)") # maybe?
end
end
end # module
end # module
I suggest something like
I guess you can use |
Indeed, packages with mutable state (like a package wrapping another package manager) break the assumption that the content hash is enough to reproduce the content. |
I don't understand why
Is this directory even guaranteed to be writable? |
It is unique because (But I guess
When you set the package option |
Depending on the API, |
I'm closing it. It looks like |
In the longer run, we still need something tied to the environment and package options, but getting package options working seems like a higher priority to me. |
|
It looks like Conda.jl JuliaPy/Conda.jl#123 and DataDeps.jl oxinabox/DataDeps.jl#48 need a stable location that persist across package versions. (I'm less certain about DataDeps.jl; @oxinabox please correct me if I'm missing something.) Furthermore, sometimes it is useful to share such location even across different Julia versions to save disk space. I think it would then make sense to document a convention for the directory that each package can use without worrying about name crashes and cluttering
~/.julia
directory.For example, how about that
$(DEPOT_PATH[i])/data/$UUID
can be used by the package whose uuid is$UUID
? (e.g.,~/.julia/data/8f4d0f93-b110-5947-807f-2305c1781a2d
for Conda.jl) (Question: should it be writable only wheni==1
?)cc: @stevengj @oxinabox @Evizero
The text was updated successfully, but these errors were encountered: