Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make registering same DataDep twice a no-op #88

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

briochemc
Copy link
Contributor

Trying to fix issue #86 (not sure what I am doing at this point)

Trying to fix issue oxinabox#86 (not sure what I am doing at this point)
@codecov-io
Copy link

codecov-io commented Feb 15, 2019

Codecov Report

Merging #88 into master will increase coverage by 0.4%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master      #88     +/-   ##
=========================================
+ Coverage    53.5%   53.91%   +0.4%     
=========================================
  Files          12       12             
  Lines         228      230      +2     
=========================================
+ Hits          122      124      +2     
  Misses        106      106
Impacted Files Coverage Δ
src/registration.jl 60% <100%> (+10%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6c94424...f43cb0f. Read the comment docs.

@oxinabox oxinabox changed the title Update registration.jl Make registering same DataDep twice a no-op Feb 15, 2019
@@ -5,6 +5,9 @@ const registry = Dict{String, AbstractDataDep}()
function register(datadep::AbstractDataDep)
name = datadep.name
if haskey(registry, name)
if isequal(datadep, registry[name])
return registry
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return registry
return registry[name]

Might be a cleaner way to do this, using say:

existing = get(regisitry, name, nothing)
if existing != nothing
     if isequal(datadep, existing)
          return existing
    else
          @warning...
     end
end

or something like that?

@oxinabox
Copy link
Owner

Cool! thanks. I know digging into a new code-base can be tough and I appreciate it.

Needs tests.
Take a look at the Test.logger stuff for checking what warning messages are, or are not printed.

I think isequal (and thus hash) will need to be defined for a DataDep to make this work, but I am not sure.
Tests will prove it

@zerefwayne
Copy link
Contributor

@oxinabox Is any work remaining on this PR? I need this functionality for my new package.

@oxinabox
Copy link
Owner

oxinabox commented Nov 2, 2020

Something is wrong if you need this for your package.
Can you give more info?

@zerefwayne
Copy link
Contributor

zerefwayne commented Nov 2, 2020

@oxinabox

// Returns the URL of dataset
function download(country) {
 register(Datadep(country, ...))
...
}

function get(country) {
resource_path = download(country)

...further processing
}

There's a single get() function in my package which may be called multiple times by a user during runtime. Which leads to register being called multiple times. There's no way to get the path of the dataset or checking the existence if I don't register it once.

If register is not a very heavy process, I don't think doing it multiple times during runtime should be a problem.

UPDATE: I've solved the problem by catching the KeyError thrown by @datadep_str.

@oxinabox
Copy link
Owner

oxinabox commented Nov 2, 2020

recommendation is that register is called from the __init__ of your module.
It registers it into a global dictionary so that datadep"abc" works.
Basically so one can pretend it is a filepath string, which is how it acts.

If you have a function you call that return the DataDep object, you actually don't need to register it.
You can resolve it directly, to get a file path (which may mean it will also do a download if not already done)

function resolve(datadep::AbstractDataDep, inner_filepath, calling_filepath)::String

Though in many packages i have that could do that, I register in the __init__ anyway.
Its just the way normally it is worked with.
Something to rethink maybe if we ever did a big overhall of the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants