Skip to content
This repository has been archived by the owner on Jun 18, 2023. It is now read-only.

Error when phenology fitting is enabled #49

Closed
valpasq opened this issue Oct 22, 2015 · 12 comments
Closed

Error when phenology fitting is enabled #49

valpasq opened this issue Oct 22, 2015 · 12 comments
Labels
Milestone

Comments

@valpasq
Copy link
Collaborator

valpasq commented Oct 22, 2015

Error occurs when attempting to run phenology fitting:

  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 190, in line
    ltm = pheno.LongTermMeanPhenology(yatsm, **cfg['phenology'])
TypeError: __init__() got an unexpected keyword argument 'year_interval'

Config location: /projectnb/landsat/projects/Massachusetts/p012r031/p012r031_config.yaml

@ceholden ceholden added the docs label Oct 22, 2015
@ceholden ceholden added this to the v0.5.4 milestone Oct 22, 2015
@ceholden
Copy link
Owner

Whoops. If you look at line 190 of yatsm line you'll see that I'm passing the entire phenology section of the config file to the __init__ method of the LTM phenology class. Some of the configuration options are actually for the fit method, and one of them (enable) is just a toggle.

This speaks to a larger issue when parameterizing the algorithms with a config file -- we probably need to separate the __init__ and fit sections. An example of what I'm proposing would be as follows:

# Section for phenology fitting
phenology:
    enable: False
    init:
        # Specification for dataset indices required for EVI based phenology monitoring
        red_index: 2
        nir_index: 3
        blue_index: 0
        # Scale factor for reflectance bands
        scale: 0.0001
        # You can also specify index of EVI if contained in dataset to override calculation
        evi_index:
        evi_scale:
        # Number of years to group together when normalizing EVI to upper and lower percentiles
        year_interval: 3
        # Upper and lower percentiles of EVI used for max/min scaling
        q_min: 10
        q_max: 90

This way it's clear what are hyperparameters and what are parameters used in the fitting. Related to #50. This proposed change would affect all other algorithm configurations (e.g., CCDCesque)

ceholden added a commit that referenced this issue Oct 22, 2015
@valpasq
Copy link
Collaborator Author

valpasq commented Oct 22, 2015

Okay, I think I'm following, but just to be clear, to run the phenology code, I should have enable: True ? (I wasn't getting the error with enable: False, was assuming that was because I wasn't running that section with the toggle off)

I guess my bigger question is what (if anything) can I do to start running the phenology code? Is there a change I can make to the .yaml config to prevent that error from happening (e.g. what you proposed above)? Or does something need to change in the code. Sorry if this is a dumb question.

@ceholden
Copy link
Owner

For the short term, the phenology calculation should run without error and without you changing your configuration file (commit 8c4313e).

In the next week though I'd like to adjust the config file format to subsection it into init and fit sections as long as that seems like something that makes sense to do. Most algorithms designed around the scikit-learn object API model have hyperparameters in their __init__ methods and many also have a data-dependent variables that control the fitting (e.g., an iteration tolerance). Since we're adapting these conventions where possible, it seems to me to make sense to further split the config file into fit and init.

Any thoughts? This config file change seems like an omission from the v0.5.0 release that should be made for the future and would be included in another release, but I don't want to keep proposing changes just to keep change things.

@ceholden ceholden added bug and removed docs labels Oct 23, 2015
@valpasq
Copy link
Collaborator Author

valpasq commented Oct 23, 2015

Thanks for the fix! Looking like YATSM with phenology `enable: True`` is running without issue now...

I think it does make sense to subsection the config file into init and fit sections, especially if this is a convention used by scikit-learn. This way it's more clear how the parameters are used.

In terms of making changes, I would say better to tweak things now while we are all in testing mode after the new release. It will only get harder as more users start working with the code, so if something makes sense and improves clarity and functionality, it's worth making the change sooner than later.

@valpasq
Copy link
Collaborator Author

valpasq commented Oct 23, 2015

Uh-oh. New problem:

File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/cli/line.py", line 201, in line
    q_max=pcfg['q_max'])
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/phenology.py", line 283, in fit
    year_interval, q_min, q_max)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/phenology.py", line 249, in _fit_record
    ) + 1 + peak_doy + 1
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/numpy/lib/nanfunctions.py", line 218, in nanmin
    res = np.fmin.reduce(a, axis=axis, out=out, keepdims=keepdims)
ValueError: zero-size array to reduction operation fmin which has no identity

Not sure if this info helps, but 871 YATSM results (lines) were produced before the above error occurred..

@ceholden
Copy link
Owner

Is that for line 872 then? Looks like there is a bug for situations I haven't encountered before. My best guess is that the percentile scaling creates an array that is 100% np.nan, but I definitely need to get find the data that caused this to track it down as I've never encountered this.

@valpasq
Copy link
Collaborator Author

valpasq commented Oct 23, 2015

I'm not sure how to tell which line(s) actually generated the error...
I had 100 jobs, as far as I can tell, all ended with that error.

The log files are in
projectnb/landsat/projects/Massachusetts/p012r031/, results are in
./images/YATSM/

I ran the code using the following script:

module load python/2.7.5
module load gdal/1.11.1
module load R/R-3.1.1

njob=100
for job in $(seq 1 $njob); do
    qsub -j y -V -l h_rt=24:00:00 -N yatsm_$job -l eth_speed=10 -b y  \
        yatsm line --resume \

/projectnb/landsat/projects/Massachusetts/p012r031/p012r031_config.yaml \
        $job $njob
done

Let me know if there's any other info that might help.

@ceholden
Copy link
Owner

If you turn on verbose messages, using yatsm -v [subcommand], then yatsm -v line will spit out what line it's working on.

As for the error you're encountering, the bad news is that I'm having a remarkably hard time reproducing it given that all of your jobs seem to fail. I've placed a "now running this column" log style log statement and submitted two qsub jobs for your stack. One job is running with almost exactly your environment (e.g., dependencies from module system) and another using conda. I'll see if that can help track down what data are causing the issue.

@valpasq
Copy link
Collaborator Author

valpasq commented Oct 24, 2015

Thanks for looking into it. I didn't realize there was a verbose option for yatsm line (though I have noticed my log files are relatively sparse before an error occurs. I've added the -v to my script for future runs.

If there are any runs you want me to do or other ways I could help diagnose, just let me know.

@ceholden
Copy link
Owner

Reproduced the bug for column 5238 in line 900. The pixel timeseries that throws the error looks to be from a very dark and non-vegetated land cover. The DOY profile for EVI for this timeseries is so poor that the estimated middle of the season (peak_doy) is so late into the year that there are no observations for the autumn, hence the "zero-size array" in the exception message. There are likely other scenarios in which the estimated date is so early that there are no observations in the spring period.

I've pushed what I think fixes the problem to issue49 branch. I check to make sure spring and autumn splined EVI arrays have 1+ observations.

It fixed the issue for me for that one pixel, but I'm waiting on the results for 1/100th (jobid 1 out of 100) to run through before I'll confirm. You can pull and checkout the issue49 branch if you want to test but I should get to merge into master before Monday.

ceholden added a commit that referenced this issue Oct 26, 2015
Remove data-related vars from __init__ and move to fit.
Remove hyperparams from fit and move to __init__.
Also split pheno configuration into init and fit
ceholden added a commit that referenced this issue Oct 26, 2015
ceholden added a commit that referenced this issue Oct 26, 2015
ceholden added a commit that referenced this issue Oct 26, 2015
ceholden added a commit that referenced this issue Oct 26, 2015
@valpasq
Copy link
Collaborator Author

valpasq commented Oct 30, 2015

Pulled updated repository yesterday (currently up to date). Tried running YATSM with phenology enabled. A good number of lines process with no issue, but eventually the following error occurs:

  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/phenology.py", line 309, in fit
    self.q_min, self.q_max)
  File "/usr3/graduate/valpasq/Documents/yatsm/yatsm/phenology.py", line 241, in _fit_record
    pad_start = np.arange(1, yeardoy[:, 1].min() + 1)
  File "/project/earth/packages/Python-2.7.5/lib/python2.7/site-packages/numpy/core/_methods.py", line 29, in _amin
    return umr_minimum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation minimum which has no identity 

yatsm_*.o* logfiles located in /projectnb/landsat/projects/Massachusetts/
Verbose turned on to help ID which lines caused the error.

@ceholden
Copy link
Owner

Moving to new ticket...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants