Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Multi-Objective cache #872

Merged
merged 16 commits into from
Jul 7, 2022
Merged

Fix Multi-Objective cache #872

merged 16 commits into from
Jul 7, 2022

Conversation

renesass
Copy link
Collaborator

@renesass renesass commented Jul 4, 2022

Closes #852.

I slightly rearranged the runhistory so that it's more intuitive.

What else did I do?
average_cost/min_cost/sum_cost return a list of floats (instead of a float) now.
For example, for average_cost, each objective is averaged separately. Imagine you have two runs with [100, 200] and [0, 0] then you'd get [50, 100]. However, once you call get_cost, this (cached) [50, 100] is normalized based on all passed entries. That means that get_cost returns a float in the end.

Unfortunately, the intensifier and also some other methods work with average_cost or sum_cost (which should actually be a private method). So, I have to call normalize_cost in the intensifier again. Since _cost_per_config(cache for average_cost) is always storing the current state with all objective values, we are caching it correctly and normalize it once it's needed in the get_cost method.

Edit: I added an argument in average_cost/min_cost/sum_cost to return the normalized (single float) values. Makes the code easier and scalable.

One more thing which really is weird: In the runhistory2epm we iterate over the run dictionary (runkey, runvalue). But here we use the run.cost directly and hence we have to call normalize_costs in the case of MO again. I wonder why this is done this way? @mfeurer

@renesass renesass linked an issue Jul 4, 2022 that may be closed by this pull request
@renesass renesass requested review from benjamc and dengdifan July 6, 2022 06:50
"""
Transform a multi-objective loss to a single loss.

Parameters
----------
values (np.ndarray): Normalized values.
values : list[float]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue, that the normalization is part of the aggregation strategy and should probably be moved here instead

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a reason why the values should not be normalized here.

Copy link
Collaborator

@timruhkopf timruhkopf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normalization returns a one vector if budget = None. This may not be intuitive and desirable.

  • we might want to make normalization part of the aggregation strategy instead (making it explicit)

  • We could as well call the aggregationStrategy ScalarizationStrategy, implying that we want to make a scalar objective from the multi objective

@renesass renesass merged commit 100e2f3 into development Jul 7, 2022
@renesass renesass deleted the fix_mo_cache branch July 7, 2022 09:02

# Normalize st all theta values sum up to 1
theta = theta / (np.sum(theta) + 1e-10)

# Weight the values
theta_f = theta * values

return np.max(theta_f, axis=1) + self.rho * np.sum(theta_f, axis=1)
return np.max(theta_f, axis=0) + self.rho * np.sum(theta_f, axis=0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat surprised to see that the summation over an axis changes without a respective test changing. Does this mean that there is no test for ParEGO?

numerator = data - min_value
normalized_values.append(numerator / denominator)
cost = p / q
costs += [cost]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not costs.append()?

renesass added a commit that referenced this pull request Jul 14, 2022
## Features
* [BOinG](https://arxiv.org/abs/2111.05834): A two-stage Bayesian optimization approach to allow the 
optimizer to focus on the most promising regions.
* [TurBO](https://arxiv.org/abs/1910.01739): Reimplementaion of TurBO-1 algorithm.
* Updated pSMAC: Can pass arbitrary SMAC facades now. Added example and fixed tests.

## Improvements
* Enabled caching for multi-objectives (#872). Costs are now normalized in `get_cost` 
or optionally in `average_cost`/`sum_cost`/`min_cost` to receive a single float value. Therefore,
the cached cost values do not need to be updated everytime a new entry to the runhistory was added.

## Interface changes
* We changed the location of Gaussian processes and random forests. They are in the folders
`epm/gaussian_process` and `epm/random_forest` now.
* Also, we restructured the optimizer folder and therefore the location of the acquisition functions
and configuration chooser.
* Multi-objective functions are located in the folder `multi_objective`.
* pSMAC facade was moved to the facade directory.

Co-authored-by: Difan Deng <[email protected]>
Co-authored-by: Eddie Bergman <[email protected]>
Co-authored-by: Carolin Benjamins <[email protected]>
Co-authored-by: timruhkopf <[email protected]>
github-actions bot pushed a commit that referenced this pull request Jul 14, 2022
sharpe5 added a commit to sharpe5/SMAC3 that referenced this pull request Aug 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Integrate Multi-Objective caching
3 participants