-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[breaking] return config struct from each layer #67
Comments
This is a breaking change because it changes the return of |
So playing around with this. One option is: struct ModelComponents
variables::Vector{Any}
constraints::Vector{Any}
end
function add_predictor(model::JuMP.Model, ::Tanh, x::Vector)
y = JuMP.@variable(model, [1:length(x)], base_name = "moai_Tanh")
_set_bounds_if_finite.(y, -1.0, 1.0)
cons = JuMP.@constraint(model, y .== tanh.(x))
return ModelComponents(y, cons), y
end
function add_predictor(::JuMP.Model, ::ReducedSpace{Tanh}, x::Vector)
return ModelComponents(), tanh.(x)
end with syntax components, y = add_predictor(model, predictor, x) Another option is struct PredictionModel{T}
output::Vector{T}
variables::Vector{Any}
constraints::Vector{Any}
end
function add_predictor(model::JuMP.Model, ::Tanh, x::Vector)
y = JuMP.@variable(model, [1:length(x)], base_name = "moai_Tanh")
_set_bounds_if_finite.(y, -1.0, 1.0)
cons = JuMP.@constraint(model, y .== tanh.(x))
return PredictionModel(y, y, cons)
end
function add_predictor(::JuMP.Model, ::ReducedSpace{Tanh}, x::Vector)
return PredictionModel(tanh.(x), Any[], Any[])
end with syntax ml_model = add_predictor(model, predictor, x)
y = ml_model.output Yet another option is: y = add_predictor(model, predictor, x)
components, y = add_predictor(model, predictor, x; return_components = true) |
Thoughts @Robbybp? |
The idea is that there is no overlap between Are you thinking that variables/constraints are completely unstructured, or somehow mimic the structure of the network (e.g. are indexed by nodes/layers)? Asking because I think this would be nice, but I'm not sure if it can/should be generalized. |
Nah, with overlap. Perhaps each predictor should have a dedicated |
Personally, I'd prefer no overlap.
Maybe, although I could see this getting complicated quickly. Maybe a default |
One issue is that the type of our result
Yip. That is exactly what this is: #80 |
Here are @pulsipher's thoughts from #82 I definitely think it is a good idea to have access to the variables and constraints created for the predictor. This helps to demystify the transformations and I believe provides a way to delete a predictor (i.e., manually deleting the variables and constraints). Here, are some of my thoughts on syntax to accomplish this. I prefer options 2 or 3. 1. Using the approach proposed in #80I don't have any major issues with this approach, except that I think One side question would be why is 2. Tweaking #80 to return only one objectInstead of returning struct SimpleFormulation{P<:AbstractPredictor} <: AbstractFormulation
predictor::P
outputs::Array{Any} # new field for `y`
variables::Vector{Any}
constraints::Vector{Any}
end Then the user can just extract the outputs y from the formulation object as wanted. Going one step further, one could even overload Base.getindex(f::SimpleFormulation, inds...) = getindex(f.outputs, inds...) 3. Store the formulation info in the model and use referencesAdding a little more complexity, we could store the formulation objects in predictor = add_predictor(model, nn, x)
y = outputs(predictor)
cons = transformation_constraints(predictor)
vars = transformation_variables(predictor)
predictor_obj = predictor_object(predictor)
set_transformation(predictor, TransformType()) # change the transformation used in-place
delete(model, predictor) # removes the predictor and anything it added to the model Most of the above API could also be added with option 2. Moreover, we could also overload |
My thought for choosing option 1 is that most people don't actually want the formulation. Most codes will do: _, y = MathOptAI.add_predictor(model, predictor, x)
# or
y, _ = MathOptAI.add_predictor(model, predictor, x)
# or
y = first(MathOptAI.add_predictor(model, predictor, x)) I get that a single return simplifies things, but then I assume most people will immediately do: formulation = MathOptAI.add_predictor(model, predictor, x)
y = formulation.output This violates my design principle, and is one of the things I don't like about OMLT: https://github.com/lanl-ansi/MathOptAI.jl/blob/main/docs/src/developers/design_principles.md#omlt
I think we can make I'm not in a hurry to merge #80. At minimum, I'll wait until we can make the repo public and set up CI etc. It's a pretty big change to the library. |
Another API idea would be to treat an added predictor as an operator-like object. For a NN this might look like: NN = add_predictor(model, flux_chain) # creates a modelling object that can be treated as an operator
@objective(model, Min, sum(NN(x))) where In my research, we commonly use a single NN in an optimal control problem that steps forward in time. So the same NN is used over What is nice about this approach is that @predictor(model, NN, flux_chain)
# containers of predictors could also be supported
@predictor(model, NN[i = 1:10], flux_chain[i]) |
At one point I played around with I also considered the macros. But I don't know that they add much functionality. They're really just macros for the same of it. I decided to go with I'm open to revisiting this in the future though, but it would require some large-scale practical examples that clearly demonstrate the benefit. |
Enabling function (predictor::AbstractPredictor)(x::Vector)
model = _get_model_or_error(x)
return add_predictor(model, predictor, x)
end where function (predictor::AbstractPredictor)(x::Vector)
model = _get_model_or_error(x)
return first(add_predictor(model, predictor, x))
end which provides a natural way to avoid getting the formulation information if you don't want it. I still would like to also avoid making redundant formulations by using the same nonlinear operator over a set of inputs and by using the same formulation if A large-scale example that exists in the optimal control community (and my own research) is an MPC formulation that uses a NN model that steps in time. The NN takes the states K = 100
raw_NN = # put raw Flux/Lux/PyTorch model here
@variable(model, x_lb[i] <= x[i =1:10, 0:K] <= x_ub[i])
@variable(model, u_lb[j] <= u[j = 1:2, 0:K] <= x_ub[j])
NN = build_predictor(raw_NN, gray_box = true)
@constraint(model, [k = 0:K-1], NN(vcat(x[:, k], u[:, k])) .== x[:, k+1]) where only one new nonlinear operator is created for |
Closing as won't-fix for now. Let's see how far our current setup gets us. |
As discussed with Russell and Kaarthik
The text was updated successfully, but these errors were encountered: