Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[modeling] add modeling implementations #1067

Merged
merged 18 commits into from
Oct 2, 2024

Conversation

alxlyj
Copy link
Contributor

@alxlyj alxlyj commented Oct 2, 2024

Project Robyn

We are adding in the implementations for the modeling component.

Test Plan and Diagrams

graph TD
    %% Color definitions for better readability in both light and dark modes
    classDef r_node fill:#FFE6E6,stroke:#FF6B6B,stroke-width:2px,color:black;
    classDef py_node fill:#E6FFE6,stroke:#4CAF50,stroke-width:2px,color:black;
    classDef mapped fill:#E6F3FF,stroke:#1E90FF,stroke-width:2px,color:black;
    classDef partially_mapped fill:#FFFFE6,stroke:#FFD700,stroke-width:2px,color:black;
    classDef missing fill:#FFE6F0,stroke:#FF69B4,stroke-width:2px,color:black;

    subgraph R ["R Implementation (model.R)"]
        R_robyn_run["robyn_run()"]:::r_node
        R_robyn_train["robyn_train()"]:::r_node
        R_robyn_mmm["robyn_mmm()"]:::r_node
        R_model_decomp["model_decomp()"]:::r_node
        R_model_refit["model_refit()"]:::r_node
        R_lambda_seq["lambda_seq()"]:::r_node
        R_hyper_collector["hyper_collector()"]:::r_node
    end

    subgraph Python ["Python Implementation"]
        P_BaseModelExecutor["BaseModelExecutor class"]:::py_node
        P_ModelExecutor["ModelExecutor class"]:::py_node
        P_RidgeModelBuilder["RidgeModelBuilder class"]:::py_node
        P_build_models["build_models()"]:::py_node
        P_model_train["_model_train()"]:::py_node
        P_run_nevergrad_optimization["_run_nevergrad_optimization()"]:::py_node
        P_evaluate_model["_evaluate_model()"]:::py_node
        P_prepare_data["_prepare_data()"]:::py_node
        P_geometric_adstock["_geometric_adstock()"]:::py_node
        P_hill_transformation["_hill_transformation()"]:::py_node
        P_calculate_rssd["_calculate_rssd()"]:::py_node
        P_calculate_mape["_calculate_mape()"]:::py_node
        P_lambda_seq["_lambda_seq()"]:::py_node
    end

    R_robyn_run --> |Mapped| P_ModelExecutor:::mapped
    R_robyn_train --> |Mapped| P_build_models:::mapped
    R_robyn_mmm --> |Partially Mapped| P_run_nevergrad_optimization:::partially_mapped
    R_model_decomp --> |Partially Mapped| P_evaluate_model:::partially_mapped
    R_model_refit --> |Partially Mapped| P_evaluate_model:::partially_mapped
    R_lambda_seq --> |Mapped| P_lambda_seq:::mapped
    R_hyper_collector --> |Partially Mapped| P_RidgeModelBuilder:::partially_mapped

    P_BaseModelExecutor --> P_ModelExecutor
    P_ModelExecutor --> P_RidgeModelBuilder
    P_RidgeModelBuilder --> P_build_models
    P_build_models --> P_model_train
    P_model_train --> P_run_nevergrad_optimization
    P_run_nevergrad_optimization --> P_evaluate_model
    P_evaluate_model --> P_prepare_data
    P_prepare_data --> P_geometric_adstock
    P_prepare_data --> P_hill_transformation
    P_evaluate_model --> P_calculate_rssd
    P_evaluate_model --> P_calculate_mape
    P_evaluate_model --> P_lambda_seq

    subgraph Missing ["Missing or Incomplete"]
        M_robyn_converge["robyn_converge()"]:::missing
        M_ts_validation["ts_validation()"]:::missing
        M_robyn_outputs["robyn_outputs()"]:::missing
    end

    R_robyn_run --> M_robyn_converge
    R_robyn_run --> M_ts_validation
    R_robyn_run --> M_robyn_outputs

    %% Add a note about color coding
    classDef note fill:#f9f9f9,stroke:#333,stroke-width:1px,color:black;

Loading
classDiagram
    class RidgeModelBuilder {
        +mmm_data: MMMData
        +holiday_data: HolidaysData
        +calibration_input: CalibrationInput
        +hyperparameters: Hyperparameters
        +featurized_mmm_data: FeaturizedMMMData
        +__init__(mmm_data, holiday_data, calibration_input, hyperparameters, featurized_mmm_data)
        +build_models(trials_config: TrialsConfig, dt_hyper_fixed: Optional[Dict], ts_validation: bool, add_penalty_factor: bool, seed: int, rssd_zero_penalty: bool, objective_weights: Optional[List[float]], nevergrad_algo: NevergradAlgorithm, intercept: bool, intercept_sign: str, cores: int) : ModelOutputs
        -_model_train(hyper_collect: Dict, trials_config: TrialsConfig, intercept_sign: str, intercept: bool, nevergrad_algo: NevergradAlgorithm, dt_hyper_fixed: Optional[Dict], ts_validation: bool, add_penalty_factor: bool, rssd_zero_penalty: bool, objective_weights: Optional[List[float]], seed: int, cores: int) : List[Trial]
        -_run_nevergrad_optimization(hyper_collect: Dict, iterations: int, cores: int, nevergrad_algo: NevergradAlgorithm, intercept: bool, intercept_sign: str, ts_validation: bool, add_penalty_factor: bool, objective_weights: Optional[List[float]], dt_hyper_fixed: Optional[Dict], rssd_zero_penalty: bool, trial: int, seed: int, total_trials: int) : Trial
        -_prepare_data(params: Dict[str, float]) : Tuple[pd.DataFrame, pd.Series]
        -_geometric_adstock(x: pd.Series, theta: float) : pd.Series
        -_hill_transformation(x: pd.Series, alpha: float, gamma: float) : pd.Series
        -_calculate_rssd(coefs: np.ndarray, rssd_zero_penalty: bool) : float
        -_calculate_mape(model: Ridge) : float
        -_evaluate_model(params: Dict[str, float], ts_validation: bool, add_penalty_factor: bool, rssd_zero_penalty: bool, objective_weights: Optional[List[float]]) : Tuple[float, float, float, float, Optional[pd.DataFrame], Optional[pd.DataFrame], pd.DataFrame, float, float, float, float, float, float, float, int]
        -_lambda_seq(x: np.ndarray, y: np.ndarray, seq_len: int, lambda_min_ratio: float) : np.ndarray
        -_select_best_model(output_models: List[Trial]) : str
        -_calculate_convergence(output_models: List[Trial]) : Dict[str, Any]
        -_create_moo_distrb_plot(nrmse_values: List[float], decomp_rssd_values: List[float]) : str
        -_create_moo_cloud_plot(nrmse_values: List[float], decomp_rssd_values: List[float]) : str
        -_create_ts_validation_plot(output_models: List[Trial]) : str
    }

    class ModelExecutor {
        +mmm_data: MMMData
        +holiday_data: HolidaysData
        +calibration_input: CalibrationInput
        +hyperparameters: Hyperparameters
        +featurized_mmm_data: FeaturizedMMMData
        +__init__(mmm_data, holiday_data, calibration_input, hyperparameters, featurized_mmm_data)
        +model_run(dt_hyper_fixed: Optional[Dict], ts_validation: bool, add_penalty_factor: bool, refresh: bool, seed: int, cores: int, trials_config: Optional[TrialsConfig], rssd_zero_penalty: bool, objective_weights: Optional[Dict[str, float]], nevergrad_algo: NevergradAlgorithm, intercept: bool, intercept_sign: str, outputs: bool, model_name: Models, lambda_control: Optional[float]) : ModelOutputs
        -_validate_input() : None
        -_prepare_hyperparameters(dt_hyper_fixed: Optional[Dict], add_penalty_factor: bool, ts_validation: bool) : Dict[str, Any]
        -_setup_nevergrad_optimizer(hyperparameters: Dict[str, Any], iterations: int, cores: int, nevergrad_algo: NevergradAlgorithm) : ng.optimizers.base.Optimizer
        -_calculate_objective(train_score: float, test_score: Optional[float], rssd: float, objective_weights: Optional[Dict[str, float]]) : float
        -_apply_transformations(channel: str, hyperparameters: Dict[str, Any]) : np.ndarray
        -_prepare_model_data(hyperparameters: Dict[str, Any]) : np.ndarray
        -_generate_additional_outputs(model_outputs: ModelOutputs) : Dict[str, Any]
    }

    class Trial {
        +result_hyp_param: pd.DataFrame
        +lift_calibration: Optional[pd.DataFrame]
        +decomp_spend_dist: Optional[pd.DataFrame]
        +nrmse: float
        +decomp_rssd: float
        +mape: float
        +x_decomp_agg: pd.DataFrame
        +rsq_train: float
        +rsq_val: float
        +rsq_test: float
        +lambda_: float
        +lambda_hp: float
        +lambda_max: float
        +lambda_min_ratio: float
        +pos: int
        +elapsed: float
        +elapsed_accum: float
        +sol_id: str
        +trial: int
        +iter_ng: int
        +iter_par: int
        +train_size: float
    }

    class ModelOutputs {
        +trials: List[Trial]
        +train_timestamp: str
        +cores: int
        +iterations: int
        +intercept: bool
        +intercept_sign: str
        +nevergrad_algo: str
        +ts_validation: bool
        +add_penalty_factor: bool
        +hyper_updated: Dict[str, Any]
        +hyper_fixed: bool
        +convergence: Dict[str, Any]
        +ts_validation_plot: Any
        +select_id: str
        +seed: int
        +hyper_bound_ng: Dict[str, Any]
        +hyper_bound_fixed: Dict[str, Any]
    }

    class TrialsConfig {
        +trials: int
        +iterations: int
    }

    class NevergradAlgorithm {
        <<enumeration>>
        DE
        TWO_POINTS_DE
        ONE_PLUS_ONE
        DOUBLE_FAST_GA_DISCRETE_ONE_PLUS_ONE
        DISCRETE_ONE_PLUS_ONE
        PORTFOLIO_DISCRETE_ONE_PLUS_ONE
        NAIVE_TBPSA
        CGA
        RANDOM_SEARCH
    }

    class Models {
        <<enumeration>>
        RIDGE
    }

    RidgeModelBuilder --> ModelOutputs : produces
    RidgeModelBuilder --> Trial : creates
    ModelExecutor --> RidgeModelBuilder : uses
    ModelExecutor --> ModelOutputs : produces
    ModelOutputs --> "*" Trial : contains
    RidgeModelBuilder ..> TrialsConfig : uses
    RidgeModelBuilder ..> NevergradAlgorithm : uses
    ModelExecutor ..> TrialsConfig : uses
    ModelExecutor ..> NevergradAlgorithm : uses
    ModelExecutor ..> Models : uses
Loading

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 2, 2024
@alxlyj alxlyj marked this pull request as ready for review October 2, 2024 00:12
@alxlyj alxlyj merged commit e0458e7 into robynpy_release Oct 2, 2024
3 checks passed
@alxlyj alxlyj deleted the review_ready_impl_modeling_new branch October 2, 2024 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants