Column `mean_exposure` doesn't exist." #1191

mahaja1 · 2024-12-19T21:44:32Z

Hey, I i came across this issue in robyn and it's not going away:

Run all trials and iterations. Use ?robyn_run to check parameter definition

OutputModels <- robyn_run(

InputCollect = InputCollect, # feed in all model specification
cores = NULL, # NULL defaults to (max available - 1)
iterations = 10000, # 2000 recommended for the dummy dataset with no calibration
trials = 8, # 5 recommended for the dummy dataset
ts_validation = TRUE, # 3-way-split time series for NRMSE validation.
add_penalty_factor = FALSE # Experimental feature. Use with caution.
)
Input data has 42 weeks in total: 2023-08-14 to 2024-05-27
Initial model is built on rolling window of 42 week: 2023-08-14 to 2024-05-27
Time-series validation with train_size range of 50%-80% of the data...
Using geometric adstocking with 14 hyperparameters (14 to iterate + 0 fixed) on 15 cores

Starting 8 trials with 10000 iterations each using TwoPointsDE nevergrad algorithm...
Running trial 1 of 8
| | 0%Timing stopped at: 0.73 0.11 0.86
Error in { : task 1 failed - "Can't select columns that don't exist.
✖ Column mean_exposure doesn't exist."

I'd really appreciate if anyone can help me out with this one.

The text was updated successfully, but these errors were encountered:

mahaja1 · 2024-12-19T23:00:24Z

dt_input

A tibble: 65 × 6

DATE Call RESPONSIVE_SEARCH VIDEO_TRUEVIEW_IN_STREAM FB Conversion

1 2023-09-18 00:00:00 0 818. 75.0 501. 220
2 2023-09-25 00:00:00 0 805. 74.2 450. 198
3 2023-10-02 00:00:00 0 781. 70.2 500. 211
4 2023-10-09 00:00:00 0 802. 55.9 509. 194
5 2023-10-16 00:00:00 0.241 801. 69.6 483. 240
6 2023-10-23 00:00:00 2.87 797. 102. 462. 206
7 2023-10-30 00:00:00 18.9 713. 64.4 460. 224
8 2023-11-06 00:00:00 5.99 740. 82.3 510. 238
9 2023-11-13 00:00:00 2.86 818. 72.0 519. 193
10 2023-11-20 00:00:00 0 718. 81.3 481. 187

ℹ 55 more rows

ℹ Use `print(n = ...)` to see more rows

All sign control are now automatically provided: "positive" for media & organic

variables and "default" for all others. User can still customise signs if necessary.

Documentation is available, access it anytime by running: ?robyn_inputs

InputCollect <- robyn_inputs(

dt_input = dt_input,
dt_holidays = dt_prophet_holidays,
date_var = "DATE", # date format must be "2020-01-01"
dep_var = "Conversion", # there should be only one dependent variable
dep_var_type = "conversion", # "revenue" (ROI) or "conversion" (CPA)
prophet_vars = c("trend", "season", "holiday"), # "trend","season", "weekday" & "holiday"
prophet_country = "US", # input country code. Check: dt_prophet_holidays
paid_media_spends = c( "Call", "RESPONSIVE_SEARCH", "VIDEO_TRUEVIEW_IN_STREAM", "FB"
), # mandatory input
paid_media_vars = c( "Call", "RESPONSIVE_SEARCH", "VIDEO_TRUEVIEW_IN_STREAM", "FB"), # mandatory.
paid_media_vars must have same order as paid_media_spends. Use media exposure metrics like
impressions, GRP etc. If not applicable, use spend instead.
#context_vars = c("events"),
#factor_vars = c("events"), # force variables in context_vars or organic_vars to be categorical
#window_start = "2021-05-03",
#window_end = "2023-12-25",
adstock = "geometric" # geometric, weibull_cdf or weibull_pdf.
)
Warning message:
In check_datadim(dt_input, all_ind_vars, rel = 10) :
There are 7 independent variables & 65 data points. We recommend row:column ratio of 10 to 1

print(InputCollect)
Total Observations: 65 (weeks)
Input Table Columns (6):
Date: DATE
Dependent: Conversion [conversion]
Paid Media: Call, RESPONSIVE_SEARCH, VIDEO_TRUEVIEW_IN_STREAM, FB
Paid Media Spend: Call, RESPONSIVE_SEARCH, VIDEO_TRUEVIEW_IN_STREAM, FB
Context:
Organic:
Prophet (Auto-generated): trend, season, holiday on US
Unused variables: None

Date Range: 2023-09-18:2024-12-09
Model Window: 2023-09-18:2024-12-09 (65 weeks)
With Calibration: FALSE
Custom parameters: None

Adstock: geometric
Hyper-parameters: Not set yet

Default media variable for modelling has changed from paid_media_vars to paid_media_spends.

Also, calibration_input are required to be spend names.

hyperparameter names are based on paid_media_spends names too. See right hyperparameter names:

hyper_names(adstock = InputCollect$adstock, all_media = InputCollect$all_media)
[1] "Call_alphas" "Call_gammas" "Call_thetas"
[4] "FB_alphas" "FB_gammas" "FB_thetas"
[7] "RESPONSIVE_SEARCH_alphas" "RESPONSIVE_SEARCH_gammas" "RESPONSIVE_SEARCH_thetas"
[10] "VIDEO_TRUEVIEW_IN_STREAM_alphas" "VIDEO_TRUEVIEW_IN_STREAM_gammas" "VIDEO_TRUEVIEW_IN_STREAM_thetas"

1. IMPORTANT: set plot = TRUE to create example plots for adstock & saturation

hyperparameters and their influence in curve transformation.

plot_adstock(plot = FALSE)
plot_saturation(plot = FALSE)

4. Set individual hyperparameter bounds. They either contain two values e.g. c(0, 0.5),

or only one value, in which case you'd "fix" that hyperparameter.

Run hyper_limits() to check maximum upper and lower bounds by range

hyper_limits()
thetas alphas gammas shapes scales
1 >=0 >0 >0 >=0 >=0
2 <1 <10 <=1 <20 <=1

Example hyperparameters ranges for Geometric adstock

hyperparameters <- list(

FB_alphas = c(0.5, 3),
FB_gammas = c(0.3, 1),
FB_thetas = c(0, 0.3),
Call_alphas = c(0.5, 3),
Call_gammas = c(0.3, 1),
Call_thetas = c(0, 0.3),
RESPONSIVE_SEARCH_alphas = c(0.5, 3),
RESPONSIVE_SEARCH_gammas = c(0.3, 1),
RESPONSIVE_SEARCH_thetas = c(0, 0.3),
VIDEO_TRUEVIEW_IN_STREAM_alphas = c(0.5, 3),
VIDEO_TRUEVIEW_IN_STREAM_gammas = c(0.3, 1),
VIDEO_TRUEVIEW_IN_STREAM_thetas = c(0, 0.3)
)

InputCollect <- robyn_inputs(InputCollect = InputCollect, hyperparameters = hyperparameters)

Running feature engineering...
Warning message:
In check_hyperparameters(InputCollect$hyperparameters, InputCollect$adstock, :
Automatically added missing hyperparameter range: 'train_size' = c(0.5, 0.8)
print(InputCollect)
Total Observations: 65 (weeks)
Input Table Columns (6):
Date: DATE
Dependent: Conversion [conversion]
Paid Media: Call, RESPONSIVE_SEARCH, VIDEO_TRUEVIEW_IN_STREAM, FB
Paid Media Spend: Call, RESPONSIVE_SEARCH, VIDEO_TRUEVIEW_IN_STREAM, FB
Context:
Organic:
Prophet (Auto-generated): trend, season, holiday on US
Unused variables: None

Date Range: 2023-09-18:2024-12-09
Model Window: 2023-09-18:2024-12-09 (65 weeks)
With Calibration: FALSE
Custom parameters: None

Adstock: geometric
Hyper-parameters ranges:
FB_alphas: [0.5, 3]
FB_gammas: [0.3, 1]
FB_thetas: [0, 0.3]
Call_alphas: [0.5, 3]
Call_gammas: [0.3, 1]
Call_thetas: [0, 0.3]
RESPONSIVE_SEARCH_alphas: [0.5, 3]
RESPONSIVE_SEARCH_gammas: [0.3, 1]
RESPONSIVE_SEARCH_thetas: [0, 0.3]
VIDEO_TRUEVIEW_IN_STREAM_alphas: [0.5, 3]
VIDEO_TRUEVIEW_IN_STREAM_gammas: [0.3, 1]
VIDEO_TRUEVIEW_IN_STREAM_thetas: [0, 0.3]
train_size: [0.5, 0.8]

Check spend exposure fit if available

if (length(InputCollect$exposure_vars) > 0) {

lapply(InputCollect$modNLS$plots, plot)
}

Run all trials and iterations. Use ?robyn_run to check parameter definition

OutputModels <- robyn_run(

InputCollect = InputCollect, # feed in all model specification
cores = NULL, # NULL defaults to (max available - 1)
iterations = 5000, # 2000 recommended for the dummy dataset with no calibration
trials = 5, # 5 recommended for the dummy dataset
ts_validation = TRUE, # 3-way-split time series for NRMSE validation.
add_penalty_factor = FALSE # Experimental feature. Use with caution.
)
Input data has 65 weeks in total: 2023-09-18 to 2024-12-09
Initial model is built on rolling window of 65 week: 2023-09-18 to 2024-12-09
Time-series validation with train_size range of 50%-80% of the data...
Using geometric adstocking with 14 hyperparameters (14 to iterate + 0 fixed) on 15 cores

Starting 5 trials with 5000 iterations each using TwoPointsDE nevergrad algorithm...
Running trial 1 of 5
| | 0%Timing stopped at: 0.88 0.05 0.94
Error in { : task 1 failed - "Can't select columns that don't exist.
✖ Column mean_exposure doesn't exist."

The codes were running fine for me before but started giving me problem today. Please help me with this issue @gufengzhou

mahaja1 · 2024-12-19T23:33:52Z

I found the error! you need to have an organic variable column as well in the input column now for the model to run. This was not the case before, kindly look into it.

gufengzhou · 2024-12-20T10:02:37Z

Thanks! Will look into it. @laresbernardo would you have time for this one?

gufengzhou · 2024-12-20T14:27:55Z

@mahaja1 I can't reproduce your error. I can run the demo without organic using the latest version.

mahaja1 · 2024-12-20T15:30:31Z

I am still getting it. What options do i have? When i ran the model using traffic column, it is assigning large contribution value to traffic. I want to run it without any organic variable.

mahaja1 · 2024-12-20T15:43:51Z

I really need help around this. Please look into this.

gufengzhou · 2024-12-20T17:32:37Z

which version are you using? you can do so by running packageVersion("Robyn"). Your copy paste message above is quite messy. I might need your dataset as well as your robyn_inputs() configs to debug

laresbernardo · 2024-12-23T08:59:54Z

I wasn't able to reproduce the error either. Potentially changing a hard select(...) to select(any_of(...)) could fix it if that's the case @gufengzhou. There are only 4 mean_exposure's in the code.

dhpz-11 · 2025-01-12T14:34:08Z

Hi team,

I'm getting the same error as well. I've checked some possible solutions with Gemini but they didn't work as well.

Here's the code I'm using:

#Paramaters:

InputCollect <- robyn_inputs(
dt_input = mmm,
date_var = "Date",
dep_var = "Sales",
dep_var_type = "revenue",
paid_media_spends = c("TikTok", "Facebook", "Google"),
paid_media_vars = c("TikTok", "Facebook", "Google"),
media_vars = NULL, #This was a recommendation from Gemini, in order to solve the Error from the robyn output
window_start = "2018-01-07", #mandatory
window_end = "2021-10-31", #mandatory
adstock = "geometric", #mandatory
)
print(InputCollect)

#Hyperparameters:

hyper_names(adstock = InputCollect$adstock, all_media = InputCollect$all_media)
hyperparameters <- list(
Facebook_alphas = c(0.5, 3),
Facebook_gammas = c(0.3, 1),
Facebook_thetas = c(0, 0.4),
Google_alphas = c(0.5, 3),
Google_gammas = c(0.3, 1),
Google_thetas = c(0, 0.4),
TikTok_alphas = c(0.5, 3),
TikTok_gammas = c(0.3, 1),
TikTok_thetas = c(0, 0.4),
train_size = c(0.5, 0.8)
)
InputCollect <- robyn_inputs(InputCollect = InputCollect, hyperparameters = hyperparameters)
print(InputCollect)

#Robyn Outputs:

Outputmodels <- robyn_run(
InputCollect = InputCollect,
cores = NULL,
interations = 200,
trials = 5,
ts_validation = TRUE,
add_penalty_factor = FALSE
)
print(Outputmodels)

This is the error I'm getting:

Input data has 200 weeks in total: 2018-01-07 to 2021-10-31
Initial model is built on rolling window of 200 week: 2018-01-07 to 2021-10-31
Time-series validation with train_size range of 50%-80% of the data...
Using geometric adstocking with 11 hyperparameters (11 to iterate + 0 fixed) on 7 cores

Starting 5 trials with 2000 iterations each using TwoPointsDE nevergrad algorithm...
Running trial 1 of 5
| | 0%Timing stopped at: 0.425 0.533 0.25
Error in { : task 1 failed - "Can't select columns that don't exist.
✖ Column mean_exposure doesn't exist."

mmm.xlsx

gufengzhou self-assigned this Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Column `mean_exposure` doesn't exist." #1191

Column `mean_exposure` doesn't exist." #1191

mahaja1 commented Dec 19, 2024

Run all trials and iterations. Use ?robyn_run to check parameter definition

mahaja1 commented Dec 19, 2024

All sign control are now automatically provided: "positive" for media & organic

variables and "default" for all others. User can still customise signs if necessary.

Documentation is available, access it anytime by running: ?robyn_inputs

paid_media_vars must have same order as paid_media_spends. Use media exposure metrics like

impressions, GRP etc. If not applicable, use spend instead.

Default media variable for modelling has changed from paid_media_vars to paid_media_spends.

Also, calibration_input are required to be spend names.

hyperparameter names are based on paid_media_spends names too. See right hyperparameter names:

1. IMPORTANT: set plot = TRUE to create example plots for adstock & saturation

hyperparameters and their influence in curve transformation.

4. Set individual hyperparameter bounds. They either contain two values e.g. c(0, 0.5),

or only one value, in which case you'd "fix" that hyperparameter.

Run hyper_limits() to check maximum upper and lower bounds by range

Example hyperparameters ranges for Geometric adstock

Check spend exposure fit if available

Run all trials and iterations. Use ?robyn_run to check parameter definition

mahaja1 commented Dec 19, 2024

gufengzhou commented Dec 20, 2024

gufengzhou commented Dec 20, 2024

mahaja1 commented Dec 20, 2024

mahaja1 commented Dec 20, 2024

gufengzhou commented Dec 20, 2024

laresbernardo commented Dec 23, 2024 •

edited

Loading

dhpz-11 commented Jan 12, 2025 •

edited

Loading

Column mean_exposure doesn't exist." #1191

Column mean_exposure doesn't exist." #1191

Comments

mahaja1 commented Dec 19, 2024

Run all trials and iterations. Use ?robyn_run to check parameter definition

mahaja1 commented Dec 19, 2024

A tibble: 65 × 6

ℹ 55 more rows

ℹ Use print(n = ...) to see more rows

All sign control are now automatically provided: "positive" for media & organic

variables and "default" for all others. User can still customise signs if necessary.

Documentation is available, access it anytime by running: ?robyn_inputs

paid_media_vars must have same order as paid_media_spends. Use media exposure metrics like

impressions, GRP etc. If not applicable, use spend instead.

Default media variable for modelling has changed from paid_media_vars to paid_media_spends.

Also, calibration_input are required to be spend names.

hyperparameter names are based on paid_media_spends names too. See right hyperparameter names:

1. IMPORTANT: set plot = TRUE to create example plots for adstock & saturation

hyperparameters and their influence in curve transformation.

4. Set individual hyperparameter bounds. They either contain two values e.g. c(0, 0.5),

or only one value, in which case you'd "fix" that hyperparameter.

Run hyper_limits() to check maximum upper and lower bounds by range

Example hyperparameters ranges for Geometric adstock

Check spend exposure fit if available

Run all trials and iterations. Use ?robyn_run to check parameter definition

mahaja1 commented Dec 19, 2024

gufengzhou commented Dec 20, 2024

gufengzhou commented Dec 20, 2024

mahaja1 commented Dec 20, 2024

mahaja1 commented Dec 20, 2024

gufengzhou commented Dec 20, 2024

laresbernardo commented Dec 23, 2024 • edited Loading

dhpz-11 commented Jan 12, 2025 • edited Loading

Column `mean_exposure` doesn't exist." #1191

Column `mean_exposure` doesn't exist." #1191

ℹ Use `print(n = ...)` to see more rows

laresbernardo commented Dec 23, 2024 •

edited

Loading

dhpz-11 commented Jan 12, 2025 •

edited

Loading