Error in summary.connection(connection) : invalid connection #534

cynthia10wang · 2022-11-14T14:59:05Z

Project Robyn

I'm getting this error when running robyn_outputs step.

Running Pareto calculations for 10000 models on 3 fronts...
Error in summary.connection(connection) : invalid connection

My dataset has about 300 rows and 40 variables, and I'm running on 2000 iterations * 5 trials. When I was on old 3.6x version of Robyn everything is perfectly fine. I looked at previous issues and tried manually changing cores to be 1 or 7 (as max is 8) but it still doesn't work

Environment & Robyn version

Check and share Robyn version: 3.7.2
R version 4.2.2
Running on windows

gufengzhou · 2022-11-15T07:18:40Z

Can you please share your sessionInfo()$R.version? Also, can you try installing this commit version using remotes::install_github("facebookexperimental/Robyn/R", ref="80dbaa0018422bde2a65826398eeea364ed28d33"), then restart, reload the lib and retry to see if the error still occurs?

Updated: fixed quote in this reply.

cynthia10wang · 2022-11-15T17:54:39Z

$platform
[1] "x86_64-w64-mingw32"

$arch
[1] "x86_64"

$os
[1] "mingw32"

$crt
[1] "ucrt"

$system
[1] "x86_64, mingw32"

$status
[1] ""

$major
[1] "4"

$minor
[1] "2.2"

$year
[1] "2022"

$month
[1] "10"

$day
[1] "31"

$svn rev
[1] "83211"

$language
[1] "R"

$version.string
[1] "R version 4.2.2 (2022-10-31 ucrt)"

$nickname
[1] "Innocent and Trusting"

btw I cant install the commit version due to API rate limit exceeded prob due to corporate firewall.

Alopes-data · 2022-11-15T19:30:21Z

I was having a similar issue earlier today.
I ran the code successfully a few times this morning but sometime mid-day I started having the same connection error at the same point when I switched from Geometric to Weibull-PDF adstock. After the inital error I still had the connection error code occur even when I switched back to the Geometric models.

I restarted my R session and ran the code provided above from the contributer:
remotes::install_github("facebookexperimental/Robyn/R", ref= "80dbaa0018422bde2a65826398eeea364ed28d33")

Note I had to retype the last quotation mark*

This code required me to update my Plyr package from version 1.8.7 - 1.8.8; after running I was able to run a Geometric model where before it would fail at the first trial and provide the connection error that Cynthia received.
I was able to confirm I can make a Weibull model after running the above code which updated the plyr package.

:: Here is my R session Information ::

$platform
[1] "x86_64-w64-mingw32"

$arch
[1] "x86_64"

$os
[1] "mingw32"

$crt
[1] "ucrt"

$system
[1] "x86_64, mingw32"

$status
[1] ""

$major
[1] "4"

$minor
[1] "2.1"

$year
[1] "2022"

$month
[1] "06"

$day
[1] "23"

$svn rev
[1] "82513"

$language
[1] "R"

$version.string
[1] "R version 4.2.1 (2022-06-23 ucrt)"

$nickname
[1] "Funny-Looking Kid"

laresbernardo · 2022-11-15T19:49:55Z

That's really useful information @Alopes-data
Just checked and plyr package was updated to 1.8.8 (in CRAN) a couple of days ago (2022-11-11). What's weird is that plyr changelog shows only check fixes, no new feature, bugs, or change of default behaviors since almost 7 years ago.

cynthia10wang · 2022-11-15T21:36:03Z

Mine is still having issues after upgrading to Plyr, and definitely running much slower than before when there's no issues. It takes 30 minites for csv exporting and then fails when plotting one pagers.

And one thing I noticed is that my OutputModels from Robyn_run is extremely large. The latest I'm running with 3 trials on 2000 iterations generate a 3.2G size OutputModels, where colleagues running with similar dataset on 5000 iterations and 6 trials only get less than 400MB...

gufengzhou · 2022-11-18T09:28:49Z

This is quite puzzling that you have different file size than your colleagues. What machines / OS systems are you and your colleagues using?

cynthia10wang · 2022-11-18T21:16:52Z

@gufengzhou We're all on Windows. I compared the OutputModels objects and it seems like mine is much larger due to the vec_collect - which is in current version robyn_run fuction but not in previous version. It creates additional xDecompVec etc. variables in trail[i] and also vec_collect item on its own.

laresbernardo · 2022-11-21T15:40:27Z

Hello! Can you guys update to latest dev version and check how it goes? I've reduced the size of the OutputModels by aggregating results into a single aggregated dataframe thus returning a 50% lighter object.

cynthia10wang · 2022-11-22T18:41:44Z

@laresbernardo Hi I re-ran with demo datasets on 2000 iterations * 5 trials and compared the size of OutputModels for 3 different recent packages:

3.8.1 - 538.1MB
dev version you posted above - 1GB
3.8.2 - 563.1 MB (please note that I was trying to install 3.7.2 using remotes::install_github("facebookexperimental/Robyn/R", version='3.7.2'), however for some reason it installed 3.8.2 instead). I think compared to 3.7.2 it still has all those additional xDecompVec items in each trial although vec_collect is gone so the size is down to half compared to what I saw earlier

laresbernardo · 2022-11-22T18:44:47Z

Hi @cynthia10wang can you run it once more with latest version (3.8.2 - just released)? We got rid of that vec_collect list and are calculating the decomp values from bottom to top so no memory issues should occur. Sorry for all these changes but we believe this last iteration will be the final onwards.

cynthia10wang · 2022-11-22T19:44:00Z

Yes I checked version before running those 3 comparisons and the last one is 3.8.2
My point is that it's still significantly larger than 3.7x versions due to xDecompVec...and I'd like to reverse back to 3.7.2 so that I can get the exact size of same OutputModels using 3.7.2 (could be 10x times smaller based on experience I had with another laptop), but I can't seem to install 3.7.2 now.

With uninstalling and installing remotes::install_github("facebookexperimental/Robyn/R", version='3.7.2'), it's still installing 3.8.2 for me

laresbernardo · 2022-11-22T19:50:10Z

To downgrade: remotes::install_github("facebookexperimental/Robyn/[email protected]")
But I'd recommend you keep using the latest version so you can leverage all the new features and bugs fixed. If you actually ran the latest version I deployed a couple of hours ago, then you shouldn't have the vec_collect list, so the size of your output won't be as large as you mention. Please, check again and after you change any version, you must reset your session.

cynthia10wang · 2022-11-22T23:25:39Z

Sorry just realized I missed typing two words in the previous comment where I posted the screenshot so it might be confusing.

Yes, in 3.8.2 vec_collect is gone, so size of OutputModels is down to half compared to 3.8.1 or one of the 3.8.1 dev version
However, it's still quite larger than 3.7.2, because although vec_collect is gone, xDecompVec, xDecompVecImmediate, xDecompVecCarryOver are still in each trial; while 3.7.2 doesn't keep all these in OutputModels. I successfully downgrade my version to 3.7.2 and running the demo with everything else the same, the size of OutputModels is only 37.9MB, compared to 563.1 MB from 3.8.2.

Attached is the OutputModels screenshot from 3.7.2 - you can see the differences compared to the previous screenshot I attached from 3.8.2.

Unless the three (xDecompVec, xDecompVecImmediate, xDecompVecCarryOver) are kept there for further use, otherwise it could have a huge impact on larger dataset/# of iterations etc.

laresbernardo · 2022-11-23T01:14:45Z

YES! You're absolutely right, I missed that redundancy. I've just merged all 3 results (xDecompVec, xDecompVecImmediate, xDecompVecCarryOver) into a single data.frame (xDecompVec) for OutputModels object. Doing this reduced the results from 563 MB to 330 MB [41% reduction], following the demo (5 trials, 2000 iterations) as you did.

We are already delivering ~30% of the 3.8.1 version OutputModels size and probably there are a few other things we can do to keep reducing the size but not much without having to re-calculate these results.

…olumns #534

laresbernardo · 2022-11-23T01:39:39Z

Last commit reduced OutputModels size to 234 MB (using the same demo example), so we are now ~80% smaller than 3.8.1 version! Is it good enough for you to adopt this new version which is 6-ish times larger given we've split immediate and carryover effects for calibrating models and plotting carryover % of the effect in one-pagers?

cynthia10wang · 2022-11-23T23:25:51Z

Hi, some of our models are at least 300~400M big under 3.7.2 and upgrading to 3.8.1 meaning about 1.5-2GB difference and we found that it sometimes causes issues like Rstudio crashing and parallel plotting errors (using 1 fewer cores still doesn't help).
Is there a way to turn this off if we're not doing calibration?

laresbernardo · 2022-11-23T23:39:20Z

@cynthia10wang can you please update to latest 3.8.2 version and see how it goes? It will be larger than the results you had when running 3.7.2, but that's because we are splitting immediate and carry-over effects for each model not only to calibrate against one of them but to provide the % of carryover / immediate response for each media variable (paid and not paid). That's something we can't run off because we strongly believe this information will be helpful for users to pick the model that best describes their marketing strategy and understand the mid-term effect of their marketing efforts.

gufengzhou · 2022-12-21T04:07:45Z

Hi, some of our models are at least 300~400M big under 3.7.2 and upgrading to 3.8.1 meaning about 1.5-2GB difference and we found that it sometimes causes issues like Rstudio crashing and parallel plotting errors (using 1 fewer cores still doesn't help). Is there a way to turn this off if we're not doing calibration?

@cynthia10wang We'll try further reducing the object size in the future. Until then, maybe one work-around.: you could also try reducing the sizes by running less iterations each time --> e.g. you could run like 20005 to see how the hyperparameter ranges change in higher iterations, then narrow down the hyppar ranges and rerun 20005 etc or even less, if you can observe the metrics converge. This way you could avoid having to go for large iters in one go. Let us know how it works for you.

JJohnson-DA · 2023-01-04T20:19:12Z

I'm on version 3.9.0 on Windows 10 OS and am having this same issue where I run out of memory while trying to export the one pagers. It get's through the first few and then fails. My OutputModels object is over 1.5GB with each trial having a xDecompVec being 596,596 rows.

laresbernardo · 2023-03-09T17:01:58Z

Closing this task given it's been inactive for some months. Feel free to re-open.

Luinehil · 2023-03-13T11:59:56Z

Hello, I have the same problem on Robyn 3.10.0.9000. I can't output models if I set clusters to TRUE, this is the error

Running Pareto calculations for 10000 models on auto fronts...
Automatically selected 5 Pareto-fronts to contain at least 100 pareto-optimal models (111)
Calculating response curves for all models' variables (555)...
Error in summary.connection(connection) : invalid connection

laresbernardo · 2023-03-13T13:08:46Z

Hi @Luinehil as mentioned in this thread, this is a memory issue. Did you try with less iterations or less trials? We are currently working on a way to reduce the size of these objects so users don't get memory problems this easily (10K models in your case). We'll update once we have a permanent scalable solution.

Luinehil · 2023-03-13T14:03:35Z

Hey @laresbernardo I have managed to get a clustered output on Robyn 3.9.0, on 2 trials with 2000 iterations but I had to do it twice (as I got the same error on the first run), and it took almost 99% of my memory (I work on Win 11, 32GB RAM).

skonduri-sreeja · 2024-01-24T03:07:45Z

or in summary.connection(connection) : invalid connection when I run the model and I want to change the train size to use 90% data. When I say train_size=0.9 -- it is erroring out.

laresbernardo self-assigned this Nov 21, 2022

laresbernardo added a commit that referenced this issue Nov 23, 2022

recode: 41% OutputCollect size reduction #534

166b6a9

laresbernardo added a commit that referenced this issue Nov 23, 2022

recode: reduced OutputModels size more by getting rid of not needed c…

8f5a2f5

…olumns #534

laresbernardo closed this as completed Mar 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in summary.connection(connection) : invalid connection #534

Error in summary.connection(connection) : invalid connection #534

cynthia10wang commented Nov 14, 2022

gufengzhou commented Nov 15, 2022 •

edited by laresbernardo

Loading

cynthia10wang commented Nov 15, 2022 •

edited

Loading

Alopes-data commented Nov 15, 2022 •

edited by laresbernardo

Loading

laresbernardo commented Nov 15, 2022

cynthia10wang commented Nov 15, 2022

gufengzhou commented Nov 18, 2022

cynthia10wang commented Nov 18, 2022

laresbernardo commented Nov 21, 2022

cynthia10wang commented Nov 22, 2022 •

edited

Loading

laresbernardo commented Nov 22, 2022

cynthia10wang commented Nov 22, 2022 •

edited

Loading

laresbernardo commented Nov 22, 2022

cynthia10wang commented Nov 22, 2022 •

edited

Loading

laresbernardo commented Nov 23, 2022

laresbernardo commented Nov 23, 2022 •

edited

Loading

cynthia10wang commented Nov 23, 2022

laresbernardo commented Nov 23, 2022 •

edited

Loading

gufengzhou commented Dec 21, 2022

JJohnson-DA commented Jan 4, 2023

laresbernardo commented Mar 9, 2023

Luinehil commented Mar 13, 2023

laresbernardo commented Mar 13, 2023

Luinehil commented Mar 13, 2023

skonduri-sreeja commented Jan 24, 2024

Error in summary.connection(connection) : invalid connection #534

Error in summary.connection(connection) : invalid connection #534

Comments

cynthia10wang commented Nov 14, 2022

Project Robyn

Environment & Robyn version

gufengzhou commented Nov 15, 2022 • edited by laresbernardo Loading

cynthia10wang commented Nov 15, 2022 • edited Loading

Alopes-data commented Nov 15, 2022 • edited by laresbernardo Loading

laresbernardo commented Nov 15, 2022

cynthia10wang commented Nov 15, 2022

gufengzhou commented Nov 18, 2022

cynthia10wang commented Nov 18, 2022

laresbernardo commented Nov 21, 2022

cynthia10wang commented Nov 22, 2022 • edited Loading

laresbernardo commented Nov 22, 2022

cynthia10wang commented Nov 22, 2022 • edited Loading

laresbernardo commented Nov 22, 2022

cynthia10wang commented Nov 22, 2022 • edited Loading

laresbernardo commented Nov 23, 2022

laresbernardo commented Nov 23, 2022 • edited Loading

cynthia10wang commented Nov 23, 2022

laresbernardo commented Nov 23, 2022 • edited Loading

gufengzhou commented Dec 21, 2022

JJohnson-DA commented Jan 4, 2023

laresbernardo commented Mar 9, 2023

Luinehil commented Mar 13, 2023

laresbernardo commented Mar 13, 2023

Luinehil commented Mar 13, 2023

skonduri-sreeja commented Jan 24, 2024

gufengzhou commented Nov 15, 2022 •

edited by laresbernardo

Loading

cynthia10wang commented Nov 15, 2022 •

edited

Loading

Alopes-data commented Nov 15, 2022 •

edited by laresbernardo

Loading

cynthia10wang commented Nov 22, 2022 •

edited

Loading

cynthia10wang commented Nov 22, 2022 •

edited

Loading

cynthia10wang commented Nov 22, 2022 •

edited

Loading

laresbernardo commented Nov 23, 2022 •

edited

Loading

laresbernardo commented Nov 23, 2022 •

edited

Loading