-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in summary.connection(connection) : invalid connection #534
Comments
Can you please share your Updated: fixed quote in this reply. |
$platform $arch $os $crt $system $status $major $minor $year $month $day $ $language $version.string $nickname btw I cant install the commit version due to API rate limit exceeded prob due to corporate firewall. |
I was having a similar issue earlier today. I restarted my R session and ran the code provided above from the contributer:
This code required me to update my Plyr package from version 1.8.7 - 1.8.8; after running I was able to run a Geometric model where before it would fail at the first trial and provide the connection error that Cynthia received. :: Here is my R session Information :: $platform $arch $os $crt $system $status $major $minor $year $month $day $ $language $version.string $nickname |
That's really useful information @Alopes-data |
Mine is still having issues after upgrading to Plyr, and definitely running much slower than before when there's no issues. It takes 30 minites for csv exporting and then fails when plotting one pagers. And one thing I noticed is that my OutputModels from Robyn_run is extremely large. The latest I'm running with 3 trials on 2000 iterations generate a 3.2G size OutputModels, where colleagues running with similar dataset on 5000 iterations and 6 trials only get less than 400MB... |
This is quite puzzling that you have different file size than your colleagues. What machines / OS systems are you and your colleagues using? |
@gufengzhou We're all on Windows. I compared the OutputModels objects and it seems like mine is much larger due to the vec_collect - which is in current version robyn_run fuction but not in previous version. It creates additional xDecompVec etc. variables in trail[i] and also vec_collect item on its own. |
Hello! Can you guys update to latest dev version and check how it goes? I've reduced the size of the |
@laresbernardo Hi I re-ran with demo datasets on 2000 iterations * 5 trials and compared the size of OutputModels for 3 different recent packages: 3.8.1 - 538.1MB |
Hi @cynthia10wang can you run it once more with latest version (3.8.2 - just released)? We got rid of that |
Yes I checked version before running those 3 comparisons and the last one is 3.8.2 With uninstalling and installing remotes::install_github("facebookexperimental/Robyn/R", version='3.7.2'), it's still installing 3.8.2 for me |
To downgrade: |
Sorry just realized I missed typing two words in the previous comment where I posted the screenshot so it might be confusing.
Attached is the OutputModels screenshot from 3.7.2 - you can see the differences compared to the previous screenshot I attached from 3.8.2. Unless the three (xDecompVec, xDecompVecImmediate, xDecompVecCarryOver) are kept there for further use, otherwise it could have a huge impact on larger dataset/# of iterations etc. |
YES! You're absolutely right, I missed that redundancy. I've just merged all 3 results (xDecompVec, xDecompVecImmediate, xDecompVecCarryOver) into a single data.frame (xDecompVec) for OutputModels object. Doing this reduced the results from 563 MB to 330 MB [41% reduction], following the demo (5 trials, 2000 iterations) as you did. We are already delivering ~30% of the 3.8.1 version OutputModels size and probably there are a few other things we can do to keep reducing the size but not much without having to re-calculate these results. |
Last commit reduced OutputModels size to 234 MB (using the same demo example), so we are now ~80% smaller than 3.8.1 version! Is it good enough for you to adopt this new version which is 6-ish times larger given we've split immediate and carryover effects for calibrating models and plotting carryover % of the effect in one-pagers? |
Hi, some of our models are at least 300~400M big under 3.7.2 and upgrading to 3.8.1 meaning about 1.5-2GB difference and we found that it sometimes causes issues like Rstudio crashing and parallel plotting errors (using 1 fewer cores still doesn't help). |
@cynthia10wang can you please update to latest 3.8.2 version and see how it goes? It will be larger than the results you had when running 3.7.2, but that's because we are splitting immediate and carry-over effects for each model not only to calibrate against one of them but to provide the % of carryover / immediate response for each media variable (paid and not paid). That's something we can't run off because we strongly believe this information will be helpful for users to pick the model that best describes their marketing strategy and understand the mid-term effect of their marketing efforts. |
@cynthia10wang We'll try further reducing the object size in the future. Until then, maybe one work-around.: you could also try reducing the sizes by running less iterations each time --> e.g. you could run like 20005 to see how the hyperparameter ranges change in higher iterations, then narrow down the hyppar ranges and rerun 20005 etc or even less, if you can observe the metrics converge. This way you could avoid having to go for large iters in one go. Let us know how it works for you. |
I'm on version 3.9.0 on Windows 10 OS and am having this same issue where I run out of memory while trying to export the one pagers. It get's through the first few and then fails. My OutputModels object is over 1.5GB with each trial having a xDecompVec being 596,596 rows. |
Closing this task given it's been inactive for some months. Feel free to re-open. |
Hello, I have the same problem on Robyn 3.10.0.9000. I can't output models if I set clusters to TRUE, this is the error
|
Hi @Luinehil as mentioned in this thread, this is a memory issue. Did you try with less iterations or less trials? We are currently working on a way to reduce the size of these objects so users don't get memory problems this easily (10K models in your case). We'll update once we have a permanent scalable solution. |
Hey @laresbernardo I have managed to get a clustered output on Robyn 3.9.0, on 2 trials with 2000 iterations but I had to do it twice (as I got the same error on the first run), and it took almost 99% of my memory (I work on Win 11, 32GB RAM). |
or in summary.connection(connection) : invalid connection when I run the model and I want to change the train size to use 90% data. When I say train_size=0.9 -- it is erroring out. |
Project Robyn
I'm getting this error when running robyn_outputs step.
My dataset has about 300 rows and 40 variables, and I'm running on 2000 iterations * 5 trials. When I was on old 3.6x version of Robyn everything is perfectly fine. I looked at previous issues and tried manually changing cores to be 1 or 7 (as max is 8) but it still doesn't work
Environment & Robyn version
The text was updated successfully, but these errors were encountered: