-
-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip set.seed() if NA_integer_ is given. #53
Conversation
This should address mlampros#52 but currently it does not work because `NA_integer_` is implemented as the maximum negative value, and a negative seed causes an error "negative length vectors are not allowed". `base::set.seed()` accepts negative seeds without any problem, which indicates ClusterR is doing something wrong with negative seeds.
Sorry, my bad. I was doing wrong on initialization of |
With this modification, it becomes possible to do this: set.seed(1L)
KMeans_arma(iris[,-5], 3L, seed = NA_integer_) which generates exactly the same result as this: KMeans_arma(iris[,-5], 3L, seed = 1L) It is also possible to do the same thing in Rcpp level. |
Thanks for the pull request and the additional functionality. A few requests from my side:
|
OK, for now, I have added the minimal set of tests to validate the modification.
|
We have to make sure the previous version of the ClusterR and the updated version using your adjustments in the code do not give different results. Therefore, we have to test the functions which have as an argument the "seed"
If you use compiled code in an R package then the code is tested for "Undefined Behaviour". Your adjustment which is the following line
Will take an NA (missing value) as input and this might lead to undefined behaviour (or might not - I don't know that, because I don't know the internals of the Rcpp package). I'll tell you from personal experience that I spent many hours in the past to fix errors from CRAN related to ASAN, UBSAN and avoid the exclusion of the package from CRAN. |
I appreciate the purpose, but partially disagree on the conclusion:
because it should have been logically proven already by the combination of the existing tests and my additional tests. Is there any ClusterR function that uses But, OK, you are the author of this package, and I will add some tests for the other functions in the near future.
Thank you for the effort. You saved my day(s). Is it possible that we skip our local ASAN/UBSAN tests and give it a shot to let CRAN do it (hopefully automated)? |
By "spent time" I meant to debug previous error cases related to ASAN, UBSAN and not related to your current additions in the .cpp files. This is something you have to test/check using the existing packages such as rhub or any packages that you think it can serve the purpose.
I'm sorry I can not do that. If you are not currently in place to perform the test/check then feel free to close the PR and re-open or create a new one once you are ready. thanks |
I'll close the PR for now. Feel free to re-open or open a new one. |
This should address #52 but currently it does not work because
NA_integer_
is implemented as the maximum negative value, and a negative seed causes an error "negative length vectors are not allowed".base::set.seed()
accepts negative seeds without any problem,which indicates ClusterR is doing something wrong with negative seeds.