-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use std of 0 for singleton vectors #90
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## main #90 +/- ##
=======================================
Coverage 99.25% 99.25%
=======================================
Files 12 12
Lines 135 135
=======================================
Hits 134 134
Misses 1 1
Continue to review full report at Codecov.
|
src/scaling.jl
Outdated
@@ -68,7 +68,8 @@ struct MeanStdScaling <: AbstractScaling | |||
end | |||
end | |||
|
|||
compute_stats(x) = (mean(x), std(x)) | |||
# Set std to 0 using corrected=false if x is a singleton | |||
compute_stats(x) = (mean(x), std(x; corrected=(length(x) != 1))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it's better to pass this via MeanStdScaling
as a kwarg? since it's already exposed via Statistics.std
anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see where you're coming from by leaving it up to the user. I guess it would just be inconvenient in the case that came up for me. I would rather change my downstream test case so that the data doesn't end up being one row, than add corrected=(length(x) != 1)
to MeanStdScaling
in the transform pipeline, in the source code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think it's better to not make those kinds of assumptions in the code here, especially when Statistics.std
doesn't. In the instance of edge-case we just have to handle it on that end.
@@ -31,8 +31,8 @@ struct MeanStdScaling <: AbstractScaling | |||
σ::Real | |||
|
|||
""" | |||
MeanStdScaling(A::AbstractArray; dims=:, inds=:) -> MeanStdScaling | |||
MeanStdScaling(table, [cols]) -> MeanStdScaling | |||
MeanStdScaling(A::AbstractArray; dims=:, inds=:, corrected=true) -> MeanStdScaling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just pass generic kwargs...
? Or does it get confused about what to do with the dims
kwarg?
Closes #89 by adding an optional kwarg
corrected
which is passed to thestd
function.