-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error : Computation failed in stat_bin()
: binwidth
must be positive
#3043
Comments
This looks like an issue where you're using reprex (this repo, which helps with reprex mechanics) to pose a question about ggplot2, yes? In that case, I recommend you close this issue and open same over on https://github.com/tidyverse/ggplot2/issues |
This comment has been minimized.
This comment has been minimized.
Please reduce your code example to the absolute minimum necessary to produce the issue and then make it reproducible by running it through the reprex package. These articles may help: |
Thanks @batpigandme for moving the issue to ggplot2. Thanks @jennybc & @clauswilke for pointing to reprex. This is a really nice way to have a conversation. Hope the snippet below helps, I am trying to create multiple histograms in one plot. It seems that features with zero variance / one value are not getting plotted like Over18, Standard Hours. Please help me in resolving this. library(readxl)
library(httr)
library(tidyverse)
library(tidyquant)
#> Loading required package: lubridate
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
#> Loading required package: PerformanceAnalytics
#> Loading required package: xts
#> Loading required package: zoo
#>
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#>
#> as.Date, as.Date.numeric
#>
#> Attaching package: 'xts'
#> The following objects are masked from 'package:dplyr':
#>
#> first, last
#>
#> Attaching package: 'PerformanceAnalytics'
#> The following object is masked from 'package:graphics':
#>
#> legend
#> Loading required package: quantmod
#> Loading required package: TTR
#> Version 0.4-0 included new data defaults. See ?getSymbols.
url <- "https://community.watsonanalytics.com/wp-content/uploads/2016/06/HR-Employee-Attrition-data.xlsx"
GET(url, write_disk(tf <- tempfile(fileext = ".xlsx")))
#> Response [https://community.watsonanalytics.com/wp-content/uploads/2016/06/HR-Employee-Attrition-data.xlsx]
#> Date: 2018-12-21 19:19
#> Status: 200
#> Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
#> Size: 266 kB
#> <ON DISK> /var/folders/gl/wb7rqfgd3v708q5m0ydj3pkc0000gq/T//RtmpYF44cr/filef2d148f65c0.xlsx
data<- read_excel(tf)
bins = 10
ncol = 5
fct_reorder = FALSE
fct_rev = FALSE
fill = palette_light()[[3]]
color = "white"
scale = "free"
data_factored <- data %>%
mutate_if(is.character, as.factor) %>%
mutate_if(is.factor, as.numeric) %>%
gather(key = key, value = value, factor_key = TRUE)
if(fct_reorder) {
data_factored <- data_factored %>%
mutate(key = as.character(key) %>% as.factor())
}
if(fct_rev){
data_factored <- data_factored %>%
mutate(key = fct_rev(key))
}
g <- data_factored %>%
ggplot(aes(x = value, group = key )) +
geom_histogram(bins = bins, fill = fill , color = color,
) +
facet_wrap(~ key, ncol = ncol, scale = scale) +
theme_tq()
g
#> Warning: Computation failed in `stat_bin()`:
#> `binwidth` must be positive
#> Warning: Computation failed in `stat_bin()`:
#> `binwidth` must be positive
#> Warning: Computation failed in `stat_bin()`:
#> `binwidth` must be positive Created on 2018-12-22 by the reprex package (v0.2.1) |
This does not look like a "minimal example"... Can you identify which of these many plots is associated with an error? (retry plot by plot) From the error message, are you attempting to set too many bins (10) for too little data? |
Hey @ptoche , thanks for your response. Hope the following example helps. I have tried plot by plot and still getting the similar warning. Following example sets bins = 5 which produces the similar error. library(readxl)
library(httr)
library(tidyverse)
library(tidyquant)
#> Loading required package: lubridate
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
#> Loading required package: PerformanceAnalytics
#> Loading required package: xts
#> Loading required package: zoo
#>
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#>
#> as.Date, as.Date.numeric
#>
#> Attaching package: 'xts'
#> The following objects are masked from 'package:dplyr':
#>
#> first, last
#>
#> Attaching package: 'PerformanceAnalytics'
#> The following object is masked from 'package:graphics':
#>
#> legend
#> Loading required package: quantmod
#> Loading required package: TTR
#> Version 0.4-0 included new data defaults. See ?getSymbols.
url <- "https://community.watsonanalytics.com/wp-content/uploads/2016/06/HR-Employee-Attrition-data.xlsx"
GET(url, write_disk(tf <- tempfile(fileext = ".xlsx")))
#> Response [https://community.watsonanalytics.com/wp-content/uploads/2016/06/HR-Employee-Attrition-data.xlsx]
#> Date: 2018-12-21 20:22
#> Status: 200
#> Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
#> Size: 266 kB
#> <ON DISK> /var/folders/gl/wb7rqfgd3v708q5m0ydj3pkc0000gq/T//RtmpfHG0YO/filef3692827f789.xlsx
data<- read_excel(tf)
bins = 5
ncol = 1
fill = palette_light()[[3]]
color = "white"
scale = "free"
zeroVar <- function(data, useNA = 'ifany') {
out <- apply(data, 2, function(x) {length(table(x, useNA = useNA))})
which(out==1)
}
# Following are the features which produces error
zeroVar(data)
#> Employee Count Over 18 Standard Hours
#> 1 8 23
data_factored <- data %>%
select('Employee Count', 'Over 18', 'Standard Hours')%>%
mutate_if(is.character, as.factor) %>%
mutate_if(is.factor, as.numeric) %>%
gather(key = key, value = value, factor_key = TRUE)
data_factored %>%
ggplot(aes(x = value, group = key )) +
geom_histogram(bins = bins, fill = fill , color = color,
) +
facet_wrap(~ key, ncol = ncol, scale = scale) +
theme_tq()
#> Warning: Computation failed in `stat_bin()`:
#> `binwidth` must be positive
#> Warning: Computation failed in `stat_bin()`:
#> `binwidth` must be positive
#> Warning: Computation failed in `stat_bin()`:
#> `binwidth` must be positive Created on 2018-12-22 by the reprex package (v0.2.1) |
Can you cut that down to one plot, |
Thanks @ptoche for following up. Hope following is what you are looking to reproduce library(readxl)
library(httr)
library(tidyverse)
library(tidyquant)
#> Loading required package: lubridate
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
#> Loading required package: PerformanceAnalytics
#> Loading required package: xts
#> Loading required package: zoo
#>
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#>
#> as.Date, as.Date.numeric
#>
#> Attaching package: 'xts'
#> The following objects are masked from 'package:dplyr':
#>
#> first, last
#>
#> Attaching package: 'PerformanceAnalytics'
#> The following object is masked from 'package:graphics':
#>
#> legend
#> Loading required package: quantmod
#> Loading required package: TTR
#> Version 0.4-0 included new data defaults. See ?getSymbols.
url <- "https://community.watsonanalytics.com/wp-content/uploads/2016/06/HR-Employee-Attrition-data.xlsx"
GET(url, write_disk(tf <- tempfile(fileext = ".xlsx")))
#> Response [https://community.watsonanalytics.com/wp-content/uploads/2016/06/HR-Employee-Attrition-data.xlsx]
#> Date: 2018-12-24 21:07
#> Status: 200
#> Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
#> Size: 266 kB
#> <ON DISK> /var/folders/gl/wb7rqfgd3v708q5m0ydj3pkc0000gq/T//RtmpA1oFEE/file108fe1d346270.xlsx
data<- read_excel(tf)
data <- data %>%
select('Over 18')
bins = 5
ncol = 1
fct_reorder = FALSE
fct_rev = FALSE
fill = palette_light()[[3]]
color = "white"
scale = "free"
data_factored <- data %>%
mutate_if(is.character, as.factor) %>%
mutate_if(is.factor, as.numeric) %>%
gather(key = key, value = value, factor_key = TRUE)
dput(head(data_factored,10))
#> structure(list(key = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
#> 1L, 1L, 1L), .Label = "Over 18", class = "factor"), value = c(1,
#> 1, 1, 1, 1, 1, 1, 1, 1, 1)), row.names = c(NA, -10L), class = c("tbl_df",
#> "tbl", "data.frame"))
data_factored %>%
ggplot(aes(x = value, group = key )) +
geom_histogram(bins = bins, fill = fill , color = color,
) +
facet_wrap(~ key, ncol = ncol, scale = scale) +
theme_tq()
#> Warning: Computation failed in `stat_bin()`:
#> `binwidth` must be positive Created on 2018-12-25 by the reprex package (v0.2.1) |
You can use the result of data_factored <- structure(list(key = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = "Over 18", class = "factor"), value = c(1,
1, 1, 1, 1, 1, 1, 1, 1, 1)), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))
data_factored %>%
ggplot(aes(x = value, group = key )) +
geom_histogram(bins = bins, fill = fill , color = color) +
facet_wrap(~ key, ncol = ncol, scale = scale) +
theme_tq() I guess the code below is the minimal version of the problem; if there's only one value, library(ggplot2)
d <- data.frame(x = rep(1, 100))
ggplot(d, aes(x = x)) +
geom_histogram()
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> Warning: Computation failed in `stat_bin()`:
#> `binwidth` must be positive Created on 2018-12-25 by the reprex package (v0.2.1) This is because Line 104 in 7f13dfa
|
@yutannihilation : this is pretty much the problem we suspected from the start. I don't think the error message is that bad... Having said that, a possible improvement would be to set bins to 1 if there isn't enough data to compute a width, e.g. along the lines of |
Thanks, but I think |
@yutannihilation, Indeed you're right, From a basic definition of a histogram (wikipedia, I'm afraid): "To construct a histogram, divide the entire range of values into a series of intervals and then count how many values fall into each interval." In the case you have highlighted, What I mean is that I think the histogram should look like this: library(ggplot2)
d <- data.frame(x = rep(1, 100))
ggplot(d, aes(x = x)) +
geom_bar() Created on 2018-12-26 by the reprex package (v0.2.1) Does that make sense? |
I basically agree with you in that this is a problem, I just disagreed with this part:
|
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/ |
This might be related to a previous issue:
#2312
I am using the function below to create multiple histograms in one plot. It seems that features with zero variance / one value are not getting plotted. It was working with previous version of ggplot2. Please help me in fixing this.
plot_hist_facet <- function(data, bins = 10, ncol = 5,
fct_reorder = FALSE, fct_rev = FALSE,
fill = palette_light()[[3]],
color = "white", scale = "free") {
data_factored <- data %>%
mutate_if(is.character, as.factor) %>%
mutate_if(is.factor, as.numeric) %>%
gather(key = key, value = value, factor_key = TRUE)
if(fct_reorder) {
data_factored <- data_factored %>%
mutate(key = as.character(key) %>% as.factor())
}
if(fct_rev){
data_factored <- data_factored %>%
mutate(key = fct_rev(key))
}
g <- data_factored %>%
ggplot(aes(x = value, group = key )) +
geom_histogram(bins = bins, fill = fill , color = color,
) +
facet_wrap(~ key, ncol = ncol, scale = scale) +
theme_tq()
return(g)
}
Getting Computation failed in
data:image/s3,"s3://crabby-images/2cca0/2cca00a95507bd1253c92a8476a67568eb026de6" alt="snip20181221_20"
stat_bin()
:binwidth
must be positive errorThe text was updated successfully, but these errors were encountered: