-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
incorrect ordering when having multiple across
calls inside arrange
#6538
Comments
sessioninfo::package_info("attached")
#> package * version date (UTC) lib source
#> dplyr * 1.0.10 2022-09-01 [1] CRAN (R 4.2.1)
#> forcats * 0.5.2 2022-08-19 [1] CRAN (R 4.2.1)
#> ggplot2 * 3.4.0 2022-11-04 [1] CRAN (R 4.2.2)
#> purrr * 0.3.5 2022-10-06 [1] CRAN (R 4.2.1)
#> readr * 2.1.3 2022-10-01 [1] CRAN (R 4.2.1)
#> stringr * 1.4.1 2022-08-20 [1] CRAN (R 4.2.1)
#> tibble * 3.1.8 2022-07-22 [1] CRAN (R 4.2.1)
#> tidyr * 1.2.1 2022-09-08 [1] CRAN (R 4.2.1)
#> tidyverse * 1.3.2 2022-07-18 [1] CRAN (R 4.2.2)
#> |
Nothing fishy here, you can reproduce with identical calls to library(tidyverse)
df <- tribble(
~other_text, ~categ_1, ~categ_2, ~points_1, ~points_2, ~total,
"x", "A", "B", 22L, 20L, 42L,
"z", "A", "B", 20L, 22L, 42L,
"y", "A", "B", 22L, 20L, 42L
)
set.seed(3)
purrr::map_lgl(1:20,
~ identical(
df %>%
slice_sample(n = nrow(.)) %>%
arrange(desc(total),
categ_1, categ_2,
desc(points_1), desc(points_2)),
df %>%
slice_sample(n = nrow(.)) %>%
arrange(desc(total),
categ_1, categ_2,
desc(points_1), desc(points_2))
))
#> [1] FALSE FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE TRUE
#> [13] TRUE TRUE FALSE FALSE FALSE FALSE FALSE TRUE The problem is that you have two rows that are identical in every column except |
Sorry @DavisVaughan, I was trying to write a reproducible example that were simpler than the real one. library(tidyverse)
df <- tibble::tribble(
~resultado_1, ~resultado_2, ~resultado_3, ~nota_1, ~nota_2,
"SU", "SU", "", 26.3, 18.75,
"SU", "SU", "", 28.3, 22.5
)
df
#> # A tibble: 2 × 5
#> resultado_1 resultado_2 resultado_3 nota_1 nota_2
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 SU SU "" 26.3 18.8
#> 2 SU SU "" 28.3 22.5
identical(
df %>%
mutate(across(
starts_with("resultado_"),
~ case_when(. %in% c("SU", "EX") ~ 1L,
. == "NS" ~ 2L,
. == "" ~ 3L)
)) %>%
arrange(resultado_1, resultado_2, resultado_3,
desc(nota_1), desc(nota_2)),
df %>%
arrange(
across(starts_with("resultado_"), ~ case_when(
. %in% c("SU", "EX") ~ 1,
. == "NS" ~ 2,
. == "" ~ 3
)),
across(starts_with("nota_"), desc)
)
)
#> [1] FALSE Could you check it? |
Those are just different data frames because you mutated the library(tidyverse)
df <- tibble::tribble(
~resultado_1, ~resultado_2, ~resultado_3, ~nota_1, ~nota_2,
"SU", "SU", "", 26.3, 18.75,
"SU", "SU", "", 28.3, 22.5
)
df
#> # A tibble: 2 × 5
#> resultado_1 resultado_2 resultado_3 nota_1 nota_2
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 SU SU "" 26.3 18.8
#> 2 SU SU "" 28.3 22.5
df %>%
mutate(across(
starts_with("resultado_"),
~ case_when(. %in% c("SU", "EX") ~ 1L,
. == "NS" ~ 2L,
. == "" ~ 3L)
)) %>%
arrange(resultado_1, resultado_2, resultado_3,
desc(nota_1), desc(nota_2))
#> # A tibble: 2 × 5
#> resultado_1 resultado_2 resultado_3 nota_1 nota_2
#> <int> <int> <int> <dbl> <dbl>
#> 1 1 1 3 28.3 22.5
#> 2 1 1 3 26.3 18.8
df %>%
arrange(
across(starts_with("resultado_"), ~ case_when(
. %in% c("SU", "EX") ~ 1,
. == "NS" ~ 2,
. == "" ~ 3
)),
across(starts_with("nota_"), desc)
)
#> # A tibble: 2 × 5
#> resultado_1 resultado_2 resultado_3 nota_1 nota_2
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 SU SU "" 28.3 22.5
#> 2 SU SU "" 26.3 18.8 Created on 2022-11-14 with reprex v2.0.2.9000 |
My previous intention was to write library(tidyverse)
df <- tibble::tribble(
~resultado_1, ~resultado_2, ~resultado_3, ~nota_1, ~nota_2,
"SU", "SU", "", 26.3, 18.75,
"SU", "SU", "", 28.3, 22.5
)
df
#> # A tibble: 2 × 5
#> resultado_1 resultado_2 resultado_3 nota_1 nota_2
#> <chr> <chr> <chr> <dbl> <dbl>
#> 1 SU SU "" 26.3 18.8
#> 2 SU SU "" 28.3 22.5
identical(
df %>%
mutate(across(
starts_with("resultado_"),
~ case_when(. %in% c("SU", "EX") ~ 1L,
. == "NS" ~ 2L,
. == "" ~ 3L)
)) %>%
arrange(resultado_1, resultado_2, resultado_3,
desc(nota_1), desc(nota_2)),
df %>%
arrange(
across(starts_with("resultado_"), ~ case_when(
. %in% c("SU", "EX") ~ 1,
. == "NS" ~ 2,
. == "" ~ 3
)),
across(starts_with("nota_"), desc)
) %>%
mutate(across(
starts_with("resultado_"),
~ case_when(. %in% c("SU", "EX") ~ 1L,
. == "NS" ~ 2L,
. == "" ~ 3L)
))
) This returns TRUE when using a recent tidyverse/dplyr version from Github. sessioninfo::package_info("attached")
package * version date (UTC) lib source
dplyr * 1.0.99.9000 2022-11-14 [1] Github (tidyverse/dplyr@50c58dd)
forcats * 0.5.2 2022-08-19 [1] CRAN (R 4.2.1)
ggplot2 * 3.4.0 2022-11-04 [1] CRAN (R 4.2.2)
purrr * 0.3.5 2022-10-06 [1] CRAN (R 4.2.1)
readr * 2.1.3 2022-10-01 [1] CRAN (R 4.2.1)
stringr * 1.4.1 2022-08-20 [1] CRAN (R 4.2.1)
tibble * 3.1.8 2022-07-22 [1] CRAN (R 4.2.1)
tidyr * 1.2.1 2022-09-08 [1] CRAN (R 4.2.1)
tidyverse * 1.3.2 2022-07-18 [1] CRAN (R 4.2.2) I had (since Oct 4th, #6490) installed locally a recent version from GitHub. So, this issue is analogous to #6490 and should remain closed. Thanks for your help and diagnostics @DavisVaughan. |
There's something fishy here...
When having multiple
across
insidearrange
, sometimes the ordering is correct and other times it is not:The text was updated successfully, but these errors were encountered: