Skip to content

Commit

Permalink
Rework simplification (#909)
Browse files Browse the repository at this point in the history
Introduce new `list_simplify()` which provides vctrs based simplifcation. This is used to power a new `list_transpose()` and `accumulate()`/`accumulate2()`.

* Fixes #875 by introducing `list_transpose()`
* Fixes #809 by allowing `accumulate()` to opt out of simplification.
* Fixes #774 by aligning `accumulate()` and `accumulate2()` semantics.
* Fixes #900 by concluding the flatten/simplification re-alignment.
  • Loading branch information
hadley authored Sep 12, 2022
1 parent 61fb2ac commit 1276623
Show file tree
Hide file tree
Showing 26 changed files with 886 additions and 146 deletions.
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,8 @@ export(list_flatten)
export(list_merge)
export(list_modify)
export(list_rbind)
export(list_simplify)
export(list_transpose)
export(list_update)
export(lmap)
export(lmap_at)
Expand Down
17 changes: 15 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,13 @@
* `*_dfc()` and `*_dfr()` have been deprecated in favour of using the
appropriate map function along with `list_rbind()` or `list_cbind()` (#912).

* `simplify()`, `simplify_all()`, and `as_vector()` have been deprecated in
favour of `list_simplify()`. It provides a more consistent definition of
simplification (#900).

* `transpose()` has been deprecated in favour of `list_transpose()` (#875).
It has built-in simplification.

### Deprecation next steps

* `as_function()`, `at_depth()`, and the `...f` argument to `partial()`
Expand Down Expand Up @@ -70,8 +77,14 @@
* New `list_c()`, `list_rbind()`, and `list_cbind()` make it easy to
`c()`, `rbind()`, or `cbind()` all of the elements in a list.

* `accumulate()` now uses vctrs for simplifying the output. This
ensures a more principled and flexible coercion behaviour.
* New `list_simplify()` reduces a list of length-1 vectors to a simpler atomic
or S3 vector (#900).

* New `list_transpose()` which automatically simplifies if possible (#875).

* `accumulate()` and `accumulate2()` now both simplify the output if possible
using vctrs. New arguments `simplify` and `ptype` allow you to control the
details of simplification (#774, #809).

### Tidyverse consistency

Expand Down
58 changes: 33 additions & 25 deletions R/coercion.R
Original file line number Diff line number Diff line change
@@ -1,38 +1,35 @@
#' Coerce a list to a vector
#'
#' `as_vector()` collapses a list of vectors into one vector. It
#' checks that the type of each vector is consistent with
#' `.type`. If the list can not be simplified, it throws an error.
#' `simplify` will simplify a vector if possible; `simplify_all`
#' will apply `simplify` to every element of a list.
#' @description
#' `r lifecycle::badge("deprecated")`
#'
#' `.type` can be a vector mold specifying both the type and the
#' length of the vectors to be concatenated, such as `numeric(1)`
#' or `integer(4)`. Alternatively, it can be a string describing
#' the type, one of: "logical", "integer", "double", "complex",
#' "character" or "raw".
#' These functions are deprecated in favour of `list_simplify()`:
#'
#' * `as_vector(x)` is now `list_simplify(x)`
#' * `simplify(x)` is now `list_simplify(x, strict = FALSE)`
#' * `simplify_all(x)` is `map(x, list_simplify, strict = FALSE)`
#'
#' @param .x A list of vectors
#' @param .type A vector mold or a string describing the type of the
#' input vectors. The latter can be any of the types returned by
#' [typeof()], or "numeric" as a shorthand for either
#' "double" or "integer".
#' @param .type Can be a vector mold specifying both the type and the
#' length of the vectors to be concatenated, such as `numeric(1)`
#' or `integer(4)`. Alternatively, it can be a string describing
#' the type, one of: "logical", "integer", "double", "complex",
#' "character" or "raw".
#' @export
#' @keywords internal
#' @examples
#' # Supply the type either with a string:
#' # was
#' as.list(letters) %>% as_vector("character")
#' # now
#' as.list(letters) %>% list_simplify(ptype = character())
#'
#' # Or with a vector mold:
#' as.list(letters) %>% as_vector(character(1))
#'
#' # Vector molds are more flexible because they also specify the
#' # length of the concatenated vectors:
#' # was:
#' list(1:2, 3:4, 5:6) %>% as_vector(integer(2))
#'
#' # Note that unlike vapply(), as_vector() never adds dimension
#' # attributes. So when you specify a vector mold of size > 1, you
#' # always get a vector and not a matrix
#' # now:
#' list(1:2, 3:4, 5:6) %>% list_c(ptype = integer())
as_vector <- function(.x, .type = NULL) {
lifecycle::deprecate_warn("0.4.0", "as_vector()", "list_simplify()")

if (can_simplify(.x, .type)) {
unlist(.x)
} else {
Expand All @@ -43,6 +40,7 @@ as_vector <- function(.x, .type = NULL) {
#' @export
#' @rdname as_vector
simplify <- function(.x, .type = NULL) {
lifecycle::deprecate_warn("0.4.0", "as_vector()", "list_simplify()")
if (can_simplify(.x, .type)) {
unlist(.x)
} else {
Expand All @@ -53,7 +51,17 @@ simplify <- function(.x, .type = NULL) {
#' @export
#' @rdname as_vector
simplify_all <- function(.x, .type = NULL) {
map(.x, simplify, .type = .type)
lifecycle::deprecate_warn("0.4.0", "as_vector()", I("map() + list_simplify()"))

# Inline simplify to avoid double deprecation
simplify <- function(.x) {
if (can_simplify(.x, .type)) {
unlist(.x)
} else {
.x
}
}
map(.x, simplify)
}


Expand Down
90 changes: 90 additions & 0 deletions R/list-simplify.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
#' Simplify a list to an atomic or S3 vector
#'
#' Simplification maintains a one-to-one correspondence between the input
#' and output, implying that each element of `x` must contain a vector of
#' length 1. If you don't want to maintain this correspondence, then you
#' probably want either [list_c()] or [list_flatten()].
#'
#' @param x A list.
#' @param strict What should happen if simplification fails? If `TRUE`,
#' it will error. If `FALSE` and `ptype` is not supplied, it will return `x`
#' unchanged.
#' @param ptype An optional prototype to ensure that the output type is always
#' the same.
#' @returns A vector the same length as `x`.
#' @export
#' @examples
#' list_simplify(list(1, 2, 3))
#'
#' try(list_simplify(list(1, 2, "x")))
#' try(list_simplify(list(1, 2, 1:3)))
list_simplify <- function(x, strict = TRUE, ptype = NULL) {
if (!is_bool(strict)) {
cli::cli_abort(
"{.arg strict} must be `TRUE` or `FALSE`, not {.obj_type_friendly {strict}}."
)
}

simplify_impl(x, strict = strict, ptype = ptype)
}

# Wrapper used by purrr functions that do automatic simplification
list_simplify_internal <- function(x,
simplify = NA,
ptype = NULL,
error_call = caller_env()) {
if (length(simplify) > 1 || !is.logical(simplify)) {
cli::cli_abort(
"{.arg simplify} must be `TRUE`, `FALSE`, or `NA`.",
arg = "simplify",
call = error_call
)
}
if (!is.null(ptype) && isFALSE(simplify)) {
cli::cli_abort(
"Can't specify {.arg ptype} when `simplify = FALSE`.",
arg = "ptype",
call = error_call
)
}

if (isFALSE(simplify)) {
return(x)
}

simplify_impl(
x,
strict = !is.na(simplify),
ptype = ptype,
error_call = error_call
)
}

simplify_impl <- function(x,
strict = TRUE,
ptype = NULL,
error_call = caller_env()) {
vec_check_list(x, call = error_call)

can_simplify <- every(x, vec_is, size = 1)

if (can_simplify) {
tryCatch(
# TODO: use `error_call` when available
list_unchop(x, ptype = ptype),
vctrs_error_incompatible_type = function(err) {
if (strict || !is.null(ptype)) {
cnd_signal(err)
} else {
x
}
}
)
} else {
if (strict) {
cli::cli_abort("All elements must be length-1 vectors.", call = error_call)
} else {
x
}
}
}
136 changes: 136 additions & 0 deletions R/list-transpose.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
#' Transpose a list
#'
#' @description
#' `list_transpose()` turns a list-of-lists "inside-out". For instance it turns a pair of
#' lists into a list of pairs, or a list of pairs into a pair of lists. For
#' example, if you had a list of length `n` where each component had values `a`
#' and `b`, `list_transpose()` would make a list with elements `a` and
#' `b` that contained lists of length `n`.
#'
#' It's called transpose because `x[["a"]][["b"]]` is equivalent to
#' `list_transpose(x)[["b"]][["a"]]`, i.e. transposing a list flips the order of
#' indices in a similar way to transposing a matrix.
#'
#' @param x A list of vectors to transpose.
#' @param template A "template" that describes the output list. Can either be
#' a character vector (where elements are extracted by name), or an integer
#' vector (where elements are extracted by position). Defaults to the names
#' of the first element of `x`, or if they're not present, the integer
#' indices.
#' @param simplify Should the result be [simplified][list_simplify]?
#' * `TRUE`: simplify or die trying.
#' * `NA`: simplify if possible.
#' * `FALSE`: never try to simplify, always leaving as a list.
#'
#' Alternatively, a named list specifying the simplification by output
#' element.
#' @param ptype An optional vector prototype used to control the simplification.
#' Alternatively, a named list specifying the prototype by output element.
#' @param default A default value to use if a value is absent or `NULL`.
#' Alternatively, a named list specifying the default by output element.
#' @export
#' @examples
#' # list_transpose() is useful in conjunction with safely()
#' x <- list("a", 1, 2)
#' y <- x %>% map(safely(log))
#' y %>% str()
#' # Put all the errors and results together
#' y %>% list_transpose() %>% str()
#' # Supply a default result to further simplify
#' y %>% list_transpose(default = list(result = NA)) %>% str()
#'
#' # list_transpose() will try to simplify by default:
#' x <- list(list(a = 1, b = 2), list(a = 3, b = 4), list(a = 5, b = 6))
#' x %>% list_transpose()
#' # this makes list_tranpose() not completely symmetric
#' x %>% list_transpose() %>% list_transpose()
#'
#' # use simplify = FALSE to always return lists:
#' x %>% list_transpose(simplify = FALSE) %>% str()
#' x %>%
#' list_transpose(simplify = FALSE) %>%
#' list_transpose(simplify = FALSE) %>% str()
#'
#' # Provide an explicit template if you know which elements you want to extract
#' ll <- list(
#' list(x = 1, y = "one"),
#' list(z = "deux", x = 2)
#' )
#' ll %>% list_transpose()
#' ll %>% list_transpose(template = c("x", "y", "z"))
#' ll %>% list_transpose(template = 1)
#'
#' # And specify a default if you want to simplify
#' ll %>% list_transpose(c("x", "y", "z"), default = NA)
list_transpose <- function(x, template = NULL, simplify = NA, ptype = NULL, default = NULL) {
vec_check_list(x)

if (length(x) == 0) {
template <- integer()
} else {
template <- template %||% vec_index(x[[1]])
}

if (!is.character(template) && !is.numeric(template)) {
cli::cli_abort(
"{.arg template} must be a character or numeric vector, not {.obj_type_friendly {template}}.",
arg = template
)
}

simplify <- match_template(simplify, template)
default <- match_template(default, template)
ptype <- match_template(ptype, template)

out <- rep_along(template, list())
if (is.character(template)) {
names(out) <- template
}

for (i in seq_along(template)) {
idx <- template[[i]]
res <- map(x, idx, .default = default[[i]])
res <- list_simplify_internal(res,
simplify = simplify[[i]] %||% NA,
ptype = ptype[[i]]
)
out[[i]] <- res
}

out
}

match_template <- function(x, template, error_arg = caller_arg(x), error_call = caller_env()) {
if (is.character(template)) {
if (is_bare_list(x) && is_named(x)) {
extra_names <- setdiff(names(x), template)
if (length(extra_names)) {
cli::cli_abort(
"{.arg {error_arg}} contains unknown names: {.str {extra_names}}.",
arg = error_arg,
call = error_call
)
}

out <- rep_named(template, list(NULL))
out[names(x)] <- x
out
} else {
rep_named(template, list(x))
}
} else if (is.numeric(template)) {
if (is_bare_list(x) && length(x) > 0) {
if (length(x) != length(template)) {
cli::cli_abort(
"Length of {.arg {error_arg}} ({length(x)}) and {.arg template} ({length(template)}) must be the same when transposing by position.",
call = error_call
)
}
x
} else {
rep_along(template, list(x))
}
} else {
abort("Invalid `template`", .internal = TRUE)
}
}
Loading

0 comments on commit 1276623

Please sign in to comment.