Skip to content

Commit

Permalink
Rework flattening (#912)
Browse files Browse the repository at this point in the history
The key idea is to introduce a new family of "combining" functions: `list_c()`, `list_rbind()`, and `list_cbind()`, which replace `flatten_lgl()`, `flatten_int()`, `flatten_dbl()`, `flatten_chr()` (now `list_c()`), `flatten_dfc()` (`list_cbind()`), and `flatten_dfr()` (`list_rbind()`). The new functions are straightforward wrappers around vctrs functions, but somehow feel natural in purrr to me. 

This leaves `flatten()`, which had a rather idiosyncratic interface. It's now been replaced by `list_flatten()` which now always removes a single layer of list hierarchy (and nothing else). While working on this I realised that this was actually what `splice()` did, so overall this feels like a major improvement in naming consistency.

With those functions in place we can deprecate `map_dfr()` and `map_dfc()` which are actually "flat" map functions because they combine, rather than simplify, the results. They have never actually belonged with `map_int()` and friends because they don't satisfy the invariant `length(map(.x, .f)) == length(.x)`, and `.f` must return a length-1 result. This also strongly implies that `flat_map()` would just be `map_c()` and is thus not necessary.

* Fixes #376 by deprecating `map_dfc()`
* Fixes #405 by clearly ruling against `map_c()`
* Fixes #472 by deprecating `map_dfr()`
* Fixes #575 by introducing `list_c()`, `list_rbind()`, and `list_cbind()`
* Fixes #757 by deprecating `flatten_dfr()` and `flatten_dfc()`
* Fixes #758 by introducing `list_rbind()` and `list_cbind()`
* Part of #900
  • Loading branch information
hadley authored Sep 8, 2022
1 parent d2896e2 commit 4f78bd3
Show file tree
Hide file tree
Showing 46 changed files with 1,291 additions and 569 deletions.
7 changes: 4 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Imports:
lifecycle (>= 1.0.1.9001),
magrittr (>= 1.5.0),
rlang (>= 0.4.10),
vctrs (>= 0.3.2)
vctrs (>= 0.4.1.9000)
Suggests:
covr,
dplyr (>= 0.7.8),
Expand All @@ -36,5 +36,6 @@ LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.1
Config/testthat/edition: 3
Remotes:
r-lib/lifecycle
Remotes:
r-lib/lifecycle,
r-lib/vctrs
4 changes: 4 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -129,8 +129,12 @@ export(lift_lv)
export(lift_vd)
export(lift_vl)
export(list_along)
export(list_c)
export(list_cbind)
export(list_flatten)
export(list_merge)
export(list_modify)
export(list_rbind)
export(lmap)
export(lmap_at)
export(lmap_if)
Expand Down
12 changes: 11 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,8 @@
manipulation that is very uncommon in R code (#871).

* `splice()` is deprecated because we no longer believe that automatic
splicing makes for good UI. Instead use `list2()` + `!!!` (#869).
splicing makes for good UI. Instead use `list2()` + `!!!` or
`list_flatten()` (#869).

* `as_function()`, `at_depth()`, and the `...f` argument to `partial()`
are no longer supported. They have been defunct for quite some time.
Expand All @@ -36,8 +37,17 @@
* `*_raw()` have been deprecated because they are of limited use and you can
now use `map_vec()` instead (#903).

* `flatten()` and friends are all deprecated in favour of `list_flatten()`,
`list_c()`, `list_cbind()`, and `list_rbind()`.

* `*_dfc()` and `*_dfr()` have been deprecated in favour of using the
appropriate map function along with `list_rbind()` or `list_cbind()` (#912).

## Features and fixes

* New `list_c()`, `list_rbind()`, and `list_cbind()` make it easy to
`c()`, `rbind()`, or `cbind()` all of the elements in a list.

* `_lgl()`, `_int()`, `_int()`, and `_dbl()` now use the same (strict) coercion
methods as vctrs (#904). This means that:

Expand Down
2 changes: 1 addition & 1 deletion R/arrays.R
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ array_branch <- function(array, margin = NULL) {
}
as.list(array)
} else {
flatten(apply(array, margin, list))
list_flatten(apply(array, margin, list))
}
}

Expand Down
47 changes: 36 additions & 11 deletions R/flatten.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
#' Flatten a list of lists into a simple vector.
#'
#' These functions remove a level hierarchy from a list. They are similar to
#' [unlist()], but they only ever remove a single layer of hierarchy and they
#' are type-stable, so you always know what the type of the output is.
#' @description
#' `r lifecycle::badge("deprecated")`
#'
#' These functions have been deprecated because their behavior was inconsistent.
#'
#' * `flatten()` has been replaced by [list_flatten()].
#' * `flatten_lgl()`, `flatten_int()`, `flatten_dbl()`, and `flatten_chr()`
#' have been replaced by [list_c()].
#' * `flatten_dfr()` and `flatten_dfc()` have been replaced by [list_rbind()]
#' and [list_cbind()] respectively.
#'
#' @param .x A list to flatten. The contents of the list can be anything for
#' `flatten()` (as a list is returned), but the contents must match the
Expand All @@ -14,50 +21,61 @@
#' `flatten_dfr()` and `flatten_dfc()` return data frames created by
#' row-binding and column-binding respectively. They require dplyr to
#' be installed.
#' @keywords internal
#' @inheritParams map
#' @export
#' @examples
#' x <- rerun(2, sample(4))
#' x <- map(1:3, ~ sample(4))
#' x
#' x %>% flatten()
#' x %>% flatten_int()
#'
#' # You can use flatten in conjunction with map
#' x %>% map(1L) %>% flatten_int()
#' # But it's more efficient to use the typed map instead.
#' x %>% map_int(1L)
#' # was
#' x %>% flatten_int() %>% str()
#' # now
#' x %>% list_c() %>% str()
#'
#' x <- list(list(1, 2), list(3, 4))
#' # was
#' x %>% flatten() %>% str()
#' # now
#' x %>% list_flatten() %>% str()
flatten <- function(.x) {
lifecycle::deprecate_warn("0.4.0", "flatten()", "list_flatten()")
.Call(flatten_impl, .x)
}

#' @export
#' @rdname flatten
flatten_lgl <- function(.x) {
lifecycle::deprecate_warn("0.4.0", "flatten_lgl()", "list_c()")
.Call(vflatten_impl, .x, "logical")
}

#' @export
#' @rdname flatten
flatten_int <- function(.x) {
lifecycle::deprecate_warn("0.4.0", "flatten_lgl()", "list_c()")
.Call(vflatten_impl, .x, "integer")
}

#' @export
#' @rdname flatten
flatten_dbl <- function(.x) {
lifecycle::deprecate_warn("0.4.0", "flatten_lgl()", "list_c()")
.Call(vflatten_impl, .x, "double")
}

#' @export
#' @rdname flatten
flatten_chr <- function(.x) {
lifecycle::deprecate_warn("0.4.0", "flatten_lgl()", "list_c()")
.Call(vflatten_impl, .x, "character")
}


#' @export
#' @rdname flatten
flatten_dfr <- function(.x, .id = NULL) {
lifecycle::deprecate_warn("0.4.0", "flatten_dfr()", "list_rbind()")
check_installed("dplyr", "for `flatten_dfr()`.")

res <- .Call(flatten_impl, .x)
Expand All @@ -67,6 +85,7 @@ flatten_dfr <- function(.x, .id = NULL) {
#' @export
#' @rdname flatten
flatten_dfc <- function(.x) {
lifecycle::deprecate_warn("0.4.0", "flatten_dfc()", "list_cbind()")
check_installed("dplyr", "for `flatten_dfc()`.")

res <- .Call(flatten_impl, .x)
Expand All @@ -76,4 +95,10 @@ flatten_dfc <- function(.x) {
#' @export
#' @rdname flatten
#' @usage NULL
flatten_df <- flatten_dfr
flatten_df <- function(.x, .id = NULL) {
lifecycle::deprecate_warn("0.4.0", "flatten_df()", "list_rbind()")
check_installed("dplyr", "for `flatten_dfr()`.")

res <- .Call(flatten_impl, .x)
dplyr::bind_rows(res, .id = .id)
}
14 changes: 0 additions & 14 deletions R/imap.R
Original file line number Diff line number Diff line change
Expand Up @@ -59,20 +59,6 @@ imap_dbl <- function(.x, .f, ...) {
}


#' @rdname imap
#' @export
imap_dfr <- function(.x, .f, ..., .id = NULL) {
.f <- as_mapper(.f, ...)
map2_dfr(.x, vec_index(.x), .f, ..., .id = .id)
}

#' @rdname imap
#' @export
imap_dfc <- function(.x, .f, ...) {
.f <- as_mapper(.f, ...)
map2_dfc(.x, vec_index(.x), .f, ...)
}

#' @export
#' @rdname imap
iwalk <- function(.x, .f, ...) {
Expand Down
79 changes: 79 additions & 0 deletions R/list-combine.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
#' Combine list elements into a single data structure
#'
#' @description
#' * `list_c()` combines elements into a vector by concatenating them together
#' with [vctrs::vec_c()].
#'
#' * `list_rbind()` combines elements into a data frame by row-binding them
#' together with [vctrs::vec_rbind()].
#'
#' * `list_cbind()` combines elements into a data frame by column-binding them
#' together with [vctrs::vec_cbind()].
#'
#' @param x A list. For `list_rbind()` and `list_cbind()` the list must
#' only contain data frames.
#' @param ptype An optional prototype to ensure that the output type is always
#' the same.
#' @param id By default, `names(x)` are lost. Alternatively, supply a string
#' and the names will be saved into a column with name `{id}`. If `id`
#' is supplied and `x` is not named, the position of the elements will
#' be used instead of the names.
#' @param size An optional integer size to ensure that every input has the
#' same size (i.e. number of rows).
#' @param name_repair One of `"unique"`, `"universal"`, or `"check_unique"`.
#' See [vctrs::vec_as_names()] for the meaning of these options.
#' @export
#' @examples
#' x1 <- list(a = 1, b = 2, c = 3)
#' list_c(x1)
#'
#' x2 <- list(
#' a = data.frame(x = 1:2),
#' b = data.frame(y = "a")
#' )
#' list_rbind(x2)
#' list_rbind(x2, id = "id")
#' list_rbind(unname(x2), id = "id")
#'
#' list_cbind(x2)
list_c <- function(x, ptype = NULL) {
vec_check_list(x)
vctrs::vec_unchop(x, ptype = ptype)
}

#' @export
#' @rdname list_c
list_cbind <- function(
x,
name_repair = c("unique", "universal", "check_unique"),
size = NULL
) {
check_list_of_data_frames(x)

vctrs::vec_cbind(!!!x, .name_repair = name_repair, .size = size, .call = current_env())
}

#' @export
#' @rdname list_c
list_rbind <- function(x, id = rlang::zap(), ptype = NULL) {
check_list_of_data_frames(x)

vctrs::vec_rbind(!!!x, .names_to = id, .ptype = ptype, .call = current_env())
}


check_list_of_data_frames <- function(x, error_call = caller_env()) {
vec_check_list(x, call = error_call)

is_df <- map_lgl(x, is.data.frame)

if (all(is_df)) {
return()
}

bad <- which(!is_df)
cli::cli_abort(
"All elements of {.arg x} must be data frames. Elements {bad} are not.",
call = error_call
)
}
55 changes: 55 additions & 0 deletions R/list-flatten.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
#' Flatten a list
#'
#' Flattening a list removes a single layer of internal hierarchy,
#' i.e. it inlines elements that are lists leaving non-lists alone.
#'
#' @param x A list.
#' @param name_spec If both inner and outer names are present, control
#' how they are combined. Should be a glue specification that uses
#' variables `inner` and `outer`.
#' @param name_repair One of `"minimal"`, `"unique"`, `"universal"`, or
#' `"check_unique"`. See [vctrs::vec_as_names()] for the meaning of these
#' options.
#' @return A list. The list might be shorter if `x` contains empty lists,
#' the same length if it contains lists of length 1 or no sub-lists,
#' or longer if it contains lists of length > 1.
#' @export
#' @examples
#' x <- list(1, list(2, 3), list(4, list(5)))
#' x %>% list_flatten() %>% str()
#' x %>% list_flatten() %>% list_flatten() %>% str()
#'
#' # Flat lists are left as is
#' list(1, 2, 3, 4, 5) %>% list_flatten() %>% str()
#'
#' # Empty lists will disappear
#' list(1, list(), 2, list(3)) %>% list_flatten() %>% str()
#'
#' # Another way to see this is that it reduces the depth of the list
#' x <- list(
#' list(),
#' list(list())
#' )
#' x %>% pluck_depth()
#' x %>% list_flatten() %>% pluck_depth()
#'
#' # Use name_spec to control how inner and outer names are combined
#' x <- list(x = list(a = 1, b = 2), y = list(c = 1, d = 2))
#' x %>% list_flatten() %>% names()
#' x %>% list_flatten(name_spec = "{outer}") %>% names()
#' x %>% list_flatten(name_spec = "{inner}") %>% names()
list_flatten <- function(
x,
name_spec = "{outer}_{inner}",
name_repair = c("minimal", "unique", "check_unique", "universal")
) {
vec_check_list(x)

x <- map_if(x, vec_is_list, identity, .else = list)
vec_unchop(
x,
ptype = list(),
name_spec = name_spec,
name_repair = name_repair
)
}
9 changes: 7 additions & 2 deletions R/lmap.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@
#' @inheritParams map_if
#' @inheritParams map_at
#' @inheritParams map
#' @return A list. There are no guarantees about the length.
#' @return A list or data frame, matching `.x`. There are no guarantees about
#' the length.
#' @family map variants
#' @export
#' @examples
Expand Down Expand Up @@ -86,5 +87,9 @@ lmap_helper <- function(.x, .ind, .f, ..., .else = NULL) {
out[[i]] <- res
}

flatten(out)
if (is.data.frame(.x)) {
list_cbind(out)
} else {
list_flatten(out)
}
}
Loading

0 comments on commit 4f78bd3

Please sign in to comment.