diff --git a/NEWS.md b/NEWS.md index 1fc2f97f..48fe9035 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,4 +1,4 @@ -# janitor 2.0.1 (unreleased) +# janitor 2.0.1 (2020-04-12) ## Bug fixes and Breaking changes diff --git a/docs/404.html b/docs/404.html index 453c9135..7f43f6e3 100644 --- a/docs/404.html +++ b/docs/404.html @@ -8,11 +8,13 @@ Page not found (404) • janitor + + @@ -36,10 +38,12 @@ + + @@ -67,7 +71,7 @@ janitor - 2.0.0 + 2.0.1 @@ -101,7 +105,6 @@ Changelog - - - - - - @@ -153,7 +153,7 @@

library(readxl); library(janitor); library(dplyr); library(here)
 
 roster_raw <- read_excel(here("dirty_data.xlsx")) # available at http://github.com/sfirke/janitor
-glimpse(roster_raw)
+glimpse(roster_raw)
 #> Rows: 13
 #> Columns: 11
 #> $ `First Name`        <chr> "Jason", "Jason", "Alicia", "Ada", "Desus", "Chien-Shiung", "Chien-Shiung", NA,…
@@ -173,7 +173,7 @@ 

.name_repair = make_clean_names) # Tells read_excel() how to repair repetitive column names, overriding the # default repair setting -glimpse(roster_raw_cleaner) +glimpse(roster_raw_cleaner) #> Rows: 13 #> Columns: 11 #> $ first_name <chr> "Jason", "Jason", "Alicia", "Ada", "Desus", "Chien-Shiung", "Chien-Shiung", NA, "… @@ -190,9 +190,9 @@

This can be further cleaned:

roster <- roster_raw_cleaner %>%
   remove_empty(c("rows", "cols")) %>%
-  mutate(hire_date = excel_numeric_to_date(hire_date),
-         cert = coalesce(certification, certification_2)) %>% # from dplyr
-  select(-certification, -certification_2) # drop unwanted columns
+  mutate(hire_date = excel_numeric_to_date(hire_date),
+         cert = coalesce(certification, certification_2)) %>% # from dplyr
+  select(-certification, -certification_2) # drop unwanted columns
 
 roster
 #> # A tibble: 12 x 8
@@ -227,7 +227,7 @@ 

Finding duplicates

Use get_dupes() to identify and examine duplicate records during data cleaning. Let’s see if any teachers are listed more than once:

-
roster %>% get_dupes(contains("name"))
+
roster %>% get_dupes(contains("name"))
 #> # A tibble: 4 x 9
 #>   first_name   last_name dupe_count employee_status subject   hire_date  percent_allocat… full_time cert      
 #>   <chr>        <chr>          <int> <chr>           <chr>     <date>                <dbl> <chr>     <chr>     
@@ -272,7 +272,7 @@ 

#> <NA> 2 0.16666667 NA

Two variables:

roster %>%
-  filter(hire_date > as.Date("1950-01-01")) %>%
+  filter(hire_date > as.Date("1950-01-01")) %>%
   tabyl(employee_status, full_time)
 #>  employee_status No Yes
 #>   Administration  0   1
@@ -319,7 +319,7 @@ 

  • submit suggestions and report bugs: https://github.com/sfirke/janitor/issues
  • -
  • let me know what you think on Mastodon @samfirke@a2mi.social +
  • let me know what you think on Mastodon: @samfirke@a2mi.social
  • compose a friendly e-mail to: samuel.firke AT gmail
  • diff --git a/docs/issue_template.html b/docs/issue_template.html index 2d9cf95e..ce4dc689 100644 --- a/docs/issue_template.html +++ b/docs/issue_template.html @@ -8,11 +8,13 @@ NA • janitor + + @@ -36,10 +38,12 @@ + + @@ -67,7 +71,7 @@ janitor - 2.0.0 + 2.0.1

@@ -101,7 +105,6 @@ Changelog -
@@ -101,7 +105,6 @@ Changelog -
+
+

+janitor 2.0.1 (2020-04-12) 2020-04-12 +

+
+

+Bug fixes and Breaking changes

+

Transliteration of characters within make_clean_names() now operates across operating systems, independent of differences in stringi installations (Fix #365, thanks to @eamoncaddigan for reporting and @billdenney for fixing).

+

This bug patch represents a breaking change with the way that make_clean_names() worked in janitor versions 1.2.1.9000 and 2.0.0 as the transliterations are now more generalized and follow a more best-practice approach to transliterating to ASCII.

+
+

-janitor 2.0.0 (2020-04-07) Unreleased +janitor 2.0.0 (2020-04-07) 2020-04-08

-Breaking Changes

+Breaking changes

  • clean_names() and make_clean_names() are now more locale-independent and translation to ASCII is simpler (in many cases, Unicode is removed, e.g., the Greek character “delta” becomes a “d”). You may also now control how substitutions occur and add your own substitutions (like “%” becoming “percent”). As a result of these changes, the clean names generated by these functions may break with what was produced in prior versions of janitor. (Fix #331, thanks to @billdenney)
  • @@ -255,7 +269,7 @@

-Major Features

+Major features

The new function row_to_names() handles the case where a dirty data file is read in with its names stored as a row of the data.frame, rather than in the names. This function sets the names of the data.frame to this row and optionally cleans up the rows above and including where the names were stored. Thanks to @billdenney for writing this feature.

@@ -288,7 +302,7 @@

A fully-overhauled tabyl

tabyl() is now a single function that can count combinations of one, two, or three variables, ala base R’s table(). The resulting tabyl data.frames can be manipulated and formatted using a family of adorn_ functions. See the tabyls vignette for more.

-

The now-redundant legacy functions crosstab() and adorn_crosstab() have been deprecated, but remain in the package for now. Existing code that relies on the version of tabyl present in janitor versions <= 0.3.1 will break if the sort argument was used, as that argument no longer exists in tabyl (use dplyr::arrange() instead).

+

The now-redundant legacy functions crosstab() and adorn_crosstab() have been deprecated, but remain in the package for now. Existing code that relies on the version of tabyl present in janitor versions <= 0.3.1 will break if the sort argument was used, as that argument no longer exists in tabyl (use dplyr::arrange() instead).

@@ -302,7 +316,7 @@

-Major Features

+Major features

-Minor Features

+Minor features

-Major Features

+Major features

Deprecated the following functions:

-Minor Features

+Minor features
@@ -101,7 +105,6 @@ Changelog - - - - - - - - - - - - - - - - - - - - - - - - - - - @@ -118,7 +129,6 @@ Changelog - - - - - - - - - - - - - -