Skip to content

Commit

Permalink
more notes to planning page
Browse files Browse the repository at this point in the history
  • Loading branch information
sfirke committed Dec 23, 2016
1 parent 5795251 commit e459789
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 4 deletions.
8 changes: 6 additions & 2 deletions planning.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ This page is for planning the janitor package, at a high level. More-finite que

### Purpose

Provide a framework and associated functions for checking and cleaning dirty data.
Provide a framework and associated functions for checking and cleaning dirty data. There are two kinds of checks: interactive checks, like `tabyl`, and programmatic checks that say, confirm in production that some variables contain no duplicate records, or contain no missing values.

#### In scope:

Expand All @@ -39,10 +39,14 @@ Provide a framework and associated functions for checking and cleaning dirty dat

### Priorities

#### Big items
1. Establish organizing framework for the package
1. Write up documentation/vignette showing check/clean iterative cycle
2. Rename functions to fit schema
2. Figure out new names for functions to fit schema, and rename them
3. Redo vignette
4. Redo homepage
2. New function: fuzzy dupes
3. New function family: bindability issues

#### Smaller items
1. `get_dupes` should have an option for returning with or without Ns as a column (it's so much faster without)
10 changes: 8 additions & 2 deletions planning.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Package vision

### Purpose

Provide a framework and associated functions for checking and cleaning dirty data.
Provide a framework and associated functions for checking and cleaning dirty data. There are two kinds of checks: interactive checks, like `tabyl`, and programmatic checks that say, confirm in production that some variables contain no duplicate records, or contain no missing values.

#### In scope:

Expand All @@ -29,11 +29,17 @@ Provide a framework and associated functions for checking and cleaning dirty dat

### Priorities

#### Big items

1. Establish organizing framework for the package
1. Write up documentation/vignette showing check/clean iterative cycle
2. Rename functions to fit schema
2. Figure out new names for functions to fit schema, and rename them
3. Redo vignette
4. Redo homepage

2. New function: fuzzy dupes
3. New function family: bindability issues

#### Smaller items

1. `get_dupes` should have an option for returning with or without Ns as a column (it's so much faster without)

0 comments on commit e459789

Please sign in to comment.