-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
194e5c3
commit 61c0edb
Showing
4 changed files
with
149 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
Package: econdatar | ||
Title: Automation of data tasks to and from Codera Analytics' econometric data service | ||
Version: 3.0.0 | ||
Version: 3.0.1 | ||
Date: 2024-03-01 | ||
Authors@R: c(person("Byron", "Botha", role = c("aut", "cre"), email = "[email protected]"), | ||
person("Sebastian", "Krantz", role = "ctb")) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
\name{read_dataset} | ||
\alias{read_dataset} | ||
\alias{tidy_data} | ||
\title{ | ||
read_dataset | ||
} | ||
\description{ | ||
Returns the data for the given data set - ECONDATA:id(version), as a list, or as tidy \emph{data.table}'s. Available data sets can be looked up using \code{read_database()} or from the web platform (http://econdata.co.za). Tidying can be done directly within \code{read_dataset()}, or ex-post using \code{tidy_data()}. | ||
} | ||
\usage{ | ||
read_dataset(id, tidy = FALSE, \dots) | ||
tidy_data(x, \dots) | ||
} | ||
\arguments{ | ||
\item{id}{Data set identifier.} | ||
\item{x}{A raw API return object to be tidied. Can also be done directly in \code{read_dataset()} by setting \code{tidy = TRUE}. See \code{tidy} below for tidying options.} | ||
\item{\dots}{Further \emph{Optional} arguments: | ||
\tabular{llll}{ | ||
\code{agencyid} \tab\tab character. Agency responsible for the metadata creation/maintenance. \cr | ||
\code{version} \tab\tab character. Version(s) of the data (different versions will have different metadata), or 'all' to return all available versions. \cr | ||
\code{series_key} \tab\tab character. A character vector specifying a subset of time series (see the web platform (export function) for details). \cr | ||
\code{release} \tab\tab character (optionally with format \%Y-\%m-\%dT\%H:\%M:\%S, to be coerced to a date/time). The release description, which will return the data associated with that release (if the given description matches an existsing release); or a date/time which will return the data as it was at the given time; or 'latest' which will return the latest release; or 'unreleased' which will return any unreleased data (useful for data that is updated more often than it is released, e.g. daily data). \cr | ||
\code{file} \tab\tab character. File name for retrieving data sets stored as JSON data from disk (output of \code{read_dataset()}. \cr | ||
\code{username} \tab\tab character. Web username. \cr | ||
\code{password} \tab\tab character. Web password. \cr | ||
} | ||
} | ||
\item{tidy}{logical. Return data and metadata in tidy \emph{data.table}'s (see Value), by passing the result through \code{tidy_data}. If \code{TRUE}, \code{read_dataset()/tidy_data()} admit the following additional arguments: | ||
\tabular{llll}{ | ||
\code{wide} \tab\tab logical, default: \code{TRUE}. Returns data in a column-based format, with \code{"label"} and \code{"source_identifier"} attributes to columns (when available) and an overall "metadata" attribute to the table, otherwise a long-format is returned. See Value. \cr | ||
|
||
\code{codelabel} \tab\tab logical, default: \code{FALSE}. If \code{wide = TRUE}, setting \code{codelabel = TRUE} the data key will be used to generate the \code{"label"}, when available. \cr | ||
|
||
\code{combine} \tab\tab logical, default: \code{FALSE}. If \code{wide = FALSE}, setting \code{combine = TRUE} will combine all data and metadata into a single long table, whereas the default \code{FALSE} will return data and metadata in separate tables, for more efficient storage. \cr | ||
|
||
\code{origmeta} \tab\tab logical, default: \code{FALSE}. If \code{wide = FALSE}, setting \code{origmeta = TRUE} will combine all metadata fields attached to the series in the dataset as they are. The default is to construct a standardized set of metadata variables, and then drop those not observed. See also \code{allmeta}. \cr | ||
|
||
\code{allmeta} \tab\tab logical, default: \code{FALSE}. If \code{wide = FALSE}, setting \code{allmeta = TRUE} always returns the full set of metadata fields, regardless of whether they are recorded for the given dataset. It is also possible that there are series with zero observations in a dataset. Such series are dropped in tidy output, but if \code{combine = FALSE}, \code{allmeta = TRUE} retains their metadata in the metadata table. \cr | ||
|
||
\code{prettymeta} \tab\tab logical, default: \code{TRUE}. Attempts to make the returned metadata more human readable replacing each code category and enumeration with its name. It is advisable to leave this set to \code{TRUE}, in some cases, where speed is paramount you may want to set this flag to \code{FALSE}. If multiple datasets are being querioed this option is automatically set to \code{FALSE}. \cr | ||
|
||
\code{release} \tab\tab logical, default: \code{FALSE}. \code{TRUE} allows you to apply \code{tidy_data()} to objects returned by \code{read_release()}. All other flags to \code{tidy_data()} are ignored. | ||
} | ||
} | ||
} | ||
\details{ | ||
An EconData account (http://econdata.co.za) is required to use this function. The user must provide their credentials either through the function arguments, or by setting the ECONDATA_CREDENTIALS environment variable using the syntax: "username;password", e.g. \code{Sys.setenv(ECONDATA_CREDENTIALS="username;password")}. If credentials are not supplied by the aforementioned methods a GUI dialog will prompt the user for credentials. | ||
} | ||
\value{ | ||
%% ~Describe the value returned | ||
If \code{tidy = FALSE}, a list of data frames is returned, where the names of the list are the EconData series codes, and each data frame has a single column named 'OBS_VALUE' containing the data, with corresponding dates attached as rownames. Each data frame further has a \code{"metadata"} attribute providing information about the series. The entire list of data frames also has a \code{"metadata"} attribute, providing information about the dataset. If multiple datasets (or versions of a dataset if \code{version} is specified as 'all') are being queried, a list of such lists is returned. | ||
|
||
If \code{tidy = TRUE} and \code{wide = TRUE} (the default), a single \emph{data.table} is returned where the first column is the date, and the remaining columns are series named by their EconData codes. Each series has two attributes: \code{"label"} provides a variable label combining important metadata from the \code{"metadata"} attribute in the non-tidy format, and \code{"source.code"} gives the series code assigned by the original data provider. The table has the same dataset-level \code{"metadata"} attribute as the list of data frames if \code{tidy = FALSE}. If multiple datasets (or versions of a dataset if \code{version} is specified as 'all') are being queried, a list of such \emph{data.table}'s is returned. | ||
If \code{tidy = TRUE} and \code{wide = FALSE} and \code{compact = FALSE} (the default), a named list of two \emph{data.table}'s is returned. The first, \code{"data"}, has columns 'code', 'date' and 'value' providing the data in a long format. The second, \code{"metadata"}, provides dataset and series-level matadata, with one row for each series. If \code{compact = TRUE}, these two datasets are combined, where all repetitive content is converted to factors for more efficient storage. If multiple datasets (or versions of a dataset if \code{version} is specified as 'all') are being queried, \code{compact = FALSE} gives a nested list, whereas \code{compact = TRUE} binds everything together to a single long frame. In general, if \code{wide = FALSE}, no attributes are attached to the tables or columns in the tables. | ||
|
||
%% \item{comp1 }{Description of 'comp1'} | ||
%% \item{comp2 }{Description of 'comp2'} | ||
%% ... | ||
} | ||
|
||
\seealso{ | ||
%% ~~objects to See Also as \code{\link{help}}, ~~~ | ||
\code{\link{write_dataset}} | ||
\code{\link{read_release}} | ||
} | ||
\examples{ | ||
\dontrun{ | ||
# library(econdatar) | ||
# Sys.setenv(ECONDATA_CREDENTIALS="username;password") | ||
# for ids/versions see: https://www.econdata.co.za/app | ||
|
||
# Electricity Generated | ||
ELECTRICITY <- read_dataset(id = "ELECTRICITY") | ||
ELECTRICITY_WIDE <- tidy_data(ELECTRICITY) # Or: read_dataset("ELECTRICITY", tidy = TRUE) | ||
ELECTRICITY_LONG <- tidy_data(ELECTRICITY, wide = FALSE) | ||
# Same as tidy_data(ELECTRICITY, wide = FALSE, combine = TRUE): | ||
with(ELECTRICITY_LONG, metadata[data, on = "data_key"]) | ||
|
||
# CPI Analytical Series: Different Revisions | ||
CPI_ANL <- read_dataset(id = "CPI_ANL_SERIES", version = "all") | ||
CPI_ANL_WIDE <- tidy_data(CPI_ANL) | ||
CPI_ANL_LONG <- tidy_data(CPI_ANL, wide = FALSE, combine = TRUE) | ||
CPI_ANL_ALLMETA <- tidy_data(CPI_ANL, wide = FALSE, allmeta = TRUE) # v2.0 has some 0-obs series | ||
|
||
# Can query a specific version by adding e.g. version = "2.0.0" to the call | ||
|
||
# Returns 5-10 years (daily average bond yields) not yet contained in the latest release | ||
# (particularly useful for daily data that is released monthly) | ||
MARKET_RATES <- read_dataset(id = "MARKET_RATES", series_key = "CMJD003.B.A", release = "unreleased") | ||
} | ||
} | ||
% Add one or more standard keywords, see file 'KEYWORDS' in the | ||
% R documentation directory. | ||
\keyword{ load }% use one of RShowDoc("KEYWORDS") | ||
\keyword{ download }% __ONLY ONE__ keyword per line |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
\name{write_dataset} | ||
\alias{write_dataset} | ||
\title{ | ||
write_dataset | ||
} | ||
\description{ | ||
Saves the data set. Available data sets can be looked up from the web platform (http://econdata.co.za). | ||
} | ||
\usage{ | ||
write_dataset(x, method = c("stage", "validate"), \dots) | ||
} | ||
\arguments{ | ||
\item{x}{Data set to upload.} | ||
\item{method}{Desired method. "stage" will stage the given data making it ready for release. "validate" will validate the given data against the schema derived from the data structure definition.} | ||
|
||
\item{\dots}{Further \emph{Optional} arguments: | ||
\tabular{llll}{ | ||
\code{file} \tab\tab character. File name for saving data set as JSON data to disk. \cr | ||
\code{username} \tab\tab character. EconData username. \cr | ||
\code{password} \tab\tab character. EconData password. \cr | ||
} | ||
} | ||
} | ||
\details{ | ||
An EconData account (http://econdata.co.za) is required to use this function. The user must provide their credentials either through the function arguments, or by setting the ECONDATA_CREDENTIALS environment variable using the syntax: "username;password". If credentials are not supplied by the aforementioned methods a GUI dialog will prompt the user for credentials. | ||
|
||
The functionality provided by \emph{write_dataset} is to save the data set according to the function arguments. As this makes modifications to the database the user calling this function requires higher privileges than needed for other \emph{econdatar} functions - the user requires \emph{membership} with the relevant data provider. | ||
} | ||
|
||
%% ~Make other sections like Warning with \section{Warning }{....} ~ | ||
|
||
\seealso{ | ||
%% ~~objects to See Also as \code{\link{help}}, ~~~ | ||
\code{\link{read_dataset}} | ||
\code{\link{write_release}} | ||
} | ||
\examples{ | ||
\dontrun{ | ||
x <- read_dataset("MINING") | ||
|
||
write_dataset(x, file = "mining.json") | ||
} | ||
} | ||
% Add one or more standard keywords, see file 'KEYWORDS' in the | ||
% R documentation directory. | ||
\keyword{ save }% use one of RShowDoc("KEYWORDS") | ||
\keyword{ upload }% __ONLY ONE__ keyword per line |