Rego Data Configuration File

parameter	Value
## Data source	where is the data coming from
data.source.type	can be db (database) or csv (comma-separated values file) or rdata (RData file save()'d by R)
## ...CSV spec
csv.path	specifies the location of the cdv file, when the data source type is csv
csv.fname	csv file name
csv.sep	the field separator character (optional)
## ...RData spec
rdata.path	specifies the location of the RData file, when the data source type is rdata
rdata.fname	RData file name
rdata.dfname	specifies the name of the data-frame object in the RData file. Can be omitted if there is only one object in the RData file.
## ...DB spec
db.dsn
db.name
db.tbl.name
db.tbl.maxrows
db.query.tmpl	a SQL query "template" file to use when fetching the data -- e.g., "SELECT * FROM TBLNAME WHERE Y IN (0, 1) LIMIT MAXROWS "
## Column specs
col.types.fname	a text file with column name, column type pairs (column type is 1 for continuous, and 2 for categorical variables). Can be omitted if the data source type is rdata, in which case columns inheriting from factor will be treated as categorical
col.y	name of response variable.
col.id	name of row-id column (optional). Often useful during prediction when <id, y, yHat> tuples are generated.
col.weights	name of weights column (optional).
row.weights.fname	name of text file with customized weights for each row (no header, one weight per line)
col.skip.fname	name of text file listing columns to skip (like a row-id column); one column name per line (optional)
col.winz.fname	text file with column-specific winsorizing parameters (optional).
## Any preprocessing
na.threshold	maximum fraction of NA values to allow per column (optional).
min.level.count	levels with fewer than this count will be merged.
do.class.balancing	set to 1 to have classes to be equally weighted; 0 otherwise.
## HTML model report
html.fname	file where to write model summary as an HTML report.
html.title
html.title2
html.min.var.imp	exclude from HTML report variables with importance score lower than this.
html.min.rule.imp	exclude from HTML report rules with importance score lower than this.
## Other
rand.seed	random number seed
log.level	one of kLogLevelDEBUG, kLogLevelINFO, kLogLevelWARNING, kLogLevelERROR, kLogLevelFATAL, kLogLevelCRITICAL. Controls the verbosity of the logging messages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DATA_CONF.md

DATA_CONF.md

Rego Data Configuration File

Files

DATA_CONF.md

Latest commit

History

DATA_CONF.md

File metadata and controls

Rego Data Configuration File