Skip to content

Commit

Permalink
Merge pull request #56 from GregFa/main
Browse files Browse the repository at this point in the history
added function description for new subset functions
  • Loading branch information
GregFa authored Aug 23, 2024
2 parents bf4eba4 + 121276b commit 7785f73
Show file tree
Hide file tree
Showing 20 changed files with 459 additions and 202 deletions.
28 changes: 14 additions & 14 deletions src/io/export_to_type.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Creates a `Gmap` type/struct from gmap CSV file.
# Argument
- `filename` : A string containing the name(with directory) of the gmap CSV file.
* `filename` : A string containing the name(with directory) of the gmap CSV file.
# Output
Expand Down Expand Up @@ -56,7 +56,7 @@ Creates a `CrossType` type/struct from control file in json format.
# Argument
- `filename` : A string containing the name(with directory) of the control file in json format.
* `filename` : A string containing the name(with directory) of the control file in json format.
# Output
Expand Down Expand Up @@ -85,7 +85,7 @@ Creates a `Alleles` type/struct from control file in json format.
# Argument
- `filename` : A string containing the name(with directory) of the control file in json format.
* `filename` : A string containing the name(with directory) of the control file in json format.
# Output
Expand All @@ -112,7 +112,7 @@ Creates a `GenoType` type/struct from control file in json format.
# Argument
- `filename` : A string containing the name(with directory) of the control file in json format.
* `filename` : A string containing the name(with directory) of the control file in json format.
# Output
Expand Down Expand Up @@ -144,7 +144,7 @@ Creates a `GenoTranspose` type/struct from control file in json format.
# Argument
- `filename` : A string containing the name(with directory) of the control file in json format.
* `filename` : A string containing the name(with directory) of the control file in json format.
# Output
Expand Down Expand Up @@ -175,7 +175,7 @@ Creates a `Geno` type/struct from gmap CSV file and geno CSV file.
# Argument
- `filename` : A string containing the name(with directory) of the control file in json format.
* `filename` : A string containing the name(with directory) of the control file in json format.
# Output
Expand Down Expand Up @@ -289,7 +289,7 @@ Creates a `Pmap` type/struct from Pmap CSV file.
# Argument
- `filename` : A string containing the name(with directory) of the pmap CSV file.
* `filename` : A string containing the name(with directory) of the pmap CSV file.
# Output
Expand Down Expand Up @@ -354,7 +354,7 @@ Creates a `Pheno` type/struct from Pheno CSV file.
# Argument
- `filename` : A string containing the name(with directory) of the pheno CSV file.
* `filename` : A string containing the name(with directory) of the pheno CSV file.
# Output
Expand Down Expand Up @@ -399,7 +399,7 @@ Creates a `Phenocovar` type/struct from phenocovar CSV file.
# Argument
- `filename` : A string containing the name(with directory) of the phenocovar CSV file.
* `filename` : A string containing the name(with directory) of the phenocovar CSV file.
# Output
Expand Down Expand Up @@ -439,7 +439,7 @@ Creates a `Crossinfo` type/struct from crossinfo CSV file.
# Argument
- `filename` : A string containing the name(with directory) of the crossinfo CSV file.
* `filename` : A string containing the name(with directory) of the crossinfo CSV file.
# Output
Expand Down Expand Up @@ -480,7 +480,7 @@ Creates a `IsXChar` type/struct from gmap CSV file.
# Argument
- `filename` : A string containing the name(with directory) of the gmap CSV file.
* `filename` : A string containing the name(with directory) of the gmap CSV file.
# Output
Expand Down Expand Up @@ -510,7 +510,7 @@ Creates a `Covar` type/struct from gmap CSV file.
# Argument
- `filename` : A string containing the name(with directory) of the gmap CSV file.
* `filename` : A string containing the name(with directory) of the gmap CSV file.
# Output
Expand Down Expand Up @@ -554,7 +554,7 @@ If control file does not stipulate sex information, we assume all female
# Argument
- `filename` : A string containing the name(with directory) of the control file in json format.
* `filename` : A string containing the name(with directory) of the control file in json format.
# Output
Expand Down Expand Up @@ -601,7 +601,7 @@ Creates a `GeneticStudyData` type/struct from control file in json format.
# Argument
- `filename` : A string containing the name(with directory) of the control file in json format.
* `filename` : A string containing the name(with directory) of the control file in json format.
# Output
Expand Down
22 changes: 11 additions & 11 deletions src/io/io_helpers.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Parses a JSON file to a dictionary with keys containing the names of CSV file fo
# Argument
- `file` : A string containing the name of the JSON file
* `file` : A string containing the name of the JSON file
# Output
Expand All @@ -26,7 +26,7 @@ Writes a CSV file to data frame excluding the comments lines.
# Argument
- `filename` : A string containing the name of the CSVfile
* `filename` : A string containing the name of the CSVfile
# Output
Expand All @@ -51,10 +51,10 @@ end
Locate the control file and determine the directory where it resides.
# Arguments
- `filename::String`: A string representing either the path to a file or a directory.
* `filename::String`: A string representing either the path to a file or a directory.
# Returns
- `(String, String)`: A tuple containing the directory of the control file and the path
* `(String, String)`: A tuple containing the directory of the control file and the path
to the control file. If the provided `filename` is a directory, the function will search
for JSON files within and return the path to the first JSON file found.
Expand Down Expand Up @@ -88,11 +88,11 @@ end
Encode a matrix of genotype values using a dictionary of predefined mappings.
# Arguments
- `geno_dict::Dict{String, Any}`: A dictionary mapping genotype strings to integer codes.
- `geno_val::AbstractArray`: An array of genotype strings to be encoded.
* `geno_dict::Dict{String, Any}`: A dictionary mapping genotype strings to integer codes.
* `geno_val::AbstractArray`: An array of genotype strings to be encoded.
# Returns
- `Matrix{Union{Missing, Int64}}` or `Matrix{Int64}`: A matrix of the same dimensions as
* `Matrix{Union{Missing, Int64}}` or `Matrix{Int64}`: A matrix of the same dimensions as
`geno_val` where each genotype string is replaced by its corresponding integer code.
Genotypes not found in `geno_dict` are encoded as `missing`.
Expand Down Expand Up @@ -156,11 +156,11 @@ end
Check if a specified key exists in the given dictionary and return its corresponding value.
# Arguments
- `control_dict::Dict`: A dictionary from which the value associated with a key is to be retrieved.
- `s::String`: The key for which the existence and value are checked within the dictionary.
* `control_dict::Dict`: A dictionary from which the value associated with a key is to be retrieved.
* `s::String`: The key for which the existence and value are checked within the dictionary.
# Returns
- `Any`: Returns the value associated with the key `s` in `control_dict` if it exists.
* `Any`: Returns the value associated with the key `s` in `control_dict` if it exists.
# Throws
- Throws an error if the key `s` is not found in `control_dict`.
Expand All @@ -177,7 +177,7 @@ function check_key(control_dict::Dict, s::String)
if (in(s, keys(control_dict)))
val = control_dict[s]
else
@warn "Error: $(s) not found in control file"
@warn "$(s) file not found in control file"
val = missing
# throw("Error: $(s) not found in control file")
end
Expand Down
4 changes: 2 additions & 2 deletions src/kinship/kinship.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Calculate kinship from the genotype probability array.
# Arguments
- `geno` is the genotype probability matrix for n individuals and p markers.
* `geno` is the genotype probability matrix for n individuals and p markers.
# Output
Expand All @@ -24,7 +24,7 @@ Calculate kinship from the genotype probability array.
# Arguments
- `geno` is the genotype probability matrix, n individuals and p markers, which
* `geno` is the genotype probability matrix, n individuals and p markers, which
contains `missing` values.
# Output
Expand Down
2 changes: 1 addition & 1 deletion src/kinship/kinship_4way.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Note: In [R/qtl](https://cran.r-project.org/web/packages/qtl/qtl.pdf), genotypes
# Argument
- `genmat` : A matrix of genotypes for `four-way cross` ``(1,2, \\dots)``.
* `genmat` : A matrix of genotypes for `four-way cross` ``(1,2, \\dots)``.
size(genematrix)= (p,n), for `p` genetic markers x `n` individuals(or lines).
# Output
Expand Down
2 changes: 1 addition & 1 deletion src/kinship/kinship_ctr.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Calculates a kinship by a centered genotype matrix (linear kernel), i.e. genotyp
# Argument
- `genmat` : A matrix of genotype data (0,1,2). size(genmat)=(n,p) for `n` individuals x `p` markers
* `genmat` : A matrix of genotype data (0,1,2). size(genmat)=(n,p) for `n` individuals x `p` markers
# Output
Expand Down
4 changes: 2 additions & 2 deletions src/kinship/kinship_gs.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ Computes a kinship matrix using the Gaussian Kernel.
# Arguments
- `climate`: A matrix of genotype or climate information data. size(climate)=(m,r), such that `r` genotype markers (or days/years of climate factors,
* `climate`: A matrix of genotype or climate information data. size(climate)=(m,r), such that `r` genotype markers (or days/years of climate factors,
i.e. precipitations, temperatures, etc.), and `m` individuals (or environments/sites)
- `ρ`: A free parameter determining the width of the kernel. It could be attained empirically.
* `ρ`: A free parameter determining the width of the kernel. It could be attained empirically.
# Output
Expand Down
4 changes: 2 additions & 2 deletions src/kinship/kinship_lin.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ Calculates a kinship (or climatic relatedness, [`kinship_gs`](@ref)) matrix by l
# Arguments
- `mat` : A matrix of genotype (or allele) probabilities usually extracted from [R/qtl](https://rqtl.org/tutorials/rqtltour.pdf),
* `mat` : A matrix of genotype (or allele) probabilities usually extracted from [R/qtl](https://rqtl.org/tutorials/rqtltour.pdf),
[R/qtl2](https://kbroman.org/qtl2/assets/vignettes/user_guide.html) or the counterpart packages. size(mat)= (p,n) for p genetic markers x n individuals.
- `cross` : A scalar indicating instances of alleles or genotypes in a genetic marker. ex. 1 for genotypes (labeled as 0,1,2), 2 for RIF, 4 for four-way cross, 8 for HS mouse (allele probabilities), etc.
* `cross` : A scalar indicating instances of alleles or genotypes in a genetic marker. ex. 1 for genotypes (labeled as 0,1,2), 2 for RIF, 4 for four-way cross, 8 for HS mouse (allele probabilities), etc.
# Output
Expand Down
2 changes: 1 addition & 1 deletion src/kinship/kinship_man.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ This function is for recombinant inbred line (RIL) (AA/BB), not for 4-way cross-
# Argument
- `genematrix` : A matrix of genotypes, i.e. 0,1 (or 1,2). size(genematrix)= (p,n) for `p` genetic markers x `n` individuals(or lines).
* `genematrix` : A matrix of genotypes, i.e. 0,1 (or 1,2). size(genematrix)= (p,n) for `p` genetic markers x `n` individuals(or lines).
# Output
Expand Down
2 changes: 1 addition & 1 deletion src/kinship/kinship_std.jl
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ It can also do with climatic information data. See [`kinship_gs`](@ref).
# Argument
- `genmat` : A matrix of genotype data (0,1,2). size(genmat)=(n,p) for `n` individuals x `p` markers
* `genmat` : A matrix of genotype data (0,1,2). size(genmat)=(n,p) for `n` individuals x `p` markers
# Output
Expand Down
6 changes: 3 additions & 3 deletions src/kinship/shrinkg.jl
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ This function runs faster by CPU parallelization. Add workers/processes using `
# Arguments
- `f `: A function of computing a kinship. It can only be used with [`kinship_man`](@ref), [`kinship_4way`](@ref).
- `nb` : An integer indicating the number of bootstraps. It does not have to be a large number.
- `geno` : A matrix of genotypes. See [`kinship_man`](@ref), [`kinship_4way`](@ref) for dimension.
* `f `: A function of computing a kinship. It can only be used with [`kinship_man`](@ref), [`kinship_4way`](@ref).
* `nb` : An integer indicating the number of bootstraps. It does not have to be a large number.
* `geno` : A matrix of genotypes. See [`kinship_man`](@ref), [`kinship_4way`](@ref) for dimension.
# Example
Expand Down
24 changes: 12 additions & 12 deletions src/loco/loco_bulkscan.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ Single trait scan without covariates for LOCO data structure.
# Arguments
- `y`is the phenotype column matrix.
- `dfG`is is a dataframe containing genotype values and genotype info such as the chromosome, loci...
- `kwargs` are optional keywords arguments pertaining to the `BulkLMM.bulkscan()` function. For
* `y`is the phenotype column matrix.
* `dfG`is is a dataframe containing genotype values and genotype info such as the chromosome, loci...
* `kwargs` are optional keywords arguments pertaining to the `BulkLMM.bulkscan()` function. For
example:
- method::String = "null-grid", h2_grid::Array{Float64, 1} = collect(0.0:0.1:0.9),
- nb::Int64 = Threads.nthreads(),
Expand All @@ -34,10 +34,10 @@ Single trait scan without covariates for LOCO data structure.
# Arguments
- `y`is the phenotype column matrix.
- `G`is vector of genotype matrices based on the chromosome.
- `K` is a vector of kinship matrices.
- `kwargs` are optional keywords arguments pertaining to the `BulkLMM.scan()` function. For
* `y`is the phenotype column matrix.
* `G`is vector of genotype matrices based on the chromosome.
* `K` is a vector of kinship matrices.
* `kwargs` are optional keywords arguments pertaining to the `BulkLMM.scan()` function. For
example:
- method::String = "null-grid", h2_grid::Array{Float64, 1} = collect(0.0:0.1:0.9),
- nb::Int64 = Threads.nthreads(),
Expand All @@ -63,11 +63,11 @@ Single trait scan with covariates for LOCO data structure.
# Arguments
- `y`is the phenotype column matrix.
- `G`is vector of genotype matrices based on the chromosome.
- `covar` is covariate column matrix.
- `K` is a vector of kinship matrices.
- `kwargs` are optional keywords arguments pertaining to the `BulkLMM.scan()` function. For
* `y`is the phenotype column matrix.
* `G`is vector of genotype matrices based on the chromosome.
* `covar` is covariate column matrix.
* `K` is a vector of kinship matrices.
* `kwargs` are optional keywords arguments pertaining to the `BulkLMM.scan()` function. For
example:
- method::String = "null-grid", h2_grid::Array{Float64, 1} = collect(0.0:0.1:0.9),
- nb::Int64 = Threads.nthreads(),
Expand Down
14 changes: 7 additions & 7 deletions src/loco/loco_helpers.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ get_loco_geno(dfG::DataFrame; chromosome_colname::String = "Chr", idx_start::Int
Returns a vector of genotype matrices based on the chromosome.
# Arguments
- `dfG` is a dataframe containing genotype values and genotype info such as the chromosome, loci...
- `chromosome_colname` column name containing chromosome information
- `idx_start` indicates the first column index containing the genotype values
* `dfG` is a dataframe containing genotype values and genotype info such as the chromosome, loci...
* `chromosome_colname` column name containing chromosome information
* `idx_start` indicates the first column index containing the genotype values
"""
function get_loco_geno(dfG::DataFrame;
chromosome_colname::String = "Chr",
Expand Down Expand Up @@ -34,9 +34,9 @@ get_loco_geno_info(dfG::DataFrame; chromosome_colname::String = "Chr", idx_info
Returns a vector of genotype information dataframes based on the chromosome.
# Arguments
- `dfG` is a dataframe containing genotype values and genotype info such as the chromosome, loci...
- `chromosome_colname` column name containing chromosome information
- `idx_info` indicates columns containing genotype information (e.g., Locus, Mb...)
* `dfG` is a dataframe containing genotype values and genotype info such as the chromosome, loci...
* `chromosome_colname` column name containing chromosome information
* `idx_info` indicates columns containing genotype information (e.g., Locus, Mb...)
"""
function get_loco_geno_info(dfG::DataFrame;
chromosome_colname = "Chr",
Expand All @@ -57,7 +57,7 @@ Calculates kinship matrices leaving out one chromosome out, and returns a vector
of kinship matrices per chromosome.
# Arguments
- `G` is a vector of genotype matrices
* `G` is a vector of genotype matrices
"""
function calcLocoKinship(G::Vector{Matrix{Float64}})

Expand Down
Loading

0 comments on commit 7785f73

Please sign in to comment.