This function generates a set of differential expressed gene IDs with associated fold changes for a given number of genes, simulations and fraction of DE genes.

DESetup(ngenes=10000, nsims=25,
p.DE=0.1, pLFC,
p.B=NULL, bLFC=NULL, bPattern="uncorrelated",
sim.seed)

Arguments

ngenes

The total number of genes to simulate. Default is 10000.

nsims

Number of simulations to run. Default is 25.

p.DE

Numeric vector between 0 and 1 representing the percentage of genes being differentially expressed due to phenotype, i.e. biological signal. Default is 0.1.

pLFC

The log2 phenotypic fold change for DE genes. This can be: (1) a constant, e.g. 2; (2) a vector of values with length being number of DE genes. If the input is a vector and the length is not the number of DE genes, it will be sampled with replacement to generate log-fold change; (3) a function that takes an integer n, and generates a vector of length n, e.g. function(x) rnorm(x, mean=0, sd=1.5).

p.B

Numeric vector between 0 and 1 representing the percentage of genes being differentially expressed between batches. Default is NULL, i.e. no batch effect.

bLFC

The log2 batch fold change for all genes. This can be: (1) a constant, e.g. 2; (2) a vector of values with length being number of all genes. If the input is a vector and the length is not the number of total genes, it will be sampled with replacement to generate log2 fold changes; (3) a function that takes an integer n, and generates a vector of length n, e.g. function(x) rnorm(x, mean=0, sd=1.5). Note that only two batches will be simulated.

bPattern

Character vector for batch effect pattern if p.B is non-null. Possible options include: "uncorrelated", "orthogonal" and " correlated". Default is "uncorrelated".

sim.seed

Simulation seed.

Value

A list with the following entries:

ngenes

An integer for number of genes.

nsims

An integer for number of simulations.

sim.seed

The specified simulation seed.

p.DE

Percentage of DE genes.

DEid

A list (length=nsims) of vectors (length=ngenes*p.DE) for the IDs of DE genes.

glfc

A list (length=nsims) of vectors (length=ngenes) for phenotypic log fold change of all genes, ie nonDE=0 and DE=lfc.

blfc

A list (length=nsims) of vectors (length=ngenes) for batch log fold change of all genes.

design

Two group comparison

Examples

# NOT RUN {
desettings <- DESetup(ngenes = 10000, nsims = 25,
p.DE = 0.2, pLFC = function(x) sample(c(-1,1), size=x,replace=TRUE)*rgamma(x, 3, 3),
p.B=0.1, bLFC = function(x) rnorm(x, mean=0, sd=1.5), bPattern="uncorrelated",
sim.seed = 43856)
# }