Classic high-dimensional mediation analysis

hima_classic is used to estimate and test classic high-dimensional mediation effects (linear & logistic regression).

Usage

hima_classic(
  X,
  M,
  Y,
  COV.XM = NULL,
  COV.MY = COV.XM,
  Y.type = c("continuous", "binary"),
  M.type = c("gaussian", "negbin"),
  penalty = c("MCP", "SCAD", "lasso"),
  topN = NULL,
  scale = TRUE,
  Bonfcut = 0.05,
  verbose = FALSE,
  parallel = FALSE,
  ncore = 1,
  ...
)

Arguments

X: a vector of exposure. Do not use data.frame or matrix.
M: a data.frame or matrix of high-dimensional mediators. Rows represent samples, columns represent variables.
Y: a vector of outcome. Can be either continuous or binary (0-1). Do not use data.frame or matrix.
COV.XM: a data.frame or matrix of covariates dataset for testing the association M ~ X. Covariates specified here will not participate penalization. Default = NULL. If the covariates contain mixed data types, please make sure all categorical variables are properly formatted as factor type.
COV.MY: a data.frame or matrix of covariates dataset for testing the association Y ~ M. Covariates specified here will not participate penalization. If not specified, the same set of covariates for M ~ X will be applied (i.e., COV.XM. Using different sets of covariates is allowed but this needs to be handled carefully.
Y.type: data type of outcome (Y). Either 'continuous' (default) or 'binary'.
M.type: data type of mediator (M). Either 'gaussian' (default) or 'negbin' (i.e., negative binomial).
penalty: the penalty to be applied to the model. Either 'MCP' (the default), 'SCAD', or 'lasso'.
topN: an integer specifying the number of top markers from sure independent screening. Default = NULL. If NULL, topN will be either ceiling(n/log(n)) for continuous outcome, or ceiling(n/(2*log(n))) for binary outcome, where n is the sample size. If the sample size is greater than topN (pre-specified or calculated), all mediators will be included in the test (i.e. low-dimensional scenario).
scale: logical. Should the function scale the data? Default = TRUE.
Bonfcut: Bonferroni-corrected p value cutoff applied to select significant mediators. Default = 0.05.
verbose: logical. Should the function be verbose? Default = FALSE.
parallel: logical. Enable parallel computing feature? Default = FALSE.
ncore: number of cores to run parallel computing Valid when parallel = TRUE.
...: other arguments passed to ncvreg.

Value

A data.frame containing mediation testing results of selected mediators.

Index:: mediation name of selected significant mediator.
alpha_hat:: coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
beta_hat:: coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
IDE:: mediation (indirect) effect, i.e., alpha*beta.
rimp:: relative importance of the mediator.
pmax:: joint raw p-value of selected significant mediator (based on Bonferroni method).

References

Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, Zhang W, Schwartz J, Just A, Colicino E, Vokonas P, Zhao L, Lv J, Baccarelli A, Hou L, Liu L. Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies. Bioinformatics. 2016. DOI: 10.1093/bioinformatics/btw351. PMID: 27357171; PMCID: PMC5048064

Examples

if (FALSE) { # \dontrun{
# Note: In the following examples, M1, M2, and M3 are true mediators.

# When Y is continuous and normally distributed
# Example 1 (continuous outcome):
data(ContinuousOutcome)
pheno_data <- ContinuousOutcome$PhenoData
mediator_data <- ContinuousOutcome$Mediator

hima.fit <- hima_classic(
  X = pheno_data$Treatment,
  Y = pheno_data$Outcome,
  M = mediator_data,
  COV.XM = pheno_data[, c("Sex", "Age")],
  Y.type = "continuous",
  scale = FALSE, # Disabled only for simulation data
  verbose = TRUE
)
hima.fit

# When Y is binary
# Example 2 (binary outcome):
data(BinaryOutcome)
pheno_data <- BinaryOutcome$PhenoData
mediator_data <- BinaryOutcome$Mediator

hima.logistic.fit <- hima_classic(
  X = pheno_data$Treatment,
  Y = pheno_data$Disease,
  M = mediator_data,
  COV.XM = pheno_data[, c("Sex", "Age")],
  Y.type = "binary",
  scale = FALSE, # Disabled only for simulation data
  verbose = TRUE
)
hima.logistic.fit
} # }