hima
is a wrapper function designed to perform various HIMA methods for estimating and testing high-dimensional mediation effects.
hima
can automatically select the appropriate HIMA method based on the outcome and mediator data type.
Arguments
- formula
an object of class
formula
representing the overall effect model to be fitted, specified asoutcome ~ exposure + covariates
. The "exposure" variable (the variable of interest) must be listed first on the right-hand side of the formula.- data.pheno
a data frame containing the exposure, outcome, and covariates specified in the formula. Variable names in
data.pheno
must match those in the formula. Whenscale = TRUE
, the exposure and covariates will be scaled (the outcome retains its original scale).- data.M
a
data.frame
ormatrix
of high-dimensional mediators, with rows representing samples and columns representing mediator variables. Whenscale = TRUE
,data.M
will be scaled.- mediator.type
a character string indicating the data type of the high-dimensional mediators (
data.M
). Options are:'gaussian'
(default): for continuous mediators.'negbin'
: for count data mediators modeled using the negative binomial distribution (e.g., RNA-seq data).'compositional'
: for compositional data mediators (e.g., microbiome data).- penalty
a character string specifying the penalty method to apply in the model. Options are:
'DBlasso'
: De-biased LASSO (default).'MCP'
: Minimax Concave Penalty.'SCAD'
: Smoothly Clipped Absolute Deviation.'lasso'
: Least Absolute Shrinkage and Selection Operator. Note: Survival HIMA and microbiome HIMA can only be performed with'DBlasso'
. Quantile HIMA and efficient HIMA cannot use'DBlasso'
; they always apply'MCP'
.- quantile
logical. Indicates whether to use quantile HIMA (
hima_quantile
). Default isFALSE
. Applicable only for classic HIMA with a continuous outcome andmediator.type = 'gaussian'
. IfTRUE
, specify the desired quantile(s) using thetau
parameter; otherwise, the defaulttau = 0.5
(i.e., median) is used.- efficient
logical. Indicates whether to use efficient HIMA (
hima_efficient
). Default isFALSE
. Applicable only for classic HIMA with a continuous outcome andmediator.type = 'gaussian'
.- scale
logical. Determines whether the function scales the data (exposure, mediators, and covariates). Default is
TRUE
. Note: For simulation studies, setscale = FALSE
to avoid estimate compression (i.e., shrinkage of estimates toward zero due to scaling).- sigcut
numeric. The significance cutoff for selecting mediators. Default is
0.05
.- contrast
a named list of contrasts to be applied to factor variables in the covariates (cannot be the variable of interest).
- subset
an optional vector specifying a subset of observations to use in the analysis.
- verbose
logical. Determines whether the function displays progress messages. Default is
FALSE
.- ...
reserved passing parameter (or for future use).
Value
A data.frame containing mediation testing results of selected mediators.
- ID:
Mediator ID/name.
- alpha:
Coefficient estimates of exposure (X) –> mediators (M) (adjusted for covariates).
- beta:
Coefficient estimates of mediators (M) –> outcome (Y) (adjusted for covariates and exposure).
- alpha*beta:
The estimated indirect (mediation) effect of exposure on outcome through each mediator.
- Relative Importance:
The proportion of each mediator's mediation effect relative to the sum of the absolute mediation effects of all significant mediators.
- p-value:
The joint p-value assessing the significance of each mediator's indirect effect, calculated based on the corresponding statistical approach.
- tau:
The quantile level of the outcome (applicable only when using the quantile mediation model).
References
1. Zhang H, Zheng Y, Zhang Z, Gao T, Joyce B, Yoon G, Zhang W, Schwartz J, Just A, Colicino E, Vokonas P, Zhao L, Lv J, Baccarelli A, Hou L, Liu L. Estimating and Testing High-dimensional Mediation Effects in Epigenetic Studies. Bioinformatics. 2016. DOI: 10.1093/bioinformatics/btw351. PMID: 27357171; PMCID: PMC5048064
2. Zhang H, Zheng Y, Hou L, Zheng C, Liu L. Mediation Analysis for Survival Data with High-Dimensional Mediators. Bioinformatics. 2021. DOI: 10.1093/bioinformatics/btab564. PMID: 34343267; PMCID: PMC8570823
3. Zhang H, Chen J, Feng Y, Wang C, Li H, Liu L. Mediation Effect Selection in High-dimensional and Compositional Microbiome data. Stat Med. 2021. DOI: 10.1002/sim.8808. PMID: 33205470; PMCID: PMC7855955
4. Zhang H, Chen J, Li Z, Liu L. Testing for Mediation Effect with Application to Human Microbiome Data. Stat Biosci. 2021. DOI: 10.1007/s12561-019-09253-3. PMID: 34093887; PMCID: PMC8177450
5. Perera C, Zhang H, Zheng Y, Hou L, Qu A, Zheng C, Xie K, Liu L. HIMA2: High-dimensional Mediation Analysis and Its Application in Epigenome-wide DNA Methylation Data. BMC Bioinformatics. 2022. DOI: 10.1186/s12859-022-04748-1. PMID: 35879655; PMCID: PMC9310002
6. Zhang H, Hong X, Zheng Y, Hou L, Zheng C, Wang X, Liu L. High-Dimensional Quantile Mediation Analysis with Application to a Birth Cohort Study of Mother–Newborn Pairs. Bioinformatics. 2024. DOI: 10.1093/bioinformatics/btae055. PMID: 38290773; PMCID: PMC10873903
7. Bai X, Zheng Y, Hou L, Zheng C, Liu L, Zhang H. An Efficient Testing Procedure for High-dimensional Mediators with FDR Control. Statistics in Biosciences. 2024. DOI: 10.1007/s12561-024-09447-4.
Examples
if (FALSE) { # \dontrun{
# Note: In the following examples, M1, M2, and M3 are true mediators.
# Example 1 (continuous outcome - linear HIMA):
head(ContinuousOutcome$PhenoData)
e1 <- hima(Outcome ~ Treatment + Sex + Age,
data.pheno = ContinuousOutcome$PhenoData,
data.M = ContinuousOutcome$Mediator,
mediator.type = "gaussian",
penalty = "MCP", # Can be "DBlasso" for hima_dblasso
scale = FALSE
) # Disabled only for simulation data
summary(e1)
# Efficient HIMA (only applicable to mediators and outcomes that are
# both continuous and normally distributed.)
e1e <- hima(Outcome ~ Treatment + Sex + Age,
data.pheno = ContinuousOutcome$PhenoData,
data.M = ContinuousOutcome$Mediator,
mediator.type = "gaussian",
efficient = TRUE,
penalty = "MCP", # Efficient HIMA does not support DBlasso
scale = FALSE
) # Disabled only for simulation data
summary(e1e)
# Example 2 (binary outcome - logistic HIMA):
head(BinaryOutcome$PhenoData)
e2 <- hima(Disease ~ Treatment + Sex + Age,
data.pheno = BinaryOutcome$PhenoData,
data.M = BinaryOutcome$Mediator,
mediator.type = "gaussian",
penalty = "MCP",
scale = FALSE
) # Disabled only for simulation data
summary(e2)
# Example 3 (time-to-event outcome - survival HIMA):
head(SurvivalData$PhenoData)
e3 <- hima(Surv(Time, Status) ~ Treatment + Sex + Age,
data.pheno = SurvivalData$PhenoData,
data.M = SurvivalData$Mediator,
mediator.type = "gaussian",
penalty = "DBlasso",
scale = FALSE
) # Disabled only for simulation data
summary(e3)
# Example 4 (compositional data as mediator, e.g., microbiome):
head(MicrobiomeData$PhenoData)
e4 <- hima(Outcome ~ Treatment + Sex + Age,
data.pheno = MicrobiomeData$PhenoData,
data.M = MicrobiomeData$Mediator,
mediator.type = "compositional",
penalty = "DBlasso"
) # Scaling is always enabled internally for hima_microbiome
summary(e4)
#' # Example 5 (quantile mediation anlaysis - quantile HIMA):
head(QuantileData$PhenoData)
# Note that the function will prompt input for quantile level.
e5 <- hima(Outcome ~ Treatment + Sex + Age,
data.pheno = QuantileData$PhenoData,
data.M = QuantileData$Mediator,
mediator.type = "gaussian",
quantile = TRUE,
penalty = "MCP", # Quantile HIMA does not support DBlasso
scale = FALSE, # Disabled only for simulation data
tau = c(0.3, 0.5, 0.7)
) # Specify multiple quantile level
summary(e5)
} # }