The enrichment analysis is based on the subset of the network connected to particular TFs (TF regulons), see calculateCommunitiesEnrichment and calculateGeneralEnrichment for community- and general enrichment, respectively. This function requires the existence of the eGRN graph in the GRN object as produced by build_eGRN_graph. Results can subsequently be visualized with the function plotTFEnrichment.

calculateTFEnrichment(
  GRN,
  rankType = "degree",
  n = 3,
  TF.IDs = NULL,
  ontology = c("GO_BP", "GO_MF"),
  algorithm = "weight01",
  statistic = "fisher",
  background = "neighborhood",
  background_geneTypes = "all",
  pAdjustMethod = "BH",
  forceRerun = FALSE
)

Arguments

GRN

Object of class GRN

rankType

Character. Default "degree". One of: "degree", "EV", "custom". This parameter will determine the criterion to be used to identify the "top" TFs. If set to "degree", the function will select top TFs based on the number of connections to genes they have, i.e. based on their degree-centrality. If set to "EV" it will select the top TFs based on their eigenvector-centrality score in the network. If set to custom, a set of TF IDs will have to be passed to the "TF.IDs" parameter.

n

Numeric. Default 3. If this parameter is passed as a value between 0 and 1, it is treated as a percentage of top nodes. If the value is passed as an integer it will be treated as the number of top nodes. This parameter is not relevant if rankType = "custom".

TF.IDs

Character vector. Default NULL. If the rank type is set to "custom", a vector of TF IDs for which the GO enrichment should be calculated should be passed to this parameter.

ontology

Character vector of ontologies. Default c("GO_BP", "GO_MF"). Valid values are "GO_BP", "GO_MF", "GO_CC", "KEGG", "DO", and "Reactome", referring to GO Biological Process, GO Molecular Function, GO Cellular Component, KEGG, Disease Ontology, and Reactome Pathways, respectively. GO ontologies require the topGO, "KEGG" the clusterProfiler, "DO" the DOSE, and "Reactome" the ReactomePA packages, respectively. As they are listed under Suggests, they may not yet be installed, and the function will throw an error if they are missing.

algorithm

Character. Default "weight01". One of: "classic", "elim", "weight", "weight01", "lea", "parentchild". Only relevant if ontology is GO related (GO_BP, GO_MF, GO_CC), ignored otherwise. Name of the algorithm that handles the GO graph structures. Valid inputs are those supported by the topGO library. For general information about the algorithms, see https://academic.oup.com/bioinformatics/article/22/13/1600/193669. weight01 is a mixture between the elim and the weight algorithms.

statistic

Character. Default "fisher". One of: "fisher", "ks", "t". Statistical test to be used. Only relevant if ontology is GO related (GO_BP, GO_MF, GO_CC), and valid inputs are a subset of those supported by the topGO library (we had to remove some as they do not seem to work properly in topGO either), ignored otherwise. For the other ontologies the test statistic is always Fisher.

background

Character. Default "neighborhood". One of: "all_annotated", "all_RNA", "all_RNA_filtered", "neighborhood". Set of genes to be used to construct the background for the enrichment analysis. This can either be all annotated genes in the reference genome (all_annotated), all genes from the provided RNA data (all_RNA), all genes from the provided RNA data excluding those marked as filtered after executing filterData (all_RNA_filtered), or all the genes that are within the neighborhood of any peak (before applying any filters except for the user-defined promoterRange value in addConnections_peak_gene) (neighborhood).

background_geneTypes

Character vector of gene types that should be considered for the background. Default "all". Only gene types as defined in the GRN object, slot GRN@annotation$genes$gene.type are allowed. The special keyword "all" means no filter on gene type.

pAdjustMethod

Character. Default "BH". One of: "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr". This parameter is only relevant for the following ontologies: KEGG, DO, Reactome. For the other ontologies, the algorithm serves as an adjustment.

forceRerun

TRUE or FALSE. Default FALSE. Force execution, even if the GRN object already contains the result. Overwrites the old results.

Value

An updated GRN object, with the enrichment results stored in the stats$Enrichment$byTF slot.

Details

All enrichment functions use the TF-gene graph as defined in the `GRN` object. See the `ontology` argument for currently supported ontologies. Also note that some parameter combinations for `algorithm` and `statistic` are incompatible, an error message will be thrown in such a case.

See also

Examples

# See the Workflow vignette on the GRaNIE website for examples
GRN =  loadExampleObject()
#> Downloading GRaNIE example object from https://git.embl.de/grp-zaugg/GRaNIE/-/raw/master/data/GRN.rds
#> INFO [2023-08-16 17:27:43] Storing GRN@data$RNA$counts matrix as sparse matrix because fraction of 0s is > 0.1 (0.44)
#> Finished successfully. You may explore the example object. Start by typing the object name to the console to see a summaty. Happy GRaNIE'ing!
GRN =  calculateTFEnrichment(GRN, n = 5, ontology = "GO_BP", forceRerun = FALSE)
#> INFO [2023-08-16 17:27:43] Calculating TF enrichment. This may take a while
#> INFO [2023-08-16 17:27:43] n = 5 equals finding the top 5 degree-central TFs in the network
#> INFO [2023-08-16 17:27:43]  Finished successfully. Execution time: 0.1 secs
#> INFO [2023-08-16 17:27:43] Running enrichment analysis for the following TFs: EGR1.0.A, E2F6.0.A, E2F7.0.B, EGR2.0.A, BATF3.0.B
#> INFO [2023-08-16 17:27:43]  Running enrichment analysis for genes connected to the TF EGR1.0.A
#> INFO [2023-08-16 17:27:43]   Ontology GO_BP
#> INFO [2023-08-16 17:27:43] Data already exists in object (GRN@Enrichment$byTF$EGR1.0.A$GO_BP). Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:43]  Running enrichment analysis for genes connected to the TF E2F6.0.A
#> INFO [2023-08-16 17:27:43]   Ontology GO_BP
#> INFO [2023-08-16 17:27:43] Data already exists in object (GRN@Enrichment$byTF$E2F6.0.A$GO_BP). Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:43]  Running enrichment analysis for genes connected to the TF E2F7.0.B
#> INFO [2023-08-16 17:27:43]   Ontology GO_BP
#> INFO [2023-08-16 17:27:43] Data already exists in object (GRN@Enrichment$byTF$E2F7.0.B$GO_BP). Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:43]  Running enrichment analysis for genes connected to the TF EGR2.0.A
#> INFO [2023-08-16 17:27:43]   Ontology GO_BP
#> INFO [2023-08-16 17:27:43] Data already exists in object (GRN@Enrichment$byTF$EGR2.0.A$GO_BP). Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:43]  Running enrichment analysis for genes connected to the TF BATF3.0.B
#> INFO [2023-08-16 17:27:43]   Ontology GO_BP
#> INFO [2023-08-16 17:27:43] Data already exists in object (GRN@Enrichment$byTF$BATF3.0.B$GO_BP). Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:43]  Finished successfully. Execution time: 0.7 secs