GRN
objectcalculateCommunitiesEnrichment.Rd
The enrichment analysis is based on the subset of the network connected to a particular community as identified by calculateCommunitiesStats
, see calculateTFEnrichment
and calculateGeneralEnrichment
for
TF-specific and general enrichment, respectively.
This function requires the existence of the eGRN graph in the GRN
object as produced by build_eGRN_graph
as well as community information as calculated by calculateCommunitiesStats
.
Results can subsequently be visualized with the function plotCommunitiesEnrichment
.
calculateCommunitiesEnrichment(
GRN,
ontology = c("GO_BP", "GO_MF"),
algorithm = "weight01",
statistic = "fisher",
background = "neighborhood",
background_geneTypes = "all",
selection = "byRank",
communities = NULL,
pAdjustMethod = "BH",
forceRerun = FALSE
)
Object of class GRN
Character vector of ontologies. Default c("GO_BP", "GO_MF")
.
Valid values are "GO_BP"
, "GO_MF"
, "GO_CC"
, "KEGG"
, "DO"
, and "Reactome"
,
referring to GO Biological Process, GO Molecular Function, GO Cellular Component, KEGG, Disease Ontology,
and Reactome Pathways, respectively. GO
ontologies require the topGO
,
"KEGG"
the clusterProfiler
, "DO"
the DOSE
, and "Reactome"
the ReactomePA
packages, respectively.
As they are listed under Suggests
, they may not yet be installed, and the function will throw an error if they are missing.
Character. Default "weight01"
. One of: "classic"
, "elim"
, "weight"
, "weight01"
, "lea"
, "parentchild"
. Only relevant if ontology is GO related (GO_BP, GO_MF, GO_CC), ignored otherwise. Name of the algorithm that handles the GO graph structures. Valid inputs are those supported by the topGO
library.
For general information about the algorithms, see https://academic.oup.com/bioinformatics/article/22/13/1600/193669. weight01
is a mixture between the elim
and the weight
algorithms.
Character. Default "fisher"
. One of: "fisher"
, "ks"
, "t"
. Statistical test to be used. Only relevant if ontology is GO related (GO_BP
, GO_MF
, GO_CC
), and valid inputs are a subset of those supported by the topGO
library (we had to remove some as they do not seem to work properly in topGO
either), ignored otherwise. For the other ontologies the test statistic is always Fisher.
Character. Default "neighborhood"
. One of: "all_annotated"
, "all_RNA"
, "all_RNA_filtered"
, "neighborhood"
. Set of genes to be used to construct the background for the enrichment analysis. This can either be all annotated genes in the reference genome (all_annotated
), all genes from the provided RNA data (all_RNA
), all genes from the provided RNA data excluding those marked as filtered after executing filterData
(all_RNA_filtered
), or all the genes that are within the neighborhood of any peak (before applying any filters except for the user-defined promoterRange
value in addConnections_peak_gene
) (neighborhood
).
Character vector of gene types that should be considered for the background. Default "all"
.
Only gene types as defined in the GRN
object, slot GRN@annotation$genes$gene.type
are allowed.
The special keyword "all"
means no filter on gene type.
Character. Default "byRank"
. One of: "byRank"
, "byLabel"
. Specify whether the communities enrichment will by calculated based on their rank, where the largest community (with most vertices) would have a rank of 1, or by their label. Note that the label is independent of the rank.
NULL
or numeric vector or character vector. Default NULL
.
If set to NULL
, all community enrichments that have been calculated before are plotted.
If a numeric vector is specified (when selection = "byRank"
), the rank of the communities is specified.
For example, communities = c(1,4)
then denotes the first and fourth largest community.
If a character vector is specified (when selection = "byLabel"
), the name of the communities is specified instead.
For example, communities = c("1","4")
then denotes the communities with the names "1" and "4", which may or may not be the largest and fourth largest communities among all.
Character. Default "BH"
. One of: "holm"
, "hochberg"
, "hommel"
, "bonferroni"
, "BH"
, "BY"
, "fdr"
. This parameter is only relevant for the following ontologies: KEGG, DO, Reactome. For the other ontologies, the algorithm serves as an adjustment.
TRUE
or FALSE
. Default FALSE
. Force execution, even if the GRN object already contains the result. Overwrites the old results.
An updated GRN
object, with the enrichment results stored in the stats$Enrichment$byCommunity
slot.
All enrichment functions use the TF-gene graph as defined in the `GRN` object. See the `ontology` argument for currently supported ontologies. Also note that some parameter combinations for `algorithm` and `statistic` are incompatible, an error message will be thrown in such a case.
# See the Workflow vignette on the GRaNIE website for examples
GRN = loadExampleObject()
#> Downloading GRaNIE example object from https://git.embl.de/grp-zaugg/GRaNIE/-/raw/master/data/GRN.rds
#> INFO [2023-08-16 17:27:27] Storing GRN@data$RNA$counts matrix as sparse matrix because fraction of 0s is > 0.1 (0.44)
#> Finished successfully. You may explore the example object. Start by typing the object name to the console to see a summaty. Happy GRaNIE'ing!
GRN = calculateCommunitiesEnrichment(GRN, ontology = c("GO_BP"), forceRerun = FALSE)
#> INFO [2023-08-16 17:27:27] Running enrichment analysis for all 6 communities. This may take a while...
#> INFO [2023-08-16 17:27:27] Community 1
#> INFO [2023-08-16 17:27:28] Data already exists in object or the specified file already exists. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:28] Community 2
#> INFO [2023-08-16 17:27:28] Data already exists in object or the specified file already exists. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:28] Community 3
#> INFO [2023-08-16 17:27:28] Data already exists in object or the specified file already exists. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:28] Community 4
#> INFO [2023-08-16 17:27:28] Data already exists in object or the specified file already exists. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:28] Community 5
#> INFO [2023-08-16 17:27:28] Data already exists in object or the specified file already exists. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:28] Community 6
#> INFO [2023-08-16 17:27:28] Data already exists in object or the specified file already exists. Set forceRerun = TRUE to regenerate and overwrite.
#> INFO [2023-08-16 17:27:28] Finished successfully. Execution time: 1 secs