GRN
object.generateStatsSummary.Rd
This functions calls filterGRNAndConnectGenes
repeatedly and stores the total number of connections and other statistics each time to summarize them afterwards.
All arguments are identical to the ones in filterGRNAndConnectGenes
, see the help for this function for details.
The function plot_stats_connectionSummary
can be used afterwards for plotting.
generateStatsSummary(
GRN,
TF_peak.fdr = c(0.001, 0.01, 0.05, 0.1, 0.2),
TF_peak.connectionTypes = "all",
peak_gene.fdr = c(0.001, 0.01, 0.05, 0.1, 0.2),
peak_gene.r_range = c(0, 1),
gene.types = c("protein_coding"),
allowMissingGenes = c(FALSE, TRUE),
allowMissingTFs = c(FALSE),
forceRerun = FALSE
)
Object of class GRN
Numeric vector[0,1]. Default c(0.001, 0.01, 0.05, 0.1, 0.2)
. TF-peak FDR values to iterate over.
Character vector. Default all
. TF-peak connection types to consider. The special keyword all
denotes all connection types (e.g., expression
and TFActivity
) that are found in the GRN
object. By default, only expression
is present in the object, so all
and expression
are usually equivalent unless calculation of TF-peak links based on TF activity has also been enabled.
Numeric vector[0,1]. Default c(0.001, 0.01, 0.05, 0.1, 0.2)
. Peak-gene FDR values to iterate over.
Numeric vector of length 2[-1,1]. Default c(0,1)
. The correlation range of peak-gene connections to keep.
Character vector of supported gene types. Default c("protein_coding", "lincRNA")
.
Filter for gene types to retain, genes with gene types not listed here are filtered. The special keyword "all" indicates no filter and retains all gene types.
The specified names must match the names as stored in the GRN
object (see GRN@annotation$genes$gene.type
) and
correspond 1:1 to the gene type names as provided by biomaRt
, with the exception of lncRNAs
,
which is internally renamed to lincRNAs
when first fetching all gene types. This is done due to a recent change in biomaRt
and aims at
keeping backwards compatibility with GRN
objects.
Logical vector of length 1 or 2. Default c(FALSE, TRUE)
. Allow genes to be missing for peak-gene connections? If both FALSE
and TRUE
are given, the code loops over both
Logical vector of length 1 or 2. Default c(FALSE)
. Allow TFs to be missing for TF-peak connections? If both FALSE
and TRUE
are given, the code loops over both
TRUE
or FALSE
. Default FALSE
. Force execution, even if the GRN object already contains the result. Overwrites the old results.
An updated GRN
object, with additional information added from this function.
# See the Workflow vignette on the GRaNIE website for examples
GRN = loadExampleObject()
#> Downloading GRaNIE example object from https://git.embl.de/grp-zaugg/GRaNIE/-/raw/master/data/GRN.rds
#> INFO [2023-08-16 17:28:10] Storing GRN@data$RNA$counts matrix as sparse matrix because fraction of 0s is > 0.1 (0.44)
#> Finished successfully. You may explore the example object. Start by typing the object name to the console to see a summaty. Happy GRaNIE'ing!
GRN = generateStatsSummary(GRN, TF_peak.fdr = c(0.01, 0.1), peak_gene.fdr = c(0.01, 0.1))
#> INFO [2023-08-16 17:28:10] Generating summary. This may take a while...
#> INFO [2023-08-16 17:28:10]
#> Real data...
#>
#> INFO [2023-08-16 17:28:10] Calculate network stats for TF-peak FDR of 0.01
#> INFO [2023-08-16 17:28:15] Calculate network stats for TF-peak FDR of 0.1
#> INFO [2023-08-16 17:28:20]
#> Permuted data...
#>
#> INFO [2023-08-16 17:28:20] Calculate network stats for TF-peak FDR of 0.01
#> INFO [2023-08-16 17:28:24] Calculate network stats for TF-peak FDR of 0.1
#> INFO [2023-08-16 17:28:29] Finished successfully. Execution time: 18.7 secs