GRN
object.addTFBS.Rd
For this, a folder that contains one TFBS file per TF in bed or bed.gz format must be given (see details). The folder must also contain a so-called translation table, see the argument translationTable
for details. We provide example files for selected supported genome assemblies (hg19, hg38 and mm10) that are fully compatible with GRaNIE as separate downloads. For more information, check https://difftf.readthedocs.io/en/latest/chapter2.html#dir-tfbs.
addTFBS(
GRN,
source = "custom",
motifFolder = NULL,
TFs = "all",
translationTable = "translationTable.csv",
translationTable_sep = " ",
filesTFBSPattern = "_TFBS",
fileEnding = ".bed",
nTFMax = NULL,
EnsemblVersion = NULL,
JASPAR_useSpecificTaxGroup = NULL,
JASPAR_removeAmbiguousTFs = TRUE,
forceRerun = FALSE,
...
)
Object of class GRN
Character. One of custom
, JASPAR
. Default custom
. If a custom source is being used, further details about the motif folder and files will be provided (see the other function arguments). If set to JASPAR
, the JASPAR2022 database is used.
Character. No default. Only relevant if source = "custom"
. Path to the folder that contains the TFBS predictions. The files must be in BED format, 6 columns, one file per TF. See the other parameters for more details. The folder must also contain a so-called translation table, see the argument translationTable
for details.
Character vector. Default all
. Only relevant if source = "custom"
. Vector of TF names to include. The special keyword all
can be used to include all TF found in the folder as specified by motifFolder
. If all
is specified anywhere, all TFs will be included. TF names must otherwise match the file names that are found in the folder, without the file suffix.
Character. Default translationTable.csv
. Only relevant if source = "custom"
. Name of the translation table file that is also located in the folder along with the TFBS files. This file must have the following structure: at least 2 columns, called ENSEMBL
and ID
. ID
denotes the ID for the TF that is used throughout the pipeline (e.g., AHR) and the prefix of how the corresponding file is called (e.g., AHR.0.B
if the file for AHR is called AHR.0.B_TFBS.bed.gz
), while ENSEMBL
denotes the ENSEMBL ID (dot suffix; e.g., ENSG00000106546, are removed automatically if present).
Character. Default " "
(white space character). Only relevant if source = "custom"
. The column separator for the translationTable
file.
Character. Default "_TFBS"
. Only relevant if source = "custom"
. Suffix for the file names in the TFBS folder that is not part of the TF name. Can be empty. For example, for the TF CTCF, if the file is called CTCF.all.TFBS.bed
, set this parameter to ".all.TFBS"
.
Character. Default ".bed"
. Only relevant if source = "custom"
. File ending for the files from the motif folder.
NULL
or Integer[1,]. Default NULL
. Maximal number of TFs to import. Can be used for testing purposes, e.g., setting to 5 only imports 5 TFs even though the whole motifFolder
has many more TFs defined.
NULL
or Character(1). Default NULL
. Only relevant if source
is not set to custom
, ignored otherwise. The Ensembl version to use for the retrieval of gene IDs from their provided database names (e.g., JASPAR) via biomaRt
.
By default (NULL
), the newest version is selected for the most recent genome assembly versions is used (see biomaRt::listEnsemblArchives()
for supported versions). This parameter can override this to use a custom (older) version instead.
NULL
or Character(1). Default NULL
. Should a tax group instead of th specific genome assembly be used for retrieving the TF list? This is useful for genomes that are not human or mouse for which JASPAR otherwise returns too few TFs otherwise.
If set to NULL
, the specific genome version as provided in the object is used within TFBSTools::getMatrixSet
in the opts
list for species
,
while tax_group
will be used instead if this argument is not set to NULL
. For example, it can be set to vertebrates
to use the vertebrates TF collection.
For more details, see ?TFBSTools::getMatrixSet
.
TRUE
or FALSE
. Default TRUE
. Remove TFs for which the name as provided b JASPAR cannot be mapped uniquely to one and only Ensembl ID?
By default (NULL
), the newest version is selected (see biomaRt::listEnsemblArchives()
for supported versions). This parameter can override this to use a custom (older) version instead.
TRUE
or FALSE
. Default FALSE
. Force execution, even if the GRN object already contains the result. Overwrites the old results.
Additional named elements for the opts
function argument from ?TFBSTools::getMatrixSet
that is used to query the JASPAR database.
An updated GRN
object, with additional information added from this function(GRN@annotation$TFs
in particular)
# See the Workflow vignette on the GRaNIE website for examples