New features

  • say hello to a new function filterConnectionsForPlotting() that can be used to include or exclude particular connections from the stored eGRN for visualization purposes only (!). Note that this filter only applies to visualization and enables a flexible system to visually explore particular features of the stored eGRN. THis is particularly handy when the eGRN is large. For more details, see the help pages of the new function.
  • similarly, the function visualizeGRN() now by default only visualizes connections that are marked as such (the result from filterConnectionsForPlotting()) - that is, it excludes connections that the user beforehand excluded from plotting. This allows to specifically plot only part of the eGRN network and explore specific T&F regulons, for example, a feature that before was not so easy to do.
  • It is now possible to integrate SNP data into GRaNIE via the new function addSNPData(). For more information, see the Package vignette.
  • version jump due to new Bioconductor development cycle

New features and stability improvements

  • we replaced biomaRt for the full genome annotation retrieval in addData with a different approach that is more reliable, as we had more and more issues with biomaRt in the recent past. While using the old biomaRt approach is still an option, the default is now to use the AnnotationHub package from Bioconductor. This makes GRaNIE overall more stable and less reliant on biomaRt due to the strict timeouts and query size restrictions.

New features and vignette updates

  • a new correlation method has been implemented that replaces the “old” robust method (via addRobustRegression) that was available as an experimental feature until now. It is implemented in the WGCNA package and called biweight midcorrelation or short bicor, a robust type of correlation based on medians that can be used as an alternative to Spearman correlation. Biweight midcorrelation has been shown to be more robust in evaluating similarity in gene expression networks and is often used for weighted correlation network analysis. In addition, this new correlation type can now also be selected for the TF-peak correlations, which was not possible before. Lastly, the code has been cleaned and simplified and all instances of addRobustRegression have been removed and replaced by a new third option bicor (in addition to Pearson and Spearman, as before) for the corMethod argument in multiple functions that support this feature. All vignettes have been updated accordingly.
  • in addData, when a DESeq2 size factor normalization is selected by the user, it is now explicitly checked whether enough genes are available that contain no 0 values on which the size factor normalization is based on. If this is not the case (the default and hard-coded limit is currently set to a minimum of 100), an error is thrown. This becomes particularly relevant for single-cell derived data with a high fraction of 0s, and prevents a normalization based on very few genes and improves the error messages that DESeq2 throws otherwise for an improved user experience.

Bugfixes and stability improvements

  • improved the stability of the biomaRt call, which did not work as originally intended in case of temporary connection failures. Now, calls to biomaRt are attempted up to 40 times to increase the chances of not suffering from connection issues. Also, the approach to deal with BiocParallel failures has been changed.

Paper acceptance and publication update

Bugfixes

  • many small bugfixes and other small improvements to homogenize the user experience due to the usage of systematic unit tests

New features and vignette updates

  • we provide two new functions with this update:
    1. getGRNSummary() that summarizes a GRN object and returns a named list, which can be used to compare different GRN objects ore easily among each other, for example.
    2. plotCorrelations() for scatter plots of the underlying data for either TF-peak, peak-gene or TF-gene pairs. This can be useful to visualize specific TF-peak, peak-gene or TF-gene pairs to investigate the underlying data and to judge the reasonability of the inferred connection.
  • methods vignette updates

Bugfixes

  • various small bugfixes that were accidentally introduced in the latest change from using the TF.ID instead of TF.name column as unique TF identifier

New features and vignette updates

  • added two new supported genomes: rn6/rn7 and dm6 for the rat and the Drosophila (fruit fly) genome, respectively
  • added preliminary support for a new, alternative way of how to import TF and TFBS data into GRaNIE. We now additionally offer a more user-friendly way by making it possible to directly use the JASPAR2022 database. You do not need any custom files anymore for this approach! See the Package vignette for more details.

Bugfixes

  • fixed a regression bug in addConnections_TF_peak (Columnpeak.GC.classdoesn't exist.) that was caused due to the recent GC modifications

New features and vignette updates

  • additional significant methods vignette updates
  • updates and clarifications for the workflow vignette
  • a new QC plot for plotDiagnosticPlots_TFPeaks (and indirectly in addConnections_TF_peak when plotDiagnosticPlots = TRUE) on page 1 that shows the total number of connections for real and background TF-peak links as calculated and stored in the GRN object, stratified by TF-peak FDR and correlation bin. This is a similar plot as we show in the paper and helps comparing foreground and background.

Improvements

  • speed improvements for plotDiagnosticPlots_TFPeaks (and indirectly in addConnections_TF_peak when plotDiagnosticPlots = TRUE) when plotAsPDF = FALSE

Bugfixes

  • fixed a bug that only occurred in addConnections_TF_peak when using useGCCorrection = TRUE

New features and vignette updates

  • significant methods vignette updates that help clarifying methods details

Minor changes

  • Small workflow vignette updates

Bug fixes

  • we were informed that newer versions of dplyr (1.1.0) changed their default behavior for the function if_else when NULL is involved, which caused an error. We changed the implementation to accommodate for that and now avoid dplyr::if_else and use base R ifelse instead.

Minor changes

  • Small vignette updates and fixing typos / improved wording

Bug fixes

  • due to a change from USCS that affected GenomeInfoDb::getChromInfoFromUCSC("hg38") (see here for more details), the minimum required version of GenomeInfoDb had to be increased to 1.34.8. If you have troubles installing at least this version, we recommend updating to the newest Bioconductor version 3.16 or (without warranties) use the following line to manually install the newest version directly from GitHub outside of Bioconductor (not recommended): BiocManager::install("Bioconductor/GenomeInfoDb)"
  • small change in addData() so that peak IDs are stored with the same name in the object in case the user-provided peak IDs have the format chr:start:end as opposed to the required chr:start-end. filterData() otherwise incorrectly discarded all peaks because of the ID mismatch caused by the two different formats.
  • fixed a rare edge case in filterGRNAndConnectGenes() that caused an error when 0 TF-peak connections were found beforehand

New features

  • We are excited to announce that we added a new vignette for how to use GRaNIE for single-cell data! We plan to update it regularly with new information. Check it out here!

New features

  • significant updated to the package details vignette
  • revisited and improved the internal logging and object history. The time when a function was called is now added to the list name, which allows the storage of multiple instances of the same function.
  • new parameter in addData(): geneAnnotation_customHost to specify a custom host and overriding the default and previously hard-coded hostname when retrieving gene annotation data via biomaRt.
  • the function getGRNConnections() can now also include the various additional metadata for all type parameters and not only the default type all.filtered.

Bug fixes

  • fixed an error that appeared in rare cases when a chromosome name from either peak or RNA data could not be found in biomaRt such as GL000194.1. Peaks from chromosomes with irretrievable lengths are now automatically discarded.
  • significant updates to the package details vignette

New features

  • the function plotDiagnosticPlots_peakGene() (which is also called indirectly from addConnections_peak_gene() when setting plotDiagnosticPlots = TRUE) now stores the plot data for the QC plots from the first page into the GRN object. It is stored in GRN@stats$peak_genes
  • the columns of the result table from getGRNConnections() are now explained in detail in the R help, and we reference this from the Vignette and other places
  • various significant Vignette updates

Bug fixes

  • optimized the column names for the function getGRNConnections(), which now does not return duplicate columns for particular cases anymore
  • improved printing in the log for the function filterData() and addData()
  • the loadExampleObject() function has been optimized and should now force download an example object when requesting it.
  • the package version as stored in the GRN object now works correctly.

Minor changes

  • further code cleaning in light of the tidyselect changes in version 1.2.0 to eliminate deprecated warnings
  • the default gene types for addConnections_peak_gene() and plotDiagnosticPlots_peakGene() have been homogenized and changed to list(c("all"), c("protein_coding")). Before, the default was list(c("protein_coding", "lincRNA")), but we decided to now split this into two separate lists: Once for all genes irrespective of the gene type and once for only protein-coding genes. As before, lincRNA or other gene types can of course still be selected and chosen.
  • various minor changes

Bug fixes

Minor changes

  • further code cleaning in light of the tidyselect changes in version 1.2.0 to eliminate deprecated warnings

Major changes

  • the default URL for the example GRN object in loadExampleObject() had to be changed due to changes in the IT infrastructure. The new stable default URL is now , in the same Git repository that provides GRaNIE outside of Bioconductor.

Bug fixes

  • fixing bugs introduced due to the tidyverse 1.2.0 related code cleaning
  • other bugfix accidentally introduced in the previous commits

Bug fixes

  • revisited the import of TADs and made the code more error-prone and fixed some bugs related to TADs. Importing TADs now works again as before.

Minor changes

  • code cleaning in light of the tidyselect changes in version 1.2.0 to eliminate deprecated warnings

New features

New features

Minor changes

  • first round of code cleaning in light of the tidyselect changes in version 1.2.0 to eliminate deprecated warnings

Major changes

  • the topGO package is now required package and not optional anymore. The reasoning for this is that the standard vignette should run through with the default arguments, and GO annotation is the default ontology so topGO is needed for this. Despite this package still being optional from a strict workflow point of view, we feel this is a better way and improves user friendliness by not having to install another package in the middle of the workflow.

Minor changes

  • in initializeGRN(), the objectMetadata argument is now checked whether it contains only atomic elements, and an error is thrown if this is not the case. As this list is not supposed to contain real data, checking this prevents the print(GRN) function to unnecessarily print the whole content of the provided object metadata, thereby breaking the original purpose.

New features

  • addTFBS() got two more arguments to make it more flexible. Now, it is possible to specify the file name of the translation table to be used via the argument translationTable, which makes it more flexible than the previously hard-coded name "translationTable.csv. In addition, the column separator for this file can now be specified via the argument translationTable_sep
  • Overlapping TFBS data with the peak is now more error-tolerant and does not error out in case that some chromosome or contig names from the TFBS BED files contain elements the size of which cannot be retrieved online. This was the case for some contig names with the suffix decoy, for example. If such elements are found, a warning is now thrown and they are ignored as they are usually not wanted anyway.
  • in case a GRN objects contains 0 connections (e..g, because of too strict filtering), subsequent functions as well as the print function now give a more user-friendly warning / error message.

New features

  • additional normalization schemes have been implemented, including GC-aware normalization schemes for peaks, and existing normalization methods have been renamed for clarity. See ?addData for details.
  • further reduced the package burden; the large genome annotation packages are now more or less fully optional and only needed when a GC-aware normalization has been chosen or when additional peak annotation is wanted. However, in contrast to before, none of these annotation packages are strictly required anywhere anymore. The vignettes have been updated accordingly.

Minor changes

  • various small changes in the code
  • vignette updates

Major changes and new features

  • major object changes and optimizations, particularly related to storing the count matrices in an optimized and simpler format. In short, the count matrices are now stored either as normal or sparse matrices, depending on the amount of zeros present. In addition, only the counts after normalization are saved, the raw counts before applying normalization are not stored anymore. If no normalization is wished by the user, as before, the “normalized” counts are equal to the raw counts. GRaNIE is now more readily applicable for larger analyses and single-cell analysis even though we just started actively optimizing for it, so we cannot yet recommend applying our framework in a single-cell manner. Older GRN objects are automatically changed internally when executing the major functions upon the first invocation.
  • various Documentation and R help updates
  • the function generateStatsSummary() now doesnt alter the stored filtered connections in the object anymore. This makes its usage more intuitive and it can be used anywhere in the workflow.
  • removed redundant biomaRt calls in the code. This saves time and makes the code less vulnerable to timeout issues caused by remote services
  • due to the changes described above, the function plotPCA_all() now can only plot the normalized counts and not the raw counts anymore (except when no normalization is wanted)
  • the GO enrichments are now also storing, for each GO term, the ENSEMBL IDs of the genes that were found in the foreground. This facilitates further exploration of the enrichment results.

Minor changes

  • many small changes in the code

Major changes and new features

  • many Documentation and R help updates, the Package Details Vignette is online
  • The workflow vignette is now improved: better figure resolution, figure aspect ratios are optimized, and a few other changes
  • the eGRN graph structure as built by build_eGRN_graph() in the GRaNIE object is now reset whenever the function filterGRNAndConnectGenes() is successfully executed to make sure that enrichment functions etc are not using an outdated graph structure.
  • the landing page of the website has been extended and overhauled
  • removed some dependency packages and moved others into Suggests to lower the installation burden of the package. In addition, removed topGO from the Depends section (now in Suggests) and removed tidyverse altogether (before in Depends). Detailed explanations when and how the packages listed under Suggests are needed can now be found in the new Package Details Vignette and are clearly given to the user when executing the respective functions
  • major updates to the function getGRNConnections(), which now has more arguments allowing a more fine-tuned and rich retrieval of eGRN connections, features and feature metadata
  • a new function add_featureVariation() to quantify and interpret multiple sources of biological and technical variation for features (TFs, peaks, and genes) in a GRN object, see the R help for more information
  • filterGRNAndConnectGenes() now doesnt include feature metadata columns to save space in the result data frame that is created. The help has been updated to make clear that getGRNConnections() includes these features now.

Minor changes

  • small changes in the GRN object structure, moved GRN@data$TFs@translationTable to GRN@annotation@TFs. All exported functions run automatically a small helper function to make this change for any GRN object automatically to adapt to the new structure
  • many small changes in the code, updated argument checking, and preparing rigorous unit test inclusion
  • internally renaming the (recently changed / renamed) gene type lncRNA from biomaRt to lincRNA to be compatible with older versions of GRaNIE

New features

  • added the argument maxWidth_nchar_plot to all functions that plot enrichments, and changed the default from 100 to 50.

Bug fixes

  • fixed a small bug that resulted in the enrichment plots to ignore the value of maxWidth_nchar_plot

Major changes and new features

  • Bioconductor acceptance: this version is the final version for the Bioconductor 3.15 release branch
  • full inclusion of the GRN visualization
  • extensive vignette updates
  • added the possibility to print only particular output pages for all plot functions

Bug fixes

  • various minor bug fixes

Minor changes

  • various minor changes

Major changes and new features

  • all enrichment analyses have been extended and improved, we added additional ontologies (KEGG, DO, and Reactome), more information in the resulting summary plots
  • all plotting functions have been homogenized and extended, PDF width and height can now be set for all exported plot functions. Also, the possibility to not to a PDF but instead to the currently active graphics device is possible. Lastly, setting a different filename is finally possible. Collectively, this provides ultimate flexibility for customizing file names, the output devices used and PDF sizes
  • we added a function to build the eGRN network that can be parameterized and that allows future developmemt more easily. Now, network-specific parameters can be changed, such as whether loops should be allowed
  • we removed the GRaNIEdev package, the development now happens in a separate branch rather than a different package
  • we added Leiden clustering for community clustering (see https://www.nature.com/articles/s41598-019-41695-z for a comparison with louvain)
  • extensive vignette updates

Bug fixes

  • various minor bug fixes

Minor changes

  • changed the object structure slightly (graph slot and structure within the stats$enrichment slot)

Major changes and new features

  • major overhaul and continuous work on peak-gene QC plots
  • the filterData functions has now more filter parameters, such as filtering for CV. Also, all filters uniformly have a min and max filter.
  • integrated network statistics and various enrichment analyses
  • handling of edge cases and rare events in various functions
  • packages have been renamed to GRaNIE as basename (before: GRN)

Bug fixes

  • various minor bug fixes

Minor changes

  • changed the object structure slightly and moved some gene and peak annotation data (such as mean, CV) to the appropriate annotation slot

Major changes and new features

  • improved PCA plotting, PCA plots are now produced for both raw and normalized data
  • new filters for the function filterGRaNIEAndConnectGenes() (peak_gene.maxDistance) as well as more flexibility how to adjust the peak-gene raw p-values for multiple testing (including the possibility to use IHW - experimental)
  • new function plotDiagnosticPlots_TFPeaks() for plotting (this function was previously called only internally, but is now properly exported), in analogy to plotDiagnosticPlots_peakGene()

Bug fixes

  • various minor bug fixes (PCA plotting, compatibility when providing pre-normalized data)

Minor changes

  • changed the object structure slightly and cleaned the config slot, for example
  • some functions have been added / renamed to make the workflow more clear and streamlined, see Vignette for details
  • some default parameters changed

Major changes and new features

  • improved PCA plotting, also works for pre-normalized counts now when provided as input originally
  • more flexibility for data normalization
  • homogenized wordings, function calls and workflow clarity, removed unnecessary warnings when plotting peak-gene diagnostic plots, added more R help documentation
  • added IHW (Independent Hypothesis Weighting) as a multiple testing procedure for peak-gene p-values in addition to now allowing all methods that are supported by p.adjust

Bug fixes

  • various minor bug fixes

Minor changes

Major changes and new features

  • significant speed improvements for the peak-FDR calculations and subsequent plotting
  • TF-peak diagnostic plots now also show negatively correlated TF-peak statistics irrespective of whether they have been filtered out in the object / pipeline. This may be useful for diagnostic purposes to check whether excluding them is a sensible choice and to confirm the numbers are low

Bug fixes

  • Numbers for connections per correlation bin in the TF-peak diagnostic plots were wrong as they did not correctly differentiate between the different connection types in case multiple ones had been specified (e.g., expression and TF activity). This has been fixed.

Minor changes

first published package version