You can also access our individual websites (via the Members page) for further information about our research and lists of our publications.
Results
- Showing results for:
- Reset all filters
Search results
-
Journal articleTang W, Bertaux F, Thomas P, et al., 2020,
bayNorm: Bayesian gene expression recovery, imputation and normalisation for single cell RNA-sequencing data
, Bioinformatics, Vol: 36, Pages: 1174-1181, ISSN: 1367-4803Motivation:Normalisation of single cell RNA sequencing (scRNA-seq) data is a prerequisite to theirinterpretation. The marked technical variability, high amounts of missing observations and batch effecttypical of scRNA-seq datasets make this task particularly challenging. There is a need for an efficient andunified approach for normalisation, imputation and batch effect correction.Results:Here, we introduce bayNorm, a novel Bayesian approach for scaling and inference of scRNA-seq counts. The method’s likelihood function follows a binomial model of mRNA capture, while priorsare estimated from expression values across cells using an empirical Bayes approach. We first validateour assumptions by showing this model can reproduce different statistics observed in real scRNA-seqdata. We demonstrate using publicly-available scRNA-seq datasets and simulated expression data thatbayNorm allows robust imputation of missing values generating realistic transcript distributions that matchsingle molecule FISH measurements. Moreover, by using priors informed by dataset structures, bayNormimproves accuracy and sensitivity of differential expression analysis and reduces batch effect comparedto other existing methods. Altogether, bayNorm provides an efficient, integrated solution for global scalingnormalisation, imputation and true count recovery of gene expression measurements from scRNA-seqdata.Availability:The R package “bayNorm” is available at https://github.com/WT215/bayNorm. The code foranalysing data in this paper is available at https://github.com/WT215/bayNorm_papercode.Contact:samuel.marguerat@imperial.ac.uk or v.shahrezaei@imperial.ac.ukSupplementary information:Supplementary data are available atBioinformaticsonline.
-
Journal articleGreenbury S, Barahona M, Johnston I, 2020,
HyperTraPS: Inferring probabilistic patterns of trait acquisition in evolutionary and disease progression pathways
, Cell Systems, Vol: 10, Pages: 39-51, ISSN: 2405-4712The explosion of data throughout the biomedical sciences provides unprecedented opportunities to learn about the dynamics of evolution and disease progression, but harnessing these large and diverse datasets remains challenging. Here, we describe a highly generalisable statistical platform to infer the dynamic pathways by which many, potentially interacting, discrete traits are acquired or lost over time in biomedical systems. The platform uses HyperTraPS (hypercubic transition path sampling) to learn progression pathways from cross-sectional, longitudinal, or phylogenetically-linked data with unprecedented efficiency, readily distinguishing multiple competing pathways, and identifying the most parsimonious mechanisms underlying given observations. Its Bayesian structure quantifies uncertainty in pathway structure and allows interpretable predictions of behaviours, such as which symptom a patient will acquire next. We exploit the model’s topology to provide visualisation tools for intuitive assessment of multiple, variable pathways. We apply the method to ovarian cancer progression and the evolution of multidrug resistance in tuberculosis, demonstrating its power to reveal previously undetected dynamic pathways.
-
Journal articleHoffmann T, Peel L, Lambiotte R, et al., 2020,
Community detection in networks without observing edges
, Science Advances, Vol: 6, ISSN: 2375-2548We develop a Bayesian hierarchical model to identify communities of time series. Fitting the model provides an end-to-end community detection algorithmthat does not extract information as a sequence of point estimates but propagates uncertainties from the raw data to the community labels. Our approachnaturally supports multiscale community detection as well as the selection ofan optimal scale using model comparison. We study the properties of the algorithm using synthetic data and apply it to daily returns of constituents of theS&P100 index as well as climate data from US cities.
-
Journal articleLiu Z, Barahona M, 2020,
Graph-based data clustering via multiscale community detection
, Applied Network Science, Vol: 5, Pages: 1-20, ISSN: 2364-8228We present a graph-theoretical approach to data clustering, which combines the creation of a graph from the data with Markov Stability, a multiscale community detection framework. We show how the multiscale capabilities of the method allow the estimation of the number of clusters, as well as alleviating the sensitivity to the parameters in graph construction. We use both synthetic and benchmark real datasets to compare and evaluate several graph construction methods and clustering algorithms, and show that multiscale graph-based clustering achieves improved performance compared to popular clustering methods without the need to set externally the number of clusters.
-
Journal articleTonn MK, Thomas P, Barahona M, et al., 2020,
Computation of Single-Cell Metabolite Distributions Using Mixture Models.
, Front Cell Dev Biol, Vol: 8, ISSN: 2296-634XMetabolic heterogeneity is widely recognized as the next challenge in our understanding of non-genetic variation. A growing body of evidence suggests that metabolic heterogeneity may result from the inherent stochasticity of intracellular events. However, metabolism has been traditionally viewed as a purely deterministic process, on the basis that highly abundant metabolites tend to filter out stochastic phenomena. Here we bridge this gap with a general method for prediction of metabolite distributions across single cells. By exploiting the separation of time scales between enzyme expression and enzyme kinetics, our method produces estimates for metabolite distributions without the lengthy stochastic simulations that would be typically required for large metabolic models. The metabolite distributions take the form of Gaussian mixture models that are directly computable from single-cell expression data and standard deterministic models for metabolic pathways. The proposed mixture models provide a systematic method to predict the impact of biochemical parameters on metabolite distributions. Our method lays the groundwork for identifying the molecular processes that shape metabolic heterogeneity and its functional implications in disease.
-
Journal articleHodges M, Yaliraki SN, Barahona M, 2019,
Edge-based formulation of elastic network models
, Physical Review Research, Pages: 033211-033211We present an edge-based framework for the study of geometric elastic networkmodels to model mechanical interactions in physical systems. We use aformulation in the edge space, instead of the usual node-centric approach, tocharacterise edge fluctuations of geometric networks defined in d- dimensionalspace and define the edge mechanical embeddedness, an edge mechanicalsusceptibility measuring the force felt on each edge given a force applied onthe whole system. We further show that this formulation can be directly relatedto the infinitesimal rigidity of the network, which additionally permits three-and four-centre forces to be included in the network description. We exemplifythe approach in protein systems, at both the residue and atomistic levels ofdescription.
-
Journal articleMcGrath T, Spreckley E, Rodriguez A, et al., 2019,
The homeostatic dynamics of feeding behaviour identify novel mechanisms of anorectic agents
, PLoS Biology, Vol: 17, Pages: 1-30, ISSN: 1544-9173Better understanding of feeding behaviour will be vital in reducing obesity and metabolic syndrome, but we lack a standard model that capturesthe complexity of feeding behaviour. We construct an accurate stochasticmodel of rodent feeding at the bout level in order to perform quantitativebehavioural analysis. Analysing the different effects on feeding behaviour ofPYY3-36, lithium chloride, GLP-1 and leptin shows the precise behaviouralchanges caused by each anorectic agent. Our analysis demonstrates that thechanges in feeding behaviour evoked by the anorectic agents investigated donot mimic the behaviour of well-fed animals, and that the intermeal intervalis influenced by fullness. We show how robust homeostatic control of feedingthwarts attempts to reduce food intake, and how this might be overcome. Insilico experiments suggest that introducing a minimum intermeal interval ormodulating upper gut emptying can be as effective as anorectic drug administration.
-
Journal articleLatorre-Pellicer A, Lechuga-Vieco AV, Johnston IG, et al., 2019,
Regulation of mother-to-offspring transmission of mtDNA heteroplasmy
, Cell Metabolism, Vol: 30, Pages: 1120-1130.e5, ISSN: 1550-4131mtDNA is present in multiple copies in each cell derived from the expansions of those in the oocyte. Heteroplasmy, more than one mtDNA variant, may be generated by mutagenesis, paternal mtDNA leakage, and novel medical technologies aiming to prevent inheritance of mtDNA-linked diseases. Heteroplasmy phenotypic impact remains poorly understood. Mouse studies led to contradictory models of random drift or haplotype selection for mother-to-offspring transmission of mtDNA heteroplasmy. Here, we show that mtDNA heteroplasmy affects embryo metabolism, cell fitness, and induced pluripotent stem cell (iPSC) generation. Thus, genetic and pharmacological interventions affecting oxidative phosphorylation (OXPHOS) modify competition among mtDNA haplotypes during oocyte development and/or at early embryonic stages. We show that heteroplasmy behavior can fall on a spectrum from random drift to strong selection, depending on mito-nuclear interactions and metabolic factors. Understanding heteroplasmy dynamics and its mechanisms provide novel knowledge of a fundamental biological process and enhance our ability to mitigate risks in clinical applications affecting mtDNA transmission.
-
Book chapterSchaub MT, Delvenne J-C, Lambiotte R, et al., 2019,
Structured networks and coarse-grained descriptions: a dynamical perspective
, Advances in Network Clustering and Blockmodeling, Editors: Doreian, Batagelj, Ferligoj, Publisher: John Wiley and Sons, Ltd, Pages: 333-361, ISBN: 9781119224709This chapter discusses the interplay between structure and dynamics in complex networks. Given a particular network with an endowed dynamics, our goal is to find partitions aligned with the dynamical process acting on top of the network. We thus aim to gain a reduced description of the system that takes into account both its structure and dynamics. In the first part, we introduce the general mathematical setup for the types of dynamics we consider throughout the chapter. We provide two guiding examples, namely consensus dynamics and diffusion processes (random walks), motivating their connection to social network analysis, and provide a brief discussion on the general dynamical framework and its possible extensions. In the second part, we focus on the influence of graph structure on the dynamics taking place on the network, focusing on three concepts that allow us to gain insight into this notion. First, we describe how time scale separation can appear in the dynamics on a network as a consequence of graph structure. Second, we discuss how the presence of particular symmetries in the network give rise to invariant dynamical subspaces that can be precisely described by graph partitions. Third, we show how this dynamical viewpoint can be extended to study dynamics on networks with signed edges, which allow us to discuss connections to concepts in social network analysis, such as structural balance. In the third part, we discuss how to use dynamical processes unfolding on the network to detect meaningful network substructures. We then show how such dynamical measures can be related to seemingly different algorithm for community detection and coarse-graining proposed in the literature. We conclude with a brief summary and highlight interesting open future directions.
-
Journal articleLubba CH, Sethi SS, Knaute P, et al., 2019,
catch22: CAnonical time-series CHaracteristics
, Data Mining and Knowledge Discovery, Vol: 33, Pages: 1821-1852, ISSN: 1384-5810Capturing the dynamical properties of time series concisely as interpretable feature vectors can enable efficient clustering and classification for time-series applications across science and industry. Selecting an appropriate feature-based representation of time series for a given application can be achieved through systematic comparison across a comprehensive time-series feature library, such as those in the hctsa toolbox. However, this approach is computationally expensive and involves evaluating many similar features, limiting the widespread adoption of feature-based representations of time series for real-world applications. In this work, we introduce a method to infer small sets of time-series features that (i) exhibit strong classification performance across a given collection of time-series problems, and (ii) are minimally redundant. Applying our method to a set of 93 time-series classification datasets (containing over 147,000 time series) and using a filtered version of the hctsa feature library (4791 features), we introduce a set of 22 CAnonical Time-series CHaracteristics, catch22, tailored to the dynamics typically encountered in time-series data-mining tasks. This dimensionality reduction, from 4791 to 22, is associated with an approximately 1000-fold reduction in computation time and near linear scaling with time-series length, despite an average reduction in classification accuracy of just 7%. catch22 captures a diverse and interpretable signature of time series in terms of their properties, including linear and non-linear autocorrelation, successive differences, value distributions and outliers, and fluctuation scaling properties. We provide an efficient implementation of catch22, accessible from many programming environments, that facilitates feature-based time-series analysis for scientific, industrial, financial and medical applications using a common language of interpretable time-series properties.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.
Useful Links
- Find out more about the research interests of the Biomaths group
- Explore Research within the Mathematics Department
- Biomathematics homepage
- AMMP homepage