Research interests

I am a computational biologist with a background in theory, experience in statistical modeling, software development and data analysis, and a keen interest in multi-scale macromolecular modeling.

Since late 2015, I have been a researcher in bioinformatics and computational biology in the Molecular Assemblies and Genome Integrity team which belongs to the Structural Biology and Radiobiology laboratory at CEA, France. Our team also belongs to the Institute for Integrative Biology of the Cell.

Current research

My current research focuses on the structural prediction of macromolecular interactions by integrating heterogeneous sources of information such as evolution, while keeping a keen interest in statistical modeling and functional genomics.

My young researcher project ESPRINet is funded by ANR. I am looking into the prediction of protein-RNA interaction networks using structural, evolutionary and omics data.

Our team recently developed an updated version of the InterEvDock3 server for protein-protein docking using evolutionary information.

Our team also participates in CAPRI (the Critical Assessment of PRedicted Interactions), a community-wide initiative for testing computational algorithms in blind predictions of experimentally determined 3D structures of macromolecular complexes.

I have been involved in many collaborations with wet-lab biologists. Some recent examples can be found in Marsin et al. (2021), Dai et al. (2021), Rahman et al. (2020), Vilela et al. (2019), Fishboeck et al. (2019).

Post-doctoral research

From September 2013 to September 2015, I was a post-doctoral researcher in the lab of Johannes Soeding (Gene Center, LMU Munich, and Max Planck Institute for Biophysical Chemistry, Goettingen, Germany). From February 2014 to September 2015, I was the recipient of a Humboldt Research Fellowship for Post-doctoral researchers.

During my postdoc in the Soeding lab, I started working on a project involving high-throughput sequencing data analysis. In collaboration with the laboratory of Prof. Patrick Cramer, I analyzed data from 4tU-seq (next-generation sequencing of newly synthesized RNA), ChIP-seq (to map the bindings sites of regulatory protein factors over the genome), and PAR-CLIP (to identify the binding sites of RNA-binding proteins), with the goal to study transcription and genomic regulation. Our genome-wide study of transcription termination in yeast was published in Baejen, Andreani et al. (2017).

During my postdoc in the Soeding lab, I also worked on the de-novo prediction of residue-residue contacts in proteins from correlated mutations, using statistical modelling and pattern recognition. This topic has important applications in the field of protein structure prediction from sequence only. Methods of global statistical network analysis can explain the observed correlations between columns in a multiple sequence alignment by a small set of directly coupled pairs of columns. Strong couplings are indicative of residue-residue contacts, and from the predicted contacts a structure can be computed. The structural regularity of paired β-strands leads to characteristic patterns in the noisy matrices of couplings. I developed bbcontacts, a tool which predicts β-β contacts by detecting characteristic patterns in the 2D map of predicted coupling scores using hidden Markov models (HMMs). The method was published in Bioinformatics and bbcontacts is open source software under the GNU Affero General Public License v3 (or later).

Similarly, the detection of coupling patterns corresponding to interactions between α-helices was used, together with cryo-EM data and molecular dynamics simulations, to build a structural model of the membrane protein insertase YidC. This is described in Wickles et al. (2014).

Ph.D. research

In July 2013, I completed my Ph.D. in structural bioinformatics, under the supervision of Raphaël Guerois at CEA Saclay (France). I had been working on protein-protein interactions from a structural and evolutionary point of view.

We developed the InterEvol database, which contains all non-redundant protein-protein interfaces with a known structure (retrieved from the PDB). InterEvol was designed to explore 3D structures of homologous interfaces of protein complexes and provides tools to retrieve multiple sequence alignments of orthologous pairs of proteins and to visualize structural and evolutionary information jointly in a PyMOL plugin. InterEvol is described in Faure et al. (2012).

Using a dataset of over 1,000 couples of structural interologs (homologous heteromeric complexes) extracted from InterEvol, I analyzed the conservation of interface contacts and highlighted astonishing plasticity in this respect. I also identified rather invariant features in the evolution of protein interfaces which provide important tracks for extracting meaningful information from the evolutionary history of binding partners. These findings are described in Andreani et al. (2012).

We then developed InterEvScore, an interface scoring function which coupled evolutionary information with a multi-body potential in order to discriminate near-native interfaces from decoys in protein-protein docking. InterEvScore is described in Andreani et al. (2013). The InterEvScore package can be downloaded from this page.

During my Ph.D., I also worked in collaboration with structural biologists and geneticists on projects involving protein-protein docking, remote homology detection, the study of interaction networks and the design of specific peptidic inhibitors. One such collaboration project led to the results described in Lombardi et al. (2013). Another led to the results described in Lisboa, Andreani et al. (2014).