Probabilistic Inference and Computational Biology (PROBIC)

We develop methods for efficient probabilistic inference in complex modelling problems. We develop models for genomic time series data using Gaussian processes and methods for quantitative analysis of sequencing data. We also develop theory and methods for efficient differentially private Bayesian inference. We are a part of the Probabilistic Machine Learning group at HIIT.

Group members

  • Dr Antti Honkela, PI, Assistant Professor, Academy Research Fellow
  • Dr Onur Dikmen, Postdoctoral researcher
  • Mikko Heikkilä, Doctoral student

Selected publications

A. Sankar, B. Malone, S. Bayliss, B. Pascoe, G. Méric, M.D. Hitchings, S.K. Sheppard, E.J. Feil, J. Corander and A. Honkela.
Bayesian identification of bacterial strains from sequencing data.
Microbial Genomics 2 (2016), doi:10.1099/mgen.0.000075

A. Honkela, J. Peltonen, H. Topa, I. Charapitsa, F. Matarese, K. Grote, H.G. Stunnenberg, G. Reid, N.D. Lawrence and M. Rattray.
Genome-wide modelling of transcription kinetics reveals patterns of RNA production delays.
Proc. Natl Acad. Sci U S A 112(42):13115-13120 (2015), doi:10.1073/pnas.1420404112.

P. Glaus, A. Honkela, and M. Rattray.
Identifying differentially expressed transcripts from RNA-seq data with biological variation.
Bioinformatics 28(13):1721-1728 (2012), doi:10.1093/bioinformatics/bts260.

A. Honkela, T. Raiko, M. Kuusela, M. Tornio, and J. Karhunen.
Approximate Riemannian conjugate gradient learning for fixed-form variational Bayes.
Journal of Machine Learning Research 11(Nov):3235-3268 (2010).

A. Honkela, C. Girardot, E. H. Gustafson, Y.-H. Liu, E. E. M. Furlong, N. D. Lawrence and M. Rattray.
Model-based method for transcription factor target identification with limited data.
Proc. Natl. Acad. Sci. U S A 107(17):7793-7798 (2010), doi:10.1073/pnas.0914285107.

Free software packages based on our research

  • BIB: Bayesian Identification of Bacterial from sequencing data
  • BitSeq: Transcript isoform expression and differential expression estimation from RNA-seq data (also available through Bioconductor)
  • tigre: Ranking transcription factor candidate target genes based on time series gene expression data
  • tigreBrowser: Web-based browser for genomic time course modelling results

