Parsimonious Modelling

The research group Parsimonious Modelling of the Helsinki Institute for Information Technology HIIT operates on the Department of Information and Computer Science (ICS) of the Aalto University School of Science, in Finland. It is part of the Algorithmic Data Analysis (Algodan) Centre of Excellence, elected by the Academy of Finland.

The research group Parsimonious Modelling develops novel computational data analysis methods and applies these methods on two application fields: cancer genomics and environmental informatics. Parsimonious modeling aims at simple, compact, or sparse models as a result of learning from data in the presence of very little or no a priori information about the modeled problem. Simplicity of the models facilitates understanding of the problem domain by humans.

Both application fields present similar challenges to the data analysis problems: the high dimensionality of observed data and the presence of moderate or large noise levels are both factors that bear fundamental problems for any data analysis. Seeking new areas of application and interfacing the newest application domains with lots of novel types of generated data helps in finding new, unsolved settings of problems.

Group members:

Contributed open-source software:

  • BernoulliMix — a program package of finite mixture models of multivariate Bernoulli distributions
  • dplR — Dendrochronology Program Library in R

Active Participation in Conference Organization

  • The Eleventh International Symposium on Intelligent Data Analysis (IDA 2012) in Helsinki, Finland
  • The Tenth International Symposium on Intelligent Data Analysis (IDA 2011) in Porto, Portugal
  • The Fourteenth International Conference on Discovery Science (DS 2011) in Espoo, Finland

Former personnel and visitors:

Selected publications:

  1. Miguel A. Prada and Janne Toivola and Jyrki Kullaa and Jaakko Hollmén. Three-way analysis of structural health monitoring data, Neurocomputing, (80):119–128, March, 2012.
  2. Alexios Kotsifakos, Panagiotis Papapetrou, Jaakko Hollmén, and Dimitrios Gunopulos. A subsequence matching with gaps-range-tolerances framework: A query-by-humming application. In Proceedings of the Very Large Database Endowment (PVLDB), (4)11:761–771, August 2011.
  3. Mikko Korpela, Pekka Nöjd, Jaakko Hollmén, Harri Mäkinen, Mika Sulkava, Pertti Hari. Photosynthesis, temperature and radial growth of Scots Pine in northern Finland: identifying the influential time intervals, Trees — Structure and Function. 25(2):323–332, April, 2011. Supplement information
  4. Prem Raj Adhikari, Bimal Babu Upadhyaya, Chen Meng, and Jaakko Hollmén. Gene selection in time-series gene expression data. In Proceedings of the 6th IAPR Conference on Pattern Recognition in Bioinformatics (PRIB 2011), volume 7036 of Lecture Notes in Bioinformatics, pages 145—156, Springer-Verlag, November 2011.
  5. Serafin Alonso and Mika Sulkava and Miguel Angel Prada and Manuel Dominguez and Jaakko Hollmén. Comparative analysis of power consumption in university building using envSOM. In Advances in Intelligent Data Analysis X — Proceedings of the 10th International Symposium (IDA 2011), Volume 7014 of Lecture Notes in Computer Science. pages 10—21, Springer-Verlag, October 2011.
  6. Mark J. Brewer, Mika Sulkava, Harri Mäkinen, Mikko Korpela, Pekka Nöjd, and Jaakko Hollmén. Logistic fitting method for detecting onset and cessation of tree stem radius increase. In Proceedings of The 12th International Conference on Intelligent Data Engineering and Automated Learning (IDEAL 2011), Volume 6936 of Lecture Notes in Computer Science. pages 204—211, Springer-Verlag, September 2011.
  7. Orestis Kostakis, Panagiotis Papapetrou, and Jaakko Hollmén. ARTEMIS: Assessing the similarity of event-interval sequences. In Proceedings of The European Conference of Machine Learning and Principles and Practices of Knowledge Discovery in Databases (ECML/PKDD), Volume 6912 of Lecture Notes in Computer Science. pages 229—244, Springer-Verlag, September 2011.
  8. Serafin Alonso, Mika Sulkava, Miguel Angel Prada, Manuel Dominguez, and Jaakko Hollmén. EnvSOM: a SOM algorithm conditioned on the environment for clustering and visualization. In Proceedings of the 8th Workshop on Self-Organizing Maps (WSOM 2011), Volume 6731 of Lecture Notes in Computer Science. pages 61—70, Springer-Verlag, June 2011.
  9. Konsta Sirvio and Jaakko Hollmén. Forecasting road condition after maintenance works with linear methods and radial basis function networks. In Proceedings of the 21th International Conference on Artificial Neural Networks (ICANN'11), Volume 6792 of Lecture Notes in Computer Science. pages 405—412, Springer-Verlag, June 2011.
  10. Orestis Kostakis, Panagiotis Papapetrou, and Jaakko Hollmén. Distance measure for querying sequences of temporal intervals. In Proceedings of the 4th International Conference on PErvasive Technologies Related to Assistive Environments (PETRA 2011), May 2011.
  11. Alexis Kotsifakos, Vassilis Athitsos, Panagiotis Papapetrou, Jaakko Hollmén, and Dimitrios Gunopulos. Model-based search in large time series databases. In Proceedings of The 4th International Conference on Pervasive Technologies Related to Assistive Environment (PETRA 2011). ACM, May 2011.
  12. Maurizio Bocca, Janne Toivola, Lasse Eriksson, Jaakko Hollmén, and Heikki Koivo. Structural health monitoring in wireless sensor networks by the embedded Goertzel algorithm. In Proceedings of the 2nd ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS 2011). pages 206—214, IEEE, April 2011.
  13. Anu Usvasalo, Riikka Raty, Arja Harila-Saari, Pirjo Koistinen, Eeva-Riitta Savolainen, Sakari Knuutila, Erkki Elonen, Ulla M. Saarinen-Pihkala, Jaakko Hollmén. Prognostic classification of patients with acute lymphoblastic leukemia by using gene copy number profiles identified from array-based comparative genomic hybridization data. Leukemia Research, 34(11):1476–1482, November, 2010.
  14. Prem Raj Adhikari and Jaakko Hollmén. Patterns from Multi-Resolution 0-1 Data. In Bart Goethals, Nikolaj Tatti, and Jilles Vreeken, editors, In Proceedings of the ACM SIGKDD Workshop on Useful Patterns (UP'10), pages 8—12. July 25, 2010. Washington, DC, USA.
  15. Mikko Korpela, Harri Mäkinen, Pekka Nöjd, Jaakko Hollmén, and Mika Sulkava. Automatic detection of onset and cessation of tree stem radius increase using dendrometer data. Neurocomputing, 73(10–12):2039–2046, June, 2010.
  16. Janne Toivola, Miguel A. Prada and Jaakko Hollmén. Novelty detection in projected spaces for structural health monitoring. In Paul R. Cohen, Niall M. Adams, and Michael R. Berthold, editors, Advances in Intelligent Data Analysis IX, volume 6065 of LNCS, pages 208–219. Springer-Verlag. May 2010. Tucson, Arizona, USA.
  17. S. Luyssaert, P. Ciais, S. L. Piao, E.-D. Schulze, M. Jung, S. Zaehle, M. J. Schelhaas, M. Reichstein, G. Churkina, D. Papale, G. Abril, C. Beer, J. Grace, D. Loustau, G. Matteucci, F. Magnani, G. J. Nabuurs, H. Verbeeck, M. Sulkava, G. R. van der Werf, and I. A. Janssens. The european carbon balance. part 3: forests. Global Change Biology, 16(5):1429–1450, May 2010.
  18. Laxman Yetukuri, Jarkko Tikka, Jaakko Hollmén, and Matej Orešič. Functional prediction of unidentified lipids using supervised classifiers. Metabolomics, 6(1):18–26, March, 2010.
  19. Michaela Wrage, Salla Ruosaari, Paul P. Eijk, Jussuf T. Kaifi, Jaakko Hollmén, Emre F. Yekebas, Jacob R. Izbicki, Ruud H. Brakenhoff, Thomas Streichert, Sabine Riethdorf, Bauke Ylstra, Klaus Pantel, and Harriet Wikman. Genomic profiles associated with early micrometastatis in lung cancer: Relevance of 4q deletion. Clinical Cancer Research, 15(5):1566–1574, 2009.
  20. Janne Toivola and Jaakko Hollmén. Feature extraction and selection from vibration measurements for structural health monitoring. In Niall M. Adams, Céline Robardet, Arno Siebes, Jean-François Boulicaut, editors, In Proceedings of the 8th International Symposium on Intelligent Data Analysis (IDA 2009), volume 5772 of Lecture Notes in Computer Science, pages 213–224. Springer-Verlag, 2009.
  21. Jarkko Tikka. Input variable selection methods for construction of interpretable regression models. Doctoral dissertation, Helsinki University of Technology, December, 2008.
  22. Salla Ruosaari. Microarrays in Lung Cancer Research: From Comparative Analyses to Verified Findings. Doctoral dissertation, University of Helsinki, June, 2008.
  23. Samuel Myllykangas, Jarkko Tikka, Tom Böhling, Sakari Knuutila and Jaakko Hollmén. Classification of human cancers based on DNA copy number amplification modeling. BMC Medical Genomics,1(15), May 2008.
  24. Mika Sulkava. Learning from environmental data: methods for analysis of forest nutrition time series. Doctoral dissertation, Helsinki University of Technology, January 2008.
  25. Jarkko Tikka, Jaakko Hollmén. Sequential Input Selection Algorithm for Long-term Prediction of Time Series. Neurocomputing, 71(13–15): 2604–2615, August 2008.
  26. S. Luyssaert, I.A. Janssens, M. Sulkava, D. Papale, A.J. Dolman, M. Reichstein, J. Hollmén J.G. Martin, T. Suni, T. Vesala, D. Lousteau, B.E. Law, and E.J. Moors. Photosynthesis drives anomalies in net carbon-exchange of pine forests at different latitudes. Global Change Biology, 13(10):2110–2127, October 2007.
  27. Timo Similä and Jarkko Tikka. Input selection and shrinkage in multiresponse linear regression. Computational Statistics & Data Analysis, 52(1):406–422, September, 2007.
  28. Jaakko Hollmén and Jarkko Tikka. Compact and Understandable Descriptions of Mixtures of Bernoulli Distributions. In Proceedings of the 7th International Symposium on Intelligent Data Analysis (IDA 2007), volume 4723 of Lecture Notes in Computer Science, pages 1–12. Springer-Verlag, September 2007. Ljubljana, Slovenia.
  29. H. Wikman, S.Ruosaari, P. Nymark, V.K. Sarhadi, J. Saharinen, E. Vanhala, A. Karjalainen, J. Hollmén S. Knuutila, S. Anttila, S. Knuutila. Gene expression and copy number profiling suggests the importance of allelic imbalance in 19p in asbestos-associated lung cancer. Oncogene, 26(32):4730–4737, July 2007.
  30. Mika Sulkava, Sebastiaan Luyssaert, Pasi Rautio, Ivan A. Janssens, Jaakko Hollmén. Modeling the effects of varying data quality on trend detection in environmental monitoring. Ecological Informatics, 2(1):167–176, June 2007.
  31. Penny Nymark, Pamela M. Lindholm, Mikko V. Korpela, Leo Lahti, Salla Ruosaari, Samuel Kaski, Jaakko Hollmén Sisko Anttila, Vuokko L. Kinnula and Sakari Knuutila. Gene Expression Profiles in Asbestos-exposed Epithelial and Mesothelial Lung Cell Lines. BMC Genomics, 8(62), March 2007.
  32. S. Myllykangas, J. Himberg, T. Böhling, B. Nagy, J. Hollmén, and S. Knuutila. DNA copy number amplification profiling of human neoplasms. Oncogene, 25(55):7324–7332, November 2006.
  33. Penny Nymark, Harriet Wikman, Salla Ruosaari, Jaakko Hollmén, Esa Vanhala, Antti Karjalainen, Sisko Anttila, and Sakari Knuutila. Identification of specific gene copy number changes in asbestos-related lung cancer. Cancer Research, 66(11):5737–5743, June 2006.
  34. Mika Sulkava, Jarkko Tikka, and Jaakko Hollmén. Sparse regression for analyzing the development of foliar nutrient concentrations in coniferous trees. Ecological Modeling, 191(1):118–130, January 2006.