Home » Research » Research groups » Data Mining » Data Mining: Theory and Applications

Data Mining: Theory and Applications
The Data Mining group in Otaniemi conducts research on finding local patterns and global models in discrete high-dimensional data. Techniques for this task include both algorithmics in the traditional computer science sense and probabilistic methods.
We develop new concepts, algorithms, principles and frameworks for algorithmic data analysis. We believe that developing new concepts and algorithms is at best an iterative process, consisting of interacting extensively with the application experts, formulating computational concepts, analyzing the properties of the concepts, designing algorithms and analyzing their performance, implementing and experimenting with the algorithms, and applying the results in practice. Our current research topics include
- randomization methods,
- theory of clustering,
- analysis of orders, and
- analysis of binary data.
Personnel
- Prof Heikki Mannila, group leader
- Senior and post-doctoral researchers:
- Doctoral students:
- Students:
- Ville Pettersson
Former personnel: Ella Bingham, Dimitru Erhan, Gemma Garriga, Robert Gwadera, Heli Hiisilä, Johan Himberg, Saara Hyvönen, Jaripekka Juhala, Mikko Katajamaa, Olli-Pekka Koistinen, Mikko Koivisto, Kalle Korpiaho, Aino Lahdenperä, Teemu Murtola, Anne Patrikainen, Kai Puolamäki, Antti Rasinen, Salla Ruosaari, Antti Savolainen, Jouni K. Seppänen, Nikolaj Tatti, Johanna Tikanmäki, Antti Ukkonen.
The Data Mining research group is located at the Department of Information and Computer Science at the Helsinki University of Technology (TKK) in Otaniemi campus, Espoo. We are members of the Helsinki Institute for Information Technology HIIT and Finnish Centre of Excellence for Algorithmic Data Analysis Research (Algodan). Our group is also associated with PASCAL2 (Pattern Analysis, Statistical Modelling and Computational Learning), a Network of Excellence funded by the European Union.
You can find us at the Department of Information and Computer Science at TKK (contact information and how to get here, see the individual members' home pages for personal contact information).
Research topics and selected publications
Our current topics include randomization methods (see, e.g., Randomization of real-valued matrices for assessing the significance of data mining results by Ojala et al. in ICDM 2008; Assessing data mining results via swap randomization by Gionis, Mannila et al. in KDD-06), theory of clustering (see, e.g., An approximation ratio for biclustering by Hanhijärvi et al. in 2008), analysis of orders (see, e.g., Antti Ukkonen's PhD thesis in 2008) and analysis of binary data (e.g., Nikolaj Tatti's PhD thesis in 2008; Banded structure in binary matrices by Garriga et al. in KDD-08).
Our current application areas include biology, ecology and paleontology (examples of this work include Biogeography of European land mammals shows environmentally distinct and spatially coherent clusters by Heikinheimo et al. in 2006; Seriation in paleontological data using Markov chain Monte Carlo methods by Puolamäki et al. in 2006; Higher origination and extinction rates in larger mammals by Mannila and others in 2008), visual analytics (see, e.g., our KDD-09 workshop), and borders in various disciplines such as biology, ecology, history and geography.
Some highlights and overview of our research is presented in our chapter (From Data to Knowledge Research Unit: Research Projects under the CIS Laboratory) of the CIS Biennial report 2006-2007, as well as in HIIT Annual report 2007 (especially subsections 4.1.1 and 4.1.5).
You can see the group members' recent publications at the TKK Publications Register (updated yearly). Please see the individual group members' home pages for their lists of publications.
Sitemap
Print