History of Previous Talks in Spring 2019

Personal Data and Personal AI

Date: June 10, 2019

Abstract: How to feed data hungry algorithms with personal data from multiple
sources and doing it for the benefit of people under their control? Antti ‘Jogi’ Poikola from Finland’s AI Accelerator (https://faia.fi) will present the human-centric MyData paradigm for personal data management and its connections to the development of personal AI. MyData (https://mydata.org) is striving for a fair, sustainable, and prosperous digital society, where the sharing of personal data is based on trust.

Speaker: Antti Poikola
Co-leader, Finnish AI Accelerator
Co-founder, MyData Global

Place of Seminar: Seminar Room T6, Konemiehentie 2, Otaniemi

AI for Design: Games and Beyond

Date: June 3, 2019

Abstract: The talk will provide an overview of AI and Machine Learning for/in design. The emphasis will be on designing games, which are common testbeds for AI models and algorithms, and also an ubermedia that can encompass a wide variety of other media and design. This is especially true when considering games that combine both physical and digital elements, e.g., my group’s mixed reality climbing and trampoline games; here, designing a viable product also requires traditional non-digital or “industrial” design competence. The uses of AI and ML in game design range from generative models for content creation, opponent and companion AI to testing and optimization of designs using computational models/simulations of player cognition, motivation, emotion, and embodied experience.

Bio: Perttu Hämäläinen is the professor of computer games at Aalto University. Hämäläinen’s primary research interest is gameplay innovation through technologies such as AI & ML, procedural animation, and computer vision. He is mostly known of his work on novel and award-winning exergames for physical activity motivation (e.g., Augmented Climbing Wall, Kick Ass Kung-Fu) and intelligent control algorithms for physically based simulated characters. His main publication venues are ACM SIGGRAPH, ACM CHI, and ACM CHI PLAY. Videos and more information: http://perttu.info, https://github.com/PerttuHamalainen/MediaAI

Speaker: Perttu Hämäläinen
Affiliation: Professor of Computer Science, Aalto University

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

Searching for Experiences

Date: May 27, 2019

Abstract: The rise of Artificial Intelligence raises an important challenge: how can we develop AI that increases the well-being of its users? This talk will describe a set of projects that address this issue and raise challenges for Natural Language Understanding and data management. The main theme of these projects is to help users create experiences that make them happy. The first project, Jo, is a smart-journaling application that allows users to enjoy the insights of the field of Positive Psychology in the context of their own lives. Users log their important moments via short texts and Jo attempts to give them insights that help them take steps toward creating more positive moments in their lives. The second project helps users create positive experiences when they shop online for services. The observation underlying our project is that while users are searching for experiences (e.g., restaurant outings, vacations), online services only enable them to search based on objective non-experiential attributes. Voyageur introduces the idea of a subjective databases, which enable users to query directly for experiential aspects.

Alon Halevy was the CEO of Megagon Labs until December, 2018. Previously, Alon led the Structured Data Research Group at Google for 10 years and before that he was a professor of computer science at the University of Washington. Alon is a founder of Nimble Technology, and of Transformatic, Inc., which was acquired by Google in 2005. Alon is the author of two books: “The Infinite Emotions of Coffee” and “Principles of Data Integration.” Alon is an ACM Fellow, received the Sloan Fellowship and the PECASE Award. He received his Ph.D. in Computer Science from Stanford University in 1993.

Speaker: Alon Halevy

Place of Seminar: Maarintie 8, Lecture Hall AS1, Aalto University
NOTE: This talk is part of Helsinki Distinguished Lecture Series and requires registration [link]

Advances in Compression and Exploration via Probabilistic Machine Learning

Date: May 23, 2019

Abstract: In this talk, I will describe two recent contributions in the area of probabilistic machine learning. The first one is MIRACLE, a method for finding compressed representations of neural network weights which can be very useful for the design of mobile apps and energy-efficient hardware. We encode the network weights using a random sample, requiring only a number of bits corresponding to the Kullback-Leibler divergence between the sampled variational distribution and the encoding distribution. Unlike other methods, we can explicitly control the compression rate while optimizing the expected loss on the training set. The employed encoding scheme can be shown to be close to the optimal information-theoretical lower bound. The second contribution is Successor Uncertainties (SU), a probabilistic Q-learning method for balancing exploration and exploitation in reinforcement learning. SU can incorporate uncertainty about long-term consequences of actions accounting for the fundamental dependencies in state-action values implied by the Bellman equation. SU outperforms existing algorithms on several tabular benchmarks and attains strong performance on the Atari benchmark suite.
Speaker: José Miguel Hernández-Lobato
Affiliation: Professor of Computer Science, University of Cambridge

Place of Seminar: Lecture Hall T2, Konemiehentie 2, Aalto University

Efficient estimation of AUC in a sliding window

Date: May 20, 2019

Abstract: In many applications, monitoring area under the ROC curve (AUC) in a
sliding window over a data stream is a natural way of detecting changes in the system. The drawback is that computing AUC in a sliding window is expensive, especially if the window size is large and the data flow is significant.

In this paper we propose a scheme for maintaining an approximate AUC in a sliding window of length k. More specifically, we propose an algorithm that, given ϵ, estimates AUC within ϵ/2, and can maintain this estimate in O((log k)/ϵ) time, per update, as the window slides. This provides a speed-up over the exact computation of AUC, which requires O(k) time, per update. The speed-up becomes more significant as the size of the window increases. Our estimate is based on grouping the data points together, and using these groups to calculate AUC. The grouping is designed carefully such that (i) the groups are small enough, so that the error stays small, (ii) the number of groups is small, so that enumerating them is not expensive, and (iii) the definition is flexible enough so that we can maintain the groups efficiently.

Our experimental evaluation demonstrates that the average approximation error in practice is much smaller than the approximation guarantee ϵ/2, and that we can achieve significant speed-ups with only a modest sacrifice in accuracy.
Speaker: Nikolaj Tatti
Affiliation: Professor of Computer Science, University of Helsinki

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

Detecting Stable Surface Adsorbates with Bayesian Optimization

Date: May 13, 2019

Abstract: Our frontier technologies are increasingly based on complex functional materials. Optimizing their properties requires precise structural knowledge on an atomic scale, for which conventional computational methods are prohibitively expensive. We present a novel method for atomistic structure search, in which we combine Bayesian optimization with quantum mechanical simulations to explore optimal structures. With this method, we study the adsorption of a camphor molecule on a copper surface and identify the stable adsorbate structures in comparison with atomic force microscopy images.
Speaker: Jari Järvi
Affiliation: Doctoral candidate of applied physics, Aalto University

Place of Seminar: Seminar Room T6, Konemiehentie 2, Aalto University

The 2018 Otaniemi/Espoo Geothermal Stimulation Experiment: Challenges of Standard And Opportunities for Novel Data Analysis Techniques

Date: May 6, 2019

Abstract: Seismologist G. Hillers explains the basic concept of an Enhanced Geothermal System for carbon-free energy production using the example of the 2018 Otaniemi/Espoo stimulation experiment that took place some 6 km below the campus of Aalto University. The st1 Deep Heat company injected between May and July 2018 water into the rock mass to create fractures that will allow water to circulate and to heat up. This process is accompanied by the occurrence of thousands of small earthquakes. The Institute of Seismology faces the challenge to analyze the terabytes of seismic time series data and later the earthquake catalog data that were collected from a network consisting of borehole and surface stations. The presentation introduces the traditional analysis techniques, what observables they yield, how this informs about the processes at depth, as well as the algorithmic and computational challenges associated with this “big” data set, and sketches why ML and AI techniques are also becoming increasingly popular in seismological contexts. The intention is to initiate collaboration between the institute and the CS community at Kumpula and Aalto.
Speaker: Gregor Hillers
Affiliation: Professor of Seismology, University of Helsinki

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

AI approaches to support language learning

Date: April 29, 2019

Abstract: We will overview Revita, a project in CALL – computer-aided (human) language learning. Specifically, we work on language learning beyond the elementary level – for intermediate and advanced learners (beginners get adequate support from the myriad existing services and applications). Revita aims to simulate a good teacher, by modeling and assessing the learner’s state and progress. One important aspect of our approach is allowing the user to learn from /authentic/ materials – arbitrary texts chosen by the users themselves.

According to surveys, Revita is the first AI-based system of scale that:
– works beyond the elementary level: targets intermediate to advanced learners,
– has multi-lingual focus (an English-only system exists for advanced essay assessment),
– is used in official university-level curricula – beyond “academic” experiments.
We are working on expanding to additional learning environments and additional languages.

Our approach allows us to collect data to analyze patterns of language learning in great depth and detail. From on-going studies with actual learners, we collect data about the learning process – about typical mistakes, paths of progress, etc. This data provides a playground for research problems, which we will discuss. I will focus on two sides of the research:
– building the tools needed to collect data, and
– methods of analysis and applications of the collected data, which
include neural networks, in particular, translation and sequence-to-sequence models.

Roman Yangarber has led the Research Group in NLP at the department of Computer Science, University of Helsinki, over the last 10 years. The group has been working on a variety of themes in NLP, researching how language works, and how computers can better understand language. Research themes include analysis of news media, and modeling language evolution. The more recent research – AI support for language learning – has resulted in a system being used by end-users at several universities; it has also won a best paper award at a Digital Humanities conference last year.
Speaker: Roman Yangarber
Affiliation: Professor of Computer Science, University of Helsinki

Place of Seminar: Seminar Room T5, Konemiehentie 2, Aalto University


Investigating Illegal Wildlife Trade From Social Media and Other Digital Platforms

Date: April 15, 2019

Abstract: Illegal wildlife trade (e.g. illegal trade of ivory harvested from poached elephants) is one of the main threats to biodiversity, involving thousands of species. Traditionally, illegal wildlife trade has thrived on physical markets. In this age of global connectivity, however, illegal wildlife trade has moved to online markets, especially social media platforms. Social media and other digital platforms offer good conditions for illegal wildlife trade to thrive, as the platforms are easily accessible and have a high number of users. While this represents a pressing threat to the species targeted in the illegal wildlife trade, scientists can use data mined from digital platforms to investigate illegal wildlife trade at an unprecedented spatio-temporal scale. The use of digital data sources in combination with methods from artificial intelligence can potentially be used to provide new insights, which might help stop illegal wildlife trade. Many social media platforms provide an application programming interface that allows access to user-generated text, images and videos, as well as to accompanying metadata, such as where and when the content was uploaded, and connections between users. This deluge of data can be used to investigate illegal wildlife trade in a cost-efficient manner, but require methods from computer science to be efficiently used in conservation science. In the presentation, I will show (i) how machine learning can be used to automatically identify content pertaining to the illegal wildlife trade from high-volume data mined from social media platforms and (ii) how natural language processing can be used to assess preferences, reactions and sentiment of social media users towards illegal wildlife trade.

Speaker: Enrico Di Minin
Affiliation: Research Fellow in Conservation and Sustainability Science, University of Helsinki

Place of Seminar: Seminar Room T5, Konemiehentie 2, Aalto University

NOTE: Due to some maintenance service running on T6, this week we will have our seminar in T5.

Deploying A.I.

Date: April 8, 2019

Abstract: Expectations for AI changing the world are great, and academic research groups have ambitious goals for deploying their research in the real world.

In this talk I discuss the potential of AI in a range of industries and application areas. What are the main opportunities for AI, the main obstacles for deployment of AI in a large scale, and what are the realistic routes for deployment?

My viewpoint is that of a researcher in planning and decision-making, with long-term experience in trying to apply AI to real-world industrial problems, including telecoms, energy, and software production.

I will argue that the core obstacle to AI having very broad impact on the society is the integration of AI technologies in existing software infrastructure. This obstacle also points to many promising research opportunities.

BIO: Jussi Rintanen has done A.I. research in the junction of automated reasoning, search, and planning and decision-making at the universities of Ulm and Freiburg in Germany (1999 to 2005), the Australian National University (2006-2011), with applicative AI projects at National ICT Australia (2006-2011) and since 2012 at Aalto University. He leads the A.I. and Software Systems research group at Aalto.

Speaker: Jussi Rintanen
Affiliation: Professor of Computer Science, Aalto University

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

Simulation-Based Learning for Robots

Date: April 1, 2019

Abstract: The use of reinforcement learning is challenging for physical systems because exploration is costly in terms of time and wear of equipment, as well as possibly being unsafe. For many mechanical systems, decent simulation models can be built and used instead of the physical system. Thus, the learning is performed in simulation. Even so, realistic simulation may be costly and thus sample efficiency is important. Moreover, it is not uncommon that simulation models lack realism.

In this talk, I will talk about some of the machine learning challenges associated with the use of reinforcement learning and simulation to build robust policies for physical systems. In particular, I will talk about regression models as surrogates for costly simulations. Moreover, I’ll emphasize the value of good initial policies especially in sparse reward settings and how a single human demonstration can be used as a starting point for incrementally learning a generalizable policy. I will also briefly address the reality gap problem. In particular, I will talk about how domain randomization can be used to build robustness towards known uncertainties in simulation parameters.

Bio: Ville Kyrki is an Associate Professor at Aalto University School of Electrical Engineering, where he leads the Intelligent Robotics group. The group develops intelligent robotic systems and robotic vision with a particular emphasis on methods and systems that cope with imperfect knowledge and uncertain senses.

Speaker: Ville Kyrki
Affiliation: Professor of Electrical Engineering and Automation, Aalto University

Place of Seminar: Seminar Room T6, Konemiehentie 2, Aalto University

Data science and computational history: opportunities for collaboration

Date: March 18, 2019

Abstract: Helsinki Computational History Group (COMHIS) is a multidisciplinary team that studies intellectual history (http://helsinki.fi/computational-history). The work in the group is guided by methods from various different backgrounds ranging from modern data science and machine learning to history and linguistics. “Computational history” implies the use of mixed methods in which big data approach is combined to expert subject knowledge in intellectual history and book history. Lately we have been focusing on an integrated study of large historical metadata and full-text sources (especially British eighteenth-century literature and different historical newspaper resources). From method perspective, we have been developing contextual tools for bibliographic sources to study influence and networks; text-reuse detection to study intertextuality (using BLAST to deal with OCR-mistakes); materiality explorations of printed items based on
information derived from layout, font etc.; stylometry to study particular questions of authorship; and word embeddings and other text mining methods when thinking about conceptual change. A bottleneck in computational history is the preprocessing and harmonization of data – we take care of that. What we are looking for from a data science audience are novel methodological ideas about ways to use data science to model our historical data. We have for example ideas how to use computer vision for some aspects of our work, but we need a trained data scientist with computer vision background to collaborate with us on this front. This talk will outline this kind of aspects in our research to think about opportunities for collaboration.

Speaker: Mikko Tolonen

Affiliation: Professor of Digital Humanities, University of Helsinki

Place of Seminar: Seminar Room T6, Konemiehentie 2, Aalto University

Recent Advances in Aalto ASR Group

Date: March 11, 2019

Abstract: I will introduce the current research in my automatic speech recognition (ASR) group at Aalto University. It includes the new acoustic and language models by which we won the MGB ASR Challenge 2017 and the new applications such as automatic pronunciation evaluation and verbal description of audiovisual data.

Dr. Mikko Kurimo is a professor in speech and language processing at Aalto University, Finland. He has lead Aalto’s speech recognition research group since 2000 as well as several national and international research projects. His work is internationally best known for unsupervised subword language modeling for morphologically complex languages such as Finnish, Estonian and Arabic.

Speaker: Mikko Kurimo

Affiliation: Professor of Speech and Language Processing, Aalto University

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

Machine Learning for Clinical Decision Support – How To Make It Work

Date: March 4, 2019

Abstract: Clinical Decision Support Systems (CDSSs) aim to help healthcare professionals in making more efficient decisions. Their task may be to help, e.g., with early diagnosis of a disease, creation of treatment plans, or following the effectiveness of a certain treatment. Recent advances in data-driven approaches and machine learning methods, combined with the increased availability of data, have pushed this field forward. However, the speed of uptake of such methods in routine clinical use is much slower than what we are used to e.g. in industrial or financial applications.

In this talk, we will go through several issues that we run into when developing machine learning approaches specifically for real-life healthcare settings. We will consider the need for explainability, dealing with incomplete data, and integration with other systems, as well as assessment of performance/cost-effectiveness/impact. We will do this at the hand of two developed decision support systems: one for assisting in diagnosis of dementias, and one for outcome prediction and treatment planning in traumatic brain injuries. What went well, what could be improved, and what can we learn from that?

Speaker: Mark van Gils

Affiliation: Research Professor, VTT Technical Research Center of Finland

Place of Seminar: Seminar Room T6, Konemiehentie 2, Aalto University

Slides Video

Learning From Electronic Health Records: From Temporal Abstractions to Time Series Interpretability

Date: February 25, 2019

Abstract: The first part of the talk will focus on data mining methods for learning from Electronic Health Records (EHRs), which are typically perceived as big and complex patient data sources. On them, scientists strive to perform predictions on patients’ progress, to understand and predict response to therapy, to detect adverse drug effects, and many other learning tasks. Medical researchers are also interested in learning from cohorts of population-based studies and of experiments. Learning tasks include the identification of disease predictors that can lead to new diagnostic tests and the acquisition of insights on interventions. The talk will elaborate on data sources, methods, and case studies in medical mining.
The second part of the talk will tackle the issue of interpretability and explainability of opaque machine learning models, with focus on time series classification. Time series classification has received great attention over the past decade with a wide range of methods focusing on predictive performance by exploiting various types of temporal features. Nonetheless, little emphasis has been placed on interpretability and explainability. This talk will formulate the novel problem of explainable time series tweaking, where, given a time series and an opaque classifier that provides a particular classification decision for the time series, the objective is to find the minimum number of changes to be performed to the given time series so that the classifier changes its decision to another class. Moreover, it will be shown that the problem is NP-hard. Two instantiations of the problem will be presented. The classifier under investigation will be the random shapelet forest classifier. Moreover, two algorithmic solutions for the two problem instantiations will be presented along with simple optimizations, as well as a baseline solution using the nearest neighbor classifier.
Website: https://papapetrou.blogs.dsv.su.se/

Speaker: Panagiotis Papapetrou

Affiliation: Professor of Computer Science, Stockholm University

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

Securing Pedestrians in Autonomous Traffic Ecosystems of Smart Cities

Date: February 18, 2019

Abstract: Future smart cities must be based on sustainable principles; they must provide improved quality of life for all citizens, be safe and as emission-free as possible. This goal requires radical reforms in the traffic, namely creation of automated ground vehicle ecosystem and relocating parts of the transportation of goods, probably also human, to the airspace by using Unmanned Aerial Vehicles (UAVs). Pedestrians and bicyclists must be included in traffic monitoring and controlling actions, a fact that has been largely neglected so far in the discussion
about automated traffic. All these goals demand development of sophisticated spatiotemporal data analysis methods. In order to implement a functional traffic ecosystem assuring safe cooperation of
all these actors, knowledge of their position, ability to predict their movements, effects caused by changes of the operation environment and capability to fuse all this information together is crucial.

The most challenging actors from the navigation perspective are pedestrians. Their motion is unrestricted, they spend a big portion of time indoors and they have strict demands for navigation equipment. This talk will give a glimpse to the research goals of the Spatiotemporal Data Analysis research group and will look a bit more into a specific application of infrastructure-free pedestrian navigation and especially user motion recognition via machine learning to improve the navigation result.

Speaker: Laura Ruotsalainen

Affiliation: Professor of Computer Science, University of Helsinki

Place of Seminar: Seminar Room T6, Konemiehentie 2, Aalto University


Machine Learning For Tomography

Date: February 11, 2019

Abstract: Tomography refers to imaging methods where one attempts to recover the internal structure of a physical body from non-invasive boundary measurements. The most famous example is X-ray tomography used in hospitals. The mathematics of inverse problems focuses on extracting information from indirect data, and tomography is a central research topic in the field. Recently, machine learning has offered new data-driven possibilities for image reconstruction. Some of the results using, for example, the “U-net” are truly stunning. However, they are largely “black boxes,” and especially in medical imaging there is a great need for interpretability. This talk presents some ideas on using traditional inverse problems mathematics for calculating nonlinear features that are then used as inputs for machine learning. This way one could increase interpretability, reduce the size of networks needed for learning, allow the use of smaller training data sets, and increasing the robustness of the network (the image formation tasks in ill-posed inverse problems of tomography are very sensitive to noise).

Speaker: Samuli Siltanen

Affiliation: Professor of Industrial Mathematics, University of Helsinki

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki


Probabilistic Modelling With the Experts

Date: February 4, 2019

Abstract: I will discuss multiple-data-source prediction problems typical of omics-based precision medicine. What is less typical is that some of the data sources are expert users, whose time is costly, changing the problem to active learning or experimental design for prediction. We have addressed this setup as a probabilistic modelling problem, where different types of sources need different modelling assumptions. I will demonstrate that promising results can be achieved in treatment effectiveness prediction tasks in restricted settings, even by explaining human variation with noise models. Richer behaviour requires richer models that draw from cognitive science.

Speaker: Samuel Kaski

Affiliation: Professor of Computer Science, Aalto University

Place of Seminar: Seminar Room T6, Konemiehentie 2

Gray Box Models for Controllable and Transparent Interactive AI

Date: January 28, 2019

Abstract: Gray box modelling combines first principles -based white box models with a data-driven approach to strike a balance between representation power and ontrollability. I discuss interactive AI and computational design as application areas for gray box models. So-called light gray models learn psychological parameters from data. Dark gray models, on the other hand, are data-driven models pretrained with white-box models. As a promising approach in intelligent user interfaces, I discuss models of human performance and cognition, which can predict the human consequences of a design decision. These models can 1) represent population and individual characteristics in a psychologically meaningful way and 2) predict the adaptive behavioral response of a person. However, previously their use has been limited because of lack of appropriate likelihood inference methods. I discuss the use of probabilistic machine learning methods for learning model parameters from real world data and taking decisions in the light of confidence levels.

Speaker: Antti Oulasvirta

Affiliation: Professor of Computer Science, Aalto University

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki

Can Physiological Signals be used for Implicit Interaction with Information?

Date: January 21, 2019

Abstract: Would it be nice if computers would be able to empathize with us effortlessly and understand what we are interested in or what is entertaining to us while looking for information and consuming content ? The talk will introduce physiological signals as input to computing systems outside of the medical domain. A definition of implicit interaction will be presented along with roles it can have in improving the experience of users. Several recent cases will be present to assess how far we are from using physiological signals including reflecting on current challenges.

Speaker: Giulio Jacucci

Affiliation: Professor of Computer Science, University of Helsinki

Place of Seminar: Aalto University, Seminar Room T6


Machine learning in cancer research and oncology

Date: January 14, 2019

Abstract: Medicine in general and oncology in particular is experiencing a paradigm shift where several molecularly targeted therapies have become available to patients; and many more will become in the future. This change is driven by the technological advances that have reduced the barriers to measure large amounts of data at pathophysiological and molecular levels from a cancer patient. Finding right drug to the right patient requires a joint effort where machine learning experts collaborate with translational and clinical experts. In this presentation I focus on on-going cancer research projects in the Faculty of Medicine where collaboration with the local machine learning community can lead to scientific breakthroughs and medical benefits.

Speaker: Sampsa Hautaniemi

Affiliation: Professor of Systems Biology, University of Helsinki

Place of Seminar: Lecture Hall Exactum D122, University of Helsinki